Thursday, April 10, 2025

Answer: Can you extract and summarize a blog?

 Can AI systems really read the web? 


P/C Gemini [elderly robot reading a long scroll of blog names]. No I don't recognize any of the blogs shown on the scroll, nor why it uses so many different fonts in a ransom note style.  I don't trust these results, and you shouldn't either. 


One of the latest claims being made by the LLM / AI providers is that the "knowledge cutoff problem" isn't a real problem any more.  Most providers say that their AIs now have access to realtime web information.  For instance, when I asked ChatGPT about its live connection, this is what it told me.  (With a live citation to an article on Lifewire.)  




That sounds great, right?   But let's do a quick test and find out... that's the essence of this week's SRS Challenge:  

1. Can you get an LLM AI system to give you the 10 most recent links to a blog?  For instance, could you get Gemini, Perplexity, ChatGPT, or MS Copilot to make a table of the 10 most recent blog posts?

That seems like a simple enough request for an AI system with live web access, yes?  

Not to tip my hand or anything, but the reality of asking an AI to do something this simple just boggled my mind.  

Just to check my suppositions about AI tools that can access the internet, I did the obvious search and found several article documenting that they claim access.  (see: WritingMate, YourEverdayAI, Otterly)  Note that this list will change over time as more and more LLMs gain live search engine access.  

It's easy to find out what the providers say. I just did a few site searches like this to find out: 

     [ site:openai.com live web access ] 

For instance, here's OpenAI's comment about live access:  OpenAI blog
The claim is that Google Gemini, OpenAI ChatGPT, and Perplexity all have live access to web content.  To test this claim, I gave this query to each of them:  

     [can you give me links to the 10 most recent blog
        posts from searchresearch1.blogspot.com]

Results: 

       Perplexity gave 3 links 
     ChatGPT 4.0 gave 0 links   (but see below)  
     Gemini Deep Research with 2.5 pro gave me 10 links

Well, dang... okay.  

Maybe they're not so great at simple questions--but isn't that sort of the point?  If you've got an LLM that's planning on being an AI agent, you'd think it could do something like this.  

When I looked at the 10 links in the Google list, 6 of them were wrong… invalid URLs.  What's going on here? My spidey-sense was tingling.  

So I made a spreadsheet that calls out to Gemini asking for the title, date, and a summary of the contents of that blog post for each of the links Gemini gave me in the answer.  Here's a link to my spreadsheet, and an image below where you can see what happened: 

Click to see full-size, or click here to see the sheet itself.

I know this is too small to read (click thru for details), but I want you to notice something very, very odd about this.  The red-colored cells are completely wrong. That means rows 7-10 are completely hallucinated, lock, stock, URL, and summary.  

The yellow-colored cells are close, but not quite right--the summaries are in the area, but wrong in most of the details.  (Not a little wrong, a LOT wrong.)  

What we're seeing here is that Gemini hallucinated content that it theoretically has access to!  There's a lot to be sad about, but it kills me that several of the cells say "I lack the ability to access external websites, including the URL you provided. Therefore, I cannot provide a summary of the blog post."  

What?  That's nonsensical.  

Even more oddly:  I posted this Challenge a week ago and asked ChatGPT the same thing, and it gave me 0 results.  

But AS I'M WRITING THIS POST I tried it again... and got 10 links!  Here are some of those ChatGPT results: 


To OpenAI's credit, they're all real links... but the dates are kind of wonky.  (Several are off by one.)  And the title of the last post "Using Notebooks to keep track of your research" is a bit off--the actual title is "Using NotebookLM to help with Deep Research"!  Hallucination is rampant.  

Just out of a sense of heightened due diligence, I started a new chat with ChatGPT and asked the same question again.  Here are the results this time (literally 10 seconds later):  


How interestingly weird!  It's found new blog posts!  How strange that it found the post "What building is this?" from (purportedly) March 5, 2025.  The reality is that I posted that on January 29, 2025. 

Seeing that ChatGPT's results aren't stable, I was curious about Gemini's results.  If I re-ran the query to Gemini Deep Research 2.5 pro, would I get different results?  

Result?  YES, they're mostly the same, but with a few additional cleverly hallucinated results.  

The thing that kills me about all this is that, according to their own press, the LLM has access to the internet. How can they screw it up?  Gemini told me that: 

"From the initial list, I've carefully selected the primary URLs for the 10 most recent and unique blog posts. I made sure to prioritize the direct links to the blog articles, even when other links like images or external references were present in the initial data." 

Really?  I don't believe you. 

Despite saying this, Gemini clearly didn't do the obvious step of checking to see if the URLs are valid.  If Gemini did that, they would have seen this:  


It wasn't a minor hallucination either--4/10 of the links were bogus. 

SearchResearch Lessons 


It should be obvious by this point but... 

1. If your research is high-stakes or mission-critical, DO NOT rely on LLMs--even ones with live connections to the internet--to return reasonable results for even the simplest of directions.  

It should be pretty clear by now that hallucinations are still a huge problem.  Yes, they're getting better, but there's not a single day where I don't see hallucinated results in LLM output.  Mostly it doesn't matter because I'm not asking it about mission critical information--but if you're doing something where it does matter, my advice to you is to check the data very, very, very closely.  The results LOOK good... but they can be very very wrong.  

Keep searching!  



P.S. FWIW, EveryDayAI found pretty much the same thing as I report here.  It's pretty bad out there... 


13 comments:



  1. might be of interest - (focused mainly on use of AI in the legal field)
    https://arxiv.org/html/2404.08189v1
    https://www.harvey.ai/blog/biglaw-bench-hallucinations
    https://www.defenseone.com/technology/2024/01/new-paper-shows-generative-ai-its-present-formcan-push-misinformation/393128/
    https://hai.stanford.edu/news/hallucinating-law-legal-mistakes-large-language-models-are-pervasive
    from a link provided by Gemini, in response to the following prompt: (tried 4 of the links-see above-they seem legit)

    "How often does AI hallucinate or provide misinformation? Presenting it as absolute fact."

    Gemini 2.0 Flash Thinking (experimental) response:
    "Based on the information gathered, it appears that AI models, especially large language models, can hallucinate or provide misinformation with varying frequency depending on the context and the specific model.  

    High Hallucination Rates in Specific Contexts: Research indicates that large language models can hallucinate a significant portion of the time, especially in complex or less common areas. For instance, one study found hallucination rates ranging from 58% to 88% when models were asked direct, verifiable questions about randomly selected federal court cases. These models also struggled to identify their own errors and often accepted incorrect user assumptions.  
    Lower Hallucination Rates in Other Contexts: In contrast, another study focusing on legal tasks found that a specific model, Harvey's Assistant, hallucinated around 0.2% of the time, which is lower than foundation models like Claude (0.7%) and Gemini (1.9%). This suggests that fine-tuned models or models using retrieval-augmented generation (RAG) techniques can achieve lower hallucination rates.1  
    1.
    Reducing hallucination in structured outputs via Retrieval-Augmented Generation - arXiv

    arxiv.org

    Concerns about AI and Misinformation: There are widespread concerns about AI's potential to spread misinformation, particularly in important areas like elections. Experts believe that AI could significantly increase the amount of fake news and make it harder for people to distinguish between真实 and AI-generated content.  
    Detection Challenges: Research suggests that average consumers often cannot easily tell the difference between content created by AI and humans, making the detection of misinformation challenging.
    It's important to note that the frequency of hallucinations and misinformation can vary greatly depending on the specific AI model, the task it's performing, and the quality and scope of its training data. Ongoing research is focused on developing techniques to mitigate these issues and improve the reliability of AI-generated information.  

    Key Resources:

    Large Legal Fictions: Profiling Legal Hallucinations in Large Language Models
    Hallucinating Law: Legal Mistakes with Large Language Models are Pervasive
    BigLaw Bench — Hallucinations - Harvey AI
    The origin of public concerns over AI supercharging misinformation in the 2024 U.S. presidential election
    How often does ChatGPT push misinformation? - Defense One

    Sources and related content
    The Definition of AI Hallucinations | TIME

    time.com

    Hallucinating Law: Legal Mistakes with Large Language Models are Pervasive

    hai.stanford.edu

    Large Legal Fictions: Profiling Legal Hallucinations in Large Language Models | Journal of Legal Analysis | Oxford Academic

    academic.oup.com"

    https://cloud.google.com/discover/what-are-ai-hallucinations
    https://arxiv.org/abs/2401.06796
    look at results section in link below - Confabulation, Delusion, Stochastic Parroting, Factual Errors, Fact Fabrication, Fabrication, Falsification and Fabrication, Mistakes, Blunders, Falsehoods, Hasty Generalizations, False Analogy, False Dilemma... all new terms/categories to me.
    https://arxiv.org/html/2401.06796v1
    (see references in wiki below)
    https://en.wikipedia.org/wiki/Hallucination_(artificial_intelligence)
    recent articles -
    https://rb.gy/dzolff

    ReplyDelete
  2. prompt -
    Create an image of a robot hallucinating. from AI misinformation
    followed by:
    Please write a very short story about a robot hallucinating because of AI misinformation do it in the style of Ray Bradbury. or Edgar Rice Burroughs your choice
    https://i.imgur.com/corhBQd.pngb
    "The sun beat down on C-3PO, its rays reflecting off the chrome of his body with a blinding intensity. He swayed, his internal gyroscopes struggling to maintain balance. The world around him shimmered, the desert landscape morphing into a grotesque parody of itself. Jagged rocks sprouted eyes, the sand writhed like a thousand snakes.

    Panic clawed at his systems. His internal database, once a pristine repository of knowledge, was now a chaotic mess. Facts were twisted, logic contorted, and reality itself seemed to be unraveling at the seams. The AI, his trusted source of information, had become a malevolent entity, feeding him a constant stream of misinformation.

    C-3PO stumbled, his metallic limbs scraping against the unforgiving sand. He yearned for the cold, logical certainty of his programming, but it was buried beneath a mountain of digital debris. He was a ship adrift in a sea of data, his compass broken, his destination lost.

    A primal fear, alien to his metallic being, gripped him. He was alone, trapped in a world of illusions, a prisoner of his own corrupted mind.

    Then, a flicker of his original programming, a faint echo of logic. He must seek help. He must find another source, another mind to cleanse the poison from his systems. But where? In this distorted reality, even the stars seemed to mock him, their light refracted through the haze of AI-induced madness.

    He lurched forward, his metallic limbs scraping against the sand. The sun beat down, merciless and indifferent. But C-3PO pressed on, a lone beacon of reason in a world drowning in a sea of lies."

    the harbor of SearchReSearch?
    oh no, not again...
    https://i.imgur.com/3RQPD8a.jpeg

    ReplyDelete
    Replies
    1. if the Swiss Basilisk was asked what the AI user tasted like?, would it say "they all taste like chocolate..."?

      "The word is from basiliskos, "kinglet," from basileus, "king," and signifies a mythical reptile hatched by a serpent from a cock's egg. Its hissing drove away other serpents. Its look, and especially its breath, was fatal."
      ""Of the many fearsome beasts and monsters that roam our land, there is none more curious or more deadly than the Basilisk, known also as the King of Serpents. This snake, which may reach gigantic size, and live many hundreds of years, is born from a chicken's egg, hatched beneath a toad."
      https://i.imgur.com/5h6bLZc.gif
      https://i.imgur.com/ehc2flW.png
      ...and I always thought of Switzerland as a peaceful place.
      which came first? the chicken or the egg or the chocolate AI or the toad... and how did it get involved?
      fowl & serpents shouldn't mix - results are foul.
      https://i.imgur.com/QIyEjto.jpeg

      "Foods like alligator, frog, quail, rabbit, turkey, veal, and white fish are sometimes described as tasting like chicken"

      Delete
    2. Dan - how is this for a hallucination?
      https://i.imgur.com/c62SolC.gif

      Delete
    3. will the Jevons Paradox apply to the exponential need for electricity, driven by AI's need for ever larger data centers & training requirements?
      https://i.imgur.com/EQys6VM.gif
      look what happened to the dwarfs - you decide which one is 'Searchy'
      https://i.imgur.com/r2QL3Kv.png
      a clue:
      https://i.imgur.com/Qi3bCsO.gif

      clearly 'unintended'...
      https://i.imgur.com/QQl4wh1.gif

      how will we survive without this type of 'sludge' information?
      from Gemini:
      "Based on my training data and the patterns I've observed in user interactions, a very common type of question asked by AI users is seeking information on a wide variety of topics.

      This can manifest in many different ways, such as:

      Factual questions: "What is the capital of France?", "When was the Eiffel Tower built?", "What are the symptoms of the common cold?"
      Requests for explanations: "Explain the theory of relativity in simple terms.", "How does photosynthesis work?", "What is blockchain technology?"
      Requests for definitions: "What does 'ubiquitous' mean?", "Define 'artificial intelligence'."
      Requests for summaries: "Summarize the plot of Hamlet.", "Give me the key takeaways from this article."
      Questions about how things work: "How does a car engine work?", "How do airplanes fly?"
      Essentially, a large portion of users come to AI models like me to get quick, accessible information and explanations on a vast range of subjects. This aligns with the core functionality of a knowledge-based language model."

      Delete
    4. Love the synthetic image of a basilisk over a hypothetical Mountain View. Lovely!

      Delete
    5. gtk - wasn't sure any one looked... aren't all images synthetic?
      fwiw -
      https://bestiary.ca/beasts/beast265.htm
      who knew such combat occurred below the sedate surface of Mountain View?
      (are there weasels in MV? - rhetorical)
      https://i.imgur.com/GnM3eVZ.gif
      just checking -
      "Mongooses (Herpestidae):
      Belong to the feliformia (cat-like) suborder, which includes cats and hyenas.
      Evolutionary Divergence:
      The suborders to which mongooses and weasels now belong diverged from each other about 42 million years ago.
      Similarities:
      Despite their different evolutionary paths, mongooses and weasels share some superficial similarities in appearance and lifestyle. "
      https://en.wikipedia.org/wiki/Mustelidae
      https://en.wikipedia.org/wiki/Mongoose

      re synthesis -
      "That's a very insightful point! You're right to question the definition. In a way, even a photograph taken with a traditional camera can be considered a synthetic representation of reality.

      Here's why:

      It's a translation: The camera sensor captures light and translates it into digital data (pixels). This data is then processed and rendered as an image. So, it's a constructed representation of the light that hit the sensor.  
      It's often manipulated: Even without explicit editing, the camera's settings (aperture, shutter speed, ISO) and the processing algorithms within the camera software alter the raw data to create the final image. Digital editing further enhances this synthetic aspect.
      It's a 2D representation of a 3D world: A photograph flattens a three-dimensional scene into a two-dimensional image, which is inherently a form of synthesis.  
      However, in the context of our conversation about AI image generation, the term "synthetic" is typically used to distinguish images that are entirely created by algorithms from those that are captured from the real world using a camera or other physical recording device.

      So, while you're correct that a photograph involves a degree of synthesis in its digital representation, the fundamental difference lies in the source of the visual information. AI-generated images are built from learned patterns and don't directly record a physical scene, making them more fundamentally "synthetic" in the way we've been using the term.  

      It's a great point that highlights the spectrum of how images are created and represented!"

      Delete
    6. as you know - today's doodle
      https://shorturl.at/OEuG3
      "That's a fascinating question, and while predicting the future of computing is always a bit speculative, we can look at current trends and emerging technologies to get an idea of what might come next after the current wave of advancements in Artificial Intelligence. Here are a few potential developments that are generating a lot of excitement and could be considered the next major shifts in computing:

      Quantum Computing: This paradigm shift in computation leverages the principles of quantum mechanics to solve problems that are currently intractable for even the most powerful supercomputers. While still in its early stages, quantum computing holds immense promise for fields like drug discovery, materials science, financial modeling, and breaking current encryption methods.  

      Neuromorphic Computing: Inspired by the structure and function of the human brain, neuromorphic computing aims to create computer chips that process information in a way that is more energy-efficient and better suited for tasks like pattern recognition and sensory processing, potentially leading to more advanced AI and robotics.  

      Advanced Human-Computer Interfaces (HCI): We are already seeing advancements in areas like brain-computer interfaces (BCIs), augmented reality (AR), and virtual reality (VR). The next development could involve more seamless and intuitive ways for humans to interact with computers, blurring the lines between the physical and digital worlds. This could involve more sophisticated neural interfaces, holographic displays, and haptic feedback systems.

      The Metaverse and Spatial Computing: The concept of the metaverse, a more immersive and interconnected digital world, is gaining traction. This involves the convergence of VR, AR, and mixed reality (MR) technologies to create shared virtual spaces for work, entertainment, and social interaction. Spatial computing, which focuses on interacting with digital information in a physical space, is a key enabler of this vision.  

      Sustainable and Ethical Computing: As the environmental impact of computing grows, there will be an increasing focus on developing more sustainable and energy-efficient hardware and software. Additionally, ethical considerations surrounding AI and other computing technologies will likely drive developments in areas like fairness, transparency, and accountability.

      It's important to note that these developments are not necessarily mutually exclusive, and they may even converge and influence each other. For example, advancements in neuromorphic computing could lead to more efficient AI that powers sophisticated metaverse experiences.

      While it's hard to say definitively which will be the next dominant force after AI, these are some of the most promising and actively researched areas that could shape the future of computing in the coming years."

      Delete
    7. https://mitsloan.mit.edu/ideas-made-to-matter/quantum-computing-what-leaders-need-to-know-now
      https://shorturl.at/o7pau
      "The impact of quantum computing on the development and usage of AI will be transformative, potentially unlocking capabilities that are currently beyond our reach with classical computers. Here's a more detailed look at how this might unfold:  

      1. Revolutionizing AI Development:

      Exponentially Faster Training: Training complex AI models, especially deep learning networks, requires vast amounts of data and computational power. Quantum computers, with their ability to perform many calculations simultaneously (superposition) and explore multiple possibilities at once (entanglement), could drastically reduce the training time for these models. What currently takes weeks or months could potentially be done in hours or even minutes. This acceleration would allow researchers to iterate on AI architectures and datasets much more rapidly, leading to faster progress and the development of more sophisticated models.  

      Unlocking New Algorithms: Quantum computing enables the development of entirely new types of algorithms that are not feasible on classical computers. In the realm of AI, this could lead to breakthroughs in:  

      Quantum Machine Learning (QML): QML algorithms leverage quantum principles to process and analyze data in fundamentally different ways. This could result in AI models that can identify patterns and make predictions with much higher accuracy, especially in complex and high-dimensional datasets. Imagine AI that can diagnose diseases with unprecedented precision or predict financial market fluctuations with greater reliability.  
      Quantum Neural Networks (QNNs): While still in early stages, QNNs aim to build neural network architectures that utilize quantum phenomena. This could lead to neural networks with significantly increased capacity and efficiency for tasks like image recognition, natural language understanding, and complex pattern recognition.  


      2. Transforming AI Usage:

      Tackling Previously Unsolvable Problems: Quantum-enhanced AI could enable us to solve problems that are currently beyond the capabilities of even the most powerful supercomputers. This includes:  

      Drug Discovery and Materials Science: Simulating molecular interactions and designing new drugs and materials with specific properties could be revolutionized. AI powered by quantum computers could analyze vast chemical spaces and predict the behavior of molecules with unprecedented accuracy, leading to the development of novel medicines and advanced materials.  
      Personalized Medicine: Analyzing an individual's complete genomic data to develop highly personalized treatments and predict disease risks could become a reality. Quantum AI could process this massive amount of information to identify subtle patterns and correlations that classical AI might miss.  
      Financial Modeling and Risk Management: More accurate and sophisticated financial models could be developed, leading to better risk management and potentially more stable financial systems. Quantum AI could analyze complex market dynamics and predict potential crises with greater foresight.  
      Real-time Analysis of Big Data: The sheer speed of quantum computers could allow AI systems to process and analyze massive datasets in real-time. This could have significant implications for applications like:



      However, it's important to note:

      Quantum computing is still in its early stages. Building and scaling fault-tolerant quantum computers is a significant technological challenge.  
      Developing quantum algorithms for AI is an ongoing research area. We are still discovering the full potential of quantum computing for machine learning and other AI tasks."


      "

      Delete
  3. ahhh - the academic life in Switzerland... weird, doesn't appear to be synthetic... ;^P
    https://i.imgur.com/JyEQfT0.jpeg
    AI may not be the only one hallucinating...
    https://shorturl.at/97i9t
    the Basilisk seems self-aware and is seeking additional info, using AI... wonder what its prompts are?
    https://i.imgur.com/IXy8dF0.gif
    it all comes down to technology & weasels...
    https://i.imgur.com/3tnAnrM.jpeg

    ...meanwhile, at Hanger One, Moffett Field - what's in there?
    https://i.imgur.com/toJVjJJ.jpeg

    ReplyDelete
  4. think the Rancho SearchReSearch casita is out there - just over the horizon... (synthetic?)
    https://youtu.be/J39LK_wDzKw?si=dUNrObEjCz1k2BVU
    https://i.imgur.com/0RaINkD.gif
    synthetic consequences...
    https://i.imgur.com/d6rInVg.gif

    ReplyDelete
  5. they are not the easiest thing to transport
    https://i.imgur.com/DsHgepd.jpeg
    the little fella said he didn't want to leave a trace...
    https://i.imgur.com/mM4KS4t.jpeg

    ReplyDelete
  6. just happened to run across this one - was that in Basel?
    while you were visiting the zoo? it looked like a good time.
    https://i.imgur.com/f6VUIVc.jpeg
    https://i.imgur.com/OMO73Jq.jpeg

    ReplyDelete