Can AI systems really read the web?
![]() |
P/C Gemini [elderly robot reading a long scroll of blog names]. No I don't recognize any of the blogs shown on the scroll, nor why it uses so many different fonts in a ransom note style. I don't trust these results, and you shouldn't either. |
One of the latest claims being made by the LLM / AI providers is that the "knowledge cutoff problem" isn't a real problem any more. Most providers say that their AIs now have access to realtime web information. For instance, when I asked ChatGPT about its live connection, this is what it told me. (With a live citation to an article on Lifewire.)
That sounds great, right? But let's do a quick test and find out... that's the essence of this week's SRS Challenge:
1. Can you get an LLM AI system to give you the 10 most recent links to a blog? For instance, could you get Gemini, Perplexity, ChatGPT, or MS Copilot to make a table of the 10 most recent blog posts?
That seems like a simple enough request for an AI system with live web access, yes?
Not to tip my hand or anything, but the reality of asking an AI to do something this simple just boggled my mind.
Just to check my suppositions about AI tools that can access the internet, I did the obvious search and found several article documenting that they claim access. (see: WritingMate, YourEverdayAI, Otterly) Note that this list will change over time as more and more LLMs gain live search engine access.
It's easy to find out what the providers say. I just did a few site searches like this to find out:
[ site:openai.com live web access ]
For instance, here's OpenAI's comment about live access: OpenAI blog
The claim is that Google Gemini, OpenAI ChatGPT, and Perplexity all have live access to web content. To test this claim, I gave this query to each of them:
[can you give me links to the 10 most recent blog
posts from searchresearch1.blogspot.com]
posts from searchresearch1.blogspot.com]
Results:
Perplexity gave 3 links
ChatGPT 4.0 gave 0 links (but see below)
Gemini Deep Research with 2.5 pro gave me 10 links
Well, dang... okay.
Maybe they're not so great at simple questions--but isn't that sort of the point? If you've got an LLM that's planning on being an AI agent, you'd think it could do something like this.
When I looked at the 10 links in the Google list, 6 of them were wrong… invalid URLs. What's going on here? My spidey-sense was tingling.
So I made a spreadsheet that calls out to Gemini asking for the title, date, and a summary of the contents of that blog post for each of the links Gemini gave me in the answer. Here's a link to my spreadsheet, and an image below where you can see what happened:
![]() |
Click to see full-size, or click here to see the sheet itself. |
I know this is too small to read (click thru for details), but I want you to notice something very, very odd about this. The red-colored cells are completely wrong. That means rows 7-10 are completely hallucinated, lock, stock, URL, and summary.
The yellow-colored cells are close, but not quite right--the summaries are in the area, but wrong in most of the details. (Not a little wrong, a LOT wrong.)
What we're seeing here is that Gemini hallucinated content that it theoretically has access to! There's a lot to be sad about, but it kills me that several of the cells say "I lack the ability to access external websites, including the URL you provided. Therefore, I cannot provide a summary of the blog post."
What? That's nonsensical.
Even more oddly: I posted this Challenge a week ago and asked ChatGPT the same thing, and it gave me 0 results.
But AS I'M WRITING THIS POST I tried it again... and got 10 links! Here are some of those ChatGPT results:
To OpenAI's credit, they're all real links... but the dates are kind of wonky. (Several are off by one.) And the title of the last post "Using Notebooks to keep track of your research" is a bit off--the actual title is "Using NotebookLM to help with Deep Research"! Hallucination is rampant.
Just out of a sense of heightened due diligence, I started a new chat with ChatGPT and asked the same question again. Here are the results this time (literally 10 seconds later):
How interestingly weird! It's found new blog posts! How strange that it found the post "What building is this?" from (purportedly) March 5, 2025. The reality is that I posted that on January 29, 2025.
Seeing that ChatGPT's results aren't stable, I was curious about Gemini's results. If I re-ran the query to Gemini Deep Research 2.5 pro, would I get different results?
Result? YES, they're mostly the same, but with a few additional cleverly hallucinated results.
The thing that kills me about all this is that, according to their own press, the LLM has access to the internet. How can they screw it up? Gemini told me that:
"From the initial list, I've carefully selected the primary URLs for the 10 most recent and unique blog posts. I made sure to prioritize the direct links to the blog articles, even when other links like images or external references were present in the initial data."
Really? I don't believe you.
Despite saying this, Gemini clearly didn't do the obvious step of checking to see if the URLs are valid. If Gemini did that, they would have seen this:
It wasn't a minor hallucination either--4/10 of the links were bogus.
SearchResearch Lessons
It should be obvious by this point but...
1. If your research is high-stakes or mission-critical, DO NOT rely on LLMs--even ones with live connections to the internet--to return reasonable results for even the simplest of directions.
It should be pretty clear by now that hallucinations are still a huge problem. Yes, they're getting better, but there's not a single day where I don't see hallucinated results in LLM output. Mostly it doesn't matter because I'm not asking it about mission critical information--but if you're doing something where it does matter, my advice to you is to check the data very, very, very closely. The results LOOK good... but they can be very very wrong.
Keep searching!
P.S. FWIW, EveryDayAI found pretty much the same thing as I report here. It's pretty bad out there...