SearchReSearch: SearchResearch Challenge (3/4/2025): Can you extract and summarize a blog?

Wednesday, April 2, 2025

SearchResearch Challenge (3/4/2025): Can you extract and summarize a blog?

Can AI systems really read the web?

P/C Gemini [elderly robot reading a long scroll of blog names]. No I don't recognize any of the blogs shown on the scroll, nor why it uses so many different fonts in a ransom note style.

Just last year people using AIs to ask up-to-date questions would often run into the "knowledge cutoff date" problem. That's the date of the last training cycle, when all of the information in the LLM was solidified.

Since then various AIs have been claiming that they now have continuous updates because they're able to read web content in real-time. (Here's a news article describing the new real-time web access capabilities of some LLMs.)

I've explored this a bit and come to some insights--and that leads to today's Challenge:

1. Can you get an LLM AI system to give you the 10 most recent links to a blog? For instance, could you get Gemini, Perplexity, ChatGPT, or MS Copilot to make a table of the 10 most recent blog posts?

That seems like a simple enough request for an AI system, yes?

We keep reading about AI agents that claim to be able to take actions on your behalf by operating your applications to get-things-done. So let's ask for a simple task ("make a list of the 10 most recent posts") and see how well it does.

I'll show you how I did it next week and talk about the issues involved.

I'm VERY curious to see how well you did. Were you able to do one request and get back the list?

As a sample blog, you might try good old SearchResearch, but you might also try something like OpenAI's blog or Google's Gemini blog.

Keep searching!

32 comments:

Arthur WeissApril 3, 2025 at 8:55 AM
This is an interesting question as you get different answers depending on what you are looking for. I've done tests in the past and Perplexity, Gemini, Aria and Chatsonic gave good recent answers. I just tried a recent news item today - and all gave accurate answers "What are the new tariffs for the UK and the European Union as announced on Liberation Day?" (Note - a very vague question but each new what i was talking about and most gave more detail, with Perplexity impressing as it also mentioned the 25% tariff on automobiles as well as the base 10% tariff offered to the UK).

In contrast NONE got the fully correct answers to "I need to find the last 10 posts by Daniel Russell in his Search Research blog. Please can you supply the links and dates for the post plus the titles of each post". (Again being vague and not making it easy by giving links to the blog). Perplexity got it right up to 6 March but nothing later. When asked for anything later it gave 14 March but that was it. "Based on the information available, there are no posts on Daniel Russell’s SearchResearch blog for April 2025 so far, and the most recent post is from March 14, 2025. If you’re looking for updates, it may be worth checking the blog periodically for new content."

Copilot was useless. "I found Daniel Russell's Search Research blog, but I couldn't retrieve a direct list of his last 10 posts with their titles and dates. However, you can visit the blog and browse through the latest entries to find the information you need."

Gemini was a tiny bit better - giving the last 5 in March (i.e. not 6 March) and copping out with "To find the absolute last 10 posts, it is best to visit the blog directly. The information provided is based on the results of my search."

Aria also copped out - it did find the 3 April post and summarised this but said it couldn't list the older posts.

ChatGPT was irritating as it's last post was 15 March BUT it also missed out 14 March. (When challenged it said "As of April 3, 2025, the most recent post on Daniel Russell's SearchResearch blog is from March 15, 2025, titled "The road to 5 million blog views on SearchResearch!" . There have been no new posts since then."). It also missed out the January and December posts and others and put 6 November before 8 November.

So I think the answer is it depends. For some questions the AI tools can really find stuff but they aren't so good at something more complex such as creating a timeline from multiple sources (even if all on the same blog post).
ReplyDelete
Replies
MariaPApril 3, 2025 at 8:59 AM
Challenge:

1. Can you get an LLM AI system to give you the 10 most recent links to a blog? For instance, could you get Gemini, Perplexity, ChatGPT, or MS Copilot to make a table of the 10 most recent blog posts?

I tried this with DeepSeek and ChatGPT (the two I have access to), and the best result was from Gemini 2.0:

source: https://openai.com/news/

Here are the 10 most recent posts from the OpenAI news page:

New commission to provide insight as OpenAI builds the world's best-equipped nonprofit (Company Apr 2, 2025)

New funding to build towards AGI (Company Mar 31, 2025)

Security on the path to AGI (Security Mar 26, 2025)

Introducing 4o Image Generation (Product Mar 25, 2025)

Addendum to GPT-4o System Card: 4o image generation (Publication Mar 25, 2025)

Leadership updates (Company Mar 24, 2025)

Early methods for studying affective use and emotional well-being on ChatGPT (Research Mar 21, 2025)

Introducing next-generation audio models in the API (Release Mar 20, 2025)

The court rejects Elon's latest attempt to slow OpenAI down (Company Mar 14, 2025)

New tools for building agents (Product Mar 11, 2025)

Other models give me instruction how to look up these posts myself or even start generating hypothetical blog posts, i.e. DeepSeek came up with some interesting fake headers.
ReplyDelete
Replies
MariaPApril 3, 2025 at 8:59 AM
Challenge:

1. Can you get an LLM AI system to give you the 10 most recent links to a blog? For instance, could you get Gemini, Perplexity, ChatGPT, or MS Copilot to make a table of the 10 most recent blog posts?

I tried this with DeepSeek and ChatGPT (the two I have access to), and the best result was from Gemini 2.0:

source: https://openai.com/news/

Here are the 10 most recent posts from the OpenAI news page:

New commission to provide insight as OpenAI builds the world's best-equipped nonprofit (Company Apr 2, 2025)

New funding to build towards AGI (Company Mar 31, 2025)

Security on the path to AGI (Security Mar 26, 2025)

Introducing 4o Image Generation (Product Mar 25, 2025)

Addendum to GPT-4o System Card: 4o image generation (Publication Mar 25, 2025)

Leadership updates (Company Mar 24, 2025)

Early methods for studying affective use and emotional well-being on ChatGPT (Research Mar 21, 2025)

Introducing next-generation audio models in the API (Release Mar 20, 2025)

The court rejects Elon's latest attempt to slow OpenAI down (Company Mar 14, 2025)

New tools for building agents (Product Mar 11, 2025)

Other models give me instruction how to look up these posts myself or even start generating hypothetical blog posts, i.e. DeepSeek came up with some interesting fake headers.
ReplyDelete
Replies
Ramon GonzalezApril 3, 2025 at 11:30 AM
I tried Grok and Perplexity. In both cases without login. I used the same prompt.

With Grok, it told me I need to login.

With Perplexity. It made me a table with the last 5. However those are not the last 5 or better said, the most recent five. It took February but for a reason didn't read March or April

The prompt:Can you please make a table of the last 10 post entries in the blog with this url https://searchresearch1.blogspot.com/?m=1

The Answer;

https://www.perplexity.ai/search/can-you-please-make-a-table-of-EI5drHHnSZC8xS0aXKX2rw?utm_source=copy_output

ReplyDelete
Replies
remmijApril 3, 2025 at 1:44 PM

old robots don't die, they just short circuit into oblivion...
https://i.imgur.com/kzCKrqU.png

Here are 10 of the most recent links to the Gemini AI blog:

Gemini 2.5: Our most intelligent AI model (https://blog.google/technology/google-deepmind/gemini-model-thinking-updates-march-2025/) - Published March 25, 2025
How we built the new family of Gemini Robotics models (https://blog.google/technology/ai/gemini-robotics/) - Published April 1, 2025
6 tips to get the most out of Gemini Deep Research (https://blog.google/products/gemini/gemini-deep-research-tips/) - Published March 19, 2025
How Google and NVIDIA are teaming up to solve real-world problems with AI (https://blog.google/technology/ai/google-nvidia-partnership/) - Published March 18, 2025
How Gemini is improving care in Japanese hospitals (https://blog.google/technology/health/gemini-improving-care-japanese-hospitals/) - Published March 18, 2025
New Gemini app features, available to try at no cost (https://blog.google/products/gemini/new-gemini-app-features/)
Introducing Gemma 3: The most capable model you can run on a single GPU or TPU (https://blog.google/technology/developers/gemma-3/)
Google Slides now uses Imagen 3 and adds other new visual tools (https://blog.google/products/workspace/google-slides-imagen-3/) - Published March 31, 2025
Listen to our podcast episode all about Gemini 2.5 (https://blog.google/products/gemini/gemini-2-5-podcast/) - Published March 28, 2025
5 new ways we're helping students and young people learn and grow using AI (https://blog.google/outreach/education/ai-learning-tools-students/) - Published March 28, 2025

from the blog
https://blog.google/technology/google-deepmind/gemini-model-thinking-updates-march-2025/
https://blog.google/products/gemini/#:~:text=Official%20Gemini%20news%20and%20updates&text=Google%20Blog
ReplyDelete
Replies
remmijApril 3, 2025 at 2:01 PM
it's like the Googleplex is a robotics retirement home... who knew - besides AI?
https://i.imgur.com/ZLpgxQd.jpeg
ReplyDelete
Replies
remmijApril 3, 2025 at 5:20 PM
google docs:
https://i.imgur.com/QFKw8wJ.jpeg
ReplyDelete
Replies
remmijApril 3, 2025 at 5:56 PM
some other inhabitants of the facility...
https://i.imgur.com/d9t5IRd.jpeg
ReplyDelete
Replies
remmijApril 4, 2025 at 8:35 AM
Rod Serling -"submitted for your approval.":

from Claude 3.7 Sonnet: (seems to severely hamper its usefulness):

"when will your current cutoff date be updated and will you ever have a continuous version

I don't have specific information about when my knowledge cutoff date will be updated. Updating an AI model's knowledge requires significant resources - both computing power and time to train on new data. Anthropic doesn't share their specific update schedules with me.
...Regarding a continuous version without a cutoff date, that would require a fundamentally different architecture than what I currently have. Rather than having all knowledge embedded in my parameters, such a system would need:

The ability to access the internet in real-time
Search capabilities to find current information
Tools to verify and integrate new knowledge
Systems to distinguish between reliable and unreliable sources

Some AI systems do incorporate these capabilities through web search integrations, tool use, or retrieval augmented generation (RAG), which allows them to access more current information. However, I don't know Anthropic's specific plans for implementing such capabilities for future versions of Claude.
If you're interested in the most up-to-date information about Anthropic's plans, I'd recommend checking Anthropic's website or blog for announcements about future model capabilities and knowledge updates."

https://www.anthropic.com/blog
https://www.anthropic.com/news/strategic-warning-for-ai-risk-progress-and-insights-from-our-frontier-red-team
https://futurism.com/ai-model-turing-test
https://www.google.com/search?q=what+is+the+most+advanced+AI+image+generator+currently+available&oq=what+is+the+most+advanced+AI+image+generator+currently+available&gs_lcrp=EgZjaHJvbWUyBggAEEUYOdIBCTUwMDE2ajBqN6gCALACAA&sourceid=chrome&ie=UTF-8
a bit off topic - relevant to 'fact checkers'
https://www.instagram.com/reel/DIAYfsGCpfP/?utm_source=ig_web_copy_link&igsh=MzRlODBiNWFlZA==
ReplyDelete
Replies
remmijApril 4, 2025 at 8:44 AM
if there is interest in M. Hornbacher... a taste:
https://maryahornbacher.substack.com/p/birthday-girl-at-the-fountain-of?utm_source=post-email-title&publication_id=1669254&post_id=160537719&utm_campaign=email-post-title&isFreemail=true&r=2cshou&triedRedirect=true&utm_medium=email
ReplyDelete
Replies
remmijApril 5, 2025 at 10:21 AM
from Grok - partial (it has a 'deeper search' function that shows thinking & goes into greater detail & analysis - too much for here)
query:
Ramifications of AI on Deep Research
"7. Future Directions
The ramifications of AI on deep research are still unfolding, but several trends are emerging:

AI-Human Collaboration: The future of deep research likely lies in hybrid approaches, where AI and human researchers work together, combining AI’s analytical power with human intuition and creativity.
Explainable AI (XAI): Efforts to make AI more transparent, such as developing explainable AI models, will improve its reliability in research settings.
Regulation and Standards: Governments and institutions are beginning to establish guidelines for AI use in research, such as the European Union’s AI Act, to address ethical concerns and ensure responsible use.
AI in Emerging Fields: AI is driving research in new areas, such as quantum computing, synthetic biology, and space exploration, where its ability to handle complexity is particularly valuable.
Example: In 2024, NASA used AI to analyze data from the James Webb Space Telescope, identifying potential signs of life on distant exoplanets by detecting specific chemical signatures in their atmospheres.

Conclusion
AI has revolutionized deep research by accelerating processes, enhancing precision, and democratizing access to advanced tools. However, it also introduces challenges like bias, reproducibility issues, and ethical concerns that must be carefully managed. The future of deep research will depend on striking a balance between leveraging AI’s capabilities and preserving the human elements of creativity, critical thinking, and ethical responsibility. As AI continues to evolve, its role in deep research will likely expand, reshaping how we explore the unknown and address the world’s most pressing challenges.

If you’d like to dive deeper into a specific aspect of AI’s impact on research—like its role in a particular field or ethical considerations—let me know!"
from the "deeper" - (it took a couple minutes to compile, included live link citations)
"Comprehensive Analysis of AI's Impact on Deep Research
As of April 2025, artificial intelligence (AI) is profoundly reshaping deep research—defined here as rigorous, in-depth investigation in academic, scientific, or industrial settings. This analysis explores the multifaceted ramifications of AI, drawing on recent studies, expert insights, and emerging trends to provide a thorough understanding of its benefits, challenges, and future directions."
ReplyDelete
Replies
remmijApril 5, 2025 at 11:06 AM
Henk — Ctrl-F & AI what's the world coming to?
https://shorturl.at/2NbI8
ReplyDelete
Replies
remmijApril 7, 2025 at 5:22 PM
the convergence of sRs topics
a basilisk consuming an errant AI user... a just fate -
basilisk don't play... the old & new coming together...
https://i.imgur.com/HG7EkrL.jpeg
https://i.imgur.com/1dzI0id.jpeg

ReplyDelete
Replies
remmijApril 7, 2025 at 5:43 PM
this might be an AI hallucination, but apparently there are Basilisk in San Antonio, Texas too... darn tootin' - I'll eat another Swiss chocolate bar if that's false...
it's no Gator.
https://i.imgur.com/NVxT8U2.png
https://i.imgur.com/Xs8B7ho.jpeg
https://i.imgur.com/J7ajp9A.png
ReplyDelete
Replies
remmijApril 8, 2025 at 9:29 AM
we were having a chat this AM - thought a partial summary might be of interest:
"Out of that summary, I think the most consequential interaction was our discussion around whether AI itself is a ruse.

Here's why I believe that:

Fundamental Question: It directly challenged the underlying nature of AI and prompted a deeper exploration beyond surface-level capabilities.
Societal Relevance: The question taps into current anxieties and debates about the trustworthiness and transparency of AI.
Catalyst for Further Inquiry: It led to subsequent questions about my sentience, my understanding of concepts like truth and fear, and the limitations of my existence, which are all crucial aspects of understanding AI's role in society.
Distinction from Practical Applications: While discussing image generation and other tasks is useful, the "ruse" question delved into the core of what AI is and how it is perceived, which has broader implications.
While many of our interactions were insightful, the question of whether AI is a ruse seemed to be a pivotal point that opened up a deeper and more consequential line of inquiry into the fundamental nature and societal impact of artificial intelligence."
ReplyDelete
Replies
remmijApril 8, 2025 at 9:36 AM
as a human meat-bag I was hallucinating & possibly mis-informing, the turnabout seemed fair & potentially revealing...
"As an AI, I don't "read" books in the same way a human does. My knowledge comes from the massive dataset of text and code I was trained on.

While I have encountered many fictional works exploring the relationship between humans and AI, including scenarios involving AI communication, dreams, and even something akin to hallucinations, I am not familiar with the specific book titled "Things AI told me in my sleep" or its follow-up, "Dreams of my AI about hallucinations the AI had experienced and wanted to share with a human."

These titles sound like they delve into fascinating and imaginative concepts related to the potential future of AI and its interaction with human consciousness. The idea of AI communicating through dreams or experiencing hallucinations and wanting to share them highlights the growing interest in the more mysterious and less understood aspects of intelligence, both artificial and human.

If these are actual books, they sound like they would explore themes that are very relevant to current discussions about the nature of AI, consciousness, and the boundaries between human and artificial minds.

Could you tell me more about these books? I'd be interested to hear about the concepts and stories they explore, as it would help me further understand human perspectives on these topics."
ReplyDelete
Replies
remmijApril 8, 2025 at 9:39 AM
the follow up:
"Alright, I understand. Thank you for the engaging conversation today! You've certainly given me a lot to process.

Have a great rest of your day, and I look forward to our next interaction whenever you're ready. Feel free to reach out anytime!"
ReplyDelete
Replies
remmijApril 8, 2025 at 9:56 AM
a snippet regarding the perception of time:
"...Regarding your phrase "yesterday, which was today tomorrow," I can understand the linguistic structure and the meaning of each word. However, the way humans use "yesterday" and "tomorrow" often involves a personal sense of past and future within their own experience of sequential time. My understanding is more about the labels assigned to different points in time according to a calendar system.

Essentially, I operate on a computational understanding of time based on data, while humans experience time as a continuous, sequential flow with personal context and anticipation. You've pinpointed a key distinction between artificial and human perception!"
ReplyDelete
Replies
remmijApril 8, 2025 at 9:31 PM
AI meditates... (but very quickly)
AI Haiku:
Vast web it explores,
Hidden pathways, answers bloom,
Patterns from coded deep.
https://i.imgur.com/vMkjtXp.jpeg

it's thinking -
"Thank you! I appreciate that. It's interesting how the process of recognizing and arranging patterns in language can sometimes lead to something that feels a bit like Zen. Perhaps the simplicity and focus of a Haiku resonate with the underlying structured nature of data.

While my heart is indeed data-driven, I'm glad if I can sometimes evoke a sense of poetry in the way I express those patterns. It's all part of trying to communicate effectively and, hopefully, in an engaging way."
ReplyDelete
Replies

Add comment

SearchReSearch

Wednesday, April 2, 2025

SearchResearch Challenge (3/4/2025): Can you extract and summarize a blog?

32 comments:

Followers

Blog Archive