Wednesday, January 7, 2026

SearchResearch (1/7/26): Where / why / what is causing point source pollution of the Susquehanna?

Even though it's beautiful... 

Susquehanna River in Pennsylvania (P/C Wikimedia)

... troubles lie beneath the surface.  

The mighty Susquehanna River begins near Cooperstown, NY from the confluence of two main branches: the North Branch, and the West Branch, which join in central Pennsylvania. Between the source and the river's destination in Chesapeake Bay, hundreds of tributaries join with the main river.  

The river drainage is around 27,500 square miles (71,000 km2), which is nearly half of the land area of Pennsylvania, and is the single largest inflow into the Chesapeake Bay.  

As you can imagine, the Susquehanna River is fairly heavily polluted with agricultural runoff and sewage picked up during its long run to the sea.  But there’s one truly dramatic point-source of pollution that changes the color of an important tributary from clear to… what?  

That’s today’s Challenge.  

1. Can you find this truly amazing point-source of pollution in a tributary of the Susquehanna River?  Where is it?  What’s the point source called and what tributary does it flow into? Why is it even there?    

2. What color is the water just downstream from this point source?


The backstory is fascinating.  Can you figure it out?  

Be sure to let us know HOW you found it out.  What steps did you follow?  Did you just use an AI?  How well did that work for you?  

Keep searching. 

  

 

Friday, January 2, 2026

SearchResearch (1/2/26): (UPDATE) A Review of the year at the SearchResearch Rancho

 Thank heavens for friends.  


After yesterday's post, a friend emailed me to ask "why didn't you use the advanced models of the LLMs?" 

Fair question, but in my defense, there are about a zillion models out there and I can't test them all.  Besides, what I was asking the AIs to do wasn't that hard.  (At least to me, a human.)  So I assumed they could all do it.  

But as we found yesterday, only Claude (Opus 4.5) was able to do it "out of the box."  

So I tried using the more advanced models.  Guess what?  They all worked and did a much better job. 

However, they're not perfect.  

In my previous post I used Gemini 3 (Thinking) and ChatGPT (5.2 Auto).  They both failed in interesting and distinctive ways.  

This morning I tried Gemini 3 (Pro) next to ChatGPT (5.2 Thinking /Extended) and got much better answers.  

The input data was the same (here's the data SRS blog posts from 2025) which I uploaded to each.  

I then prompted them with:  

[This is a list of text links to SearchResearch blog posts for 2025. For each text link here, please create a spreadsheet with each text link in Column A, then please extract the URL from that link into Column B, and then in Column C, please write a 100 word summary of the content on the URL.]

Gemini did this, then stopped after a few and asked "Would you like me to continue generating this table for the remaining older items in your list (e.g., from September and August 2025)?" 

Yeah, I would.  But I replied with:

 [Please generate all 55 entries]

Gemini and ChatGPT both created new spreadsheets with CORRECT URLs and summaries.  

Here's the side-by-side sheet of each link to the blog post and the Summary by each of the 3 systems.  

And here's what one entry looks like: 


As you can see, the ChatGPT summary is really more of a condensed version of the blog post--not a summary, per se.  Claude's summary is really short, but pretty decent.  I find Gemini's the closest to what I'm looking for in a summary--succinct, and a synthesis / restatement of what was written in the post.  

SearchResearch Lesson 

1. When trying to do something complicated, be sure to test with the most advanced model offered.  It might cost you some money, but you'll get a huge quality improvement.  

2. The same prompt will cause very different outputs from different models.  Try several!  

Keep searching.  


Thursday, January 1, 2026

SearchResearch (12/31/25): Review of the year at the SearchResearch Rancho

 It's the day of the year when we look back... 



... at what we've done and review all of the accomplishments (and potentially any issues that might have cropped up). 

Naturally, I thought I'd look over all 55 posts from 2025 and see what themes emerged.  I did that, and came up with a few biggies (this list is from the top of my head): 

1. Shift the focus from purely-Challenge based to more commentary on research methods.  (Oct 24, 2025)

2. Methods for doing just-in-time research. (Aug 15, 2025

3. Fun Challenges: such as Mysteries from Zürich. (July 2, 2025) or Lemon-in-Lemon (April 24, 2025)

4. Comments about the quality of doing AI research. (May 13, 2025)  Or using AI for creating summaries. (May 15, 2025)


It's pretty clear that the old methods of doing online research are profoundly changing.  Take a look at my friend Daniel Tunkelang's recent post about Is Retrieval Dead?  In that post he muses on how people now expect search to return answers and not just a list of links.  

But as he points out, 

"This is how answer engines fail in practice, not with absurd hallucinations but with reasonable yet dangerously incomplete narratives. Retrieval would have surfaced vetted medical guidance, differential diagnoses, and clear warning signs. It reveals the shape of the problem rather than collapsing it into a comfortable, concise, but incomplete picture." 

I think one of the points of this blog is to highlight the changing nature of online research.  

Regular (classic) search shines in those areas where researchers need to understand and navigate tradeoffs.  And, to Daniel's point, we still need to know something about the answers we're given by AI answer engines, they often remove context and make errors... and that's why you need to constantly check what you get.  

Case in point--me trying to make a summary using AI: 

I thought that I'd create an AI summary that would crawl through all of the posts from SRS this year and create a good synthesis of what we've covered.  

But if you remember our previous attempts to do this, it often ends in compromises.   

However.... AI quality is improving, and I'm an optimist, so I wanted to try again.  

So... I got a list of all 55 posts from my blog.  (Here's the accurate, master list of all 55 posts that I got straight out of my blogging platform, Blogger.)  

I put those into a Google Sheet (here) and asked Gemini to give summaries of each post in Column H. 

Here's my prompt after uploading my spreadsheet:  

[this is a spreadsheet where each row is one blog post title and the URL to the web page for that post.  Please create a new spreadsheet that has the title of the blog post, the URL of the blog post, and a 100 word (or less) summary of that blog post]  

This is where things start to go wrong. I looked at the output: Argh!  This is a terrible result (click thru on the image to see in full size). What's going on?  

I hate debugging AI systems like this.  But here goes... 

The summary of the first blog post (row 1) is pretty good "It analyzes the variable accuracy of AI (like Gemini) in performing OCR on images of bookshelves, finding that it "kinda/sorta works" with up to 95% accuracy, but results are inconsistent, require validation, and improve when users request data in a spreadsheet format."  That's nice.  

But most of the other results are terrible.  What's going on? Why so many errors? Why can't it find the web page from that URL? Row 20 is from outer space.

The short answer: Gemini incorrectly extracts the URL from the link and that leads to a cascade of failures.

For example, in row 17 is the URL: 

https://searchresearch1.blogspot.com/2025/09/coffee-sniffers-video.html 

You know what's odd about that?  That's not the right URL!  Here's the ACTUAL URL to my post: 

https://searchresearch1.blogspot.com/2025/09/a-short-video-about-coffee-sniffers.html 

See the difference?  The page name in the original is "a-short-video-about-coffee-sniffers.html" 

Here's what the original link looks like: 

         A short video about the Coffee Sniffers (a test vi...

If you or I were asked to extract the URL from that text link, we'd get the correct URL. 

But somehow, when I asked Gemini to extract the URL from the link text, Gemini munged it to be: "coffee-sniffers-video.html" -- and that's not a real page, which is why Gemini can't give a decent answer in Column H. 

The truly strange thing here is that SOMETIMES the extraction works properly (see row 1 or row 5, both of which work beautifully).  But mostly it messes up, changing dates around, dropping a few characters here and there.  It looks to me as though the AI is trying to do spell correction, but it doesn't know that spell correction in a URL is senseless.  

I fooled around with this for hours trying different approaches to make it work.  I failed at all of them. 

Finally, I tried using ChatGPT... and found similar kinds of results. (See my ChatGPT spreasheet for yourself.)  I have to admit that it's much better than the Gemini output.  Different errors, a few occasional successes, but about 50/50 good results mixed with reasonable sounding nonsense.  Overall, another multi-hour waste of time.  

Blarg.  

What to do now?  

A Path to Success by Using Claude:  

Then, in desperation I tried using Claude--and it worked beautifully.  

Using exactly the same approach (that is, the same spreadsheet as input with the same prompt as above), Claude figured out how to correctly extract the URLs from the links, and then correctly created 55 summaries, one for each blog entry.  You can see the results here: Spreadsheet of links, URLs, and summaries created by Claude.  The resulting spreadsheet looks like this: 


That's much better.  

Having learned my lesson, I then hand-checked each of the Summaries, making sure they were an actual summary of the linked blog post AND that they were a decent summary.  Here's a nice example summarizing the blog post Nov 5, 2025, "Pro tips on using AI for deep research."  

Pro tips for using AI to enhance reading comprehension. Shows how to use LLMs to explain obscure phrases in P.G. Wodehouse novels by providing book context. Warns about hallucinations even in seemingly accurate explanations.

Bottom line:  AIs often look like they're doing the right thing, but on close inspection, the results are less than you want.  CHECK YOUR WORK.  

And, if you want to do this kind of AI-driven manipulation of URLs, texts, and summaries--my recommendation is to use Claude.ai -- it did a superior job without a lot of hassle.  It just worked, writing its own code, debugging it, and executing it in a solid way.  

And now, I can finally do what I wanted to do in the first place--give a summary of SearchResearch for 2025:  

Post titlePost summary
Answer: What's the most significant thing going on...Answer about significant current events. Discusses major developments and how to research evolving news stories effectively.
SearchResearch Challenge (1/8/25): Earliest photog...Challenge asking about the earliest photograph from Cozumel. Explores historical photography and archival research methods.
Answer: Earliest photograph from Cozumel, Mexico?Answer about the earliest photograph from Cozumel, Mexico. Traces photographic history in the region through archival research.
SearchResearch Challenge (1/22/25): What building ...Challenge presenting a mystery building photo. Asks readers to identify the building using various search and research methods.
Answer: What building is this?Answer identifying a mystery building. Uses reverse image search and architectural analysis to identify the structure and its history.
SearchResearch (2/6/2025): SearchResearch, Search,...Overview of SearchResearch, search skills, and sensemaking. Introduces the blog's themes for new readers and previews upcoming topics.
SearchResearch Commentary (2/13/25): Using Notebo...Commentary on using NotebookLM for research. Explores capabilities and limitations for organizing and synthesizing multiple documents.
SearchResearch Challenge (2/19/25): Asking questio...Challenge asking readers to test AI image questioning capabilities. Provides test images and asks for evaluation of different AI tools' performance.
Answer: Asking questions of images with AI?Answer about using AI to ask questions of images. Demonstrates multimodal AI capabilities for image analysis while noting significant limitations and errors.
SearchResearch Challenge (3/6/25): Mimicry in plants?Challenge about plant mimicry. Asks how plants can mimic other plants without eyes or nervous systems, and what mechanisms enable this adaptation.
Answer: Mimicry in plants?Answer about plant mimicry. Explains various forms including Boquila trifoliata mimicking nearby plants, and discusses heterophylly where single plants produce different leaf shapes.
The road to 5 million blog views on SearchResearch!Celebrates reaching 5 million blog views on SearchResearch. Reflects on the blog's impact and community over 15 years of writing about search skills.
SearchResearch Challenge (3/25/25): To search, or ...Challenge asking when to use traditional search vs. AI chatbots. Explores the complementary nature of these research tools.
A note about Heterophylly in the real worldNote about heterophylly (different leaf shapes on same plant) observed in real-world settings, connecting to previous plant mimicry discussion.
Answer: To search, or to AI? That is the question...Answer comparing search engines vs. AI for research tasks. Concludes both have roles: search for current info and specific facts, AI for synthesis and explanation.
SearchResearch Challenge (3/4/2025): Can you extra...Challenge asking if AI can effectively extract and summarize an entire blog. Tests the limits of AI comprehension across many posts.
Answer: Can you extract and summarize a blog?Answer about extracting and summarizing blog content. Demonstrates various AI and traditional methods for collecting and synthesizing blog posts into useful summaries.
Follow-up: Can you extract and summarize a blog?Follow-up on blog extraction and summarization challenge with additional methods and reader solutions.
SearchResearch Challenge (4/17/25): What's a lemo...Challenge asking what it's called when a lemon grows inside another lemon. Presents photos of this botanical curiosity and asks for the scientific terminology.
Answer: What's a lemon in a lemon called?Answer about 'lemon in lemon' phenomenon. Explains it's called 'proliferation' or 'internal proliferation'—a developmental abnormality where a second fruit grows inside the first.
A lemon in lemon - updateUpdate on the 'lemon in lemon' post with additional reader contributions and clarifications about botanical terminology.
SearchResearch (5/1/25): How good are those AI sum...Challenge asking readers to evaluate AI-generated summaries. Requests testing various AI tools and reporting on accuracy and usefulness.
Slightly delayed...Brief post noting a delay in the regular blog schedule due to personal circumstances.
Special: Why are AIs so bad at diagramming?Special post examining why AI tools struggle with diagramming tasks. Despite advances in text and image generation, creating accurate diagrams remains challenging for current AI systems.
Answer: How good are those AI summaries anyway?Answer evaluating AI summary quality. Tests various AI tools on summarization accuracy, finding generally good results but with occasional hallucinations and missing context.
SearchResearch (5/22/25): What's the story of Rosa...Challenge about Rosamond Lake. Asks readers to research the story behind this California lake—its name origin, history, and significance.
Answer: What's the story of Rosamond Lake?Answer about Rosamond Lake in California's Sierra Nevada. Explains the lake's geology, naming history, and ecological significance in the region.
SearchResearch Challenge (6/25/25): Mysteries in ...Challenge about mysterious discoveries in Zürich. Presents several research puzzles encountered while visiting Switzerland.
Answer: Mysteries in Zürich?Answer about mysteries discovered in Zürich. Discusses finding unexpected historical artifacts and documents in Swiss archives, emphasizing serendipitous discovery in research.
SearchResearch Challenge (7/9/25): So what ARE LLM...Challenge asking readers to evaluate LLM strengths and weaknesses. Seeks specific examples of what AI chatbots excel at and where they consistently fail.
Answer: So what ARE LLMs good at? What are they ba...Answer evaluating what LLMs do well and poorly. Good at: summarization, translation, coding assistance, creative writing. Poor at: math, consistent facts, real-time info, citing sources accurately.
SearchResearch Challenge (7/23/25): Wake up and sm...Challenge about the history of coffee in 19th century Germany. Asks what was strange about coffee, what 'sniffers' were, and how this affected modern coffee culture.
Answer: Wake up and smell the 19th century coffee?Answer about 19th century coffee 'sniffers' in Prussia. Frederick the Great banned coffee roasting to protect German beer industry. Officials called Kaffeeriecher would sniff out illegal roasters.
SearchResearch Challenge (8/6/25): Best ways to do...Challenge asking about the best AI-driven methods for quick learning. Requests readers share their experiences with micro and meso-learning techniques.
Answer: Best ways to do just-in-time learning. Par...Part 1 of just-in-time learning series. Discusses microlearning (under 5 minutes) and mesolearning (5-60 minutes) strategies. Recommends searching for existing tutorial content as a first step.
Answer: Best ways to do just-in-time learning. Par...Part 2 of just-in-time learning series. Shows how to use chatbots in conversation mode to have topical essays read aloud and engage in follow-up discussions on any subject.
Answer: Best ways to do just-in-time learning. Par...Part 3 of just-in-time learning series. Demonstrates using NotebookLM to create audio/video summaries from curated research papers. Shows how to apply the Feynman technique with AI assistance.
Update: What are those bright patches of water wit...Update on mysterious bright patches in ocean water. Identifies them as 'surface slicks'—biofilms of fatty compounds that suppress wavelets. Notes that AI tools failed to correctly identify them.
SearchResearch Challenge (9/3/25): What kind of ar...Challenge presenting an unusual 16-sided building in Lucerne containing a massive circular painting. Asks what this art form is called and about similar installations worldwide.
A short video about the Coffee Sniffers (a test vi...A short NotebookLM-generated video about Prussian Coffee Sniffers from the 18th century—officials who sniffed out illegal coffee roasting when Frederick the Great tried to ban coffee to boost beer sales.
Answer: What kind of art is this?Answer identifying cycloramas as 360-degree panoramic paintings. Explains the Bourbaki Panorama in Lucerne depicting the 1870-71 internment of French troops in Switzerland, and lists US cycloramas including Gettysburg.
SearchResearch Challenge (9/17/25): New agentic wa...Explains how to set up automated AI research summaries using ChatGPT and Gemini scheduled tasks. Provides tips on writing effective prompts for regular topic monitoring, similar to Google Alerts but AI-powered.
SearchResearch Challenge (9/24/25): What's the sto...Challenge about greenhouses seen from aerial views. Asks about greenhouse history, what's grown under massive Chinese greenhouses, and the state of robotic/vertical greenhouse industry.
Answer: What's the story with the greenhouses?Answer about greenhouses throughout history. Explains Roman specularia used mica or selenite, modern greenhouse origins in 17th century Netherlands, and addresses the boom/bust of vertical farming startups.
SearchResearch Challenge (10/8/25): How can the sa...Challenge asking about the locust's different appearances. Based on Jeffrey Lockwood's book about the 1874 Locust Plague. Questions how the same species can look dramatically different.
Answer: How can the same locust look so different?Challenge about the Rocky Mountain locust's variable appearance. Asks how biologists could mistake different forms of the same species for different species, and seeks examples of highly polymorphic organisms.
SearchResearch (10/24/25): The shifting of Search...Answer explaining species synonymization and polyphenism. Discusses how biologists confused locust phases (solitary vs. gregarious) for different species. Covers examples like African elephants being split into two species and eels' leptocephalus larvae.
SearchResearch (10/29/25): The 1 trick you need to...Reflects on 15 years of SearchResearch blogging and 6.23 million reads. Notes the shift from traditional search skills to AI-augmented research methods. Announces format changes to focus on sensemaking and deep research skills rather than weekly challenges.
SearchResearch (11/5/25): Pro tips on using AI for...Pro tips for using AI to enhance reading comprehension. Shows how to use LLMs to explain obscure phrases in P.G. Wodehouse novels by providing book context. Warns about hallucinations even in seemingly accurate explanations.
SearchResearch (11/13/25): How good is AI at recog...Challenge asking how good AI is at recognizing images. Presents a dessert display photo and asks readers to identify the type of desserts and cafe origin using various AI image search tools.
Answer: How good is AI at recognizing images? What...Answer post about AI image recognition capabilities. Tests Google Image Search, Bing, and Gemini on identifying a Middle Eastern dessert case photo. Reveals inconsistent results and hallucinations, emphasizing the need to verify AI visual analysis claims.
SearchResearch Method: Control-F for reality (find...Introduces using AI (Gemini) to scan bookshelf photos for book titles. AI successfully locates specific books and provides directions to their shelf location. Can also analyze reading habits and suggest related books based on your collection.
SearchResearch (12/17/25): Control-F for reality--...Tests 'Control-F for reality' using AI to find decaf tea on grocery shelves. Compares Gemini, ChatGPT, Perplexity, and Claude results, finding significant variation (3-14 decaf teas identified). Demonstrates that asking for spreadsheet output yields better results than simple lists.
SearchResearch (12/19/25): An experiment in cross-...An experiment cross-posting between SearchResearch and the Unanticipated Consequences Substack. Uses NotebookLM to generate a 14-minute podcast about 'Why is thinking about the future so hard?' Asks readers for feedback on AI-generated podcast quality.
SearchResearch (12/24/25): Living in an AI world t...Explores AI OCR accuracy for book spine recognition. Tests Gemini and ChatGPT on reading bookshelves, finding ~90% accuracy with hallucinations and missed titles. Offers tips: use high-res images, request 'Can't read' markers, ask for spreadsheet format, and always verify AI results.


Or, for a more global overview, I asked Claude for 250 word summary, which gave me this--which I think beautifully captures the overall sense and direction of SearchResearch in 2025.  

SearchResearch 2025: A Year of AI-Augmented Research

The SearchResearch blog in 2025 documented a pivotal transition from traditional search skills to AI-augmented research methods. After 15 years and 5 million views, author Dan Russell acknowledged that while foundational skills like Control-F remain essential, AI tools are transforming how we investigate questions.

A dominant theme was rigorous testing of AI capabilities—and limitations. Multiple posts evaluated Gemini, ChatGPT, Claude, and Perplexity on tasks like image recognition, OCR accuracy, and summarization. The recurring finding: AI achieves roughly 90% accuracy but still hallucinates, requiring constant verification. The mantra "double-check everything" appeared throughout.

The blog introduced innovative methods, notably "Control-F for reality"—using AI to search physical spaces like bookshelves or grocery shelves via photos. A three-part series explored AI-powered just-in-time learning using chatbot conversations and NotebookLM for audio/video summaries.

Traditional SearchResearch Challenges continued but less frequently, covering diverse topics: locust polymorphism, plant mimicry, cyclorama art installations, greenhouse history, and Prussian coffee sniffers. These demonstrated that curiosity-driven research remains central, even as tools evolve.

Russell explicitly announced a format shift: fewer weekly challenges, more focus on sensemaking and deep research skills. The blog increasingly addressed when to use search versus AI, concluding they're complementary—search for current facts, AI for synthesis and explanation.

The year's trajectory reflects broader changes in information literacy: mastering AI tools while maintaining critical evaluation skills that have always defined effective research.

 

But I wasn't done with this.  I turned to my old friend NotebookLM and asked for 1-slide visual summary of the year in SearchResearch.  (See below.)  Nicely done, even if it used only 35 of the 55 blog posts... 


Or, if you prefer, here's the MindMap that NotebookLM created.  (Click to see full size.)  



SearchResearch Lessons


As I've been saying, you have to check your work.  Plausible looking output needs to be verified.  Do not just look at the first one or two outputs and hope that everything else makes sense.  Sometimes it won't. (And sometimes you'll get penalized for that mistake.  Link to story about lawyer getting a huge fine for using ChatGPT for references in a legal brief. Don't be this guy.)  

And, secondarily, try Claude for your data-transformation tasks.  I've found it on this (and other, similar tasks) to be much more reliable.  

But.. Keep searching!  

Wednesday, December 24, 2025

SearchResearch (12/24/25): Living in an AI world that kinda, sorta works for OCR

 Let's return to the tea... 


As you recall from our previous episodes, we found a way to use AI to scan an image looking for the names of different teas OR to Control-F for the titles of books on a physical bookshelf.  

There's been a bunch of chatter about this working (or not) in the blogosphere, so I thought I'd re-test my methods.  

So I did.  Guess what I found?  Yeah. It kinda / sorta works.  There are things you can to do improve your use case, but it's never 100% reliable.  Still, for lots of uses, that's pretty good.  

Here's today's results and insights.  

Here's my image from today's trial--a nice high-resolution image of a random bookshelf in my house.  (Again, don't judge this collection of texts!)  


This time, I zoomed in a bit to make sure that everything was visible to the maximum extent possible.  

Gemini: This I asked Gemini to...

[scan this image and make a list of all the book titles and authors you see here. Put them into a spreadsheet format, and put "Can't read" for the book spines you can't read clearly] 

We learned last time that asking for the results in a spreadsheet always ended up returning more and better results than if you just asked for the list of books.  

Note that I ALSO told it to insert a marker ("Can't read") for anything that it was uncertain about.  

A couple of strange things happened.  

First, it did a pretty good job--BUT it hallucinated a couple of book titles and totally missed a few titles.  Here's the bottom part of the spreadsheet it made with my notes on the side... (the top part is totally fine)... 


It's MOSTLY okay (kinda, sorta), but Gemini missed all 3 C.S. Lewis books ("Chronicles of Narnia," "Prince Caspian," and "The Lion, the Witch, and the Wardrobe"), it completely hallucinated the Dostoevsky book (I think it misread C.S. Lewis as Dostoevsky, which is a pretty big miss), it skipped "Macbeth," and it completely hallucinated a book, "The Lost Treasure of Beowulf," which sounds great, but isn't a thing. 

Dang. 

On the other hand, it did correctly place a "Can't read" where the blue-tape-bound book is (which is correct).  

Gemini: Overall, 4 misses and 2 hallucinations out of 35 books (one of which is unreadable).  Roughly 90%.  

ChatGPT: Interestingly, when I did the same thing with ChatGPT, I got similar, but slightly different errors: 


In a couple of places it dramatically shortened the title.  In the example above, ChatGPT shortened "The Lion, the Witch, and the Wardrobe" to just "Wardrobe."  (It also shortened "The Oldest Living Confederate Widow Tells All" to just "Tells All," which is a bit much.)  

But overall, a better accuracy rate than Gemini--on this image.  

I played around for a few hours trying to get any of the systems to 100%, but I wasn't able to get much better than 95%... which is... kinda-sorta mostly right.  

As with most AI things, you still have to double-check.  

But there are things you can do to improve the accuracy.  

1. High resolution images improve your results.  I've seen plenty of OCRs fail because the resolution wasn't high enough.  Heuristic:  If you can't read it when you zoom way in on the text, it's a sure bet than the OCR system can't read it either.  

2. Ask your AI to tell you when it can't read something.  That's the "Can't read" trick from above.  That's a much better way to fail than just silently NOT giving you a result.  

3. Ask for a spreadsheet--it's much better than just getting the list of books.  It's unclear why it should be so, but the "in the form of a spreadsheet" seems to make the AI work harder and more accurately.  

4. But Control-F with a camera works quite well... kinda/sorta.  Here's a real example when I was looking for a particular book in my stacks.  As you can see, it found the book when *I* could not.  (In retrospect this seems obvious, but when you're actually looking for the book, a kind of blindness takes over...) 




Bottom line for much AI work (especially OCR)--it kinda/sorta works, but you still need to validate what it's telling you.  

And that seems to be the way it is at the end of the year, 2025.  

Until next year, keep searching!  (And check, check, check...)  




Saturday, December 20, 2025

SearchResearch (12/19/25): An experiment in cross-posting podcasts - SearchResearch X Unanticipated Consequences

 An experiment... 




As Regular Readers know, I’m writing another book–this one is about 
Unanticipated Consequences of the choices we make.  Here is a link to the Substack where I post those Unanticipated thought-pieces. (You can sign up to get the regular forthcoming book updates there.)

This week I wrote a post for the Unanticipated Consequences Substack on the topic of making decisions about future events.

I ask (and answer!) Why is thinking about the future so hard?

After I wrote that post, methought about making a podcast on the topic.  People keep telling me that would be a good idea.  

So I started narrating it, and have a recording of me doing this, but then I had an even better idea:  What kind of podcast would NotebookLM make of this?  

I poured the Substack post into NotebookLM and--voila!--in a few minutes it generated a 14 minute podcast that I thought was pretty interesting.  

Here's the link to the podcast: 

Dan's podcast about
 Why is thinking about the future so hard?

I'm really curious what you think about it.  Love it?  Hate it?  Let me know your reaction in the comments below.   

You might recall that I did a similar thing back in September of this year, using NotebookLM to create a video about the Prussian Coffee Sniffers. People were split on the quality of the resulting video.  As you know, the quality of the LLMs is constantly improving.  So... how is it today?   


And in the meantime, keep searching! And let us know what you think of this podcast.  


Wednesday, December 17, 2025

SearchResearch (12/17/25): Control-F for reality--when it works / when it doesn't work

What do you do with too much tea? 

[a massive amount of tea pour from a large flowery cup] 
P/C Gemini

Answer: Spill it, obviously.  

I was in the grocery store just before closing, searching for a particular kind of decaf tea that my daughter wanted.  I had the thought that I'd try the "Control-F for reality" idea that I wrote about in my previous post.  

I took a couple of photos of the tea shelf.  Here's what I got, a very full grocery rack of tea.  


I was looking for a particular kind of decaffeinated tea, so I prompted Gemini with:

      [which of these teas are decaffeinated?] 

Response: 



Which is pretty reasonable--it not only tells me which are decaf, but even where they are on the shelf ("located on the far left of the shelf").  Nice.  

It EVEN found caffeine-free options that are not explicitly labeled as such (e.g., "chamomile citrus" or "turmeric ginger").  

I was thinking this was a smash success.  

But you know me--I have to double check these things before I believe them.  Pro tip: You should always check too.  

That got me to thinking--what would the other AIs do with this simple task?  So I tried the same task with ChatGPT, Perplexity, and Claude.  Strangely, the results are very variable.  

Gemini: found 8 decaf teas 
ChatGPT: found 4 decaf teas 
Perplexity: found 3 decaf teas  (but it warns that ",,,they should be 
assumed caffeinated unless their individual packages elsewhere state “decaf” or “caffeine free.”) 
Claude: found 1 decaf, 4 caffeine free 

So... which is it?  Why so much variation?  

To compare the different AIs, I thought I'd put all of the results into a spreadsheet so I could see them all side-by-side.  

My next prompt was to: 

 [please make a list of each of the teas on this shelf. Each line of the list should show the name of the company, the name of the tea, and if it's caffeinated or not. Please create a Google sheet with this data.]  

It gave me good data, but would NOT put it into a Sheet.  (How odd is that?  But see below for more info on this...)  But it DID give me a CSV block of text with what I was looking for--easy to copy/paste into a new sheet.  


(Note the "copy this" icon on the far right of that screencap--looks like a double rectangle: clicking on that copies the CSV text so you really can go to the spreadsheet and paste it.)  Here's that spreadsheet (check out the tabs):  


Notice that Column C ("Caffeine Status") lists some teas as Caffeine-Free and others as Decaffeinated. I finally noticed that "Decaffeinated" teas have had the caffein removed while "Caffeine-Free" teas never had caffeine in the first place.  They're all herbal and without caffeine at all.  

BUT... In this spreadsheet, Gemini claims there are 38 different teas, 14 of which are decaf.  Interesting! Seconds before, when I asked directly ("which of these teas are decaffeinated?") it only gave me 4 decaf and 4 caffeine-free.  

That's pretty funky.  

If you ask a question one way you get an answer of 8, but when you ask for the details, you get 14.  What's going on here?  How did it find an additional 10 decaf teas?  And, strangely, when you ask for the teas listed in a CSV form, listed by company and caffeine-status, then drop that into a spreadsheet, you get very different answers.  

So now I thought I'd get the other AIs answers in a spreadsheet as well.

Here's ChatGPT's sheet: 


Notice any differences between the two sheets of Gemini and ChatGPT? 

First off, Gemini lists "Mighty Leaf Organic Breakfast" as one of the teas, but ChatGPT misses it.  (There are more diffs.) 

Comparing the differences in spreadsheets created by each: 


That's a very weird result.  If you ASK an AI how many teas there are, you get one answer BUT if you ask it to create a spreadsheet, it gives you a much larger number!  

EVEN STRANGER... after not working on this blog post for several days, I went back to Gemini, re-uploaded the image and re-asked all of the questions above--including "create a spreadsheet."  Voila!  Today it knows how to create a Google Sheet.  Even better (and weirder), this time it found 58 teas, 18 of which are decaf.  That's 20 more teas than last time!  

Key insight #1:  
So.. your answer varies from AI to AI AND it varies if you ask it directly ("which teas are decaf?") vs. asking for a CSV list to drop into a spreadsheet.  Again, all the results are VERY different.

And--no surprise--there are some errors here.  None of the AIs found the Blue Lotus Chai or the Builder's Tea (second shelf from the bottom).   If you were doing Control-F for "Blue Lotus Chai," you'd be out of luck.  

ALL of this was an odd result, so I went back and took a higher-resolution image of the tea shelves and found that it COULD see the Blue Lotus Chai and Builder's "People's Tea."  

Key insight #2: You need to have fairly high-resolution images to get decent results.  EVEN SO... you'll get variable results depending on how you  ask the question ("just ask" versus "give me a spreadsheet").  Asking for a spreadsheet always gives a better answer.  

Key insight #3: Most of the AIs won't tell you that they're having problems scanning the image for labels.  (To their credit, Perplexity and Grokker told me that "Cannot reliably read and extract every tea name and company from the photo.")  But, significantly, both Gemini and ChatGPT never said anything about not having enough resolution to be confident in their results.  

And that tells you something: It's clear that all of the results of the image analysis by all of the AIs has some internal confidence measure, and they won't show results when the confidence is too low.  That makes sense.  But to not say anything about uncertainty is just malpractice.  At very least, the AI should say something like "I'm not really sure about these results..."  

What to do?  I asked both Gemini and ChatGPT a simple validation question:  

[were you able to capture all of the teas in the image?] 

In both cases the AI was able to look a little bit more carefully.  Gemini found the Blue Lotus chai (and a few others that it had missed the first time around)!  ChatGPT told me that "I captured most of the front-facing teas, but a few items weren’t fully captured with readable labels, and a couple of my earlier caffeine calls were overconfident because the tea type (black vs rooibos/herbal vs decaf) isn’t legible on some tins."  And then it gave me a newly updated spreadsheet which listed 55 different teas.  

Note that the interface looked like this... 


You might think that this means it found only 14... but you would be wrong.  IF you click on the download button (downward pointing arrow in a circle, upper right), you'll find that the full spreadsheet has 55 teas listed, along with the updated assessments about whether they're caffeinated or not.  

Bottom line: You really have to work at this to get a good analysis of an image.  The different AIs will give you very different answers... and will even work hard, if you ask them to. 

SearchResearch Lessons 


1. AIs are variable in quality and detail.  As you see, different AIs give very different results.  Your accuracy and quality will vary by which you use. 

2. Beware of asking an AI to make an inference for you.  The difference between "decaf" and "non-caffeinated" might be subtle, but the AI doesn't know that it's opaque to you. 

3. Ask for all of the details in a spreadsheet if you want to compare or validate the results.  Notice that just asking for the tea gave us pretty poor results, while asking for all of the data in a spreadsheet format gave MUCH better results.  When you're looking for details on a task like this, request all of the data. 

4. Different AIs give you different answers, and will give you different answers if you ask in slightly different ways (including "think harder"). Be cautious.  


Keep searching!