It's easy to complain...
![]() |
| P/C DS Studio at Pexels.com |
... about the quality of AI-powered search tools. (I've done my fair share!) But when you're searching for something that's fairly difficult, I've often found the answers to be incredibly helpful, especially when the research question is vague or difficult. They really help you find the right puzzle piece, even in a massive soup of pieces that all look pretty-much the same.
If you think about Google as a database search, that's the wrong mental model. A database search implies that the query will find every thing that matches. If your query is something like "magic trick" then I'd expect the database to give me back a complete and accurate list of all the hits. Those are key ideas: "accurate" and "complete."
But that's not the way any search engine works. Instead of database records, a search engine indexes all kinds of documents--text files, Word documents, PDFs, videos, spreadsheets, images, etc etc etc. Your search engine finds the most probable hits and then rank orders them by what it thinks is best. Usually, that means sorting the hits by relevancy. (What makes something relevant is a topic of long debate and discussion, quickly approaching the zenith of technical discussion. Here's an article with more details, should you wish to learn more.)
With the additional AI overviews, the search engines now have another tool that tries to answer your question. A good old-fashioned Google query (short and to the point) isn't as helpful to the AI as an extended question: for AI questions, longer (with more detail) is often better.
My point is that people often complain about the changes to search engines. You're right to complain about inaccuracies and errors, but it's also worth taking a moment to celebrate how truly magical some of the AI-augmented search experience really is.
These are really decent suggestions, although there are a couple of errors. Kalinka isn't at that address anymore and neither is the Slavic Shop at that address. But they're both plausible places to buy quark. (And it was simple to find their current addresses with a quick Google search.)
As members of the SearchResearch Rancho have noted, it's often true that just putting in old SRS challenges works pretty well. Current AI search technology just answers them.
For instance, if you remember the Carolina Parakeet Challenge (find an image drawn from life), copy-pasting the Challenge with the default current Google search gives more-or-less the same answer that we worked out by hand.
Other AI engines also do pretty well. Here's Claude's lovely answer…
But wait, there's more! Specialty Search Tools!
You might have also noticed that there are an increasing number of other kinds of search engines. We've talked before about music identification systems. But there are more: like Shazam for music, or Starwalk2 for things in the night sky, or Vinvino for wine identification.
I want to mention another special-purpose search app that I've been using recently.
Merlin Bird ID is incredibly accurate—with roughly 98% accuracy for photo identification and 70-80% for sound recognition. Developed by the Cornell Lab of Ornithology, it relies on massive, crowdsourced databases to deliver reliable results in your immediate area.
Identifiying birds just by their songs requires human verification. A part of good search practice is double checking. For instance, there's the problem of the Northern Mockingbird, which imitates other species (hence, "mocking bird"). Merlin will often identify the mockingbird as the original bird... that is, the one being mocked. Ah well. (Pro tip: Listen for consistency: If Merlin flags an unusual bird but you only hear a split-second snippet of it once, it is likely a misidentification. If the song repeats continuously, it's generally accurate.)
AND, when Merlin hears a bird, it will show a picture of that bird and give you additional spotting information. (Such as "look in the top branches of a nearby tree; they love to perch there...")
Here's an example from this morning's birds:
| The Merlin Bird ID interface. |
SearchResearch Lessons
1. AI search engines sometimes make mistakes... but they're often useful. For what it's worth, *I* sometimes make mistakes as well. Learn from the errors and try to figure out what happens, why, and how to work around the issue. (Big tip: CHECK EVERYTHING!)
2. What used to be hard SearchResearch Challenges are now (mostly) straightforward. This is a huge shift! And I'm celebrating the increase in our ability to find the answers to complex questions. And, as always, be sure you understand the answer.
3. Consider other kinds of special purpose search tools. There are a large number of speciality search tools. It's good to learn which ones are useful for the tasks you do. (I'll try to collect a list of them in a future post. It will go out of date quickly, but it will show us the range of possibilities!)
And...
... keep searching.




Other questions:
ReplyDelete* How accurate is the training data for a large language model like Gemini, Claude, or ChatGPT, compared to information it gives after when that was most recently updated? (According to Gemini, the former is pretty accurate but it takes a lot of time and money to update, though the latter relies on Google Search. But, since you've been working with chatbots for 3 years, I figured I'd ask to check on that.)
* I found out that using the site operator when doing a search on Gemini tells it to only search at the site that you tell it to. And, according to what it said, the minus operator also works in the same way that it would for a standard web search. So, what's been your experience with other search operators when using an LLM? Or, if you've already written about this in a blog post, would you let me know which one?
Thanks for reading these, and I look forward to hearing from you.
The training data is just a large corpus of text collected from a number of different sources. It's so large that nobody goes through to verify everything in it. So "accuracy" isn't really a thing, although some LLM training sets are higher quality than others (mostly because they're scraped from places that are more-or-less trustworthy, leaving out sources that are known to be unreliable).
DeleteI actually haven't tried using a site: operator (or minus) with Gemini. It never occurred to me because the web search component of an LLM response is just a piece of the larger story.
If Gemini's "Thinking" models or underlying conversational engine try to answer strictly from their own training data, they might occasionally ignore your search query parameters. To force Gemini to use its live search engine—and to obey your search operators—preface your query with a direct command, such as:"Do not answer from memory. Use the Google Search tool to verify this.""Search the web using the following operators..."
I'll ask around inside and see what the engineers tell me.