Wednesday, September 25, 2024

SearchResearch Challenge (9/25/24): How literate are people wrt road signs?

I've wondered for a while... 


... to what extent people actually understand common signs and symbols they see every day.  You probably understand all of the symbols in the above image, but based on some of the stellar driving skills I see, it's pretty clear that not everyone shares that skill.  What's the difference between the first symbol in row 2 vs. the third symbol in row 2. Do you know?  

That brings up an interesting SearchResearch Challenge for the week.  How literate are we? The goal of a roadside symbol is to be rapidly understood--you see it / you understand it.  It's supposed to be automatic.  But... 

1.  How many of the most common roadside symbols DO people understand?   

Don't just give me your estimate but find where this has actually been studied, and what the results are.  Are you surprised by what you found?  Do you think the study (studies) are valid and sound?  

Does this make you feel better or worse about your fellow drivers? 

As always, tell us HOW you found your results. 

And... 

Keep Searching! 


The basis of reading literacy is glyph recognition. Just sayin'... 



Wednesday, September 18, 2024

Answer: What is the oldest city in the Americas?

 Simple questions are

sometimes harder... 

Mythical ancient city by Imagen

... than you'd expect.   

This Challenge is like that.  It's a very straight-forward question that might not have the simplest answer.  It's up to you to figure this out! 

1.  What is the oldest city in the Americas? 


You'd think that just asking a search engine or an LLM would give you the answer and you'd be done.  Right?  

Well... if there's any deep lesson from SearchResearch, it's that things are never as simple as you'd expect... there's always something deeper and more interesting behind the question.  

Every question, even something as straight-forward as this, needs a bit of definition help.  The answer--no surprise--is going to depend on how you define "city" and how you define "oldest."  

First things first: How do we define a city?  Is it just population or "level of sophistication" or some combination?  

By the dictionary definition, a city is just "a large number of people who live fairly close together."  True, that, but not particularly precise.  What's "large" and "close together"?  

Remember that in the year 0, Rome was around 1 million people in size, while London was only around 1.4 sq km (0.5 sq mi) and home to less than 5,000 people.  Of course, by the year 537 AD, Rome's population had fallen to around 30,000 souls, while London had risen to around the same number.  

Populations come and go--cities are built, grow, prosper, decline, and sometimes lose everyone becoming less than a hamlet.  

So we have a couple of definitional questions to answer before we get to the key Challenge. 

a. how many people make up a city? 

b. how large an area does a city have to be?  (Or does population density make a city?) 

c. does the length of time a city is occupied make a difference in our question?  

I mention all of these variables because in order to answer the question, we need to pick some values.  (In some sense, it doesn't really matter which values you pick, as long as most people will agree that "this is a city at this time.")  When Rome fell to 30,000 people, was it still a city?  I'd say so, partly because of history, but also because they were fairly densely packed together.  

So, for our purposes here, a city is an assembly of more than 2000 people living in a small area that supports commercial activity, with some kind of government or ceremonial / religious functions.  (We'll ignore continuity for the moment.  If the city lasted for more than 5 years, it is--or was--a city for our discussion.)  

Once you get beyond definitions, you might think you could just ask all of the ChatBot LLMs and Search Engines this question, "What is the oldest city in the Americas?"  

When you do that, this is what you get:  



The answers vary based on differing assumptions that each system makes.  

Bing, for example, only shows St. Augustine, FL... but the first organic result points to the Wikipedia article on "List of cities in the Americas by year of foundation" (we'll look at that in a minute).  

Some of the systems expose their assumptions.  Gemini's reply includes the supposition for each: 

"If you consider the oldest continuously inhabited city founded by Europeans, it's Santo Domingo in the Dominican Republic, established in 1496.    If you're looking for the oldest continuously inhabited city in North America founded by Europeans, that would be St. Augustine, Florida, founded in 1565. However, if you're talking about the oldest city in the Americas overall, the answer is Caral, Peru. It's an archaeological site dating back to around 3500 BCE, making it one of the oldest cities in the world."

That's a good response, but it does NOT expose any of the other cities that might qualify.  (And is pretty North American biased as well since it lists Euro-towns before the older places in Peru, but I digress.)  

If you just look at the cities listed here and collect their founding dates (as given by the search/AI systems), you'll have this:


THIS is why doing a comparison table is a great idea--you can see the different options and the different assumptions that were made.  With this table we can look at each of these claims one at a time.  Let's do a little digging for each of these claims.  

Tlapacoya is claimed by Perplexity to date to 7500 BCE.  That's quite a claim.  By doing a bit of Google searching, it's clearly a city by 1500 BCE, but there are some artifacts going waaay back, including some rather controversial ones dated to 25,000 BCE.  (They're so controversial that we're going to ignore them here.) But where did the 7500 BCE claim come from? 

When I pushed Perplexity on this [why did you say that Tlapacoya dates to 7500 BCE] it rapidly backpedaled and claimed--falsely--that "I did not mention that date."  Harrumph.  Yes you did.  I have the screencapture to prove it.  Sigh. So this seems bogus... The actual founding date of Tlapacoya seems to be 1500 BCE.  

Footnote:  I figured out where the claim of 7500 BCE for the start of Tlapacoya comes from... it was scraped from the Wikipedia article, List of Cities in the Americas without careful verification.  Oops!  There is an error on the internet... Perplexity just forgot what it was trained on.  

Aspero was pretty clearly a city (with major buildings, large temples, and agricultural fields) by 3000 BCE.  It was part of the Caral-Supe civilization which goes back even farther. (The Caral-Supe culture seems to date to 5000 BCE, but cities started later and can be reliably dated to ca. 3700 BCE.)  
Huaricanga was also connected with the Caral-Supe culture and dates to 3500 BCE.  There are more major buildings and temples and possibly a connection to Aspero.  

Caral seems to have had several thousand inhabitants starting around 2600 BCE, centrally located to all of the Carl-Supe sites.

HOWEVER... while reading about Caral, I stumbled across a mention of site that was possibly older--a place called Bandurria.  Curious about this place (which wasn't mentioned by any search engine or LLM), I did a bit of searching and found dates for Bandurria that are around 3000 BCE--older than Caral, but newer than Huaricanga. (See Paleodiet in Late Preceramic Peru: Preliminary Isotopic Data From Bandurria)

Odd, isn't it?  A major city that is contender for oldest city in the Americas, and it doesn't show up in any of the search/LLMs.  

There are a couple other sites that are quite old e.g., Puerto Hormiga in Columbia, or Celilo Falls (aka Wyam) in Washington state--but both of these seem to be ephemeral villages or trading locations--they have long histories as temporary settlements, but never quite made it to city status.  

Bottom line:  The "oldest city in the Americas" tag has to go to Aspero (3710 BCE), with Huaricanga (3500 BCE) and Bandurria (3000 BCE) close behind. All of these cities had more than 2,000 inhabitants, lasted for many years, and were centers of commerce and religion.  

And our new table has moved Tlapacoya to the fifth position, and added Badurria into position three.



SearchResearch Lessons

1. Compare and contrast different sources.  You know, we've talked about "second sourcing" your results. In this case, I compared eight different systems (search engines + LLM chatbots).  As you can see in the table above, the answers are VERY different from each other.  In some cases, the results are just plain wrong.  (Interestingly, not because they're hallucinating, but because they trained on data that was incorrect, which then surfaced in their outputs.)  

2. Building a comparison table is handy. Not just because you can then use the table to work through the different results, but also so you can see the huge variety of results.  When the "answers" are this different from each other, you have to be fairly skeptical... which we found was the right thing to be.  

3. Remember that search results might be incomplete!  I found Bandurria because I noticed the unusual name when scanning the results.  Checking into it, I found the 3rd oldest city in the Americas... and a result that NONE of the systems surfaced!  

4. When doing search comparisons like this, make your definitions clear so people will know what you're comparing. 


Keep searching!  







Wednesday, September 11, 2024

SearchResearch Challenge (9/11/24): What is the oldest city in the Americas?

 Simple questions are
sometimes harder... 

Mythical ancient city by Imagen

... than you'd expect.  We've seen this multiple times in our SRS explorations.  Simple question can lead to fascinating side-excursions into topics that you didn't expect.  

Today's Challenge is like that.  It's a very straight-forward question that might not have the simplest answer.  It's up to you to figure this out! 

1.  What is the oldest city in the Americas? 


Easy, right?  

When you find the answer BE SURE to tell us what you did to find it AND why you believe this is the oldest city in the Americas.  

There's an interesting reason I'm posing the Challenge in this way.  I hope you'll also find out why I'm bothering to ask what seems to be so simple, and so obvious.  

Keep searching!  



Thursday, September 5, 2024

Answer: How can you find search phrases beyond your own brain's power to imagine?

 Expanding your thoughts... 

The "Walkie Talkie" building in central London has a giant unanticipated consequence. At certain times of day, the windows reflect the sun's rays to a point creating a hot spot that can damage cars unlucky enough to be parked at the focus.

 
 ... is a key step for doing SearchResearch research.



1. What can a researcher do to find other words and phrases that would help in doing online searching for such a topic?  Let's consider my topic--unanticipated consequences--how can we find other helpful search terms and phrases to seek out and understand this topic?  Ideas?  

Note that what we're looking for are other ways to say "unanticipated consequences." While getting synonyms for each of these two words isn't a bad idea (unanticipated = unexpected, surprise, etc. while consequences = effects, results, etc.), you'll miss a bunch of equivalent phrases or terms. From my own reading in the area I knew that phrases like "cobra effect" or "perverse outcomes" were equivalent, but my brain doesn't answer questions like "tell me all of the other terms and phrases" for this idea.

You could do the obvious synonym search with a search engine.

The Challenge this week was to imagine that you're doing research on a topic that's big, complicated, and difficult to render in just a few words.  How do you start to search for such a beast? How do you expand your mind (and search behavior) to other phrases, terms, and ideas? That is, you could do this (note that you need to use double quotes, or you'll get synonyms for "unexpected"):


It's interesting that Google is giving me synonyms for "unforeseen consequences."  I guess that's the same as "unanticipated consequences," but it's a bit odd.  

If I try Bing, I get this, which isn't quite as useful: 


EXCEPT that in the lower right corner, there's a contribution from CoPilot (i.e., ChatGPT4) suggesting that I try "boomerang effect," "collateral damage," or "Dutch disease." (I hadn't heard of Dutch disease before, so that was a new one to me--it means an apparent causal relationship between the increase in the development of a specific sector (for example natural resources) and a decline in other sectors (like the manufacturing sector or agriculture).)  

But I thought this might be a great chance to try out LLMs as a way to get suggestions about equivalent phrases--that is, after all, what they're really good at doing.  

So I prompted each of the top 5 LLMs for [give me a list of the top ten terms or words that mean the same thing as "unanticipated consequences"].  Here's the spreadsheet with the results, which looks like this: 


You can see a pretty good suite of options here, including several that I hadn't thought about ("unforeseeable repurcussion" or "collateral impacts").  

But we had a great set of suggestions from SRS Regular Readers: 

remmij: 

type in "unintended" and let the suggestions populate.

Here's what I get when I follow remmij's strategy.  It's not a bad way to start.  



From Arthur Weiss:  

Now we can use ChatGPT and similar tools. So I put in this prompt: [I'm trying to find books on the topic of "unanticipated consequences". What synonyms should i use in searching for this topic?] 

(Arthur generates the same list I show above.)  

You can combine these with broader concepts such as "policy," "technology," "social change," "economics," or "decision-making" to refine your search further.

I thought it odd that "black swan" wasn't listed so I asked again. (It's useful to have some ideas to back up what the AI gives you).

"Black Swan" is indeed a relevant term when exploring the topic of unanticipated consequences, especially in the context of rare, high-impact events that are difficult to predict. Here are some additional related terms that might be useful:  Black Swan events, Chaos theory, Butterfly effect, Emergent phenomena, Systemic risk, Tipping points, Complexity theory, Cascading failures, Disruptive events, Unknown unknowns, Contingency theory, Rare events, Outliers, Wild cards (used in futures studies), High-impact low-probability (HILP) events

UC Librarian Donald Barclay wrote:  

My go-to for this is to turn to the experts and try to figure out what language they use. In the case of unanticipated consequences, I might search the term in PsycInfo ($) to see if psychology has anything to say on the topic. I might also search the phrase in Social Science Abstracts ($) and/or in an engineering database like Compendex’s “Engineering Village.” ($)  If I find that experts in a field use different, possibly more precise, language to describe what I’m researching, I can then search those terms. An obvious example is that non-experts commonly drop the phrase “split personality” to describe a certain kind of mental illness. Psychologists, on the other hand, use phrases like “disassociative identity disorder” to more accurately describe the phenomenon commonly, but incorrectly, known as “split personality.” By searching the term used by the experts, I’m more likely to encounter information created by experts and less likely to encounter information created by someone who has watched too many episodes of Dr. Phil.

Searching “unintended consequences” in PsycInfo got 340 hits, so I will need to refine that search to make it useful. It does happen that, as an unintended consequence, the first hit in PsycInfo told me about a book I do not know but which seems promising, Good Intentions: Max Weber and the Paradox of Unintended Consequences. Looking at the full record for that first hit, I was treated to some PsycInfo subject terms that might lead me in fruitful directions if I combine them with “unintended consequences.” These are: Choice Behavior (major); Intention (major); Social Processes (major); Analysis; Consequence

Pro tip: When you find a promising article in a database, always look at the subject terms that have been assigned to it.


Krossbow:  
Suggested using PowerThesaurus to find wide-ranging synonyms.   (Dan: This is a great tool that I'd totally forgotten about!)   

 

Here's what it gave me:  unintended consequences, unforeseen consequences, unforeseen effects, their indiscriminate effects, uncontrolled effects, accidental consequences, accidental effects, accidental results, inadvertent consequences, inadvertent effects, incidental effects, unanticipated effects, unanticipated outcomes, unanticipated results, undesirable effects, undesirable outcomes, undesirable reactions, undesirable results, undesired effects, undesired outcomes, undesired reactions, undesired results.  


RB: 
Built Wiki-Guided Google Search (a kind of front end to do effective Google searches over the Wikipedia). That site contains: Wiki-Guided Google Search, which searches on Wikipedia articles to find concepts related to the ones you're searching. When RB tried searching for "unintended consequences," he got query word recommendations which were both specific (North American Free Trade Agreement, Eliza Armstrong case) and conceptual (Perverse incentive, Structural functionalism).

The second tool on that site is Clumpy Bounce Topic Search. It determines what Wikipedia categories a relevant page belongs to, determines the most popular pages in that category, and makes the ten most popular pages available. This time when RB searched for unintended consequences, he discovered the category "Consequentialism" and generated a search containing the terms unintended consequences, "Consequentialism", and "Experience machine", which dropped him deep in to the wilds of philosophy search results.

2. Same question, except this time I want to search for books on the topic of unanticipated consequences.  (Yes, I know I can go to Amazon, Barnes and Noble, AbeBooks, Google Books, or the Internet Archive.)  What's the best way to find the top 10 books on the topic? 

Unsurprisingly, many of the same tricks that worked above also work here.

When I did the cross-comparision of LLMs suggestions for books, I found this:


which was a pretty decent selection.

Donald Barclay wrote:

As for the books part of your question, you can always resort to not just looking at reviews, but considering where a book has been reviewed to get some idea of how it is regarded. Has the book been reviewed in important journals for the field it covers? Has it been reviewed in outlets like the NY Review of Books, the NY Times, the Times, etc.? On the other hand, metrics like “New York Times bestseller” are worthless because it’s beyond simple for publishers to manipulate bestseller lists. You can also look up a book title in a source such as Google Scholar to see how many times it has been cited. A lot of citations is not necessarily an endorsement, but at least it shows you how much attention has been paid to the book. Google Scholar will also direct you to book reviews that might not turn up in Amazon.  

remmij also suggested going to YouTube and searching there for our topic. I've done this before, and I'm glad remmij brought this up. The results are surprisingly rich and worth reading:



SearchResearch Lessons

There are so many here... but here's a quick summary:

1. Try synonym search. Seems obvious, but you'll find a few great ideas.

2. Try using a thesaurus. Krossbow recommends PowerThesaurus, and I see why.

3. Use LLMs and ask for new synonymous phrases. It works surprisingly well, especially if you compare and contrast different LLMs. (Check out my spreadsheet.) This is an especially good use for LLMs in SearchResearch tasks.

4. Check out Wiki-Guided Google Search as another way to look at Wikipedia.  

5. Don't forget about searching on YouTube. There's more there than you might expect.    

Keep searching.