Wednesday, October 16, 2024

Answer: Finding similar content?

 Finding something similar... 

P/C Gemini. [create an image of a hall of mirrors with trees and mountains]

... but not exactly the same is a common SearchResearch thing to do.  As I'm doing my own research work, I'll want to find another take on a topic, and would really like to know how to find a similar / closely-related page or site.    

This common research step is our Challenge for this week: 

1. How can you find related sites (or content) to something that you know about?  Suppose you want to learn more about Search and Research methods:  Can you find similar or related websites to our trusty SearchResearch1.blogspot.com?  

Last week I gave you a BIG HINT:  If you do the obvious search e.g. [how to find related sites on Google], it will tell you that you should use the related: operator has been deprecated since June, 2023.  Don't believe it: It is no more--it has ceased to be. (See the note from Google's SearchLiaison.)  

Why is it so mistaken?  A: There are thousands of pages that discuss how to use the related: operator... but only a very small number that tell you that it no longer works.  Unfortunately, both search engines and LLMs work with the vast majority rather than actual bits of knowledge.  You could say that the site: operator no longer works (don't worry--it still does!), but it would take the LLMs just about forever to figure this out. 

Alas, Google currently tells you when you ask [how to search for related pages on Google]  

WRONG!
Searching for how to find related sites on Google.  It gives you a bunch of bogus information.  If I were a linguist, I'd put a giant asterisk next to this set of results.

What IS a reasonable "find related" strategy?  

I asked a couple of LLMs "how to find sites that are similar to X," and while a lot of the advice wasn't great, I DID extract a few gems.  

SimilarSites.com - doesn't work for our SearchResearch1.blogspot.com site (but see below--it DOES work for 3blue1brown... bear in mind -- your mileage may vary).  

SimilarWeb.com - DOES work for SRS, but the similar sites are mostly math or education sites (e.g., 11011110.github.io or blog.discoveryeducation.com).  Those are fine, but not quite on topic.  

If you're a Regular Reader, you know that what our SearchResearch site is REALLY all about is how to be a better searcher.  We cover a lot of other ground in our quest to improve online research skills, but at heart, we're a team that knows about how to do online search.  

But the blog is a long series of Challenges followed by Answers. What is it about?  

To get the gist of SRS, I took a few posts (in particular, the last 8--four Challenges and four Answers), copy/pasted them into a single file, uploaded it to Gemini and asked it: 

This file is a sampler of several pages from the searchresearch1.blogspot.com web site.  Can you summarize these pages and tell me what this website is all about? 

It's not a perfect summary, but it's not bad.  Here are a couple of excerpts:  

This website, searchresearch1.blogspot.com, is a blog dedicated to exploring the intricacies of online search and research methodologies. It delves into the art of utilizing search engines effectively, with a particular focus on Google, and provides valuable insights into how to conduct research efficiently and intelligently.

The blog's author, Dan Russell, frequently presents "SearchResearch Challenges," thought-provoking questions designed to test and enhance readers' search skills. These challenges range from identifying the oldest city in the Americas to finding related websites with similar content.

A recurring theme throughout the blog is the concept of "unanticipated consequences," highlighting the unexpected outcomes that can arise from research and the importance of being prepared for surprises. The author emphasizes the need for persistence in research, advocating for continuous learning and exploration.

Overall, this website serves as a valuable resource for anyone looking to improve their online search and research capabilities. It encourages critical thinking, effective search strategies, and a deeper understanding of the complexities involved in finding and evaluating information online.

But from this summary, we could extract a couple of decent search queries to find related sites.  I extracted two: 

     [ art of utilizing search engines effectively,
       valuable insights into how to conduct research
       efficiently and intelligently ] 

     [ anyone looking to improve their online search
       and research capabilities. It encourages critical
       thinking, effective search strategies, and a deeper
       understanding of the complexities involved in 
       finding and evaluating information online.] 

Yes, I know those are very long queries, but the old advice has to be updated: One used to look for short, precise queries that would get you just a few, highly targeted results.  But THIS Challenge is asking a broad question that has multiple aspects to it.  For questions like this, longer queries actually work very well.  Here's some of the results:  

Note that not all of the query terms are in each result.  Doesn't matter. 
We're looking for related sites, not exact duplicates.


These two queries led to quite a nice collection of sites that are closely related to SRS. Here's a short list of interestingly related sites that I didn't know about!   

Zampila (on research methods), UniversalClass (on how to do online searches), Marcos Esteve blog on data science and search... There are a bunch!  


2. Same question, but about a different website:  Again, one of my favorite sites, www.3blue1brown.com (an excellent science/math/tech site with lots of exquisite videos illustrating complex topics).  

SimilarSites.com works for 3blue1brown, and points to sites like TomRocksMath.com and BetterExplained.com -- those are great sites that ARE very similar to 3blue1brown.  

Doing our "summarize the website" works less well for 3blue1brown since it's primarily a video site, but you can still patch together a set of writings (such as the "About" page for the site) and ask for another summary, then use that for your search.  When I did that, it led to me to: 

Mathologer: This YouTube channel delves into fascinating math puzzles and concepts with a focus on visual proofs and surprising connections.

Numberphile: Another YouTube channel, Numberphile explores a wide range of mathematical ideas through interviews with experts and engaging visuals.

Desmos: This free online graphing calculator allows you to visualize functions, explore geometric concepts, and create interactive mathematical models. It's a fantastic tool for hands-on learning.

GeoGebra: Similar to Desmos, GeoGebra offers powerful tools for visualizing geometry, algebra, calculus, and statistics. It's great for dynamic exploration and creating interactive simulations.

Overall, this approach works rather well.  


In the comments, remmij reminds us that for any question, there is a pretty decent chance that there's a Subreddit about.  In this case, searching on Reddit for "3blue1brown" quickly leads to "Looking for more channels like 3blue1brown" and a lot of suggestions for similar sites.  

Mott Given wrote in to point out https://explore2.marginalia.nu/ -- a site that finds nearly related sites.  

Regular Reader Krossbow pointed out the Yogurrt "discovery engine" (which didn't work so well for me, but YMMV).  

There are probably more such tools out there.  Let me know and I'll add them to this page.  


SearchResearch Lessons 

1. Be wary of the specific details about how to use software given to you by search engines or LLMs.  They're often out-of-date--partly because operations change quickly, and partly because there might be millions of citations about how something works, but only ONE reference to that function being deprecated.  Unfortunately, that's not the way LLMs work--you can't just say "this doesn't work anymore" and have it "learn" that!  

2. There are tools worth knowing: SimilarSites.com and SimilarWeb.com both give decent results. 

3. Creating a longer summary description of the site and asking your search engine to do a search on that description often gets you into the neighborhood. As I mentioned, that was a crazy method to do a couple of years back, but these days, it often works just fine.  

4. Don't forget to look specifically for recommendations for "more X like Y"--you never know when Reddit will have exactly the thing you seek.  

5. Ask your friends...  That's how we found a few additional "related" tools--friends comments left here!  

 


Keep searching.  


5 comments:

  1. https://misinforeview.hks.harvard.edu/
    https://shorensteincenter.org/
    "The image of a snake eating its tail is historically, spiritually, and metaphysically significant as a symbol of eternity and progressively repeating cycles. Even though the serpent devours itself, it simultaneously regenerates, making its self-consumption and self-regeneration eternal."
    https://en.wikipedia.org/wiki/Ouroboros
    https://study.com/academy/lesson/ancient-ouroboros-symbol-history-significance-examples.html
    https://x.com/searchliaison
    Danny Sullivan?
    https://youtu.be/m3M_tal884c?si=-RggLhqj5HryqBq5
    no-speaky:
    https://en.wikipedia.org/wiki/Julia_(programming_language)
    https://en.wikipedia.org/wiki/Richard_Rusczyk
    https://www.ams.org/notices/202210/rnoti-p1789.pdf
    https://artofproblemsolving.com/
    https://en.wikipedia.org/wiki/Backronym
    https://en.wikipedia.org/wiki/RDS-1
    https://en.wikipedia.org/wiki/Orobas
    "In this same chapter of John's gospel, the uroboros image first appears in the call to be born from above (water and the spirit). Uroboros is associated with water (as in the Genesis
    story of creation), and it is affiliated with Okeanos, which has the dual meaning of life (zoë) and death (Thanatos)."

    https://history.howstuffworks.com/history-vs-myth/ouroboros.htm

    ReplyDelete
  2. related/similar… tossed in the surf/waves
    Jörmungandr
    https://en.wikipedia.org/wiki/J%C3%B6rmungandr
    https://youtu.be/ln1gz-F29h4?si=yiX6nIQvHYzqFDgz
    "You could definitely scan a wikipedia page and get something in 3 seconds but you would not be reading deeply. If you took 30 years and only read 10% of the articles, now you get about 5 minutes per article. As others mentioned, reading everything is just not possible."
    https://news.ycombinator.com/item?id=38951168
    https://en.wikipedia.org/wiki/First_Wikipedia_edit
    https://theonion.com/employee-lost-like-sailor-in-maelstrom-after-hr-fails-t-1851570669/
    https://coastalreview.org/2021/01/burnside-faces-maelstrom-of-hatteras-inlet/
    https://www.poynter.org/fact-checking/media-literacy/2023/lateral-reading-the-best-media-literacy-tip-to-vet-credible-sources/
    https://cor.inquirygroup.org/curriculum/collections/teaching-lateral-reading/

    ReplyDelete
  3. https://mathwithbaddrawings.com/2023/10/30/a-brief-collection-of-math-metaphors-in-literature/
    https://abstractmath.org/MM/MMImagesMetaphors.htm
    https://www.chronicle.com/article/your-brain-on-metaphors/
    "And so may be the ability to create asymmetric neural linkages that say this is like (but not identical to) that. In an age of brain scanning as well as poetry, that’s where metaphor gets you." Michael Chorost
    https://web.archive.org/web/20101228095816/http://searchresearch1.blogspot.com/

    ReplyDelete
  4. he who
    https://the-oasis.org/product/birds/lovebirds/hewho/

    ReplyDelete
  5. AI managers get creative - Ai-Da (not DaDa nor "Ob-La-Di, Ob-La-Da" )
    https://nypost.com/2024/10/16/tech/sothebys-to-auction-painting-by-humanoid-robot-in-a-futuristic-first/

    ReplyDelete