Friday, January 3, 2025

Answer: What's the most significant thing going on here? (3/3)

I didn't mean to write three posts...  


But this is a big problem!     

I have to sympathize with Ramón and remmij’s comments about having a sense of overwhelmment in the SRS space.  We’re living in a frenzied time–when new systems and products are bubbling up every day and in every way.  It’s easy to give up… but don’t.   

A Gemini-generated image of Dan being overwhelmed by the
number of choices when doing Deep Research.


Here, at the SearchResearch Rancho, we’ll try to shed some light on what works, and what you can safely ignore. Isn’t that why you come here?  For a bit of clarity and guidance in these complicated times?  Let’s see what we can figure out.  

As I mentioned earlier this week, a new trend in SRS is the launch of several new “deep research” tools.  In today’s post I’m going to compare / contrast three of them.  (For simplicity, and to make a small pun, I’m going to call these tools “DR” tools.   


(1) Google’s DR tool is their “Deep Research” mode for Gemini.  (I can’t give a direct link to it because Gemini doesn’t use URL parameters.  You have to manually select the “1.5 Pro with Deep Research” option to get it to show up.  Sorry about that.)  The idea is that "1.5 Pro with Deep Research" will write you a short summary on some research topic.  

(2) Undermind.ai  a "personal assistant" to help you with collecting, analyzing, summarizing work in a particular topic area.  


(3) OpenScholar
(from Allen.ai) a DR tool that scans the scholarly literature (and sometimes it refuses to do so!) and writes a summary much in the style of Google 1.5 Pro w/ DR.


(For a list of other deep research systems, take a look at the SlashDot list of DR systems.

The idea behind all of these DR systems is to use AI techniques to analyze large volumes of complex data, looking for in-depth insights and discovery beyond traditional research methods. The hope is that these tools will find complex relationships and patterns within the data that might be difficult to identify manually.  Each of them writes a little report on what they found giving citations to the literature they used. 

To compare each of these systems I asked two questions.  

A. What has been the effects of the creation of Lake Nasser on the ecosystems around it?

B. What has been the effect of the creation of Lake Nasser on the incidence of schistosomiasis in Egypt? 


Question A is to answer our primary question (“what are the most important changes..”) 

Question B dives more deeply into a particular question about schistosomiasis, a parasitic disease that’s caused by blood flukes (trematode worms) of the genus Schistosoma transmitted by freshwater snails that often live in agricultural canals.. It’s a very serious disease. Can these DR systems help us understand this important change?   

Here’s what I found when using these DR systems… 


Question A:  (effect of Lake Nasser on the ecosystems)  


Google:  The DR tool (using Google’s “Gemini Advanced 1.5 Pro with Deep Research” on January 2, 2025) created a 2400 word report (link to report) that covers a bit of the history of Lake Nasser, and then lists positive and negative effects. Ecological projects are mentioned and a section entitled “Scholarly Research on the Ecological Impact of Lake Nasser” includes a very obscure data set on water quality.  (The data is important–but why copy/paste part of the raw data in the report?)  It mentions the “South Valley Project” (another name for the Toshka Lakes). 

Oddly, the report cites TWO Kids.Brittanica.com reports, a few studies by international organizations (e.g., WorldFish international), a few Wikipedia articles, and a couple of scientific literature studies.  

Overall grade: It’s not bad, but there are a few sections that are just odd–not something a human would ever write.  


=================================

Undermind:  When you use Undermind, it starts with a little back-and-forth trying to get you to add more detail to the research question. That’s fine, but it also ends up narrowing the scope of the research. In my case, the final question posed to Undermind was “The ecological impacts of the creation of Lake Nasser, focusing on changes and adaptations in both aquatic ecosystems within the lake and terrestrial ecosystems surrounding the area.”  


Here’s the top of the Undermind report: 


Undermind also provides a helpful summary of the Categories of papers it found: 


And a very helpful timeline of research work: 


As well as a very interesting set of clusters of research groups and contributions by each group: 


The references (at the end of the report show the expected citation info, but also a measure of the topic match, the number of citations / year (indicates how often it gets cited), and a summary of the relevance of the paper to the topic: 


Overall grade: Undermind gives you much more information that Google’s DR report, including analysis that Google won’t give you.  (e.g., the clustering) 


=================================

OpenScholar: By contrast, Ai2 OpenScholar took the same research question and wrote a fairly short, cursory report. Here’s the top of that report: 


There are only 3 references given, and one of them (Goher et al. 2021) is used for 6 of the 10 citations. What’s more, the Goher paper was published in the journal Water, which is a publication of the MDPI (Multidisciplinary Digital Publishing Institute), which has a not-great reputation. (You can read the Wikipedia page to learn more.)  In any case, it’s not a paper I would choose to center my critical review of Lake Nasser ecosystems, even though the paper data seems reasonable enough.  

Overall grade: It’s not bad, but there are a few sections that are just odd–not something a human would ever write.  And the overall quality of the cited works was a little suspicious.  (I found other examples of papers I’m not sure I would cite in OpenScholar analysis reports.  OpenScholar–what are you doing??)  



=================================


Question B: (effect of Lake Nasser on Schistosomiasis)


Google In reply to the Schistosomiasis research question, Google’s DR tool created another report (link to report) that is structurally similar to the other report… it too covers a bit of the history of Lake Nasser and environmental issues.  It has a section on public health interventions, and then the effects of Lake Nasser on the incidence of Schistosomiasis, pointing out that a massive anti-snail / anti-Schistosomiasis campaign has caused an overall REDUCTION in the incidence of the disease.  


Overall grade: Again, it’s not bad.. But there are some contradictory statements (e.g., while Schistosomiasis overall has gone down, another subspecies of Schistosomiasis (specifically, Schistosomiasis mansoni) has actually increased.  The obvious question a human would ask is “is this overall good, or bad for the country.”  That’s never really addressed.  

=================================

Undermind:  By contrast, Undermind doesn’t really write much of a report–it really gives a bunch of research result in the literature.  What little it says about the incidence of Schistosomiasis slightly contradicts Google.  In particular, it writes that: 

  “The creation of Lake Nasser after the construction of the Aswan High Dam significantly increased schistosomiasis transmission in Upper and Middle Egypt by altering ecological conditions that favored the proliferation of snail vectors …, though public health interventions such as mass drug administration and mollusciciding effectively reduced disease prevalence in most areas despite persistent hotspots.”  


Overall grade:  Undermind gives a completely different set of relevant papers than Google Gemini!  (There is zero overlap.)  But it does give a deeper analysis about why Lake Nasser changed the way that Egyptians use canals for agriculture, leading to an increase in the disease!  

=================================

OpenScholar: I’m not sure what happened here, but I was completely unable to get it to give me any kind of answer to my schistosomiasis questions!  I tried multiple variations on the research statement question, but all I ever got was a failure notification.  

“Referenced task failed. Error: We were unable to retrieve any relevant papers for your query. Please try a different query. OpenScholar is not designed to answer non-scientific questions or questions that require sources outside the scientific literature.” 

Overall grade: Not a great performance.  I spent probably an hour trying different variations on the theme (including logging in on a different account), but for whatever reason, it just refused to answer.  




SearchResearch Lessons

Let’s start with the post from 2 weeks ago… 

1. When getting an overview, consider using maps… and in particular, consider using time lapse.  They’re relatively easy to get, and give you a very different perspective on regional questions.  


2. Asking LLMs these questions is a good idea.. BUT ONLY if you look at multiple different AI systems.  As we saw last week, each of these can give you a very different idea about what the issues are.  But aggregating the results can give you a decent overview.  (Do not, however, take the frequency of topics mentioned as a proxy for importance!  That’s kind of random–multiple mentions do not equal overall importance!)  


3. The Deep Research (DR) tools are a new kind of thing for doing serious research. While interesting, they’re not a substitute for real human research.  (At least not yet.)  Like LLMs, each has their own perspectives (which can be useful when taken together), but don’t seem to have great quality control about what papers are high quality.  



As ever, check your work--now more than ever.  The DR systems are really interesting tools, but they’re not quite a replacement for your good research skills and discernment.  


In the future we’ll talk about other DR tools, including: ResearchRabbit, Elicit, Iris, Affor.ai and NotebookLM.  We’ll talk about those soon.  (But I didn’t want to write a book about them.. At least not yet!)   



Keep searching!  







4 comments:

  1. …not easy, but the thrill is waning, it all seems kinda alien…
    (“DR” tools - funny)
    https://i.imgur.com/PYcsi7B.jpeg
    https://en.wikipedia.org/wiki/Anhedonia

    undermind - not to be confused with underhanded.ai
    "Started by two quantum physics PhDs from MIT, with decades of experience in deep research."
    there goes the anti-gravitational work - oh well
    identification of complex systems doesn't equal comprehension/revelation of said systems.

    who names the new AI iterations?
    did it start with Archie & ask Jeeves?
    https://www.sunsethq.com/blog/what-happened-to-ask-jeeves
    https://www.captechu.edu/blog/alan-emtage-creator-of-archie-worlds-first-search-engine

    this is only a pre-summary of the aforementioned, and lost, detailed summary that failed to populate in this comment.

    bits:
    https://pmc.ncbi.nlm.nih.gov/articles/PMC4293883/

    https://www.amacad.org/publication/thinking-historically-about-water-security-public-health
    - it's all clustering
    research for your upcoming book (whatever form that takes)
    https://www.hbs.edu/faculty/Pages/item.aspx?num=56633

    ReplyDelete
  2. just planting a seed…
    what does AI eat? (seems tied to ancient Egyptian/Nile agriculture - in a titular way)
    https://arxiv.org/abs/2405.09597
    https://askdruniverse.wsu.edu/2023/09/14/what-do-robots-eat/
    MAD:
    https://deepgram.com/learn/when-ai-eats-itself
    would AI scientists make this kind of sacrifice?
    https://en.wikipedia.org/wiki/Pavlovsk_Experimental_Station
    https://deepenglish.com/lessons/scientists-gave-lives-protecting-seeds/
    https://www.sciencehistory.org/stories/magazine/the-tragedy-of-the-worlds-first-seed-bank/
    https://www.theguardian.com/world/2024/nov/12/food-source-famine-leningrad-seed-bank-nikolai-vavilov
    current version:
    https://www.croptrust.org/work/svalbard-global-seed-vault/
    https://www.clir.org/wp-content/uploads/sites/6/2024/09/Story-of-the-Modern-Seed-Library.pdf
    https://bays3rdgrade.weebly.com/uploads/4/2/5/4/42542857/2_the_most_important_seed.pdf
    https://svalbardi.com/blogs/news/what-seeds-are-in-the-svalbard-global-seed-vault
    https://en.wikipedia.org/wiki/List_of_edible_seeds

    ReplyDelete
  3. other "DR" tools…
    will it be popular in Australia? perhaps overrun the country?
    https://www.researchrabbit.ai/mission
    https://youtu.be/phWqcGcxeE4?si=d7MIU0qysEfw7E_Y
    Rascally Rabbit
    https://tvtropes.org/pmwiki/pmwiki.php/Main/RascallyRabbit
    https://elicit.com/
    https://www.xprize.org/prizes/artificial-intelligence/teams/iris_ai
    Affor.ai ---- really?
    https://youtu.be/dQw4w9WgXcQ?si=KxP05Czllo0xrxKx
    https://blog.google/technology/ai/notebooklm-audio-overviews/
    https://teaching.ucr.edu/notebooklm
    check twitter on NotebookLM



    ReplyDelete
  4. Thank you, Dr Russell

    It's great to have this information and comparing the results.

    I read about NotebookLM. Or at least I think so. People describe it as a better Google Keep. I don't think that's the purpose but that was mentioned when I was reading about Google Keep now becoming a system app.

    ReplyDelete