Wednesday, January 24, 2024

SearchResearch Challenge (1/17/24): How can I get an AI to summarize a document?

We’ve got AI tools all over the place… 



… now, how can we use them effectively? 

An important piece of SearchResearch (or more generally, sensemaking) is finding relevant documents, reading them, and extracting the pieces of knowledge you need to get on with your work.  

I find that I spend roughly half of my research time just browsing, scanning, and reading—it’s what you need to do to understand a topic area enough to do real work. 

But what if we could accelerate the scanning and reading part?  What if we could take a long document and summarize it accurately?  

This is one of the great promises of the current crop of LLMs—they offer the ability to summarize long documents.  

Or do they?  Will they make the long document into a shorter, more focused work?  Or will that tool end up blurring everything into mush?  


The Challenge for this week is to explore how well an LLM can summarize a document in a way that’s useful for us SearchResearchers.  Here’s the Challenge for this week: 

1. How well does your favorite bit of AI technology do at summarizing long texts?  What system do you like for summarizing, and why?  

2. Test out your summarization method on two long-form texts.  First, let’s try that classic gothic tale, Frankenstein by Mary Shelley.  (Here’s a link to the full text that you can use.) How good is that summary?  

3. Second, let’s try your summarization method on a piece of text you might have read, but might have slightly forgotten—Chapter 10 of my book, The Joy of Searchlink to the full-text of Chapter 10 here"When was oil first discovered in California?"  Did your summary method create a good/useful summary?  


Try using your fave summary method on these two works and let us know what you find.  We’re interested in practical ways of getting a good summary of long texts.  When you do this, think about what your goal is in getting the summary—is it just general interest?  Or do you have a particular question in mind?  When is a summary useful?  (And contrariwise, when does it not work well?)  Or, how do you know your summary is a good one?  

Let us know what you've learned in the comments section.  

Next week I’ll summarize what we’ve learned, and what techniques seem to work best for SearchResearchers.  


Keep searching! 

 

Friday, January 19, 2024

Special: How do you use LLMs in your SearchResearch

 Perhaps... 


... I was a little hasty in getting this week's SRS post out.  

Not long after I posted it, a few Regular SRS Readers wrote in with their methods of using LLMs. 

Regular Reader Unknown wrote in the comments that: 

I've found Bard useful for suggesting lines of enquiry. For instance, I asked "When did people in China first realise dolphins are not fish?" and it suggested some ancient Chinese texts which apparently discuss dolphins. I don't know if this info is accurate, but at least it's a good starting point. ChatGPT couldn't help with this query, and conventional searching didn't get me very far. I also asked Bard "Which art historians believe that Johannes Vermeer painted St Praxedis, and which are skeptical?" and it gave me two lists of names - again a good place to start.

I like this use of LLMs quite a bit--as a great research starter.  This is a great advanced use of an LLM resource. 


And I'd missed cmarlowe's earlier comment that: 

As a retired librarian I use LLMs to provide a basic understanding of new topic or to provide further insights into a topic I am researching and have some knowledge of. One of the features that helps me in researching is the conversations, because it requires me to think what else I should consider in my reference query. But always having that librarian mindset, I do check multiple search tools since often one finds different and helpful information to complete one’s research query and these can include general search engines or research sites. This also helps to avoid misinformation from LLM hallucination by comparing one to another. One unique LLM tool is Poe https://poe.com/ where you can search multiple AI sites and tools. 

Both Unknown and cmarlowe are using LLMs in a sophisticated way--they're using them as lead generators.  That is, the LLM gives them ideas about where to look and as useful resources.  

*

On the other hand, my friend Aditi sent me three examples where she used ChatGPT to answer questions that would have taken a lot longer with traditional search, and one where it's unclear that ANY search engine would have been able to do anything. This in a nice example of using LLMs to answer questions directly.  

Her description of her truly difficult search task was this: 

"[I wanted to] Find out why I remember an old Telugu nursery rhyme my grandma used to sing. I tried so hard with Google all these years and chatGPT helped me finally understand why I loved it and remembered it all these years." 

Here's their conversation of the first very difficult search task: 

Aditi:  I'm trying to figure out what an old Telugu nursery rhyme my grandma used to sing to me means. I'm not sure I remember the words right, but if I transcribe it to the best of my memory, might you be able to help me figure  it out? 

ChatGPT: Absolutely, I'd be happy to help. Please go ahead and transcribe the nursery rhyme as best as you remember it. 

Aditi: Ok here it is. I remember almost all of it clearly except for the fruit parts, which I've enclosed like (this?). I say fruit because it's about which bird eats which fruit I think? 

Gaagi, gaganaagi  

Ganganagi intilo paalupandi 

Aa pandu thecchi chilakakicchi 

Chilaka thinna pandu (cheedi?) pandu, Hamsa thinna pandu (arati?) pandu, [Baby's name] thinna pandu? Bangaaaara pandu!! 

ChatGPT:  Thank you for sharing the nursery rhyme. It's a charming Telugu rhyme about different birds and the fruits they eat, with a sweet twist at the end involving the baby's name. Here's a rough translation and interpretation: 

"Gaagi, gaganaagi 

Ganganagi intilo paalupandi" 

This part sets a playful and melodic tone, akin to calling everyone to gather around,  possibly imitating the sound of birds. 

"Aa pandu thecchi chilakakicchi Chilaka thinna pandu (cheedi?) pandu," 

It talks about bringing a fruit to a parrot (chilaka). The fruit could be "(cheedi?) pandu," where "cheedi" typically means a berry. The exact fruit is uncertain, but it's something the parrot enjoys. 

"Hamsa thinna pandu (arati?) pandu," 

This line is about a swan (hamsa) eating a  fruit, possibly "(arati?) pandu," which could refer to a banana in Telugu). 

"[Baby's name] thinna pandu? 

Bangaaaara pandu!!" 

This is the delightful twist where the baby's name is inserted, asking what fruit the baby eats, followed by "Bangaaaara pandu," meaning 'golden fruit', symbolizing that the child is precious and loved, like a rare and valuable fruit. 

It's a nursery rhyme that combines the joy of nature (birds and fruits) with the delight in a 

beloved child, turning a simple rhyme into a celebration of the child's presence. The uncertainty in the fruits adds a bit of mystery and room for imagination, which is often a charming aspect of traditional nursery rhymes. 

This is pretty impressive.  Bear in mind that Aditi transliterated the lyrics of the song she mostly remembered into a English-based transcription.  (Normally, Telugu is written in Telugu script, which looks like this: తెలుగు. That's "Telugu" in Telugu characters.)  And yet, based on her approximation of the Telugu lyrics, it somehow found the song and was able to find an English parallel translation.  

Both Aditi and I did regular searches to try and find such a resource, but we both failed.  There's a bit of knowledge buried deep in ChatGPT that includes Telugu nursery rhymes.  

And, in the spirit of double-checking all results (especially surprising results like this), Aditi checked in with her relatives in India and confirmed that what ChatGPT answered is, in fact, correct!  

*

In another search task, Aditi was searching for something to do with her young family on a cold day in Denver: 

Aditi:  What's something we can do as a family in Denver right now? It's cold so I was thinking something indoor and fun for a baby 

ChatGPT: Visiting the Denver Aquarium could be a delightful experience. Imagine it as a serene underwater world, a tapestry of vibrant marine life, captivating for both you and your young daughter. It's a warm, inviting space, away from the cold, where your baby can marvel at the wonders of aquatic life, creating a tapestry of early, colorful memories. 

Aditi: Nice! Any other ideas? 

ChatGPT: Consider the Denver Children's Museum. It's like a playground of learning and discovery, tailored for little explorers. Your baby would find a world scaled to her size, filled with interactive exhibits that engage the senses and ignite curiosity, while you could enjoy watching her discoveries and growth in such a playful environment. 

The responses are a little wordy, but they have the advantage of being high quality, and the disadvantage of being just one response.

Note that Aditi used ChatGPT's "custom instructions" feature to tell ChatGPT ahead of time who she was, about her family, and what kinds of responses she prefers.  (That's why the system knew that her family includes a baby.)  

This is a great way to tune ChatGPT to become more like the information resource that you'd like to use.  Aditi gave custom instructions saying that she has a graduate degree in information sciences, and that the replies it gives should be concise and accurate.  This has the effect of tuning the responses to be more like what you need. (I'll write more about this in the future--very handy.)  


SearchResearch Lessons

1. You can use an LLM as a smart collaborator that knows a lot about the world and can give useful direction.  This is probably the way I use LLMs most--as an extension of my brain.  (I recently asked Bard for [who is a famous writer on voodoo and hallucinogenic plants ] when I couldn't remember a particular writer's name.  It listed the top three authors in the field, with their names, and brief bios.  The first one is the author whose name I was trying to recall.

2. As we saw, even approximations of non-English texts can lead to great answers. I'm deeply impressed that a transliterated nursery rhyme was able to have ChatGPT find relevant information.  

3. Using LLM in a conversational mode can be incredibly useful.  Going back and forth with an LLM can focus the results to be more precisely what you seek.  

But, as always... double check your results. 



Wednesday, January 17, 2024

Answer: How do you use LLMs in your SearchResearch?

So.... 

P/C Dalle3. [an evocative picture of data, data tables, line charts, histograms]

It looks like many of us are using LLMs (Bard, ChatGPT, etc.) to ask SRS-style questions, especially ones that are a little more difficult to shape into a simple Google-style query.  

That's why I asked this week:  

1.  How have you found yourself using an LLM system (or more generally, any GenAI system) to help solve a real SearchResearch question that you've had?  

I've also gotten LLM help in finding answers to SRS questions that I couldn't figure out how to frame as a "regular" Google search.  Bard has answered a couple such questions for me (which I then fact-checked immediately!).  While the answers might have a few inaccuracies in the answer, the answers are often great lead-ins to regular SRS.  

For instance, I asked Pi (their home page) a question that has been in the news recently:  "What's the percentage of homeless people in the US?"  Here's what it told me: 



My first reaction was the obvious one--that can't possibly be correct!  18%???

The strangeness of the answer drove me to do a regular Google search for some data.  At the Housing and Urban Development website I found this: "The Department of Housing and Urban Development (HUD) counted around 582,000 Americans experiencing homelessness in 2022. That's about 18 per 10,000 people in the US, up about 2,000 people from 2020."  

Or, if you do the math, that's around 0.18% of the US population.  

When you look at multiple sites, you'll keep seeing that figure "18 per 10,000 people," which I could imagine an LLM rendering as "18%" in some twisted way.  Arggh!!  

Interestingly, when I did this on Bard and ChatGPT on the same day, they ALSO quoted 18% as the homeless rate. 

However... if you ask all of the LLMs this same question today (1 week after they told me it was 18%), they all get it right: the real percentage is 0.18%.  

That's great... but also worrying.  I'm glad they're in alignment with more official sources, but the underlying data didn't change--some training must have kicked in.  

You can't count on consistency with an LLM.  

But... I find myself increasingly turning to LLMs to answer questions that really are difficult to frame as a search query.  Previously I wrote about finding "Other words that are like cottagecore" as an example of the kind of task that an LLM can do well.  They can take fairly abstract language ideas (e.g., "what words are like these words...<insert list>") and come up with something fairly useful.  This is a great capability that will let us take SearchResearch questions in new directions.  

You can ask follow-up questions without having to include everything you've found out thus far.  THAT is a pretty big advantage, especially for more complex tasks that take more than 2 or 3 queries to answer.  

This past week I found a really interesting paper on our topic by some folks at Microsoft Research.  In their Arxiv paper, "Comparing Traditional and LLM-based Search for Consumer Choice: A Randomized Experiment" they compared ordinary search tasks where some people used a "traditional" search engine, while another set used an "LLM-based" search engine.  (The LLM was ChatGPT and was primed with an extensive prompt before people started their searching.) 

Intriguingly, they found a couple of interesting things: 

* people spent less time doing LLM search than trad search,

* they did fewer, but more complex, searches that trad search,

* on the whole, the two conditions were just about as accurate (but see below), 

* the searcher's perceptions of accuracy were pretty high, even when the answers weren't great.

They then did a nice intervention to see if they could color-code uncertain results in the LLMs output to see if it would help make everything more accurate.  The result?  Yes, significantly so.  (Although they still had to have a magic oracle know that the results were low-quality!)  

However, one of the Challenges they had their participants do was intentionally a hard question that they knew the LLM would give terrible answers. But it didn't matter much--people believe LLMs beyond what they should. 

That's not a huge surprise, but a good cautionary note for us.  You still need to check to make sure that the results you're getting actually jibe with reality. 

So... what do I use LLM search for these days?  I'm asking it a lot of knowledge summarization questions.  Things like "Give me an overview of how epigenetics works in humans.  Tell me assuming I have a college level degree of genetics knowledge."  

With this prompt, ChatGPT does a pretty good job.  Here's the opening paragraph of its response: 

Epigenetics is a fascinating field of study that explores the mechanisms by which gene expression is regulated without altering the underlying DNA sequence. It plays a crucial role in development, cellular differentiation, and the adaptation of organisms to their environment. Here's an overview of how epigenetics works in humans, tailored to someone with a college-level understanding of genetics... 

On the other hand, if you ask for an explanation at a 6th grade level of biology knowledge, you'll see: 

Alright, let's imagine our body is like a huge library. This library (our body) has lots of books (cells), and each book has the same set of instructions or recipes (DNA) for making you, well, you! But even though all the books have the same recipes, not all recipes are used in every book. Some recipes are for making eye color, some for hair color, and others for how tall you'll be. Now, epigenetics is like having special bookmarks or notes in those books that decide which recipes are going to be used or ignored....

These kinds of summaries are incredibly useful, and much more sophisticated than the abstracts / snippets beneath each web result on the SERP.  The ability to ask for more-or-less detail in the prompt is incredibly useful.  

On the other hand... you still have to be careful about the simplest of things.  

If you ask Bard [ how many cups are in a gallon? ] it will reply: "There are 16 cups in a gallon. This applies to both US liquid gallons and imperial gallons, although they have slightly different volumes." 

Of course, that doesn't make sense. An imperial gallon is 1.2 US gallons, so they can't have the same number of cups!  This is a classic measurement blunder: there are, in fact, 16 Imperial cups in an Imperial gallon.  (For the record: There are 19.2 US cups in an Imperial gallon.)  As always, check your facts... and check your units!  (And for the record, ChatGPT explains this: "However, it's worth noting that the imperial cup and the US cup are not the same in terms of volume: 1 Imperial gallon = 16 Imperial cups,  1 Imperial cup = 10 Imperial fluid ounces.") 

SearchResearch Lessons

1. It's worth reiterating that you shouldn't assume that the output of an LLM is accurate.  Think of it as incredibly handy and useful, but NOT authoritative. Tattoo that on your hand if you need to, but never forget it. 

2. Always double check.  As we saw, LLMs will make bone-headed mistakes that sound good... so since you're not assuming that the output of an LLM is accurate, make sure you double-or-triple source everything. Do it now, more than ever.  



Keep searching!   

 


Wednesday, January 10, 2024

SearchResearch Challenge (1/10/24): How do you use LLMs in your SearchResearch?

 It's a New Year! 

P/C Dalle3. [an evocative picture of data, data tables, line charts, histograms]

Let's reflect on what SearchResearch is all about... 

I started this blog back in January, 2010 with About this blog--Why SearchReSearch?  That was 5,093 days ago.  

So far, there have been 1,374 posts, with 4.75M views and 13,700 comments.  We're running around 40K blog views / month, and that doesn't include the various syndicated versions of the blog.  If you include those, I'm Fermi Estimating the readership at 50K /month.  That's a decent number (roughly 1,660 people / day).   

As longtime SRS readers know, one of my goals was to use the blog as an effective prompt to write a book.  That book--The Joy of Search: A Google Insider's Guide to Going Beyond the Basics--came out in September 2019 and has done reasonably well. At least well enough to be translated into Korean and Chinese, as well as go into a paperback edition.  

As my Regular Readers also know, I'm working on another book ("Unanticipated Consequences") which I'm trying to finish up in the first part of this year, 2024.  

Writing a regular blog is a serious investment of time and energy, at least it is for me.  On average, I spend around 4-8 hours each week bringing you the most interesting SearchResearch tidbits I can find.  

But it's a big time commitment, and I'd like to use those hours to work on the new book.  

So I'm going to try an experiment--scaling back a bit, trying to do each blog post in two hours or less.  

To do that, I'm going to try to involve you Regular Readers a bit more in the search for insights.  

As we've discussed, the advent of Generative AI is promising to radically change the SearchResearch process.  It has the ability to give us new insights quickly (an example), and also to generate convincing-sounding nonsense with ease (an example).  

I know that many of you are using ChatGPT or Bard or Llama to answer questions, so I want to tap into the collective intelligence that SRS Regular Readers can bring to the discussion.  

I've asked this before in general (How can we use LLMs to search even better?) and for the specific case of medical searches (How might we best use LLMs for online medical research?).  But if there's one thing to know about LLMs and Generative AI in general, it's that this field is changing fast--really fast--fast as a flash rifling down the electric blue gun barrel into a sea of new-age synth-pop psychedelia.  

So... 

Our SearchResearch Challenge of the week is this is to revisit this: 

1.  How have you found yourself using an LLM system (or more generally, any GenAI system) to help solve a real SearchResearch question that you've had?  

 

I've told you before that I've used ChatGPT to help me with some data table manipulation. I've solved a few problems in a minute that would have taken me an hour or more to do with my regular programming skills.  Those are huge savings. 

I've also gotten LLM help in finding answers to SRS questions that I couldn't figure out how to frame as a "regular" Google search.  Bard has answered a couple such questions for me (which I then fact-checked immediately!).  

We all want to know what you've found to be useful.  What has worked for you in the past few months?  

Alternatively, what has NOT worked out for you?  

We're all ears here at SRS! 

Keep searching. 


P/C Duet AI (Google Slides) [kangaroo on top of piles of text listening ]