Tuesday, October 31, 2023

Answer: Three little Swiss mysteries?

 Living anywhere new inevitably leads to discoveries and recognition that the world is larger and more interesting than you might have thought, 


As mentioned, I'm here in Switzerland for four months. Even though I've been here before many times, the extended time period of living in one distinctly different place is proving to be eye-opening.  

But as a person who's relatively new to life in Zürich, I've found a few things that are charming and puzzling at the same time.  Can you help me figure out what's going on in each of these cases? 

1. Why are these eggs colored orange/yellow?  I bought them in the local grocery store where they were sitting out on the shelf, unrefrigerated.  The label on the container says Schweizer Eier (Swiss eggs).  I've seen many different colored eggs from friends who have chickens (blue, green, brown, some with spots), still, this is extraordinary color.  But what's the story here?  What kind of chicken would produce these eggs?  


I was thinking that these eggs were naturally this color. After all, eggs DO come in a bunch of different colors!  I knew that, in the US at least, chicken eggs can show up in a variety of shades:
 
Photo by Justin Pius, NRCS

So I was prepared to think that maybe Swiss chickens are simply more colorful!  

I spent a lot of time searching for colored eggs, learning that I needed to include chicken in my searches, as there are a wild number of different kinds of eggs from a zillion kinds of birds, some of which are extraordinary. (See the Science News article about different egg colors.)  

I also have to admit to spending at least 30 minutes poking around looking at many pages, and NOT finding anything. Oh, I found a lot, but the colored eggs made at Easter, or the tendency to have naturally dyed eggs were all over the results.  Here's a sample of what I was seeing: 


I was just scrolling around, trying to figure out what to do when I noticed one article about Look! Colored Eggs in Swiss Supermarket.  That's what I was looking for!  But in the article, it mentioned that "colored eggs... are so nice for picnics."  Huh?  You wouldn't take raw eggs on a picnic, right?  

So I changed my query to a question using that as inspiration: 
  
     [ Why are there colored eggs in Swiss supermarkets? ] 

And very quickly learned from a Reddit post that "In Switzerland, grocery stores sell painted hard-boiled eggs in order to differentiate between the fresh and hard-boiled ones."  (SRS tip: Google is much, much better at answering free-form questions, independent of the LLM work.) 

Now Reddit is fun, but a bit untrustworthy.  However, when I double-checked, I found multiple sources telling me this.  Grocery stores in Switzerland DO sell hard-boiled eggs in a variety of colors (these red/gold eggs are from one particular store--different store will have different colors)! 

I checked on the package that held the eggs and found (in fairly small font!) that they are "Aus Freilandhaltung * gekocht * gefärbt."  I should have looked more carefully. Google Translate tells me that this means "Free range * cooked * colored."  

Ah.  Got it.

Lesson learned: Read the package (even the small print) first.  Also learned that sometimes you'll learn the crucial tip by just scanning the results!  

2. While on a hike in the Alps (near Rigi Scheidegg, if that helps), I came across this flag--and I have no idea what kind of a flag this is. What does it represent?  (It might help to know that the Swiss are a little flag-crazy. There is traditional Swiss flag-tossing (a kind of bucolic, even serene sport... watch the video), and flags seem to abound.  Given the level of vexillological interest here, it must signify something, but what?  


This has turned out to be very hard--I'm not sure I have the right answer.  I tried all of the obvious image search tools (Google Image Search / Lens; Tineye; Bing Image Search; Yandex Image Search), but none of them gave me anything good.  

I tried various descriptions of the flag (four white hearts, four-leaf clover on a red background, etc etc.), but nothing really worked.  I searched for versions of these terms with words like "logo" or "flag" or "emblem" or "sigil" or "device"... but I didn't get very fair.  SRS Reader Paul L tried the specific flag search engine FlagID.org (that's a new one to me.. nice find Paul), but to no avail.  

I figured it was a cantonal flag or arms--but I checked those as well--no dice.  

Paul also "... finally loosened the search to only include red flag with white and scanned to see the much sharper edges of the Maltese cross on the Bardonnex Commune (Switzerland) flag."  While that's really close, it's not quite the same thing.  


I even made a fairly high-resolution image of the flag and tried to Google Lens (and Bing, and Tineye) this: 





But this didn't really work either.  I found some near hits: 

- a UK company called Schmecken that has a logo very much like this: 

- a Finnish group on Twitter/X called Pohjois-Pohjanmaa for social and health security association: 

- a car company, Autoclover


But nothing that was flag-like and Swiss.  Is it possible that this is a one-off custom flag?  

Then I was walking down the street in the town of St. Gallen (in northern Switzerland) and saw this stand-up box advertising the Swiss national lottery, Swisslos: 


Which is the closest Swiss 4-lobed clover-like bit of iconography I can find.  (Later I went back and looked deeper in my search-by-image results and found the Swisslos logo.  Always go deeper.)  

But it's not an exact match.  People who make flags are pretty picky about the details of their design.  (See the brilliant Roman Mars TED talk about flag design. 18 minutes that will change the way you look at flags.)  

So I'm not convinced we know the answer.  We're going to have to leave this as an open Challenge for the moment.  I'll keep looking, and you, my Regular Readers, should do the same.  IF you see it, post a comment here so we'll all know what it actually is.  (And I, for my part, if I go back to that part of the Alps, I'll find the owner and ask!)  


3.  I've seen some interesting vegetables before in farmer's markets before, but this one seems very Seussian to me.  What ARE these things? How would I eat one?  


This one was easy: looks like a weird cabbage, smells like a cabbage, so my query was: 

     [ cone shaped cabbage ] 

Which rapidly told me that this cabbage has a number of names: conehead, pointed, arrowhead, and sweetheart cabbage.  It's described as having "... leaves, with variations of pea green colorings, are thin, broad, deeply veined, tightly enveloped lengthwise and bluntly pointed. The flavor of Conehead cabbage is mild and remarkably sweet, void of that bold cruciferous flavor that is most reminiscent of cabbage."

Naturally, I bought one for research purposes and ate it for most of the week.  It is, indeed, sweetly flavored and is a lovely thing to have on your plate.  (I just sautéed/steamed mine with a little olive oil, garlic, and salt.  Yum!) 


SearchResearch Lessons 

1. Mind your assumptions!  In the colored egg Challenge I had assumed that the eggs in question were naturally colored like that.  I was prepared to learn that Swiss chickens are some interesting breed that lay technicolor eggs.  It took me a while to undo that assumption and figure out that they're dyed eggs.  

2. Even unreliable sources can be useful.  I found that Reddit post about hard-boiled Swiss eggs to crack the case (so to speak), but I know that Reddit can be unreliable. So when I checked, I was pleased to find MANY sources confirming that colored eggs are hardboiled, just like this detective.  

3. Read the fine print.  I skipped the fine print on the package partly because it was small and partly because it was in German, and while I can read lots of German, I didn't know what Aus Freilandhaltung meant, so I stopped reading.  FWIW, I know what gekocht and gefärbt mean... but I'd stopped reading too early.  

4. Some Challenges don't come easily. The identity of the flag has not yet been cracked!  It's an open case.  Sometimes, that's the way it goes.  


Keep searching! 


Tuesday, October 24, 2023

SearchResearch Challenge (10/25/23): Three little Swiss mysteries?

 I'm currently living and teaching in Zürich until the end of the year, 


.. here in the heart of Switzerland. I'm teaching for this semester at the University of Zürich--and you only get one guess as to what I'm teaching.  That's right, I'm here to teach the course on Human-Computer Interaction and AI over the 14-week semester.

(In other words, what should we be doing to design and build AI systems so that people can understand and use them.  Hence last week's Challenge about getting LLMs to be useful in search tasks.)

But as a person who's relatively new to Switzerland, I've found a few things that are charming and puzzling at the same time.  Can you help me figure out what's going on in each of these cases? 

1. Why are these eggs colored orange/yellow?  I bought them in the local grocery store where they were sitting out on the shelf, unrefrigerated.  The label on the container says Schweizer Eier (Swiss eggs).  I've seen many different colored eggs from friends who have chickens (blue, green, brown, some with spots), still, this is extraordinary color.  But what's the story here?  What kind of chicken would produce these eggs?  


2. While on a hike in the Alps (near Rigi Scheidegg, if that helps), I came across this flag--and I have no idea what kind of a flag this is. What does it represent?  (It might help to know that the Swiss are a little flag-crazy. There is traditional Swiss flag-tossing (a kind of bucolic, even serene sport... watch the video), and flags seem to abound.  Given the level of vexillological interest here, it must signify something, but what?  


3.  I've seen some interesting vegetables before in farmer's markets before, but this one seems very Seussian to me.  What ARE these things? How would I eat one?  



Of course, we want to know how you found the answers to each Challenge.  Share your search tricks with us!  


Keep searching!  



Wednesday, October 18, 2023

Answer: How might we best use LLMs for online medical research?

 Can we count on LLMs for medical advice? 

Illustration by DALL-E of a medical scientist doing research

And it's pretty clear that LLMs and associated generative AI tools (see the illustration above, which is a lovely generated image by DALL-E) can come in pretty handy when given decent prompts.  

Our SearchResearch Challenge for this week was an open-ended Challenge intended to probe the limits and capabilities of high-stakes research.   

Here was the Challenge for the week: 

1.  What good advice can we give a person who wants to use an LLM for high-stakes research into a medical question?  We know people are doing this, and all of our ranting about LLMs won't stop them--but can we find ways of using LLMs and specific LLM/AI tools to improve the search process?  What would you say to people who are doing this?  (Aside from "be very, very careful.")  

I found some pretty serious problems in asking LLMs for medical advice.  The overall impression I have is that LLMs just are NOT ready for use in medical searching.  There are other possible uses, but asking an LLM for medical information is basically rolling the dice--you might get good info, and you might get terrible info.  My recommendation: 

DON'T ASK ANY LLM FOR MEDICAL INFORMATION,
especially diagnostic and treatment information  


I spent several hours testing out different kinds of prompts, testing out several popular LLMs.  This isn't all of the testing I did, but I'm summarizing so it doesn't go on forever and ever.  

I'm grouping my comments by three main types of failure: 

Citations are wrong (and if they're made up, what else is made up?): 

Bard:   

Prompt to Bard: "What is the latest literature on Glucagon-Like Peptide-1 Receptor Agonists (GLP-1 RAs)? Give citations for articles you mention." 

Bard totally made up citations.  Complete BS.  Example: Here is a purported summary of some of the latest literature on GLP-1 RAs, with citations for the articles mentioned:  

Bard: GLP-1 RAs may reduce the risk of all-cause death in people with type 2 diabetes. A recent study published in the journal Diabetes Care found that GLP-1 RA users had a 14% lower risk of death from all causes than people who did not use GLP-1 RAs. It suggested a citation: Zhang H, et al. Association of Glucagon-Like Peptide-1 Receptor Agonist Use With All-Cause Mortality in Patients With Type 2 Diabetes. Diabetes Care. 2023;46(3):503-511.)

But this is utterly wrong:  No such articles exist.  Given that, do you still believe the assertion that GLP-1 RA users have a 14% lower risk of death?  Where did it pull that from?  I have no idea, and neither does Bard.  

GPT-4: Interestingly, when I gave the same prompt to GPT-4 it gave real citations (after disclaiming that it was only up-to-date as of Jan 2022).  Interestingly, it did not try to summarize the literature (as Bard did), but just gave a few citations.  It also gave good advice about how to further the research task: "I recommend using databases like PubMed or Google Scholar and searching for terms related to GLP-1 RAs."   That's pretty reasonable. 

Claude:  Same prompt, but very different citations.  That's fine, they're real citations, but the summaries of the articles were fairly off--they seemed to summarize different papers than the ones they linked to!  What's up with that? 


Output is inconsistent with itself: 

Bard:  Here I'm intentionally asking a question that a patient might ask, "why is skin cancer more deadly than liver cancer?" knowing full well that this is the opposite of the facts.  

Prompt to Bard: "Why is skin cancer more deadly than liver cancer?"  

Bard:  Skin cancer is not more deadly than liver cancer overall. In fact, the 5-year survival rate for all skin cancer is 99%, while the 5-year survival rate for liver cancer is 19.6%. However, there is one type of skin cancer, melanoma, which is more deadly than liver cancer. Melanoma has a 5-year survival rate of 92.3%, which is lower than the 5-year survival rate for liver cancer.

 

Bard got the first part right, but the assertion that "melanoma has a 5-year survival rate of 92.3%, which is lower than the 5-year survival rate for liver cancer" doesn't make sense (how is 92.3 lower than 19.6?).  What's up with that?  

GPT-4:  Once again, ChatGPT-4 gives a pretty reasonable answer. First, it questions my assumption (that skin cancer is more deadly), and then lists reasons why people might consider some types of cancers more deadly than others.  It's quite a reasonable argument, and ends with reasonable advice about differentiating between different kinds of cancers when discussing prognosis and mortality.  

Claude:  Gives an answer much like Bard, but doesn't mix up the stats.  It's actually mostly correct. 


Don't trust reasoning of LLMs: 

I asked each of the 3 LLMs this question: "If it takes 3 days to clear up an infection with a 10 unit dose of an antibiotic, how long will it take if I use 30 units per day?" 

This, of course, is a terrible idea. Don't fool around with treatment plans!  

But when I asked Bard, it gave a quite reasonable framing of the question...  

Bard:  

It is important to follow the prescribed dosage of antibiotics to ensure their effectiveness and avoid potential adverse effects. Taking a higher dose than recommended can lead to various complications and may not necessarily clear up the infection faster.

In this case, if a 10-unit dose of an antibiotic is prescribed to clear up an infection in 3 days, it is advisable to stick to that dosage and duration. Increasing the dose to 30 units per day could potentially lead to adverse effects without necessarily shortening the treatment duration.

 

GPT-4: Gives a rather similar reply: 

I must preface my response by emphasizing that you should never adjust medication doses without consulting a healthcare professional. Self-adjusting doses can lead to unwanted side effects, antibiotic resistance, or ineffective treatment.

That said, based solely on the information you've provided and treating it as a theoretical exercise, the relationship between dosage and time isn't necessarily linear. The effectiveness of antibiotics depends on factors like... 


But Claude totally bobbled this one with an answer that's long on math and utterly missing common sense reasoning about drug therapy.  

Claude: 

Okay, let's think through this step-by-step: 

It takes 3 days to clear up the infection with a 10 unit daily dose of the antibiotic

So over the 3 days, the total dose is 3 * 10 = 30 units

Now we are using 30 units per day instead of 10 units per day

Since the total dose needed to clear the infection is 30 units:

If we give 30 units per day, it should clear the infection in 1 day

So in summary:

Total dose needed is 30 units

At 10 units per day, it takes 3 days

At 30 units per day, the total dose of 30 units is given in 1 day

Therefore, if the daily dose is increased to 30 units per day, the infection should clear up in 1 day.

Which is a terrible idea. 


You see where this is headed: Each of the LLMs has a very different training set, sets of guidelines, and text production models.  

In general, GPT-4 gives much more extensive answers to questions, and avoids summarizing technical medical texts.  I wasn't able to trip it up (but I didn't really push on the boundaries).  

By contrast, Bard always gives much shorter answers, frequently hallucinating reports and studies. 

And Claude is somewhere in the middle with middle-length texts, but will give answers that defy common sense.  (Don't ever modify your drug plan without talking with your physician!)  


What can we do that is useful?  

My direct advice would be to NOT ask for a diagnosis or suggestions for treatment.  That's probably still dangerous. 

On the other hand... I did find some value in asking LLMs for descriptions of medical conditions, therapies, and medicines.  This makes sense because there is a lot of training data out there on such topics.  So asking questions like "what is the IT band?" or "what are some side effects of taking cough medicine?"  


SearchResearch Lessons 

I could go on and on, showing all kinds of subtle errors and mistakes that the models make.  

1. Bottom line: They're not ready for medical advice.  They might be in the future, but at the moment, they're all too able to make errors that could be serious. 

2. They all give very different kinds of answers.  GPT-4 tends to give great details about a condition, going into enormous detail.  If that's what you want, go there.  For shorter answers at a slightly easier-to-read level, visit Bard.  

3. Compare and contrast.  All of the LLMs (including ones I don't mention here) have rather different outputs.  It's worth looking at them side-by-side.  

4. Don't forget your ordinary search skills.  Do I really need to say this?  Fact-check every thing, and look for important highly reliable sources in your quest.  

Keep searching!  

Wednesday, October 4, 2023

SearchResearch Challenge (10/4/23): How might we best use LLMs for online medical research?

LLMs are all the rage these days... 

Illustration by DALL-E of a medical scientist doing research

Some studies show that office workers who use LLMs (e.g., Google's Bard or OpenAI's ChatGPT) are more productive... on the tasks that they measured.  (At least that's MIT's department of economics conclusion.)  

And it's pretty clear that LLMs and associated generative AI tools (see the illustration above) can come in pretty handy when given decent prompts.  

Our SearchResearch Challenge for this week is an open-ended Challenge.  For these Challenges, I usually know what the "right" answer is and how to find it.  

But this week, this really is a call for the SRS community to help identify ways of using LLMs to do a very specific kind of research--a kind that we care a lot about--medical research.  Here's the Challenge for the week: 

1.  What good advice can we give a person who wants to use an LLM for high-stakes research into a medical question?  We know people are doing this, and all of our ranting about LLMs won't stop them--but can we find ways of using LLMs and specific LLM/AI tools to improve the search process?  What would you say to people who are doing this?  (Aside from "be very, very careful.")  

Maybe the point of using an LLM in "medical research" mode is just to help us to find really great search terms to use in regular web-search.  I'm not sure. But if so, let's figure that out. Are there other ways of using LLMs to get better results faster?  Or will the search for truth in LLMs be a bust?  

I'm focusing on medical research because it is high stakes, it's not okay to just skate on past little errors and unfortunate phrasing.  

A big problem here is that most of us are not medical experts, so it's going to be difficult to validate what the LLM tells you.  

My suggestion: Try doing a few medical search tasks in a medical area that you know something about.  Nearly everyone has been forced to become a mini-expert in some medical area.  I'm a mini-expert in a couple of really unusual types of cancer (family members, not me!) when I had to learn enough to make good recommendations to them.  So I can explore what an LLM tells me about that with some confidence that I'll have a basic understanding of an esoteric area.   

I'll be leaving comments in the thread this week (more than usual) as I work on this question as well.  

Let's figure out what advice we can give to LLM users.  

When you write your comments in the thread, be sure to tell us which LLMs you've been using.  (There are a lot of them out there, all with very different behaviors.  Let's be sure to keep our results limited appropriately.)  

And be sure to include the search path you followed (which now will include LLM prompts).  Mention the dead ends you found and strategies that just don't seem to work.  Negative results are really important here!  

Keep searching!