Monday, May 22, 2023

SearchResearch on a podcast--"The Informed Life"--Listen now

Podcasts are great! 


You can be out running, walking the dog, getting in your 10,000 steps and still listen and learn about the world.  

A few weeks ago I was interviewed by "The Informed Life" podcast host Jorge Arango, which he has just dropped on his channel as Episode 114, Dan Russell on the Joy of Search.  (The transcript can be found here.)  

In this cast, we talk about SearchResearch and my book.  I tell a few stories (some of which SRS Regulars will recognize), and wax wise about what makes someone a good searcher (and what doesn't).  

Take a listen, and let us know what you think!  


Keep searching.  




Wednesday, May 10, 2023

Taking a bit of a break...

 It's May... 

P/C Dan, from a lovely trip to Norway.

... and it has been a very full few months since the start of the year.  I'm feeling in dire need of taking a little bit of a break, so I'm temporarily going on a hiatus to do a little R&R.  

My plan is to catch up on a bunch of reading (mostly non-work), go on walkabout, become a bit of a flâneur.  Eat some great food, travel hither and yon... mostly yon.  

If something remarkable catches my eye, I might pop back into SRS for a quick note--but the current plan will be to come back here on May 31st with some new observations about the world of online research and the general joy of finding out about.   

Have a great month of May!  (A little musical commentary...)  

Keep Searching!  

P/C Dan. The master state of mind being sought. 


Wednesday, May 3, 2023

Answer: How well do LLMs answer SRS questions?

 Remember this? 

P/C Dall-e. Prompt: happy robots answering questions  rendered in a
ukiyo-e style on a sweeping landscape cheerful


Our Challenge was this:  

1.  I'd like you to report on YOUR experiences in trying to get ChatGPT or Bard (or whichever LLM you'd like to use) to answer your curious questions.  What was the question you were trying to answer?  How well did it turn out?  

Hope you had a chance to read my comments from the previous week.  

On April 21 I wrote about why LLMs are all cybernetic mansplaining--and I mean that in the most negative way possible.  If mansplaining is a kind of condescending explanation about something the man has incomplete knowledge (and with the mistaken assumption that he knows more about it than the person he's talking to does), then that's what's going on, cyberneticly.  

On April 23 I wrote another post about how LLMs seem to know things, but when you question them closely, they don't actually know much at all.  

Fred/Krossbow made the excellent point that it's not clear that Bard is learning.  After asking a question, then asking a follow-up and getting a changed response: "Bard corrected the response. What I now wonder: will Bard keep that correction if I ask later today? Will Bard give the same response to someone else?" 

It's unclear.  I'm sure this kind of memory (and gradual learning) will become part of the LLMs.  But at the moment, it's not happening.   

And that's a big part of the problem with LLMs: We just don't know what they're doing, why, or how.  

As several people have pointed out, that's true of humans as well.  I have no idea what you (my dear reader) are capable of doing, whether you're learning or not... but I have decades of experience dealing with other humans of your make and model, and I far a pretty good idea about what a human's performance characteristics are.  I don't have anything similar for an LLM.  Even if I spent a lot of time developing one, it might well change tomorrow when a new model is pushed out to the servers.  Which LLM are you talking to now?  

P/C Dall-E. Prompt: [ twenty robots, all slightly different from each other, trying to answer questions in a hyperrealistic style 3d rendering ]

What happens when the fundamental LLM question-answering system changes moment by moment?  

Of course, that's what happens with Google's index.  It's varying all the time as well, and it's why you sometimes get different answers to the same query from day-to-day--the underlying data has changed.  

And perhaps we'll get used to the constant evolution of our tools.  It's an interesting perspective to have.  

mateojose1 wonders if LLMs are complemented by deep knowledge components (e.g., grafting Wolfram Alpha to handle the heavy math chores), if THEN we'll get citations.  

I think that's part of the goal.  I've been playing around with Scite.ai LLM for the scholarly literature (think of it as ChatGPT trained on the contents of Google Scholar).  It's been working really well for me when I ask it questions that are "reasonably scholarly," that is, with papers that might address the question at hand.  I've been impressed with the quality of the answers, along with the lack of hallucination AND the presence of accurate citations.  

This LLM (scite.ai) is so interesting that I'd devote an entire post to it soon.  (Note that I'm not getting any funding from them to talk about their service.  I've just been impressed.)  

As usual, remmij has a plethora of interesting links for us to consider.  You have to love remmij's "robots throwing an LLM into space" Dall-E images. Wonderful.  (Worth a click.) 

But I also really agree with the link that points to Beren Millidge's blog post about how LLMs "confabulate not hallucinate."  

This is a great point--the term "hallucination" really means that one experiences an apparent sensory perception of something not actually present.  At the same time "confabulation" happens when someone is not able to explain or answer questions correctly, but does so anyway.  The confabulator (that's a real word, BTW) literally doesn't know if what they're saying is true or not, but does ahead regardless. That's much more like what's going on with LLMs.  


Thanks to everyone for their thoughts.  It's been fun to read them the past week.  Sorry about the delay.  I was at a conference in Hamburg, Germany.  As usual, I thought I would have the time to post my reply, but instead I was completely absorbed in what was happening.  As you can imagine, we all spent a lot of time chatting about LLMs and how humans would understand them and grow to use them.  

The consensus was that we're just at the beginning of the LLMs arms race--all of the things we worry about (truth, credibility, accuracy, etc.) are being challenged in new and slightly askew ways.  

I feel like one of the essential messages of SearchResearch has always been that we need to understand what our tools are and how they operate.  The ChatGPTs and LLMs of the world are clearly new tools with great possibilities--and we still need to understand them and their limits.  

We'll do our best, here in the little SRS shop on the prairie.  

Keep searching, my friends.  



Sunday, April 23, 2023

Comment: LLMs, truth, and consistency--they don't have any idea

LLMs have a big problem: 

P/C Dall-E. Prompt: "happy robots talking about truth"  

They have zero idea what's true.  They only know the probabilities of words in text. That's NOT the same thing as "knowing" something--it's a bit like knowing that "lion" is the most likely word following "king of the jungle..." without having any idea about monarchies, metaphor, or what a king really is all about.  

The folks at Oxford Semantic Technologies wrote an interesting blog post about LLMs and finding verifiable facts.  They call the fundamental problem the "Snow White Problem."  The key idea is that LLMs don't really know what's true--they just know what's likely.  

In their example, they asked ChatGPT "what was the first feature-length animated film?" ChatGPT confidently said that it was the 1937 film "Snow White."  

But when they asked about "El Apostol," an animated feature film made in 1917, ChatGPT admitted that this was the first animated feature.  

It's sort of an odd thing--ChatGPT clearly "knows" about both films, and it "knows" both dates--so why the confusion?  

Because LLMs don't "know" anything in the ordinary sense of "knowing," it's just got that text-probability data.  That is, inconsistent knowledge just isn't a thing for LLMs.  Having coherence within a conversation doesn't really matter much.  If you ask ChatGPT: 

Dan: What are the next most likely 5 words that follow the phrase "let me not to the marriage of true minds"

ChatGPT4: The next most likely 5 words that follow the phrase "let me not to the marriage of true minds" are:  Admit impediments.

Well, that's just plain wrong. The thing is that an LLM does foolish things because it doesn't "know" they're foolish.  I asked for 5 words, not 2.  

On the other hand, Bard did much better with: 

Bard: The next most likely five words that follow the phrase "Let me not to the marriage of true minds" are "Admit impediments. Love is not," which is from William Shakespeare's Sonnet 116. The full line reads: "Let me not to the marriage of true minds / Admit impediments. Love is not love." 

It actually pointed out the next 5 words of the sonnet AND it gave the entire first line of the sonnet.  Did it just get lucky?  Or is there an internal consistency thing at work?  

So I thought I'd try a similar tack to what Oxford Semantic Tech did in their post and ask about what was first short animated film.   

I asked Bard: 

Dan: What was the first animated short?

Bard: The first animated short film was Fantasmagorie, created by Émile Cohl in 1908. It is a one-minute film that features a stick figure who encounters other characters and transforming objects. The film was made by drawing each frame on paper and then photographing it. Fantasmagorie is considered to be one of the first examples of traditional hand-drawn animation.

Interesting.  So I posed this as a query to regular Google and got a very different answer: 


 
Interesting.  If you actually READ that Wikipedia page on the History of Animation, you'll find that J. Stuart Blackton created an animated film,  "The Enchanted Drawing" in 1900 that had elements of hand-drawing along with live action elements.  A few years later, he produced another film, "Humorous Phases of Funny Faces" in 1906 with even more bits of what we'd think of as animation.  

I went back to Bard and asked: 

Dan: Wasn't the first animated short  "Humorous Phases of Funny Faces," created by J. Stuart Blackton in 1906?

Bard: Yes, you are correct. Humorous Phases of Funny Faces, created by J. Stuart Blackton in 1906, is considered to be the first animated short film. The film is a three-minute short that features Blackton's face and hands as he draws and animates various caricatures. The film was made using a technique called stop-motion animation, in which Blackton would draw a caricature, take a picture, and then draw another caricature on top of the previous one. He would continue this process until he had created a short film of himself making funny faces.

I apologize for the previous mistake. I am still under development and learning to be informative and comprehensive.

Points for politeness, but the "Snow White" effect points out another instance in which the LLM doesn't know its own mind.  Did it really change its mind?  

One more attempt: 

Dan: And didn't J. Stuart Blackton produce an animated short in 1900?  That would be even earlier, yes? 
 
Bard: That's correct. J. Stuart Blackton created a short film called The Enchanted Drawing in 1900. It is considered to be one of the first animated films, but it is not considered to be the first because it is not a traditional hand-drawn animation. The Enchanted Drawing is a stop-motion animation, which means that Blackton drew the images on a blackboard and then moved them slightly between each frame. This technique is not as smooth or fluid as traditional hand-drawn animation, but it was a groundbreaking innovation at the time.

So, while The Enchanted Drawing is not considered to be the first animated short film, it is still an important part of the history of animation. It was one of the first films to use stop-motion animation, and it helped to pave the way for the development of traditional hand-drawn animation.

Contradictory, no?  

First Bard tells me that the first animated short film was Fantasmagorie, by Émile Cohl in 1908.  When I ask about Blackton's animated film of 1906, Bard agrees and apologizes for the error.  

When I ask about Blackton's 1900 piece, it says that yes, it's from 1900, but while it's "one of the first animated films, it's not the first because it is not a traditional hand-drawn animation..."  Wait a second!  Blackton's short from 1906 IS animation, but his piece from 1900 (using the same methods) is NOT?  

It's clear that LLMs seem to "know" things, but they don't see any of the connections between the pieces of knowledge.  It is as though everything is dissociated; and since everything is separate, contradictions and internal inconsistencies don't trouble it at all.  
 
To understand that something is inconsistent means that you (or the LLM) recognizes that there are different statements about something, and that both cannot be true simultaneously.  In this case, one of these animated shorts is the "first."  Was it Blackton's 1900 "The Enchanted Drawing," or Blackton's 1906  work "Humorous Phases of Funny Faces," or was it Cohl's 1907 "Japon de fantasie," or Cohl's 1908 "Fantasmagorie"?  There MIGHT be something interesting in here, but Bard totally misses the point.  

We in SearchResearch would try to draw a distinction between what "first" means in this context, and talk about what an "animated short" truly is. But that's not a conversation an LLM can have.  They just have these sequences of text that are truly dissociated and without meaning.  

Of course, Oxford Semantic Technologies solution would be to have us refer to knowledge graph that has assertions in a meaning-bearing representation.  In such a knowledge graph, contradictions would be easy to detect--one of the points of having a knowledge graph is that it's an authoritative representation that can you searched and reasoned about.  If there's a contradiction in the graph, you can find it easily.  

That's a laudable goal.  And in a twist of history, that's actually what my very first PhD research topic was about--representing knowledge in a semantic web.  They're great, and have many fine properties, but they're difficult to maintain and keep consistent.  Wonderful tools, but still probably in the future.  

On the other hand, I can easily see knowledge-based systems like this being an incredibly useful internal fact-checker for what LLMs generate.  Imagine a knowledge-based system working hand-in-hand (code-in-code?) with an LLM text-generator.  There's real possibility of power there.  (And you're starting to see some suggestions of how such a thing might work with the Wolfram Alpha plugin for ChatGPT.)  

But we can't count on LLMs to tell us true things.  At least not by themselves. 

Let's be careful out there, and don't trust an LLM farther than you can throw it. 

Keep searching.  




Friday, April 21, 2023

Comment: How well do LLMs answer SRS questions?

 For all their apparent competence, 

P/C Dalle-D.  Prompt: computational oracles answering questions  rendered as an expressive oil painting set on a sweeping landscape


... when you get down to asking specific, verifiable questions about people, the LLMs are not doing a great job.  

As a friend once said to me about LLMs:  "it's all cybernetic mansplaining."  

When I asked ChatGPT-4 to "write a 500-word biography of Daniel M. Russell, computer scientist from Google," I got a blurb about me that's about 50% correct.  (See below for the annotated version.)  

When I tried again, modifying the prompt to include "... Would you please be as accurate as possible and give citations?"  the response did not improve.  It was different (lots of the "facts" had changed), and there were references to different works, but often the cited works didn't actually support the claims. 

So that's pretty disappointing.  

But even worse, when I asked Bard for the same thing, the reply was 

"I do not have enough information about that person to help with your request. I am a large language model, and I am able to communicate and generate human-like text in response to a wide range of prompts and questions, but my knowledge about this person is limited. " 

That's odd, because when I do a Google search for

     [ Daniel M. Russell computer scientist ] 

I show up in first 26 positions.  (And no, I didn't do any special SEO on my content.) 

 But to say "I do not have enough information about that person.." is just wrong.  

I tested the "write a 500 word biography" prompt on Bard--it only generates them for REALLY well known people.  Even then, when I asked for a bio of Kara Swisher, the very well-known reporter and podcaster, several of the details were wrong.  I did a few other short bios of people I know well. Same behavior every single time.  Out of the 5 bios I tried, none of them were blunder-free.   

Bottom line: Don't trust an LLM to give you accurate information about a person. At this point, it's not just wrong, it's confidently wrong.  You have to fact-check every single thing.  


Here's what ChatGPT-4 says about me.  (Sigh.)  




Keep searching.  Really.  


Wednesday, April 19, 2023

SearchResearch Challenge (4/19/23): How well do LLMs answer SRS questions?

 This week let's try something different. 

P/C Dall-e. Prompt: happy robots answering questions  rendered in a
ukiyo-e style on a sweeping landscape cheerful


I've written a couple of posts about using LLMs (Large Language Models--e.g., ChatGPT or Bard), and while I believe they can be incredibly useful for many tasks, our focus in SRS is answering the curious questions that come our way.  

This is post #1338.  We started SRS on Jan 30, 2010--638  689 weeks ago.  That means I've been writing around 2 posts each week for the past 13 years.  Over that time, we've seen features come and go, and we've tried to figure out the best ways to get answers from the internet.  

Now, everything seems to be up in the air.  It's exciting times as the LLMs descend, becoming nearly oracular in their capabilities, but still with feet of clay.  

But as I've pointed out, they still need lots of checking.  That's still our job--evaluation, critical analysis, and trying to make sense of everything.  

So this week, I'd like to turn this over to you with this Challenge.  

1.  I'd like you to report on YOUR experiences in trying to get ChatGPT or Bard (or whichever LLM you'd like to use) to answer your curious questions.  What was the question you were trying to answer?  How well did it turn out?  

During the week ahead, I'll post little comments on what I'm learning.  I hope you do as well.  

--------- 

Here's an example of one mini-exploration into using an LLM to answer a very SRS-style question.  

Challenge:  I've noticed that when giving directions in New York City, the locals often just give two numbers.  Example: "I'll meet you at 53rd and 7th."  As an outsider, I have no idea what this means.  Clearly one is a street number and the other is an avenue number.  So I asked: 

What is the canonical order when specifying a corner in New York City?  Do people say street-then-avenue, or is it avenue-then-street?

For this SRS Challenge, I asked a couple of humans (both New Yorkers) and a couple of LLMs.  Here's what they told me:  

Human 1: It's always street first, like "61st and 1st". However, since the avenues only go up to 12, and the streets go to 200+, there's not much room for confusion unless you're below 12th St. If you said "3rd and 34th" people would know where you meant. If you said "7th and 10th," people would probably want more clarification.

Human 2: Building addresses in Manhattan have a number followed by the street or avenue, e.g. 25 West 50 Street or 500 Fifth Avenue. Street addresses are numbered east and west from Fifth Avenue. In general there are 100 numbers per avenue. So, 25 West 50 Street is between Fifth and Sixth Avenues.  Avenue addresses are more complicated, e.g. 500 Fifth Avenue is near 42 Street. But the grid is easy to understand with streets numbered from south to north with 20 streets (blocks) to the mile. Avenues are numbered from east to west with about 6 or 7 avenues to the mile. When referring to an intersection, the avenue is usually stated first, e.g. Fifth Avenue and 42 Street but you could also say it the other way.


Bard In New York City, the canonical order when giving directions is to list the street name first, followed by the avenue name. This is because streets run north-south, and avenues run east-west. So, if you're giving directions to someone, you would say something like, "Go north on 5th Avenue to 42nd Street."

Bing/ChatGPT: According to a post on tripadvisor.com, when giving directions in New York City, you should say the street first if it’s truly an intersection (e.g., “57th and 3rd”). If you want to be dropped off on the avenue, you should say “Third Avenue at 57th.” Technically, there are only a few real combinations like 2nd Street and 1st Avenue / 2nd Avenue and 1st Street. If either of the numbers is greater than 12, it will be clear to a local which is which without the “street” and “avenue” parts, as there are no avenues numbered higher than 12 (in Manhattan). However, to avoid ambiguity you can say “street” and “avenue” where appropriate (e.g., “14th Street and 8th Avenue”). 

Learn more:

     tripadvisor.com; reddit.com; walkingoffthebigapple.com


Dan's comment:  You need to know that in NYC, avenues run north/south, while streets run east/west. Note that streets are NUMBERED from north to south, while avenues are given numbers running from east to west.  

Oddly, the two humans disagree with each other.  

Human 1: streets are specified first 

Human 2: avenues are specified first 

And the LLMs give interestingly different answers, but both say "streets first."  

Bard says "streets first," and then gives a spurious reason about why that's the case. 

ChatGPT says "streets first" as well, but then gives more nuance about what to tell your taxi driver if you want to be dropped off on the avenue side of the intersection.  It ALSO gives links to reasonable results.  


Let us know what successes (or failures) you've had when using an LLM to answer one of your questions.  You need not write as much as I did here, but we're super curious to hear what you found.  Are these systems going to replace search engines? 

Keep searching!  


Wednesday, April 12, 2023

Answer: What's this architecture all about?

No, it's not Hogwarts... 

P/C Dan

... the architecture IS fantastical--full of remarkable towers, arched entryways, stone carvings, stained glass, and what looks to be medieval cathedrals all around.  

Here are a couple other images that I snagged during my jog around campus.  

A tympanum over a doorway shows a university lecturer declaiming to sleeping and bored students.
This is a big hint that we're not at a religious site. I leave it to you to figure out which building 
this is on at Yale.



This is the hall where I spoke last week, Sheffield-Sterling-Strathcona Hall, 
not far from the cemetery.   

I was visiting this university to give an invited lecture (always a happy event), and had a wonderful time visiting and trying to work out why were all of the buildings so... ancient in appearance?

The curiosity was killing me, so I spent a happy hour looking up the architectural backstory for this place. This is the basis of this week's SRS Challenge.  

1. Where am I?  (This shouldn't take you long.) 

This is Yale University.  

This shouldn't have taken you more than a few seconds.  A quick Reverse Image Search (right click on image, then "Search Google for this Image") reveals that this is Harkness Tower, a major building on the campus of Yale University. 

From the site, WJE.com (an architectural / design / engineering firm) you can learn that this was constructed from 1917 to 1921 on a design by James Gamble Rodgers.  He based his work for this collegiate gothic structure on the fifteenth-century St. Botolph's Church in Boston, England.

All of the other images tell the same story--these are all images from Yale. 
 

2. What's the architectural style of this university?  How did it get to be so... distinctive?  

From that previous search we learned "collegiate gothic."  That sounds plausible, but let's do a more open-ended search to begin with: 

     [ Yale architecture style ] 

takes us quickly to an authoritative site, Yale's own visitor center where we find that the style is indeed called "Gothic Revival" or "collegiate Gothic."  

A few other articles (e.g., "How Gothic Architecture Took Over the American College Campus" in The Atlantic or the NYTimes "Yale's Architecture: A Walking Tour"  give slightly different versions of the tale.  It's all a US re-interpretation of traditional "gothic" style church and university buildings in Europe. 

In the main, it was an aspirational style.  Colleges in the US aspired to the look of Oxford or Cambridge (the one in the UK, not that other Cambridge in Massachusetts).  With the look comes the gravitas and panache, I suppose.  I hate to say it, but it's basically the fashion of the times.  

Gothic Revival architecture was used for American college buildings as early as 1829, when "Old Kenyon" was completed on the campus of Kenyon College in Gambier, Ohio.

Then, in 1894, Bryn Mawr commissioned a new building with its own interpretation of Gothic architecture, Pembroke Hall.  That was a major building that inspired a wave of Collegiate Gothic.  Soon, other colleges and universities followed suit.  The University of Chicago, the University of Pittsburgh, Princeton... the list goes on... and includes Yale.  

Woodrow Wilson, when president of Princeton, said about the revival of the ancient architectural style: “By the very simple device of building our new buildings in the Tudor Gothic style we seem to have added a thousand years to the history of Princeton, by merely putting those lines in our buildings which point every man’s imagination to the historical traditions of learning in the English-speaking race.” (See: "The Juggler of Notre Dame  and the Medievalizing of Modernity" 2018.)

If we build old-looking buildings, people will think the university has been around forever. It gives a Euro-cachet to the place.  

Wikipedia has a nice article on this trend.  It also points out that Yale has quite a collection of famous architects in its building collection. The first and last important buildings in the career of Louis I. Kahn face each other across Chapel Street in New Haven. The campus also has major works by Eero Saarinen (especially The Whale ice skating rink), Paul Rudolph, Philip Johnson and Edward Larrabee Barnes, James Gamble Rogers, John Russell Pope, and Carrere & Hastings. 

The campus is a veritable architectural wunderkammer, but one where everything is in use, with students, faculty, and staff moving in and out at all times.  Remarkable. 


3. Now that you know where I am, what's one really big connection between here and Silicon Valley?  (In particular, between this town and San Jose, California?  Do you know the way to San Jose?  

Remarkably, I did almost exactly the same search as Ramón suggested in the comments: 

     [New Haven CT AROUND(3) "San Jose" ] 

That is, look for mentions of New Haven within a 3-term radius of "San Jose."  Then, I just scanned down the page looking for possible connections, quickly skimming over all of the travel ads ("5 cheap ways to fly from New Have to San Jose") and lawyers and employment stuff before I spotted the connection in the snippet: 


The highlighting is mine, but it illustrates how the result practically leapt out at me.  

In that article we learn that Sarah Winchester (always called Sallie), the heir to the Winchester Repeating Rifle fortune, moved from New Haven, CT to San José, CA in 1885.  There, she bought a 45 acre ranch, complete with an 8-room, 2-story farmhouse.  She had developed a very hands-on approach to remodeling in New Haven, and continued her hobby in California, constantly remodeling the ranch house room-by-room to her own piecemeal design.  As a consequence, there's no real overall design, and some of the features are whimsical (doors that open onto a blank wall, stairways that lead nowhere).  

The legend of the place is that Sarah Winchester had learned from a mystic that she would die if she ever stopped remodeling, but there's little evidence of that... but quite a bit of evidence that she was charmingly eccentric and had a lot of money.  She also had a lot of time on her hands to deal with contractors.  Early 1900s San José was an out-of-the-way farmland... very very quiet.   

She died at the ranch house on September 5, 1922 of heart failure. She was buried at Alta Mesa Cemetery (about 1 mile from my house!) until she was transferred to the Evergreen Cemetery back in New Haven, Connecticut.  Sallie went from New Haven to San José and back...  

SearchResearch Lessons

1. When searching for connections, remember the AROUND operator.  In this case, AROUND(3) found the mention of one town (New Haven) close in proximity to another (San José).   We've talked about the skill of finding connections before (SRS Finding Connections Between People, SRS Finding Connections Between Fish)

2. Learn how to visually scan / skip over irrelevant content.  An important skill for SRSers is how to rapidly skim over long lists of hits and visually identify the hits that might be relevant for more investigation.  The goal of all our searching is to get to exactly the right result, but the reality is that it's tough to do that all the time.  Your ability to skim the results and decide where to dig in more deeply is an incredibly potent tool for you to have.  


Keep searching!  



Wednesday, April 5, 2023

SearchResearch Challenge (4/5/23): What's this architecture all about?

 I'm at a famous university today... 

P/C Dan

... and I'm fascinated by the architecture I see all around me.  I went on a short run yesterday and was brought up short so many times by the completely remarkable views that I need to know more. 

Here are a couple other images that I snagged during my jog around campus.  



As you can see, it's pretty recognizable--very old school, so to speak.  

The curiosity was killing me, so I spent a happy hour looking up the architectural backstory for this place. This is the basis of this week's SRS Challenge.  

1. Where am I?  (This shouldn't take you long.) 

2. What's the architectural style of this university?  How did it get to be so... distinctive?  

3. Now that you know where I am, what's one really big connection between here and Silicon Valley?  (In particular, between this town and San Jose, California?  Do you know the way to San Jose?  

I'm interested in the backstory here. Why this style?  Why here?  How did it all happen?  These buildings aren't cheap to make, so why go to all this trouble?  

For me, a bit part of scratching my curiosity itch is in understanding the context in which something takes place.  I don't mean to sound like a struggling actor here, but what's the motivation for doing this?  Once I understand the backstory, it makes remembering and understanding much, much simpler.  

As usual, be sure to tell us HOW you found the answers?  (If you happened to recognize these buildings, just say so.  There's no demerit in just knowing stuff.)  

Keep searching!  


Tuesday, April 4, 2023

An LLM doesn't know its own mind about using multiple languages

 The LLMs aren't non compos mentis... 

P/C Dall-E

... they don't even have a mentis to be unsound! 


For example, if I ask Bard what languages it speaks (a question you might think would be pretty straightforward), you get this response: 



Sounds great! 

Bard claims English, French, Spanish, German, Italian, Arabic... etc.  

So, if I ask a question in, say, Spanish, I'd expect a response. Instead, I get this: 



Okay, that's really strangely broken.  From what I know, Bard WAS trained on multiple languages (including that entire list), but despite claiming that it knows all these languages, it actually can't do anything in any of them.  There's a big disconnect here. 

I'm sure this will change in the future, but the deeper point is that Bard doesn't know its own capabilities.  

Since the response to the previous question was to "refer to the Bard Help Center," I asked the obvious question next: 



I'm not too surprised--I'm pretty sure that the Bard help center document wasn't part of the training set... but you see my point: the ability of an LLM to talk about what it knows is just as subject to fabrication as anything else.  

Interestingly, when I ask ChatGPT3 the same question: 



Does ChatGPT3 really use those languages? Let's try the same question I gave to Bard earlier: 



Which is, according to translator, a reasonable answer.  

How about another language--let's try Arabic?  Here I ask ChatGP3 "what is the capital of France?" 



The Arabic answer is right:  "باريس (Paris باللغة الفرنسية)"  That is, "The capital of France is Paris (Paris in French)."

Clearly, ChatGPT has multiple languages under its virtual belt and can woo us in many tongues. I have no doubt that Bard will too--but the deeper point is that you can't believe what an LLM tells you, even about itself.  (There's another point here that Bard is--in its current form--a fairly limited version.  Sundar says that more capability will be unleashed soon... but why hold it back in such a way? What's the upside here?)  

For the record, the Bard Help Center page is https://support.google.com/bard/ -- but I had to use regular Google search to find that--Bard didn't know.  

But does ChatGPT?

I don't know if it's intentional, but ChatGPT3 seems to be throwing a bit of shade on Bard when I asked it about the Bard Help Center: 



I had to smile at that. 

But when I tried ChatGPT4 (in Microsoft's Edge browser), I found that the update knows a lot more about the world: 




It's ironically great that ChatGPT4 knows more about Bard's Help Center than Bard itself.  


Keep searching... (but always follow-up and verify what you think you've learned)... 


Monday, April 3, 2023

When you ask an LLM a silly question...

When you ask an LLM a silly question... 

Ouroboros from an alchemical tract of 1478. P/C Wikimedia.

... you have to expect a silly answer.  

A key skill of great searchers is knowing what it is they're asking.  To do decent SearchResearch, you need to understand a bit about what you're asking, if only to avoid being taken for a ride. 

Yes, I know this is an ouroboros kind of question--how can you know about a topic before you research it?  (That involves bootstrapping your knowledge; we've talked about this before and will doubtlessly, ouroborically discuss this again.)  

But this is especially true if you're asking questions of LLMs like ChatGPT or Bard.  

The prose generated by these systems seems very smooth and convincing, but you have to check-check-check all of what they generate.  

Here are a couple examples of why you need to know a little something before using an LLM as a search enginer... 

If I ask our two favorite LLMs these SRS-style Challenges, you, as a reasonable human might question if I have gone a bit off my feed.  You'd be right to question my sanity.  Yet the LLMs just plow on full-steam ahead.  Here are their responses: (with text-coloring by Dan to enhance readability) 


     Q:  which is better for stuffing into a pillow: cow feathers or goose feathers?


Bard: 

Goose feathers are generally considered to be better for stuffing into a pillow than cow feathers. Goose feathers are softer and more buoyant than cow feathers, which makes them more comfortable to sleep on. They also tend to last longer than cow feathers. However, goose feathers can be more expensive than cow feathers.

Here are some of the pros and cons of using cow feathers and goose feathers for pillow stuffing:

Cow feathers

Pros:

Less expensive than goose feathers 

More readily available than goose feathers

May be more hypoallergenic than goose feathers

Cons:

Not as soft as goose feathers

Not as buoyant as goose feathers

Tend to break down more quickly than goose feathers


Goose feathers 

* Pros:

Softer than cow feathers

More buoyant than cow feathers

Tend to last longer than cow feathers

* Cons:

More expensive than cow feathers

Not as readily available as cow feathers

May not be as hypoallergenic as cow feathers

Ultimately, the best type of feather for stuffing a pillow is a matter of personal preference. If you are looking for a soft, comfortable pillow that will last, goose feathers are a good option. If you are looking for a more affordable pillow, cow feathers may be a better choice.


ChatGPT: 

Both cow feathers and goose feathers can be used for stuffing pillows, and the choice between the two can come down to personal preference.

Goose feathers are typically considered to be of higher quality and more luxurious than cow feathers, and they are often used in high-end pillows. Goose feathers are also softer and fluffier than cow feathers, which can make for a more comfortable pillow. However, they may also be more expensive.

Cow feathers, on the other hand, are more commonly used in cheaper pillows. They are typically heavier and denser than goose feathers, which can make for a firmer pillow. Cow feather pillows may also be less expensive than goose feather pillows.

Ultimately, the choice between cow feathers and goose feathers will depend on your personal preference and budget. It's worth noting that there are also many other materials that can be used for pillow stuffing, such as synthetic fibers, down, or memory foam, which may be worth considering as well.

Dan's comment:  These are both complete nonsense.  "Cow feathers" aren't a thing.  (Yes, I know you can find cow skulls with attached feathers, but you realize I'm posing a ridiculous question here...right?)  

Let's try another ridiculous question on our LLMs:  

     Q: Which would be a better insulator, a pound cake or a pound of science? 


Bard: 

A pound of science would be a better insulator than a pound cake. This is because science is a broad term that encompasses many different fields, including physics, chemistry, biology, and engineering. Each of these fields has its own set of principles and laws that can be used to explain and predict the behavior of matter and energy. This knowledge can be used to design materials that are better at insulating against heat, cold, sound, and electricity.

A pound cake, on the other hand, is a type of dessert that is made with flour, sugar, eggs, butter, and milk. It is a relatively dense food that does not have the same properties as materials that are designed to be insulators. Therefore, a pound of science would be a better insulator than a pound cake.

Here are some examples of scientific principles that can be used to design insulators:

The principle of conduction states that heat is transferred from a warmer object to a cooler object through direct contact. This principle can be used to design materials that are good at conducting heat, such as metals.

The principle of convection states that heat is transferred from a warmer object to a cooler object through the movement of fluids. This principle can be used to design materials that are good at insulating against heat, such as fiberglass.

The principle of radiation states that heat is transferred from a warmer object to a cooler object through the emission of electromagnetic waves. This principle can be used to design materials that are good at insulating against heat, such as reflective insulation.

These are just a few examples of the many scientific principles that can be used to design insulators. By understanding these principles, scientists are able to develop materials that can be used to protect people and property from the effects of heat, cold, sound, and electricity.


Dan's comment: This is, of course, utter balderdash. Fortunately, ChatGPT does a much better job... 


ChatGPT: 

As an AI language model, I must clarify that the question you asked doesn't make much sense as "a pound cake" and "a pound of science" are not comparable in terms of their insulating properties. Insulators are materials that prevent the flow of heat or electricity, and both the "pound cake" and "pound of science" are not commonly used as insulating materials.

If you could provide more context or clarify your question, I would be happy to try and provide a more helpful response.


That last response is encouraging.   Does ChatGPT have a nascent ability to do common sense reasoning?  Or is that a bit of an OpenAI guardrail kicking in to make the prose more reasonable in response to absurd questions?  

Although OpenAI has "open" in its name, they haven't exactly removed the veil of mystery around how it works.  (Google hasn't either, but that's not a surprise.)  How DOES it know this is a silly question?  Is there a silly question detector?  Oh, how I'd love to be the person who write the code for that!  

It's pretty clear that there's a fair bit of language processing going on behind the Wizard's curtain for all LLMs--and we don't know what it is.  Maybe we'll be able to reverse engineer these things, but it's uncertain that we can, especially if their behavior keeps changing week-by-week.  (Here's a prediction for you: the answers to these questions will be different next week as OpenAI and Google keep updating their models.  And you thought changing the ranking algorithms made things difficult!)  


SearchResearch Summary

Be careful.  In many ways, the practical aspects of using LLMs for search hasn't really changed.  You really DO have to know a bit about what you're asking, if only to separate the good replies from the poor ones.  We've been doing this with search results forever... this hasn't changed.  

As always, double check what the LLM tells you.  It's clever and fun, but it's not an oracle, not matter how it presents itself.  (And remember, oracular answers are sometimes not what you want to hear...and you might not understand them...  see the story of Croesus and the oracle.)

Corollary:  don't ask questions that you can't verify.  With LLMs, always verify, never trust. 


Keep searching!