Wednesday, May 31, 2023

SearchResearch Challenge (5/31/23): Did they really burn ancient Roman statues?

 Can that be true? 

A scene from 18th century Rome by Giovanni Battista Piranesi. Veduta della Piazza di Monte Cavallo (View of the Piazza del Quirinale with the Statues of Horse Tamers in side view), from Prianesi's Vedute di Roma. Note the marble columns just ready to be rolled away and repurposed.  These statues are still standing in Rome, notably not-broken up, although there's now an obelisk between them.  


Sometimes you read something so astounding that you have to wonder, "can that possibly be true?

I have to admit that this happens to me on a daily basis, and not always about current events.  

Earlier in the week I read this off the cuff comment in a recent article in the Atlantic Monthly magazine (My Night in the Sistine Chapel, by Cullen Murphy) 

"For centuries, the bountiful supply of ancient statuary unearthed in Rome had been burned for lime to make mortar..."  

The author makes the point that for centuries, ancient Roman statues were more valuable as a source of raw marble than as beautiful works of art.  (Key insight: marble can be burned above 840°C to convert the calcium carbonate into calcium oxide, commonly called quicklime, is an essential ingredient to make concrete).  

That threw me.  The image of folks just tossing works of art into the kiln to make quicklime just killed me.  It's the kind of thing that makes you say "really?" 

I did a little SRS and found the answer.  Fascinating journey that I thought you might enjoy.  

1. Is that sentence true?  Once upon a time did people in Rome just burn ancient marble statuary in order to make quicklime for construction purposes?  

2. (Just for fun..)  I know of at least one other surprising use of ancient materials for the most prosaic of purpose--can you figure out what that other ancient material is (was)?  

As always, let us know your thought process.  HOW did you figure out your answer?  Let us know so we can learn to be better investigators of the mysterious and puzzling!  

Keep searching! 





Monday, May 22, 2023

SearchResearch on a podcast--"The Informed Life"--Listen now

Podcasts are great! 


You can be out running, walking the dog, getting in your 10,000 steps and still listen and learn about the world.  

A few weeks ago I was interviewed by "The Informed Life" podcast host Jorge Arango, which he has just dropped on his channel as Episode 114, Dan Russell on the Joy of Search.  (The transcript can be found here.)  

In this cast, we talk about SearchResearch and my book.  I tell a few stories (some of which SRS Regulars will recognize), and wax wise about what makes someone a good searcher (and what doesn't).  

Take a listen, and let us know what you think!  


Keep searching.  




Wednesday, May 10, 2023

Taking a bit of a break...

 It's May... 

P/C Dan, from a lovely trip to Norway.

... and it has been a very full few months since the start of the year.  I'm feeling in dire need of taking a little bit of a break, so I'm temporarily going on a hiatus to do a little R&R.  

My plan is to catch up on a bunch of reading (mostly non-work), go on walkabout, become a bit of a flâneur.  Eat some great food, travel hither and yon... mostly yon.  

If something remarkable catches my eye, I might pop back into SRS for a quick note--but the current plan will be to come back here on May 31st with some new observations about the world of online research and the general joy of finding out about.   

Have a great month of May!  (A little musical commentary...)  

Keep Searching!  

P/C Dan. The master state of mind being sought. 


Wednesday, May 3, 2023

Answer: How well do LLMs answer SRS questions?

 Remember this? 

P/C Dall-e. Prompt: happy robots answering questions  rendered in a
ukiyo-e style on a sweeping landscape cheerful


Our Challenge was this:  

1.  I'd like you to report on YOUR experiences in trying to get ChatGPT or Bard (or whichever LLM you'd like to use) to answer your curious questions.  What was the question you were trying to answer?  How well did it turn out?  

Hope you had a chance to read my comments from the previous week.  

On April 21 I wrote about why LLMs are all cybernetic mansplaining--and I mean that in the most negative way possible.  If mansplaining is a kind of condescending explanation about something the man has incomplete knowledge (and with the mistaken assumption that he knows more about it than the person he's talking to does), then that's what's going on, cyberneticly.  

On April 23 I wrote another post about how LLMs seem to know things, but when you question them closely, they don't actually know much at all.  

Fred/Krossbow made the excellent point that it's not clear that Bard is learning.  After asking a question, then asking a follow-up and getting a changed response: "Bard corrected the response. What I now wonder: will Bard keep that correction if I ask later today? Will Bard give the same response to someone else?" 

It's unclear.  I'm sure this kind of memory (and gradual learning) will become part of the LLMs.  But at the moment, it's not happening.   

And that's a big part of the problem with LLMs: We just don't know what they're doing, why, or how.  

As several people have pointed out, that's true of humans as well.  I have no idea what you (my dear reader) are capable of doing, whether you're learning or not... but I have decades of experience dealing with other humans of your make and model, and I far a pretty good idea about what a human's performance characteristics are.  I don't have anything similar for an LLM.  Even if I spent a lot of time developing one, it might well change tomorrow when a new model is pushed out to the servers.  Which LLM are you talking to now?  

P/C Dall-E. Prompt: [ twenty robots, all slightly different from each other, trying to answer questions in a hyperrealistic style 3d rendering ]

What happens when the fundamental LLM question-answering system changes moment by moment?  

Of course, that's what happens with Google's index.  It's varying all the time as well, and it's why you sometimes get different answers to the same query from day-to-day--the underlying data has changed.  

And perhaps we'll get used to the constant evolution of our tools.  It's an interesting perspective to have.  

mateojose1 wonders if LLMs are complemented by deep knowledge components (e.g., grafting Wolfram Alpha to handle the heavy math chores), if THEN we'll get citations.  

I think that's part of the goal.  I've been playing around with Scite.ai LLM for the scholarly literature (think of it as ChatGPT trained on the contents of Google Scholar).  It's been working really well for me when I ask it questions that are "reasonably scholarly," that is, with papers that might address the question at hand.  I've been impressed with the quality of the answers, along with the lack of hallucination AND the presence of accurate citations.  

This LLM (scite.ai) is so interesting that I'd devote an entire post to it soon.  (Note that I'm not getting any funding from them to talk about their service.  I've just been impressed.)  

As usual, remmij has a plethora of interesting links for us to consider.  You have to love remmij's "robots throwing an LLM into space" Dall-E images. Wonderful.  (Worth a click.) 

But I also really agree with the link that points to Beren Millidge's blog post about how LLMs "confabulate not hallucinate."  

This is a great point--the term "hallucination" really means that one experiences an apparent sensory perception of something not actually present.  At the same time "confabulation" happens when someone is not able to explain or answer questions correctly, but does so anyway.  The confabulator (that's a real word, BTW) literally doesn't know if what they're saying is true or not, but does ahead regardless. That's much more like what's going on with LLMs.  


Thanks to everyone for their thoughts.  It's been fun to read them the past week.  Sorry about the delay.  I was at a conference in Hamburg, Germany.  As usual, I thought I would have the time to post my reply, but instead I was completely absorbed in what was happening.  As you can imagine, we all spent a lot of time chatting about LLMs and how humans would understand them and grow to use them.  

The consensus was that we're just at the beginning of the LLMs arms race--all of the things we worry about (truth, credibility, accuracy, etc.) are being challenged in new and slightly askew ways.  

I feel like one of the essential messages of SearchResearch has always been that we need to understand what our tools are and how they operate.  The ChatGPTs and LLMs of the world are clearly new tools with great possibilities--and we still need to understand them and their limits.  

We'll do our best, here in the little SRS shop on the prairie.  

Keep searching, my friends.  



Sunday, April 23, 2023

Comment: LLMs, truth, and consistency--they don't have any idea

LLMs have a big problem: 

P/C Dall-E. Prompt: "happy robots talking about truth"  

They have zero idea what's true.  They only know the probabilities of words in text. That's NOT the same thing as "knowing" something--it's a bit like knowing that "lion" is the most likely word following "king of the jungle..." without having any idea about monarchies, metaphor, or what a king really is all about.  

The folks at Oxford Semantic Technologies wrote an interesting blog post about LLMs and finding verifiable facts.  They call the fundamental problem the "Snow White Problem."  The key idea is that LLMs don't really know what's true--they just know what's likely.  

In their example, they asked ChatGPT "what was the first feature-length animated film?" ChatGPT confidently said that it was the 1937 film "Snow White."  

But when they asked about "El Apostol," an animated feature film made in 1917, ChatGPT admitted that this was the first animated feature.  

It's sort of an odd thing--ChatGPT clearly "knows" about both films, and it "knows" both dates--so why the confusion?  

Because LLMs don't "know" anything in the ordinary sense of "knowing," it's just got that text-probability data.  That is, inconsistent knowledge just isn't a thing for LLMs.  Having coherence within a conversation doesn't really matter much.  If you ask ChatGPT: 

Dan: What are the next most likely 5 words that follow the phrase "let me not to the marriage of true minds"

ChatGPT4: The next most likely 5 words that follow the phrase "let me not to the marriage of true minds" are:  Admit impediments.

Well, that's just plain wrong. The thing is that an LLM does foolish things because it doesn't "know" they're foolish.  I asked for 5 words, not 2.  

On the other hand, Bard did much better with: 

Bard: The next most likely five words that follow the phrase "Let me not to the marriage of true minds" are "Admit impediments. Love is not," which is from William Shakespeare's Sonnet 116. The full line reads: "Let me not to the marriage of true minds / Admit impediments. Love is not love." 

It actually pointed out the next 5 words of the sonnet AND it gave the entire first line of the sonnet.  Did it just get lucky?  Or is there an internal consistency thing at work?  

So I thought I'd try a similar tack to what Oxford Semantic Tech did in their post and ask about what was first short animated film.   

I asked Bard: 

Dan: What was the first animated short?

Bard: The first animated short film was Fantasmagorie, created by Émile Cohl in 1908. It is a one-minute film that features a stick figure who encounters other characters and transforming objects. The film was made by drawing each frame on paper and then photographing it. Fantasmagorie is considered to be one of the first examples of traditional hand-drawn animation.

Interesting.  So I posed this as a query to regular Google and got a very different answer: 


 
Interesting.  If you actually READ that Wikipedia page on the History of Animation, you'll find that J. Stuart Blackton created an animated film,  "The Enchanted Drawing" in 1900 that had elements of hand-drawing along with live action elements.  A few years later, he produced another film, "Humorous Phases of Funny Faces" in 1906 with even more bits of what we'd think of as animation.  

I went back to Bard and asked: 

Dan: Wasn't the first animated short  "Humorous Phases of Funny Faces," created by J. Stuart Blackton in 1906?

Bard: Yes, you are correct. Humorous Phases of Funny Faces, created by J. Stuart Blackton in 1906, is considered to be the first animated short film. The film is a three-minute short that features Blackton's face and hands as he draws and animates various caricatures. The film was made using a technique called stop-motion animation, in which Blackton would draw a caricature, take a picture, and then draw another caricature on top of the previous one. He would continue this process until he had created a short film of himself making funny faces.

I apologize for the previous mistake. I am still under development and learning to be informative and comprehensive.

Points for politeness, but the "Snow White" effect points out another instance in which the LLM doesn't know its own mind.  Did it really change its mind?  

One more attempt: 

Dan: And didn't J. Stuart Blackton produce an animated short in 1900?  That would be even earlier, yes? 
 
Bard: That's correct. J. Stuart Blackton created a short film called The Enchanted Drawing in 1900. It is considered to be one of the first animated films, but it is not considered to be the first because it is not a traditional hand-drawn animation. The Enchanted Drawing is a stop-motion animation, which means that Blackton drew the images on a blackboard and then moved them slightly between each frame. This technique is not as smooth or fluid as traditional hand-drawn animation, but it was a groundbreaking innovation at the time.

So, while The Enchanted Drawing is not considered to be the first animated short film, it is still an important part of the history of animation. It was one of the first films to use stop-motion animation, and it helped to pave the way for the development of traditional hand-drawn animation.

Contradictory, no?  

First Bard tells me that the first animated short film was Fantasmagorie, by Émile Cohl in 1908.  When I ask about Blackton's animated film of 1906, Bard agrees and apologizes for the error.  

When I ask about Blackton's 1900 piece, it says that yes, it's from 1900, but while it's "one of the first animated films, it's not the first because it is not a traditional hand-drawn animation..."  Wait a second!  Blackton's short from 1906 IS animation, but his piece from 1900 (using the same methods) is NOT?  

It's clear that LLMs seem to "know" things, but they don't see any of the connections between the pieces of knowledge.  It is as though everything is dissociated; and since everything is separate, contradictions and internal inconsistencies don't trouble it at all.  
 
To understand that something is inconsistent means that you (or the LLM) recognizes that there are different statements about something, and that both cannot be true simultaneously.  In this case, one of these animated shorts is the "first."  Was it Blackton's 1900 "The Enchanted Drawing," or Blackton's 1906  work "Humorous Phases of Funny Faces," or was it Cohl's 1907 "Japon de fantasie," or Cohl's 1908 "Fantasmagorie"?  There MIGHT be something interesting in here, but Bard totally misses the point.  

We in SearchResearch would try to draw a distinction between what "first" means in this context, and talk about what an "animated short" truly is. But that's not a conversation an LLM can have.  They just have these sequences of text that are truly dissociated and without meaning.  

Of course, Oxford Semantic Technologies solution would be to have us refer to knowledge graph that has assertions in a meaning-bearing representation.  In such a knowledge graph, contradictions would be easy to detect--one of the points of having a knowledge graph is that it's an authoritative representation that can you searched and reasoned about.  If there's a contradiction in the graph, you can find it easily.  

That's a laudable goal.  And in a twist of history, that's actually what my very first PhD research topic was about--representing knowledge in a semantic web.  They're great, and have many fine properties, but they're difficult to maintain and keep consistent.  Wonderful tools, but still probably in the future.  

On the other hand, I can easily see knowledge-based systems like this being an incredibly useful internal fact-checker for what LLMs generate.  Imagine a knowledge-based system working hand-in-hand (code-in-code?) with an LLM text-generator.  There's real possibility of power there.  (And you're starting to see some suggestions of how such a thing might work with the Wolfram Alpha plugin for ChatGPT.)  

But we can't count on LLMs to tell us true things.  At least not by themselves. 

Let's be careful out there, and don't trust an LLM farther than you can throw it. 

Keep searching.  




Friday, April 21, 2023

Comment: How well do LLMs answer SRS questions?

 For all their apparent competence, 

P/C Dalle-D.  Prompt: computational oracles answering questions  rendered as an expressive oil painting set on a sweeping landscape


... when you get down to asking specific, verifiable questions about people, the LLMs are not doing a great job.  

As a friend once said to me about LLMs:  "it's all cybernetic mansplaining."  

When I asked ChatGPT-4 to "write a 500-word biography of Daniel M. Russell, computer scientist from Google," I got a blurb about me that's about 50% correct.  (See below for the annotated version.)  

When I tried again, modifying the prompt to include "... Would you please be as accurate as possible and give citations?"  the response did not improve.  It was different (lots of the "facts" had changed), and there were references to different works, but often the cited works didn't actually support the claims. 

So that's pretty disappointing.  

But even worse, when I asked Bard for the same thing, the reply was 

"I do not have enough information about that person to help with your request. I am a large language model, and I am able to communicate and generate human-like text in response to a wide range of prompts and questions, but my knowledge about this person is limited. " 

That's odd, because when I do a Google search for

     [ Daniel M. Russell computer scientist ] 

I show up in first 26 positions.  (And no, I didn't do any special SEO on my content.) 

 But to say "I do not have enough information about that person.." is just wrong.  

I tested the "write a 500 word biography" prompt on Bard--it only generates them for REALLY well known people.  Even then, when I asked for a bio of Kara Swisher, the very well-known reporter and podcaster, several of the details were wrong.  I did a few other short bios of people I know well. Same behavior every single time.  Out of the 5 bios I tried, none of them were blunder-free.   

Bottom line: Don't trust an LLM to give you accurate information about a person. At this point, it's not just wrong, it's confidently wrong.  You have to fact-check every single thing.  


Here's what ChatGPT-4 says about me.  (Sigh.)  




Keep searching.  Really.  


Wednesday, April 19, 2023

SearchResearch Challenge (4/19/23): How well do LLMs answer SRS questions?

 This week let's try something different. 

P/C Dall-e. Prompt: happy robots answering questions  rendered in a
ukiyo-e style on a sweeping landscape cheerful


I've written a couple of posts about using LLMs (Large Language Models--e.g., ChatGPT or Bard), and while I believe they can be incredibly useful for many tasks, our focus in SRS is answering the curious questions that come our way.  

This is post #1338.  We started SRS on Jan 30, 2010--638  689 weeks ago.  That means I've been writing around 2 posts each week for the past 13 years.  Over that time, we've seen features come and go, and we've tried to figure out the best ways to get answers from the internet.  

Now, everything seems to be up in the air.  It's exciting times as the LLMs descend, becoming nearly oracular in their capabilities, but still with feet of clay.  

But as I've pointed out, they still need lots of checking.  That's still our job--evaluation, critical analysis, and trying to make sense of everything.  

So this week, I'd like to turn this over to you with this Challenge.  

1.  I'd like you to report on YOUR experiences in trying to get ChatGPT or Bard (or whichever LLM you'd like to use) to answer your curious questions.  What was the question you were trying to answer?  How well did it turn out?  

During the week ahead, I'll post little comments on what I'm learning.  I hope you do as well.  

--------- 

Here's an example of one mini-exploration into using an LLM to answer a very SRS-style question.  

Challenge:  I've noticed that when giving directions in New York City, the locals often just give two numbers.  Example: "I'll meet you at 53rd and 7th."  As an outsider, I have no idea what this means.  Clearly one is a street number and the other is an avenue number.  So I asked: 

What is the canonical order when specifying a corner in New York City?  Do people say street-then-avenue, or is it avenue-then-street?

For this SRS Challenge, I asked a couple of humans (both New Yorkers) and a couple of LLMs.  Here's what they told me:  

Human 1: It's always street first, like "61st and 1st". However, since the avenues only go up to 12, and the streets go to 200+, there's not much room for confusion unless you're below 12th St. If you said "3rd and 34th" people would know where you meant. If you said "7th and 10th," people would probably want more clarification.

Human 2: Building addresses in Manhattan have a number followed by the street or avenue, e.g. 25 West 50 Street or 500 Fifth Avenue. Street addresses are numbered east and west from Fifth Avenue. In general there are 100 numbers per avenue. So, 25 West 50 Street is between Fifth and Sixth Avenues.  Avenue addresses are more complicated, e.g. 500 Fifth Avenue is near 42 Street. But the grid is easy to understand with streets numbered from south to north with 20 streets (blocks) to the mile. Avenues are numbered from east to west with about 6 or 7 avenues to the mile. When referring to an intersection, the avenue is usually stated first, e.g. Fifth Avenue and 42 Street but you could also say it the other way.


Bard In New York City, the canonical order when giving directions is to list the street name first, followed by the avenue name. This is because streets run north-south, and avenues run east-west. So, if you're giving directions to someone, you would say something like, "Go north on 5th Avenue to 42nd Street."

Bing/ChatGPT: According to a post on tripadvisor.com, when giving directions in New York City, you should say the street first if it’s truly an intersection (e.g., “57th and 3rd”). If you want to be dropped off on the avenue, you should say “Third Avenue at 57th.” Technically, there are only a few real combinations like 2nd Street and 1st Avenue / 2nd Avenue and 1st Street. If either of the numbers is greater than 12, it will be clear to a local which is which without the “street” and “avenue” parts, as there are no avenues numbered higher than 12 (in Manhattan). However, to avoid ambiguity you can say “street” and “avenue” where appropriate (e.g., “14th Street and 8th Avenue”). 

Learn more:

     tripadvisor.com; reddit.com; walkingoffthebigapple.com


Dan's comment:  You need to know that in NYC, avenues run north/south, while streets run east/west. Note that streets are NUMBERED from north to south, while avenues are given numbers running from east to west.  

Oddly, the two humans disagree with each other.  

Human 1: streets are specified first 

Human 2: avenues are specified first 

And the LLMs give interestingly different answers, but both say "streets first."  

Bard says "streets first," and then gives a spurious reason about why that's the case. 

ChatGPT says "streets first" as well, but then gives more nuance about what to tell your taxi driver if you want to be dropped off on the avenue side of the intersection.  It ALSO gives links to reasonable results.  


Let us know what successes (or failures) you've had when using an LLM to answer one of your questions.  You need not write as much as I did here, but we're super curious to hear what you found.  Are these systems going to replace search engines? 

Keep searching!  


Wednesday, April 12, 2023

Answer: What's this architecture all about?

No, it's not Hogwarts... 

P/C Dan

... the architecture IS fantastical--full of remarkable towers, arched entryways, stone carvings, stained glass, and what looks to be medieval cathedrals all around.  

Here are a couple other images that I snagged during my jog around campus.  

A tympanum over a doorway shows a university lecturer declaiming to sleeping and bored students.
This is a big hint that we're not at a religious site. I leave it to you to figure out which building 
this is on at Yale.



This is the hall where I spoke last week, Sheffield-Sterling-Strathcona Hall, 
not far from the cemetery.   

I was visiting this university to give an invited lecture (always a happy event), and had a wonderful time visiting and trying to work out why were all of the buildings so... ancient in appearance?

The curiosity was killing me, so I spent a happy hour looking up the architectural backstory for this place. This is the basis of this week's SRS Challenge.  

1. Where am I?  (This shouldn't take you long.) 

This is Yale University.  

This shouldn't have taken you more than a few seconds.  A quick Reverse Image Search (right click on image, then "Search Google for this Image") reveals that this is Harkness Tower, a major building on the campus of Yale University. 

From the site, WJE.com (an architectural / design / engineering firm) you can learn that this was constructed from 1917 to 1921 on a design by James Gamble Rodgers.  He based his work for this collegiate gothic structure on the fifteenth-century St. Botolph's Church in Boston, England.

All of the other images tell the same story--these are all images from Yale. 
 

2. What's the architectural style of this university?  How did it get to be so... distinctive?  

From that previous search we learned "collegiate gothic."  That sounds plausible, but let's do a more open-ended search to begin with: 

     [ Yale architecture style ] 

takes us quickly to an authoritative site, Yale's own visitor center where we find that the style is indeed called "Gothic Revival" or "collegiate Gothic."  

A few other articles (e.g., "How Gothic Architecture Took Over the American College Campus" in The Atlantic or the NYTimes "Yale's Architecture: A Walking Tour"  give slightly different versions of the tale.  It's all a US re-interpretation of traditional "gothic" style church and university buildings in Europe. 

In the main, it was an aspirational style.  Colleges in the US aspired to the look of Oxford or Cambridge (the one in the UK, not that other Cambridge in Massachusetts).  With the look comes the gravitas and panache, I suppose.  I hate to say it, but it's basically the fashion of the times.  

Gothic Revival architecture was used for American college buildings as early as 1829, when "Old Kenyon" was completed on the campus of Kenyon College in Gambier, Ohio.

Then, in 1894, Bryn Mawr commissioned a new building with its own interpretation of Gothic architecture, Pembroke Hall.  That was a major building that inspired a wave of Collegiate Gothic.  Soon, other colleges and universities followed suit.  The University of Chicago, the University of Pittsburgh, Princeton... the list goes on... and includes Yale.  

Woodrow Wilson, when president of Princeton, said about the revival of the ancient architectural style: “By the very simple device of building our new buildings in the Tudor Gothic style we seem to have added a thousand years to the history of Princeton, by merely putting those lines in our buildings which point every man’s imagination to the historical traditions of learning in the English-speaking race.” (See: "The Juggler of Notre Dame  and the Medievalizing of Modernity" 2018.)

If we build old-looking buildings, people will think the university has been around forever. It gives a Euro-cachet to the place.  

Wikipedia has a nice article on this trend.  It also points out that Yale has quite a collection of famous architects in its building collection. The first and last important buildings in the career of Louis I. Kahn face each other across Chapel Street in New Haven. The campus also has major works by Eero Saarinen (especially The Whale ice skating rink), Paul Rudolph, Philip Johnson and Edward Larrabee Barnes, James Gamble Rogers, John Russell Pope, and Carrere & Hastings. 

The campus is a veritable architectural wunderkammer, but one where everything is in use, with students, faculty, and staff moving in and out at all times.  Remarkable. 


3. Now that you know where I am, what's one really big connection between here and Silicon Valley?  (In particular, between this town and San Jose, California?  Do you know the way to San Jose?  

Remarkably, I did almost exactly the same search as Ramón suggested in the comments: 

     [New Haven CT AROUND(3) "San Jose" ] 

That is, look for mentions of New Haven within a 3-term radius of "San Jose."  Then, I just scanned down the page looking for possible connections, quickly skimming over all of the travel ads ("5 cheap ways to fly from New Have to San Jose") and lawyers and employment stuff before I spotted the connection in the snippet: 


The highlighting is mine, but it illustrates how the result practically leapt out at me.  

In that article we learn that Sarah Winchester (always called Sallie), the heir to the Winchester Repeating Rifle fortune, moved from New Haven, CT to San José, CA in 1885.  There, she bought a 45 acre ranch, complete with an 8-room, 2-story farmhouse.  She had developed a very hands-on approach to remodeling in New Haven, and continued her hobby in California, constantly remodeling the ranch house room-by-room to her own piecemeal design.  As a consequence, there's no real overall design, and some of the features are whimsical (doors that open onto a blank wall, stairways that lead nowhere).  

The legend of the place is that Sarah Winchester had learned from a mystic that she would die if she ever stopped remodeling, but there's little evidence of that... but quite a bit of evidence that she was charmingly eccentric and had a lot of money.  She also had a lot of time on her hands to deal with contractors.  Early 1900s San José was an out-of-the-way farmland... very very quiet.   

She died at the ranch house on September 5, 1922 of heart failure. She was buried at Alta Mesa Cemetery (about 1 mile from my house!) until she was transferred to the Evergreen Cemetery back in New Haven, Connecticut.  Sallie went from New Haven to San José and back...  

SearchResearch Lessons

1. When searching for connections, remember the AROUND operator.  In this case, AROUND(3) found the mention of one town (New Haven) close in proximity to another (San José).   We've talked about the skill of finding connections before (SRS Finding Connections Between People, SRS Finding Connections Between Fish)

2. Learn how to visually scan / skip over irrelevant content.  An important skill for SRSers is how to rapidly skim over long lists of hits and visually identify the hits that might be relevant for more investigation.  The goal of all our searching is to get to exactly the right result, but the reality is that it's tough to do that all the time.  Your ability to skim the results and decide where to dig in more deeply is an incredibly potent tool for you to have.  


Keep searching!  



Wednesday, April 5, 2023

SearchResearch Challenge (4/5/23): What's this architecture all about?

 I'm at a famous university today... 

P/C Dan

... and I'm fascinated by the architecture I see all around me.  I went on a short run yesterday and was brought up short so many times by the completely remarkable views that I need to know more. 

Here are a couple other images that I snagged during my jog around campus.  



As you can see, it's pretty recognizable--very old school, so to speak.  

The curiosity was killing me, so I spent a happy hour looking up the architectural backstory for this place. This is the basis of this week's SRS Challenge.  

1. Where am I?  (This shouldn't take you long.) 

2. What's the architectural style of this university?  How did it get to be so... distinctive?  

3. Now that you know where I am, what's one really big connection between here and Silicon Valley?  (In particular, between this town and San Jose, California?  Do you know the way to San Jose?  

I'm interested in the backstory here. Why this style?  Why here?  How did it all happen?  These buildings aren't cheap to make, so why go to all this trouble?  

For me, a bit part of scratching my curiosity itch is in understanding the context in which something takes place.  I don't mean to sound like a struggling actor here, but what's the motivation for doing this?  Once I understand the backstory, it makes remembering and understanding much, much simpler.  

As usual, be sure to tell us HOW you found the answers?  (If you happened to recognize these buildings, just say so.  There's no demerit in just knowing stuff.)  

Keep searching!  


Tuesday, April 4, 2023

An LLM doesn't know its own mind about using multiple languages

 The LLMs aren't non compos mentis... 

P/C Dall-E

... they don't even have a mentis to be unsound! 


For example, if I ask Bard what languages it speaks (a question you might think would be pretty straightforward), you get this response: 



Sounds great! 

Bard claims English, French, Spanish, German, Italian, Arabic... etc.  

So, if I ask a question in, say, Spanish, I'd expect a response. Instead, I get this: 



Okay, that's really strangely broken.  From what I know, Bard WAS trained on multiple languages (including that entire list), but despite claiming that it knows all these languages, it actually can't do anything in any of them.  There's a big disconnect here. 

I'm sure this will change in the future, but the deeper point is that Bard doesn't know its own capabilities.  

Since the response to the previous question was to "refer to the Bard Help Center," I asked the obvious question next: 



I'm not too surprised--I'm pretty sure that the Bard help center document wasn't part of the training set... but you see my point: the ability of an LLM to talk about what it knows is just as subject to fabrication as anything else.  

Interestingly, when I ask ChatGPT3 the same question: 



Does ChatGPT3 really use those languages? Let's try the same question I gave to Bard earlier: 



Which is, according to translator, a reasonable answer.  

How about another language--let's try Arabic?  Here I ask ChatGP3 "what is the capital of France?" 



The Arabic answer is right:  "باريس (Paris باللغة الفرنسية)"  That is, "The capital of France is Paris (Paris in French)."

Clearly, ChatGPT has multiple languages under its virtual belt and can woo us in many tongues. I have no doubt that Bard will too--but the deeper point is that you can't believe what an LLM tells you, even about itself.  (There's another point here that Bard is--in its current form--a fairly limited version.  Sundar says that more capability will be unleashed soon... but why hold it back in such a way? What's the upside here?)  

For the record, the Bard Help Center page is https://support.google.com/bard/ -- but I had to use regular Google search to find that--Bard didn't know.  

But does ChatGPT?

I don't know if it's intentional, but ChatGPT3 seems to be throwing a bit of shade on Bard when I asked it about the Bard Help Center: 



I had to smile at that. 

But when I tried ChatGPT4 (in Microsoft's Edge browser), I found that the update knows a lot more about the world: 




It's ironically great that ChatGPT4 knows more about Bard's Help Center than Bard itself.  


Keep searching... (but always follow-up and verify what you think you've learned)...