Thursday, April 28, 2016

Additional hint: What are those smoke plumes?

Does it help if I tell you...

.. that when I saw the smoke plumes, I was on AA 2820, around 1:20PM, Central?  

Thinking strategically... 

And searching on! 

Wednesday, April 27, 2016

Search Challenge (4/27/16): What are those smoke plumes?

Every so often... 

... (well, quite frequently for me!)... you'll see things that you just don't understand.  Maybe it's a funny insect, or a interesting flower, or an odd plant growing in a place you don't expect.  

I have this curiosity inspiring moments fairly often.  Usually, I'll write myself a note to "look this up later"  I flag it in my notes with the prefix L/U (for "look-up"), and every so often when I can't figure them out, I pose them to you as a SearchResearch Challenge.  

Last February 18th, 2016, in the early afternoon I was flying just north of Mobile, Alabama, when I looked south out of my window and saw this: 

This is the best shot I could get with my camera.  I didn't have polarizing filters to help cut down on the glare.  And yes, that's my checked shirt you can see reflected in the window.

As you can see, there are 5 rather large billows of smoke rising from the ground.  This week's Challenge is to... 

1.  Can you identify WHY it looks like half of Alabama is on fire?  What's going on in this photo?  

As I said, this is really just one instance of the general search problem--how do you go from the information given to a clear explanation of what's going on.  

How would you search for something this dramatic, and this difficult-to-understand from 30,000 feet in the air?  

If you figure it out, let us know HOW you figured it out.  (If you just happened to know, that's fine--put your answer in the comment stream with a comment like "I live there" or "I just happen to know..."  Those kinds of answers are good too!)  

This week.... Search on... 

... for the fire!  

Monday, April 25, 2016

Answer: An architectural plant?

The SearchResearch question this week is... 

... to understand what would make me smile when I saw this plant (below) growing at the base of these columns (above on the left) ... 

I asked the Challenge this way:   

1.  What about seeing this plant at the base of these columns made me smile?  What's funny / odd / surprising about this little scene?  (In other words, what's the connection between the plants and the columns?  No, it's not that the plants are planted at the foot of the columns.  It's much more obvious than that.)  

For instance, if you start with the simple query:  

     [ plant column ] 

about halfway down the results you'll see a result about a plant stand, but it will include the word acanthus.  

Now, at this point in your research, you probably don't know what that word means, but as a researcher, you'll want to look it up.  

This is the easiest way:  [ define acanthus ] 

Note the second definition!  That's a connection to our search Challenge!  An image search for acanthus shows you: 

When looking for connections between ideas, a great way to start is with the most obvious query.  

But since we know we're searching in an architectural context, I would add in the term architecture, do a search for: 

        [ plants columns architecture ] 

But notice that I did NOT do a search for: 

     *  [ plants at base of columns ]     -- note: by convention, a leading * on the query 
                                                                                           -- indicates that this is NOT a good query to 
                                                                                           -- use for this Challenge
Why not?  

Because while there might be plants that are commonly planted at the bottoms of columns, that's not what's of interest.  Remember that the key observation was that this made me smile--that suggests that it's a little bit of a surprise, maybe something that's NOT common at the base of plants.  This starred search would find the most common plants that live at column bases.  

I chose that term because the Challenge explicitly mentioned that "I was thinking about architecture...," but more importantly, columns are architectural elements, and so I thought it would be a useful way to add context to the search.  Look at this search below: 

You can see that the first few results are for Egyptian columns.  If you read those results a bit, you'll find that the plants sculpted into Egyptian columns are papyrus plants, which look very different than the leaves in the photo.  Keep going a bit farther down, and ... 

Again, you see the Acanthus result, and the connection between Acanthus (the plant) and acanthus (the top of the column). 

IF, on the other hand, you recognize the columns as being Corinthian, your search could be: 

     [ Corinthian column plants ] 

By reading through the SERP, it doesn't take long to learn that there are leaves at the top of a Corinthian column are called acanthus leaves.  Looking at the columns up close (which I assume you did!), you'll see that these carved leaves look a lot like the plants shown.

I confirmed this by searching for: 

     [ Corinthian column acanthus ] 

and found a great deal of information confirming that in fact the acanthus IS the basis for the design of the leafy parts of the column.  

The acanthus design dates to around 450 BCE.  The oldest known Corinthian column is in the Temple of Apollo Epicurius at Bassae in Arcadia (now Greece).  

Much later, the Roman writer Vitruvius (c. 75 BC – c. 15 BC) told the story that the Corinthian order may have been invented by Callimachus, a Greek architect and sculptor who was inspired by the sight of a votive basket left on young girl's grave. A square tile had been placed over the basket, and an acanthus plant had grown through the woven basket, forcing its leaves up through the basket, just as the top of column might be sculpted.  

The origin story of the Corinthian Order,
illustrated in Claude Perrault's translation of Vitruvius, 1684.
The basket with tablet on top and invasive acanthus is shown
near the bottom of the figure.  Link to Wikimedia.

So why did it make me smile?  Because it seemed like the acanthus plant was trying to grow to the top of the column--in essence, it was trying to get to the top... but it's not going to make it. It just seemed like an acanthus that was striving to reach the top, to where it thought it belonged. 

Search Lessons 

The key idea here is that you're exploring the concept connections between two different ideas.  For this Challenge, it's how this random plant connects with these columns.   So--how to do you this?  

1. Start with a query that includes both concepts, and start looking for the connections.  The trick here is to start with a pretty generic query that lists both concepts. and then start looking for links. As we saw above, it takes a bit of work, but can lead you to the results.   

2. A great way to focus your "connection search" is to add a context term.  In the solution above, we added the context term architecture Think about context terms as the extra search terms that describe the general topic area; it's a way to reduce the confusion about how the terms might be used in multiple ways. Context terms are incredibly useful--use them well! 

For Teachers

For teaching, finding connections is a great way to spark curiosity.  One task that I sometimes give to students is to find a click-chain from one topic to another in Wikipedia.  It's not quite the same as doing a search task, but it's a great deal of fun to try and find the shortest possible path from the Wikipedia entry on turquoise and oranges. Or, if you're going to follow the ideas in this blogpost, start at column and find a path to acanthus.  (It's easy now, but if you haven't read this post, it's a bit tricky... and fun.)  

Or, you could just try to find connections between historical figures or events.  For a fun example, see my post about the connection between the phrase “The Myrtle of Venus with Bacchus's Vine” and....  defense by reason of insanity.

Finding connections is an important piece of making history come alive... and of making reading much more fun and interesting.  

Have fun making up your question--and let me know if you find some great connections that we might use in a future Search Challenge!  

Search on! 


Wednesday, April 20, 2016

Search Challenge (4/20/16): An architectural plant?

I've been thinking about architecture lately, 

... and so it was with some bemusement that I saw some columns (specifically, the ones shown above on the left) that made me smile when I saw them.  Why?  Because this plant was growing at the base of the columns... 

I've shown all three variants on the column theme to give you a sense for what's going on.

This week's Challenge is simply this:  

1.  What about seeing this plant at the base of these columns made me smile?  What's funny / odd / surprising about this little scene?  (In other words, what's the connection between the plants and the columns?  No, it's not that the plants are planted at the foot of the columns.  It's much more obvious than that.)  

Although I've seen lots of classical Greek columns--in architectural follies, on banks, and on distinguished federal buildings--seeing this particular juxtaposition was the first time I actually smiled.  Can you figure out what I found amusing?  

(Hint: I know this sounds slightly odd--but give it a go.  If you just think about what connections might exist, you'll figure it out. Think about what kind of columns these are...)  

Search on! 

The same plant growing nearby.... 

Monday, April 18, 2016

Answer: How well do medical results stand the test of time?

What we know to be true changes...

It's a big mistake to think that what you know to be true is a constant.  Certainly there are eternal verities, but in this post I want to focus in on what we might think of as true, but as we'll see, turn out to be provisional.  This is particularly true for medical results:  At any one moment in time, we might think we have a lock on the truth, but this often turns out to be just our current best approximation.    
(To repeat my caveat from last week:  I should say that I'm not picking on Medicine as a field--it's just a nice illustration of how much our knowledge of the world shifts over time.  You could ask these same questions of physics, or chemistry, or biology.  Medicine is a bit simpler to study for SearchResearch purposes.) 

As you know, the Nobel Prize is awarded annually for outstanding contributions in a number of areas, including Medicine.

So I asked the question:  

1.  If we look at the Nobel Prizes awarded in the field of Medicine in the 1920s, how many of those highly acclaimed results from the 20s are still believed to be true?   
Let's start by finding the list of the Nobel Prizes for medicine that were awarded in the 1920s.  My query was: 

     [ nobel prizes medicine list  ] 

which handily takes me to the Nobel Prize organization's list of prizes awarded in Medicine.  

When you pull the awards in the 1920's, you get this list: 

1929  (2 awards)  
Christiaan Eijkman "for his discovery of the antineuritic vitamin"
 Sir Frederick Gowland Hopkins "for his discovery of the growth-stimulating vitamins"
Charles Jules Henri Nicolle "for his work on typhus"
Julius Wagner-Jauregg "for his discovery of the therapeutic value of malaria inoculation in the treatment of dementia paralytica"
Johannes Andreas Grib Fibiger "for his discovery of the Spiroptera carcinoma"
 1925None awarded
 Willem Einthoven "for his discovery of the mechanism of the electrocardiogram"
 Frederick Grant Banting and John James Rickard Macleod "for the discovery of insulin"
 1922 (2 awards)
Archibald Vivian Hill "for his discovery relating to the production of heat in the muscle"
 Otto Fritz Meyerhof "for his discovery of the fixed relationship between the consumption of oxygen and the metabolism of lactic acid in the muscle"
 1921None awarded
Schack August Steenberg Krogh "for his discovery of the capillary motor regulating mechanism"

At this point, it's just a matter of doing a bit of searching for each topic to discover which of these are currently held to be true.  Here's a quick summary of what I found... 

1929: Christiaan Eijkman found that when chickens (in Indonesia!) were fed polished rice, they got sick with beriberi, but when they were switched back to regular (unpolished rice), they recovered.  Working with Frederick Hopkins, they determined that a particular chemical, a "vital amine," was responsible for Beriberi.  That term was shortened by Casimir Funk to "vitamin," and this particular one was called "B," and later changed to B1.

Interestingly, Eikjkman believed beriberi to be caused by a nerve poison in the endosperm of rice, from which the outer layers of the grain gave protection to the body. It was his collaborators who figured out the correct mechanism.  (And, even more interestingly, a decade earlier, in 1884, Kanehiro Takaki, a surgeon general in the Japanese navy, hypothesized that berberi was due to insufficiencies in the sailor's diet. He discovered that substituting a diet of white rice with one that also had barley, meat, milk, bread, and vegetables nearly eliminated beriberi over a 9-month sea voyage. However, Takaki incorrectly attributed the benefit to increased nitrogen intake, as vitamins were unknown substances at the time.)

     Summary:  Still believed.  

1928:  Charles Nicolle found that while epidemic typhus patients were able to infect other patients, their clothes seemed to spread the disease.  Oddly, he observed that they were no longer infectious when they had had a hot bath and a change of clothes. Once he realized this, he reasoned that it was most likely that lice were the vector for epidemic typhus.

Nicolle tested this theory by infecting a chimpanzee with typhus, retrieving the lice from it, and placing it on a healthy chimpanzee. (Now that's an interesting experiment!)  Within 10 days the second chimpanzee had typhus as well. 

Further research showed that the major transmission method was not louse bites but excrement: lice infected with typhus turn red and die after a couple of weeks, but meanwhile they excrete a large number of microbes. When a small quantity of louse poop is rubbed on the skin or eye, an infection occurs. Kids, don't try to replicate this experiment at home.  

     Summary:  Still believed.  

1927:  Julius Wagner-Jauregg studied the effects of treating mental illness by inducing a fever, an approach known as pyrotherapy. Eventually he tried giving patients malaria, which proved to be very successful in the case of dementia paralytica (also called general paresis of the insane), which was caused by neurosyphilis, at that time a fatal disease. It had been known for some time that patients who developed high fevers could be cured of syphilis. As a consequence, giving doses of the malaria parasite Plasmodium vivax was used to induce prolonged and high fevers. This was considered an acceptable risk because the malaria could later be treated with quinine. The technique was known as malariotherapy; however, it was dangerous, killing about 15% of patients, so it is no longer in use.

     Summary:  Pyrotherapy really does work, but giving patients malaria is too risky;
     there are better methods.  No longer believed to be a good idea.    

1926:  Johannes Andreas Grib Fibiger had claimed to find an organism he called Spiroptera carcinoma that caused gastric cancers in mice and rats. He received a Nobel prize for this discovery. On the other hand, it was later shown that this specific organism was not the primary cause of the tumors. On the other, other hand, he was one of the first demonstrations that an infection could be a cause of tumor growth.  He was clearly the first person to induce cancer in laboratory animals – a major step forward for cancer research. However, Katsusaburo Yamagiwa, only two years later successfully induced squamous cell carcinoma by painting crude coal tar on the inner surface of rabbits' ears. (Some think he should have also received a Nobel...)  

The worm is now named Gongylonema neoplasticum. Later research has since shown that while the worms can stimulate cancerous cells to form tumors, the worms themselves are not a direct cause of cancer, as they are not carcinogenic to healthy cells.  Oddly enough, Fibiger died of colon cancer... 

     Summary:  Not quite right.  

1924:  Willem Einthoven invented a version of the electrocardiogram, now a commonplace tool in medical offices everywhere.  Beginning in 1901, Einthoven completed a series of prototypes of a string galvanometer. This device used a very thin filament of conductive wire passing between very strong magnets. When a current passed through the filament, the magnetic field created by the current would cause the string to move. A light shining on the string would cast a shadow on a moving roll of photographic paper, thus forming a continuous curve showing the movement of the string. The original machine required water cooling for the powerful electromagnets, required 5 people to operate it and weighed some 270 kilograms. This device increased the sensitivity of the standard galvanometer so that the electrical activity of the heart could be measured despite the insulation of flesh and bones.

ECG device around 1880.
     Summary:  Still believed. (And has vastly improved since then.)  

1923:  As happens, Frederick Banting had to give a talk to students about the pancreas in 1920. While prepping for the class, he read that diabetes seems to result from a lack of a protein hormone secreted by the islets of Langerhans in the pancreas. This putative hormone was called "insulin" and was thought to control the metabolism of sugar. Missing insulin led to an increase of sugar in the blood which was then excreted in urine.

Banting them figured out a process to create insulin from the pancreas. He discussed this approach with J. J. R. Macleod, Professor of Physiology at the University of Toronto. Macleod provided experimental facilities and the assistance of one of his students, Charles Best. Banting and Best, with the assistance of biochemist James Collip, began the production of insulin.  

Banting and Macleod were jointly awarded the 1923 Nobel Prize in Physiology or Medicine. Banting flew into a rage that he would share the Prize with Macleod, whom he felt had not contributed enough to deserve the Prize. He eventually decided to split his half of the Prize money with Best. In response, Macleod split the other half of the Prize money with James Collip.

     Summary:  Still believed.  

1922:  Archibald Vivian Hill was an English physiologist, one of the founders of biophysics and operations research. His important work was on the problem of lactic acid in muscle, particularly in relation to the effect of oxygen upon its removal in recovery.  This in turn led him to study the dependence of heat production on the length of muscle fibre.

His work on muscle function, especially the observation and measurement of thermal changes associated with muscle function, was later extended to similar studies on the mechanism of the passage of nerve impulses. Very sensitive techniques had to be developed and he was eventually able to measure temperature changes of the order of 0.003°C over periods of only hundredths of a second. He was the discoverer of the effect that heat was produced as a result of the passage of nerve impulses. His insights gave rise to an enthusiastic following in the field of biophysics.

Otto Meyerhof's showed in isolated but otherwise intact frog muscle (that is, without most of the frog attached), the lactic acid formed is reconverted to carbohydrate in the presence of oxygen.  He also showed that his preparation of a KC1 extract of muscle could carry out all the steps of glycolysis with added glycogen and hexose-diphosphate in the presence of hexokinase derived from yeast. (Sorry, but that's as simple as it gets.  Somethings really are complicated.)  

In other studies, he also showed how glucose was also glycolysed, and this became the foundation of the Embden-Meyerhof theory of glycolysis (now called the Embden-Meyerhof-Parnas pathway). He was able to show that there is a fixed relationship between the consumption of oxygen and the metabolism of lactic acid in the muscle.

After his work on lactic acid and heart muscles, he went on to show that some phosphorylated compounds are rich in energy, which influenced not only our concepts of muscular contraction, but of the entire significance of cellular metabolism. A continuously increasing number of enzymatic reactions are becoming known in which the energy of adenosine triphosphate (ATP), the compound isolated by his associate Lohmann, provides the energy for endergonic synthesis reactions. The importance of this discovery for the understanding of cellular mechanisms is generally recognized and can hardly be overestimated.

     Summary:  Still believed.  

1920:  Schack August Steenberg Krogh worked on the mechanism of regulation of the capillaries in skeletal muscle. He was first to describe the adaptation of blood perfusion in muscle and other organs according to metabolic demands by opening and closing the arterioles and capillaries.  Interestingly, in a link to the Nobel Prize of 1923, together with his colleague Hagedorn in 1922, Krogh made contributions to create a method for producing insulin by ethanol extraction from the pancreatic glands of pigs, thereby reducing the cost and increasing its availability.  

     Summary:  Still believed.  

Search Lessons 

I'm not sure there's much to tell you here about clever search methods--just that sometimes when doing research you need to spend the time to actually read through your findings, and not just take it on faith that everything you see initially is correct.  

Or perhaps the real SearchResearch lesson is the realization that even things we hold to be true now are subject to updates and improvements as we learn more.  Sometimes they're tossed out altogether.  What you know now to be true... may no longer be true in a few years.  

(This was highlighted by Ramón in his comment about the Nobel Prize Search on the use of lobotomy as a psychiatric treatment, recognized with a Nobel Prize to António Egas Moniz in 1940.  In the context of the time it was considered revolutionary.  But now that we've learned more, that treatment is no longer considered acceptable.)   

I hope you enjoyed this Challenge as much as I did.  I read a good deal about each of these Nobel recipients, trying to understand their work in the context of their times, as well as trying to understand it WRT how we think about these things today.  

As always, stay curious. 

And Search On! 

Wednesday, April 13, 2016

Answer (part 2): When you want just the headlines...

As you might remember, 

... the second part of last week's Challenge was: 

2. (Harder) Can you find the top 100 LA City Council headlines on guns, and then extract the publication dates to create a week-by-week histogram of when these articles were published?  (This is a two-step challenge: (a) find and extract the dates, (b) put the dates into a spreadsheet and create a histogram showing the number of publications on this topic by week.)  

To solve a Challenge like this (which looks a little like something a data journalist might do), it's really useful to work backwards from the goal.  

Here we want the histogram--one that looks like this (I'm doing this by months rather than weeks to make it simpler to read): 

Each bar shows the number of headlines published on the topic of  LA City Council and guns, by month from March 2015 - March 2016.  

That's what we want to create.  Now, how we get from a SERP full of headlines (see below) to the histogram above? 

This is what the SERP from our search looks like: 

Here's the plan:  

1.  Find all the articles published on our topic in our time period (3/15 - 3/16).  

2.  Extract the text from the SERP and put into a text file.

3.  Extract the publication dates from each of the headlines, put those into another file.

4.  Clean the data (to get rid of any errors). 

5.  Sort the dates and make the histogram.  

(Told you it was slightly harder than the average Challenge.)  

Let's work through this each step at a time: 

1. Find the articles 

Good news.  We've already done that.  You just use the Advanced Search features in Google News.  But how to get all of the top 100 results?  

Remember than you can use the Search Settings to change the number of results displayed on your SERP.  Note that you have to turn off "Instant predictions" results in order to show all 100 at the maximum number of results per page. 

2.  Extract the text from the SERP and put into a text file.

Now, once you have all 100 results shown on the page, you can simply select all 100 results, copy (Control-C or CMD-C), and then paste into your favorite text editor.  (Paste as "text-only," you don't want the images or the HTML in there.) 

As this point, you should have a text file that looks like this: 

3.  Extract the publication dates from each of the headlines, put those into another file.

As you can see, each of the dates in the text file represents exactly one article that's on our topic.  Since we're trying to count the number of articles by week (or by month), all we need to do is to pull out each of the dates in this file and put them a spreadsheet for making the histogram.  

Since this is a smallish data set, you could just do it by hand.  But what if we were going to analyze thousands of dates over many years?  How would we pull the dates then? 

There are a couple of options: 

(a) Write a small program to go line-by-line through the text, finding the dates, and writing them out to another data file. 

(b) Use a text editor that has a regular-expression pattern matcher built into them.  (I sometimes use TextWrangler in this way.) You can do a search for the year, and extract all of the lines that have a date in them.  

(c) Use built-in Linux commands to extract the lines with dates.  As I've mentioned before, the Linux command grep is used to pull out lines from a text file that match a regular expression pattern. 

I opened a Terminal on my Mac, did the following grep: 

     grep 20\d\d\  results.txt > dates-file.txt 

the magic here was in the 20\d\d  -- that's the pattern I wanted to match.  20\d\d will match any 2 digits preceded by a 20 (e.g., 2014 or 2015).  The input file is results.txt and the output file is dates-file.txt 

The rest of that line says to do the matching on the results.txt file (where I put all of the text from SERP) and put the results into dates-file.txt 

You can see that each line of text ends with a date.  That's handy because now I want to... 

3.  Extract the publication dates from each of the headlines, put those into another file.

Again, there are a number of ways to do this.  If you're a spreadsheet jockey, you can pull this into your favorite spreadsheet and then extract out the dates from each line.  

Since I know Linux command lines, I did something really fast and used the awk command.  (That link to awk is a pretty good tutorial on how to use it.)  Basically awk lets you change a pattern that's matched in each line; it's an incredibly handy tool to know if you do much of this kind of data transformation.  

     awk '{ print $(NF-2) " " $(NF-1) " "  $NF }' dates-file.txt > results2.txt

This looks complicated, but it's really not bad.  Let's break it down:  

    awk 'mini-program' input-file  output-file

All I'm doing is running the 'mini-program'on the input-file and sending it to a new output-file.

The mini-program is also simple once you look into it.  It's just a print statement with list of things to print.  The variable $NF means the "last item in the line of text."  Then the variable before that is $(NF-1), which means "the item BEFORE the last item in the line of text."  And $(NF-2) means, of course, the "one before that..."    The other things in there are just spaces to put inbetween the items on output.  

Make sense? 

So NOW we've got a file that's just the dates from the SERP.  (And maybe a bit more...) Before we create our histogram, we need to go through the data and... 

4.  Clean the data (to get rid of any errors). 

Since this is such a small set of numbers, we can just look at it in our text editor and fix up whatever strange things might have slipped through.  Easy.  

And now we're ready to do the last step: 

5.  Sort the dates and make the histogram.  

As you can see, I just opened a Google Sheet and pasted the data there.  You'll quickly notice that they're out of order, so I sorted them, put a title on the top of the column, and then created a second column that is just the month + year (so I could make my histogram by month... if you want to do this analysis week-by-week, this is where you'd do that).  

And... voila, we've got our histogram (see the top of this post).  

I know that was a lot of steps, but sometimes if you want to understand a topic, you really need to figure out how to find the data, and then process it to get what you really want.  

Search Lessons 

If you look at both parts of this Challenge (#1: Find the news articles; #2: count them and create the histogram), there are a number of lessons to learn.  

1.  Use the Advanced search UI for News when you want to zero in on something precise.  You can search different parts of the articles, by paper, by time, or by region.  

2.  If you're not getting enough results, generalize your query.  For instance, searching just in the "Los Angeles" region might not give you enough results--consider opening it up to "California" or even "United States."  

3. If you start doing data manipulation, learn a few tools to help out.  In this example, I used both grep and awk to help pick out just the parts of the data that you need to extract.  


When teaching critical analysis of news (or any kinds of media genres), a time-based count is a useful way to get a handle on how much effort is being spent on a topic.  What we did here was a bit complicated, but there's no reason a class couldn't do much of this by hand and learn some pretty remarkable patterns of coverage.  As usual, it's best to go with topics that are important to your students--local news, or stories that heavily influence your students' lives.

As an illustration, I re-ran this headline extraction + count + histogram routine on the topic of Affluenza, the defense made by a teen in Texas to account for his DUI (which led to the death of 4 people, and now, I see, years in jail).  In the chart below, you can see the intense interest that's relatively short-lived in the topic.  Although I pulled the data using the methods described above, this kind of analysis could be used for all kinds of teachable moments.