This was harder than most...
... but everything is a bit harder in the City of Angels, as much hard-boiled detective fiction will bear out.
But there are multiple solutions. Today I'll just write about the simple Google News solution, and then later this week, I'll get around to showing you some other methods. (It's slightly complicated because I'm traveling this week, which is why this Challenge answer is delayed.)
Suppose I'm a reporter trying to understand how the Los Angeles City Council deals with gun-related issues. Can you (expert SearchResearchers) tell me how to do the following?
As I mentioned last week, the problem here is to search JUST on the headlines. How can you do that on Google?
First of all, remember that there IS a Google News service. So what happens if we search for:
[ Los Angeles City Council gun ]
You'll get a fairly unfiltered set of news articles with these words... which aren't very specific. I'd say this isn't a great way to go--it's too open-ended, and the search terms aren't always in the headlines. This is because default News search searches both the headlines and the body text.
The big trick about searching Google News is to realize that there's a hidden dropdown widget in the News search box. See below:
It's only when you roll-over that downward pointing triangle that you realize what it's for:
Ah ha! It's the apparently-missing-and-well-hidden Advanced News Search UI!
If you click on that, you'll get lots of options that would be handy for a Challenge like this:
I've filled out the form to search for city and council and gun within the headlines of the article between the specified dates.
Note that I also added a Location at the bottom--here I put in Los Angeles.
And the results come back:
You might wonder why the 4th article is considered "in" the Los Angeles area. It's because that's where the website, OpposingViews.com is headquarted. The Los Angeles Times and KABC-TV are obviously correct, but the "Location" filter works off of the location of the organization that's writing, rather than the location that's being written about.
(Note that this is one of those things what will get better over time as entity identification improves. For for the moment, we have to live with this as it is.)
Now, if you do this same kind of "headline search" with a different query that returns fewer results, it means you might consider what OTHER papers not-in-the-area might well write about, but be headquartered somewhere else.
So let's run a query for [ county courthouse ]. If you search only in Los Angeles, there are only 5 results. But change the location to "California," and you'll get a superset of all the headlines written about the county courthouse from news organizations within California.
Now we see a LOT more results (around 60):
As you'd expect, there are more slightly off near-hits here. The "Knox County Courthouse..." article is from SFGate (in San Francisco), but it's not about California. We'd have to filter through these results as well, in order to get to the kinds of news stories that we really want.
But at least you've learned a new way to search JUST the headlines of News stories.
However, I have to zoom to the airport just now... I'll answer the question about how to extract the dates and create the histogram tomorrow, along with the Search Lesson.
Stay tuned, and search on!
... but everything is a bit harder in the City of Angels, as much hard-boiled detective fiction will bear out.
But there are multiple solutions. Today I'll just write about the simple Google News solution, and then later this week, I'll get around to showing you some other methods. (It's slightly complicated because I'm traveling this week, which is why this Challenge answer is delayed.)
Suppose I'm a reporter trying to understand how the Los Angeles City Council deals with gun-related issues. Can you (expert SearchResearchers) tell me how to do the following?
1. Can you search the major news outlets in the Los Angeles (LA) region for news articles over the past year that report on the City Council considering any kind of gun-related actions? (Be generous here--if the council heard a report about the use of guns, that would count.)
2. (Harder) Can you find the top 100 LA City Council headlines on guns, and then extract the publication dates to create a week-by-week histogram of when these articles were published? (This is a two-step challenge: (a) find and extract the dates, (b) put the dates into a spreadsheet and create a histogram showing the number of publications on this topic by week.)
As I mentioned last week, the problem here is to search JUST on the headlines. How can you do that on Google?
First of all, remember that there IS a Google News service. So what happens if we search for:
[ Los Angeles City Council gun ]
You'll get a fairly unfiltered set of news articles with these words... which aren't very specific. I'd say this isn't a great way to go--it's too open-ended, and the search terms aren't always in the headlines. This is because default News search searches both the headlines and the body text.
The big trick about searching Google News is to realize that there's a hidden dropdown widget in the News search box. See below:
It's only when you roll-over that downward pointing triangle that you realize what it's for:
Ah ha! It's the apparently-missing-and-well-hidden Advanced News Search UI!
If you click on that, you'll get lots of options that would be handy for a Challenge like this:
I've filled out the form to search for city and council and gun within the headlines of the article between the specified dates.
Note that I also added a Location at the bottom--here I put in Los Angeles.
And the results come back:
You might wonder why the 4th article is considered "in" the Los Angeles area. It's because that's where the website, OpposingViews.com is headquarted. The Los Angeles Times and KABC-TV are obviously correct, but the "Location" filter works off of the location of the organization that's writing, rather than the location that's being written about.
(Note that this is one of those things what will get better over time as entity identification improves. For for the moment, we have to live with this as it is.)
Now, if you do this same kind of "headline search" with a different query that returns fewer results, it means you might consider what OTHER papers not-in-the-area might well write about, but be headquartered somewhere else.
So let's run a query for [ county courthouse ]. If you search only in Los Angeles, there are only 5 results. But change the location to "California," and you'll get a superset of all the headlines written about the county courthouse from news organizations within California.
Now we see a LOT more results (around 60):
As you'd expect, there are more slightly off near-hits here. The "Knox County Courthouse..." article is from SFGate (in San Francisco), but it's not about California. We'd have to filter through these results as well, in order to get to the kinds of news stories that we really want.
But at least you've learned a new way to search JUST the headlines of News stories.
However, I have to zoom to the airport just now... I'll answer the question about how to extract the dates and create the histogram tomorrow, along with the Search Lesson.
Stay tuned, and search on!
Hello Dr. Russell. I hope your flight is good and you find new interesting things to create Challenges.
ReplyDeleteI tried searching in Google News but forgot about the big trick. When found only few results looked for something else.
I also thought answer involved something different, and that didn't help to try more with this approach.
I am looking forwards to learn the other ways you will teach. And of course learn the second part of the answer to this Challenge. I am sure your histogram will not be done counting one by one using only search tools and selecting different time periods.
As always very interesting. I am glad I tried. I didn't know how to answer but learned and practiced many things.
Have great trip and see you tomorrow for the new SearchResearch Challenge.