Monday, March 4, 2013

Why site: sometimes doesn't seem to work



Last week a friend wrote to me with a great question.  He wrote:  
Whenever I try to use site: operator, it sometimes doesn't do what I expect.  For example, I was looking for some Javascript code and did this query.  In particular, I did NOT want any results from the Closure-library.Googlecode.com site.  So I did the following query:  
           [ -site:closure-library "closure" autocomplete ]


What's weird is that the results STILL include items from Closure-library.googlecode.com  
Doesn't my minus sign in front of the site: prevent this from happening? 

I had to stop and think about this for a minute.  Why WAS he getting results from Closure-library.googlecode.com? 

Why?  Because site: only works on top-level domains.  To make this particular example work, you have to fully specify the domain.  That is, you'd have to write:  -site:closure-library.googlecode.com as in the example below.  

Versions of site: that work properly are...  

      site:.gov    -- searches all .GOV sites
     site:searchresearch1.blogspot.com inurl:2010   -- searches my blog with 2010 in the URL
     site:www.nasa.gov/multimedia   -- search all of NASA.gov/multimedia 
     site:www.google.co.za news   -- search all of Google South Africa for News 

And of course, putting a minus-sign in front of any of these will AVOID those resources.  

But you can't just put the leading part of a website's name into the site: operator and hope for it to work.  It has to be able to expand to a full website name! 

Hope this pro-tip helps some of you! 

Search on. 

 



8 comments:

  1. Would you write a post about Google Alerts and query tips in it? Because this service is a little bit hard to use, maybe only for me. Results on my queries aren't the same in Alerts and Google.com.
    I think this service should keep growing. In Google Now and IFTTT way but wider. Alert when weather will change, when movie will arrive in near theater and etc.
    ps: site:www.google.co.za news - search all pages in 'google.co.za' with word 'news'. Almost every Google navigation bar has it.
    site:news.google.co.za "query" - search exact words in News service of 'google.co.za'.

    ReplyDelete
  2. Thanks for the tip.

    I thought it was like in this example: related:simplyrecipes.com/recipes/perfect_guacamole/ that you need to add in the query an space to find the results desired. related: simplyrecipes.com/recipes/perfect_guacamole/

    And now I learn a new thing!.

    I'd like to know more about what Dmytro asks, if possible. It is interesting to add Google alerts

    Thank you and have a great month, Dr. Russell

    ReplyDelete
  3. Thanks Dan for the useful tips. This particular tip struck a cord in me that I would find useful and perhaps help others. I don't have a webpage, have never built a website and it struck me that type of experience is probably very useful. Not that I want to build websites. But understanding the building blocks ought to give you insight to search techniques. I was just looking at Google Custom Search Engine for ideas. If you think we can benefit could you give a tip in the future. Thanks.

    ReplyDelete
  4. Daniel, I believe that you should have used a completely different Google search operator, namely inurl: So, in you example, this would have given the result you first expected:
    -inurl:closure-library "closure" autocomplete

    This area is very will covered here since long: http://www.googleguide.com/advanced_operators.html
    I also like this article: http://iosint.wordpress.com/category/osint-in-practice/2-collection/search-engines/

    ReplyDelete
    Replies
    1. I completely agree with you. In fact, I meant to mention it in this post (but ran out of time). I previously covered inurl: in http://searchresearch1.blogspot.com/2010/10/around-has-always-been-around.html and http://searchresearch1.blogspot.com/2012/12/wednesday-search-challenge-121912.html and
      http://searchresearch1.blogspot.com/2011/10/wednesday-search-challenge-october-19.html and
      http://www.powersearchingwithgoogle.com/assets/textversions/6-24-Using_site_structure/624Usingsitestructure.html#h.eccuysiayehc

      Delete
  5. It might be a bit nit picky, but since this is a post on the workings of site: I wanted to point out that searching for site:www.example.com would not search all the pages at example.com, the correct search to do that would be site:example.com. That is because the www is treated as a sub-domain, and thus eliminates all other sub-domain.

    When I search the Canadian Forces website like this site:www.forces.gc.ca I am missing pages like army.forces.gc.ca, in that particular case the vast majority of pages are found in sub-domains.

    in general I avoid using www in my site: searches, unless I have a good reason to leave it in.

    On a couple rare occasions I have found this search useful basically it is a way to get you a sense of what sub-domains exist. If after surveying the results a particular sub-domain contains a particular type of information then you can focus in on that sub-domain.

    I hope that is clear, and useful.

    ReplyDelete
  6. The site: operator, I've just learned from Alex Chitu's blog (http://googlesystem.blogspot.pt/2013/03/advanced-uses-for-googles-site-operator.html) is one of the few if not the only Google operators that work with a wildcard. So, if you search [ site:amazon.* ] you will find all international Amazon's domains.

    The problem is that the results are not consistent, that is, this trick does not work every time. In fact, [ -site:closure-library.* "closure" autocomplete ] gives the same results as the search without the asterisk. Even if you add .com to the search — [ -site:closure-library.*.com "closure" autocomplete ] — the results are the same.

    What's more, [ site:closure-library.* "closure" autocomplete ] or [ site:closure-library.*.com "closure" autocomplete ] — without the minus, i.e., supposedly telling Google to search only in that — don't yield any results, contrary to [ inurl:closure-library "closure" autocomplete ].

    By the way, somehow if you add the asterisk to inurl:closure-library [ inurl:closure-library.* "closure" autocomplete ], the set of results is different and much smaller, not larger as could be expected.

    ReplyDelete