Friday, March 4, 2016

Are we making personal search WORSE?

Can it be that we're making search into a harder problem than it already is?

Could be.

Once upon a time, before Google Docs & Spreadsheets, if I knew I had put together a spreadsheet for some kind of analysis, I knew it was in Excel and it was on my desktop.  Search was simple, easy, and elegant.   


Now, perhaps that same document is a Google Spreadsheet OR an Excel file.  This is also true for just plain documents—it was either a .TXT file, a .DOC file (both on my desktop), but now maybe it’s a Google Docs.  OR it's a .DOC.  Or it's on my desktop, or in my cloud storage.  But under which login?

Uh oh.

You can see where this is heading—as tech companies continue to innovate, the number of places and kinds of stuff I have is just going to get more and more complicated.

Let’s take a slightly more realistic problem.  

Suppose you and I are planning a progressive dinner party together. You know, the kind of party where you start at one person's house, then walk to the next house for the next course, etc.  This party will take a bit of planning, so we start putting a few notes together.  But where are the notes kept?

Is it a .DOC file that we email back and forth?  Is it just an email thread? Maybe the notes are kept in the Details of a shared Google Calendar entry on the party date.  Or perhaps you started with a Google Spreadsheet to keep track of all the people and places involved, while I started up a “My Map” on Google Maps to plan a path for the partygoers.  Maybe you bookmarked the web site of a fabulous caterer.  Or to make things worse, maybe I used my work Gmail account to start the thread about the party, but at some point we switched the discussion thread to our personal Gmail accounts.

Dang.

Think about it.  You now have ways to create some kind of note or document… and you can’t really find your notes among all of them.  And this problem is getting worse.  Here’s MY list of places and kinds of documents where I keep notes of different kinds.

> plain txt file on desktop> MS Word> MS Excel> MS Powerpoint> Google Calendar “Details”> Google Docs> Google Spreadsheet> Google Maps “My Map”> Google Bookmarks> browser bookmarks> Google Keep notes> Google PDF files in Drive> Google Presentation> Gmail (work), content in message body> Gmail (work), content in attachment> Gmail (personal) content in message body> Gmail (personal), content in attachment> Earthlink (personal backup email account), both body and attachments> Facebook comment thread, status updates, etc.
> Twitter tweets> Google Chats> Google Site (my personal web site)> Google Blog (my work blog, of which I own 3)
> Google Task list> my non-work, personal blog (and the comments thereon)

(Are there others I’m just not remembering in the heat of writing this essay??)

So now I have to not just remember that I have some kind of note, but also in which format it’s in, which content-containing system it’s in, how to search that system, and the particular limits and properties of that system.

Thank heavens for search!

Except… 

Oh that’s right, no one search tool unites them all.  Each is separate.  Each is different.

What’s worse, almost all of these systems have really different search properties.  I can find a substring in my MS Word document (on the desktop) by using my Desktop search tool, but I can’t find a substring in my blog postings because they're kept in the cloud.  I can’t even search all of my Google Documents for substrings.  (Example:  Suppose I can't remember how to spell Wojcicki?  I can't search for just "Woj"! That kind of search works within a doc, but when you search in Google Drive across all of your documents, it's only search-for-entire-token, no substring search allowed. "Wojcicki" works, "Woj" doesn't.)

Faced with this immense wealth (and complexity) of places to stash notes, I feel a bit like a squirrel looking for my cache of winter nuts in an infinite forest without a map.  Which tree has my cache?  You mean I really have to check each and every tree in the forest?  



Of course the tech companies (and I include myself in this cohort) continue to create ever more ways to store our stuff.  

I heard about a new project to let you create a maps-based itinerary of your trip, with every interesting site and restaurant noted on the path.  Great idea, until I realize that this will be yet-another-place to put notes.  Another place that I’ll need to remember, another place with a disintegrated search system.  Great.

And if you have more than one email account then God Help You.  You’re doomed to searching manually multiple times in multiple places.  (Alas, I have 4, for various legal reasons I can't merge them all together.)

But people cope by segmenting their lives, developing a practice that works for them, putting notes in places where they “naturally” go.  I have all of my personal notes and presentations on my desktop in a plain text file.  And I only ever put notes about upcoming events into my online calendar.  I have one for work, and one for personal events, which means I only have to do two searches…

The rub comes when something slightly new happens, and you don’t already have a worked-out plan for where-to-put-this-note.  And the problem I see increasingly is “I read this somewhere online, but I don’t remember where…”  If you can’t recall some feature that lets you hone in on a particular subsystem, some glimmer of an idea about what kind of thing it was stored in, then you’re in for a long period of looking around.

Question to ask: How many more electronic cubbyholes and clever new document kinds can you mentally support?  How many more should we be making?  It’s clear what my strategy needs to be—I just can’t take on too many more kinds of notes and places to lose my information.

A great policy for our online information supplier to support might be “no new personal information content models without integrated search.”   (I need a snappier catchphrase than that, but you get my point.)

This is an opportunity.  Let’s not continue to make the personal information search problem harder. Let's build an integrated search for personal content AND let's figure out a way to stop producing ever more stuff that's not part of that integrated search story.


Or we’re doomed to be forever searching for our own stuff. 




5 comments:

  1. having a significantly simpler online life, much of this was like a pillow over my head, but as I attempted to noodle parts of it out it
    made me wonder about how the argument/rant was being framed…
    …did you ever notice that 'file' is an anagram for 'life'?
    and an image/remembrance from a hike I took in an Aspen grove in the fall of 2020 (also an election year) popped into my mind…
    not quite as dark as your winter squirrel/nut cache/aspen selection, but…
    …ohhh, bother, hmmm, nuts
    Framing,Fairhurst
    …that's Dr. F. to me…
    rainbow tree
    Wojcicki…S.D. or A.E.?
    seriously, on a mac, doesn't spotlight do much of this type of search?
    btw, thought there was a note that Director Comey wishes to speak with you… but I can't seem to locate it… may be on the other phone or server? ¯\_(ツ)_/¯

    speaking of squirrels…
    unrelated, related search topic —
    a search puzzle

    ReplyDelete
  2. Dan, your complaint ( http://dl.acm.org/citation.cfm?doid=1107458.1107496 , http://people.csail.mit.edu/karger/Papers/pimchapter.pdf ) is exactly why we built the Haystack system 20 years ago (http://people.csail.mit.edu/karger/Papers/desktopchapter.pdf ). Unfortunately we weren't sufficiently convincing and applications continue to be built the wrong way. We're continuing to protyotype tools that aim to overcome the problem.

    ReplyDelete
    Replies
    1. Haystack was a wonderful system, thanks for the reminder. Unfortunately, that lesson (as with many of the lessons of academic hypertext systems) never got transferred to the corporate world. Or, rather, building systems in that way proved to be untenable (for a variety of reasons).

      Delete
  3. Use hyperlinks from within your tool of choice, such as OneNote.

    ReplyDelete