Twitter Archives from the Library of Congress & Google: The Facts As We Know Them By Gary Price

Posted by Celia Walter | 22 Apr, 2010
...the new Twitter archives and, to be more specific, the announcement that the Library of Congress would be getting a copy. A few hours before LC began to get the word out (via a tweet, appropriately), Google announced they were already online with a searchable version of the Twitter archive. As of today, the Google’s Twitter archive only goes back a few months to February 2010 but “eventually” the entire archive back to day one will be available and searchable.

What we would like to do in this post is go over the facts and, where we don’t have the exact info we need, take educated guess at the answers. Keep in mind that things do change and, in some cases, further details need to discussed and decisions need to be made.

We read all of the primary documents (links are available), used the Google service, and were fortunate enough to have a telephone chat with a spokesperson from LC. We also read some “way out” stuff (e.g., the Library of Congress bought Twitter) but most of the time, just a fact or two were either missing or a bit “off”.

So, with all of that out of the way, let’s get to the details.

The Library of Congress Twitter Archive

...Update:  The Library of Congress Twitter archive will not be accessible to and searchable by the general public on the Internet or at the Library of Congress in Washington D.C. However, the archive will be accessible to researchers on-site at LC. Details about researcher access will be developed and made public in the next few months, but it’s likely a researcher will have to certify his or her identity by at least signing a form. Again, exact details are forthcoming... [There's a good deal more]

The Google Twitter Archive

Like most things Google, historical searching of tweets has a name. It’s called Google Replay.

+ As of today, you CAN search using Google Replay only back to February 2010, with a minimal delay for new tweets. There is NO embargo/delay of tweets using Google Replay. “Eventually” (that term is not defined), the entire Twitter archive will be accessible and searchable using Google Replay by anyone from any computer that can access Google. BTW, this is what the Twitter home page looked like on September 30, 2006.

+ Google Replay uses the familiar Google timeline interface (as used with Google News for some time) where you can manipulate the timeline to narrow the focus to down to the minute. (Note the bar that sits on the timeline; it moves)

+ If you want to go directly to Google Replay, this link should get you there...

 

Summary

Both services are needed. Will others come into play.

The LC Archive is essential. It’s going to receive cutting edge preservation; it will allow qualified researchers from LC and elsewhere to mine the data; it might even create a new exhibit at LC. However, it’s not a publicly accessible research tool. I do wonder if people will show up wanting to use the database and not be able to. I would imagine the same thing happens regularly with LC users wanting to exit the library with LC materials. Or also — people phoning LC, asking if they have a particular book and whether they can get it sent to them.

Google Replay IS for the public. It IS searchable and it IS easily manipulated to assist in focusing a search query. As we said a moment ago, it IS accessible from any computer connected to the web that can reach Google.

Update: “Tweets: What We Might Learn From Mundane Details” (via AOTUS Blog from Archivist of the United States, David Ferriero. (Hat Tip: ArchivesNext)

From The Resourceshelf