All The Data Out There
2008 April 03
My AOL data project has gotten me both interested in and terribly frustrated with the challenges of working with massive amounts of data. Data manipulation and optimization is not my bag, but I’m afraid it’s going to have to be soon. In any case, I thought I’d share some of the great resources I’ve found.
Interesting Datasets
I would love to do projects with each of these.
- AOL Search Data
- Enron Company Emails
- The Entire Contents of Wikipedia
- Hillary Clinton’s Schedules as First Lady
- Public Transit Data Feeds
- World Database of Happiness
Compilations of Publicly Accessible Data
These sites all point to datasets that you can download and incorporate into your projects.
Pages tagged with “publicdata” on del.icio.us
Subscribe to the feed on this page to keep an eye on what interesting stuff others find.A Meta-index of Data Sets
A great starting point, with more helpful links than this post.Digitized Historical Collections from the Library of Congress
The central starting point for all records at the Library accessible via the Open Archives Initiative Protocol. I’m not 100% sure what that means, but it sounds like something I could use down the line.Infochimps
A newly launched site with lots of potential to grow into something interesting.Copyright Free and Public Domain Media Sources
Points mostly to photographs and other images, and mostly older ones, at that. Thanks, Copyright Term Extension Act of 1998.
