Finding data

I’ve assembled a long list of datasets that are ready to use. If, however, you don’t find something that works for you, you can try to find other data to use for your final project.

Finding the right data can often be really challenging. There’s a lot of civic and scientific data out there, but that may not be what you want. For humanities projects, my favorite places to look for data are:

Data is Plural newsletter

Ever week, Jeremy Singer-Vine, a data journalist, sends out a newsletter containing interesting datasets. They’re also gathered on this spreadsheet.


This is a subreddit (a category within the discussion forum Reddit) where people can ask for and offer datasets. It contains a very wide variety of data.

Humanities data repositories

Humanities data repositories exist, but none of them is comprehensive, and I’ve had very patchy luck finding anything useful in them.

Alan Liu’s list of datasets

The literature scholar Alan Liu has a very large list of datasets and corpora for the humanities. Some of them may need a bit of finagling before they’re fit for purpose (e.g., they may be in a format other than CSV).