Embarrassments of riches: Managing research assets

Last updated May 15, 2013

There’s research, there’s writing, and then there’s that netherworld in between: wrangling all the digital files you gather over the course of your work. Digital files are often easier to deal with than stacks of paper, but they can also proliferate frighteningly quickly.

I teach a workshop on this topic, catchily titled Managing Research Assets (better names welcome). Below is a digital version of the workshop handout, followed by a link dump of my favorite posts about developing and refining digital research workflows. You can also download a PDF version of my handout, or a Word version if you’d like to modify it.

Jump to tools for:

Or jump to links to other resources about research workflows. (And in case you’re curious, here’s my own research workflow.)

Preserving your digital assets

The Library of Congress offers guidelines for preserving digital material, including photographs, audio, video, email, digital records, and websites. For a more technical and specific discussion of digital formats, see the Library of Congress’s “Sustainability of Digital Formats.”

In general, the Library of Congress recommends that you:

  • Identify: Make an audit of what you have.
  • Decide which of your assets you want to keep and which you don’t need.
  • Organize your assets: Give them descriptive filenames, organize them into a logical file structure, and write down your organizational scheme.
  • Make copies. It’s a good idea to have copies in a number of locations. Every few years, check your copies to see if you need to export them to a newer format.

Developing a digital research workflow

There’s no “right” research workflow. The practice that makes sense for you will depend on your own research habits and the kinds of material you work with. As you investigate tools, think about:

  • Capturing sources. Do you do most of your research online, in an archive, or at the library? You’ll need a tool (or tools) that’s appropriate for the way you really work and easily captures the data you need in a format that’s preservable — and preferably in a way that’s organized.
  • Metadata. Few things are more frustrating than locating just the information you need but not being able to determine its origin. That’s why it’s important to think about how you’re capturing information about each asset you gather, like its source and its importance to your research.
  • Searching and retrieving. None of this does you any good if you can’t get your hands on the data you need when you need it. Metadata will help you find the right stuff, but you may also want to think about tools for OCR (optical character recognition) and for “fuzzy” searching.

You should also be thinking about whether and how you can export your data. That may seem boring now, but it won’t when the tool you’re using becomes obsolete!

Tools to Consider

This list is not comprehensive. Instead, it reflects my understanding of the tools my colleagues are actually using at the moment. Prices reflect educational discounts, if applicable. Am I missing something important? Please let me know in the comments!

Backup Tools
Yes, you need to be doing this. Why are you not doing this? Do it right now! Think about whether you want to store your backup on a hard drive or in the cloud — or both!

  • Time Machine (Mac, already installed on your computer, automatically backs up your data to a hard drive at scheduled intervals)
  • Windows Backup and Restore (Windows, already installed on your computer, backs up your data to a hard drive at scheduled intervals)
  • Mozy (Mac and Windows, $5.99/month, backs up your data remotely at scheduled intervals)
  • BackBlaze (Mac and Windows, $5/month, backs up your data remotely at scheduled intervals)
  • SpiderOak (Mac and Windows, free or $100/year, backs up your data remotely)
  • DropBox (Mac and Windows, free or $10–$20/month, backs up your data remotely)

Bibliographic Management
There are a lot of good options out there for saving, sorting, and citing your sources. The key point is that you really should be using some kind of bibliographic management system. You’ll regret it if you don’t.

File Renaming and Organization
If, for example, you take a lot of photos in an archive, you probably come home with tons of files with totally unintelligible names. Several tools can help you organize these assets and give them human-readable names.

  • NameDropper (Windows, $10, batch renamer that allows you to set patterns)
  • Belvedere (Windows, free, allows you to set rules to rename and organize files)
  • Dropbbox Automator (Windows and Mac, free, allows you to automatically perform actions on files in a Dropbox folder)
  • Hazel (Mac, $21.95, allows you to set rules to rename and organize files)
  • Automator (Mac, already installed on your computer, allows you to perform many actions on your files)

Indexers and “Everything Buckets”
Depending on how you work, you may find it important to grab and tag things — from the Internet or from “real life” — quickly and easily. There are some very good tools for this. Be careful, though: It’s not enough to grab something. You have to be able to find it again, too!

  • EverNote (Windows, Mac, Android, and iPhone; free or $45/year; captures and tags Web pages, photos, and other documents)
  • Yojimbo (Mac and iPhone, $38.99, capture and tag notes and documents)
  • VoodooPad (Mac and iPhone, $39.96, capture and tag notes and documents)
  • SOHO Notes (Mac and iPhone; $39.99; capture, tag, and organize notes and documents and create custom forms)
  • DEVONthink (Mac and iPhone; $49.95 for the personal edition; indexes your files, allows you to organize them and add notes and metadata, offers “fuzzy” searching)

Annotation Tools
This is one of the murkier categories, because many other kinds of tools have annotation capabilities built in: Zotero, EverNote, SOHO Notes, and VoodooPad, to name a few. But some of these solutions might be too much tool if you’re just in the market for annotation.

Tools to annotate websites:

  • AnnotateIt (take notes directly on any webpage and share those notes if you want; Windows and Mac, free).
  • Crocodoc (annotate and share webpages, PDFs, images, Word docs online; Windows and Mac, free).
  • Diigo (collect, highlight, annotate, and share websites; Windows, Mac, iPhone, iPad, and Android, free)
  • A.nnotate (annotate and share websites and PDFs; Windows, Mac, free for limited capabilities).

Tools to annotate PDFs and other documents:

  • GoodReader (annotate, highlight, comment on a wide range of files; iPhone and iPad, $4.99)
  • iAnnotate (annotate, highlight, comment on a wide range of files; iPhone, iPad, and Android, $9.99)
  • PDF Expert (annotate, highlight, comment on PDFs; iPad, $9.99)
  • Digitate (annotate images; iPad and iPhone, free)
  • ClipNotes (annotate video, designed by UCLA TFT prof Stephen Mamber; iPad, $1.99)

Optical Character Recognition (OCR)
Your sources become much more findable when your run OCR on them. Of course, depending on the kinds of sources you gather, OCR may be imperfect (or impossible). Find a more comprehensive list of open-source OCR tools here (thanks to Clemens Neudecker). OCR is often imperfect, but you can sometimes improve your results by using OCR post-correction and enhancement tools.

  • ABYY FineReader (Windows and Mac, $49.99 and $99.99, respectively)
  • Adobe Acrobat Pro (Windows and Mac, $404.10)
  • OCRopus (Mac and Linux, free)
  • PDF Scanner (Mac, $14.99, scans documents and performs OCR)
  • EverNote (see above; EverNote automatically performs OCR on your documents)
  • DEVONthink (see above; DEVONthink automatically performs OCR on your documents)

Databases
An “everything bucket” is a database, of course, but sometimes you need a tool that structures your data, too. Structure is great, but you should also be honest with yourself about whether the tool will fit easily into your workflow.

If you are contemplating building a database for your research, I strongly recommend that you first read Mark Merry’s “Designing and Using Databases for Historical Research” (you’ll need to register, but it’s worth the trouble). Merry lays out some basic principles of database design that will serve you well as your research progresses and your database grows.

  • askSam (Windows, $149.95, designed to make database-creation quick and easy)
  • Microsoft Access (Windows, $139.99)
  • FileMaker Pro (Windows, Mac, $179.00)
  • Bento (a lighter-weight version of FileMaker, Mac, $49.00)
  • Base (free, part of the OpenOffice suite, Windows, Mac, Linux)

Links on Developing a Digital Research Workflow

General Resources

I’m constantly adding links to research tools and methods here.

My starting point for finding digital research tools is the DiRT (Digital Research Tools) wiki.

William J. Turkel, at the University of Western Ontario, is the master of the digital research methodology, and his “Workflow for Digital Research Using Off-the-Shelf Tools” is an invaluable resource. If off-the-shelf tools are no longer enough for you, The Programming Historian is a wonderful, accessible way to learn programming techniques that will immediately enhance your research.

Profhacker, at the Chronicle of Higher Education, regularly publishes great advice on research tools.

I frequently check Lifehacker for excellent advice and tool recommendations.

The University of Amsterdam hosts the Digital Methods Initiative, which offers this extensive and useful tools wiki.

A number of scholars write frequently about their own research methods.

Preserving Assets

Smithsonian Institution: Born Digital Video Preservation: A Final Report (PDF)

Library of Congress Digital Preservation Program

DEVONthink

Scholars who use DEVONthink are often evangelical about it.

Automator

I’ve written about how I use Automator to batch-process research photos

EverNote

Gina Hiatt has written about how she uses EverNote

Kalani Craig has also written about EverNote for historical research

Bibliographic Tools

Brian Croxall did us all a great service by writing this definitive comparison of Zotero and Endnote

Tips on Taking Photos in Archives

An excellent guide from the University of Illinois, Urbana-Champaign.

5 thoughts on “Embarrassments of riches: Managing research assets

  1. Miriam,
    This is a wonderful post. I’ve been working in parallel (without knowing it!) to you for several years. I created a workshop : Productivity Tools for Graduate Students ( Companion LibGuide: http://libguides.gatech.edu/getresearchdone) for our students here at Georgia Tech and then began a Personal Knowledge Management blog with two other academic librarians (www.academicpkm.org). I found this post while searching “research workflow” and plan to include a link to it in an upcoming post on our blog for our week on “academic workflow” as a part of our year-long series on “A Year for Improving Productivity for Academics and Librarians”. Thanks for all your hard work!

  2. How cool! I need to refresh this post in preparation for doing a workshop this spring, and I plan to mine all the resources you’ve created. Thanks for posting, Crystal!

  3. I tried multiple times and it is not exporting a file at all, let alone the annotations. I tried to different locations and the problem is still the same. I tried by synchronizing but still did not work. Anybody facing a similar problem and having solution for that?

Leave a Reply

Your email address will not be published.