Digital Scholarship

Personal Collections Overview

Most researchers' personal collections take two main forms: one, collections of physical materials (index cards, photocopied articles and chapters, transcripts, photographs, video and audio tapes); and, two, "born digital" documents (Word documents, digital photos, audio recordings, scans, PDFs collected from online publications or scanned at a library or archive, data files, results of analyses) stored on hard drives, CD and DVDs, diskettes, and on cloud storage like Dropbox or Box.net. Regardless of the form, organizing and preserving your collections is critical to increasing their value as a resource for your own research.  

There are a established principles, best practices, and recommendations to quide you with saving and organizing your research collections, listed below. Descriptions of the items in your collection in a structured way (metadata) that help you locate them, group them by shared characteristics, and generally ensure you get the most value from the time you spent collecting them. Please see the metadata page for more information.

1. Principles and best practices

   A. File names

   B. File formats

   C. Backups

   D. The critical role of metadata

2. Resources for managing files and formats

1. Principles and best practices

A. File names

Each of your files should have a unique file name that derived from sort of system that will help you identify and/or locate the file. As you may already know, without unique file names, you may end up with multiple files, such as "draft.doc" or "image001."

You may want to come up with common identifier to refer to all files related to a given project and then some sort of identification for each item. It's a good idea to set up a clear structure so that you can easily find what you're looking for. When constructing file names, you want to consider including in file names:

  • Project name or acronym
  • Date or date range, using a consistent fomat, such as YYYYMMDD
  • Brief description of document
  • Version of file

Other tips for file naming:

  • Try not to make file names too long, since long file names do not work well with all types of software
  • Avoid special characters such as  ~ ! @ # $ % ^ & * ( ) ` ; < > ? , [ ] { } ' " |
  • When using a sequential numbering system, using leading zeros for clarity and to make sure files sort in sequential order. For example, use "001, 002, ...010, 011 ... 100, 101, etc." instead of "1, 2, ...10, 11 ... 100, 101, etc."
  • Do not use spaces. Instead, use underscores between words, e.g. file_name.
  • In general, use lower case, as problems may occur with applications that are case-sensitive.

Resources:
http://acrl.ala.org/techconnect/?p=2607
http://bentley.umich.edu/dchome/resources/filenaming.php

B. File formats

Care should be used when using and saving file formats for long-term preservation. Stable, non-proprietary formats are best, as there is no guarantee that a software provider will support a current proprietary format into the future. While you wish to maintain your working documents in their current formats, you should be careful to back up your files and ensure that the files will readable with future versions of the software. This also includes migrating the files to new media. It is good practice to preserve key files in stable formats. Recommended long- and medium-term archival formats include:

Text: plain ASCII text, PDF/A
Images:
 uncompressed: tif, jpeg2000, png
 compressed: jpeg, tiff, gif, bmp, jpeg2000
Audio: aiff, wav, aac, mp3
Video: avi, mov, mpeg-1, mpeg-3, mpeg-4
Spreadsheet: csv, txt, dbf, sxc, ods, xlsx

Resources:
http://fclaweb.fcla.edu/uploads/Lydia%20Motyka/FDA_documentation/recFormats.pdf

 C. Backups

A good rule of thumb is lots of copies keeps stuff safe. To properly preserve your important files, it is recommended to:

  • Make three copies
  • Have at least two of the copies on two different types of media
  • Keep one copy in a different location from where you live / work

Source:
http://pages.vassar.edu/digitallibrary/resources-and-links/digital-archiving-and-preservation-information/

D. The critical role of metadata

The use of metadata in describing digital files is crucial to their long-term usefulness and preservation.  Good metadata (i.e., descriptive fields that are thoughtfully and consistently applied) increase current and future usefulness of digital files by:

  • Improving description of items
  • Improving reliable searching and retrieval of items within a collection
  • Enabling compatibility with other collections (allowing cross-database searching)
  • Making maintenance and potential migration of data easier over time

Please see the metadata page for information on creating good metadata.

2. Resources for managing files and formats

Images

Video

Audio

Personal files

Scanning your personal collection

Storage media longevity

For more information, visit the Library of Congress Personal Digital Archiving website, digital preservation.gov.