Archiving Your Research Data
Storing versus Archiving
Although often used interchangeably in common vernacular, storing and archiving research data are different activities. As the image below illustrates, storage is a necessary step towards archiving data; however, storing data (e.g., on an external drive) does not safeguard against media degradation (e.g., CD file corruption), obsolescence of data formats (e.g., VisiCalc spreadsheets) or providing easy access in the future (e.g., search interface). Archiving encompasses both active preservation of the digital object and increased discoverability and access to those data. Your data management plan should discuss how you will store your research data during the project and your preservation strategy for after the project, particularly of research data that will be reused and shared.
A data archive is a digital system, often web-accessible, that both provides an interface for research data discovery and downloading, but also manages the preservation of the digital objects deposited into it. Archives support the use of persistent electronic identifiers, such as DOI® like those used for journal articles, which allow for easy citation and attribution of your shared dataset. Finally, archives pass responsibility of managing your research data to a third-party, leaving you with more time to focus on conducting your research.
The concept of a data archive may contain some combination of the following words in its title including center, repository, bank, and library, though the purpose is the same. However, not all data archives provide the same level of service. For example, archives may vary in how often they check for corruptions of files, number of copies they retain for each file and even the detail of description that they provide for a given research project and corresponding files. Finally, it is important to consider the financial sustainability of an archive when deciding where to archive your research data.
The JHU Data Archive
While some academic disciplines have established research data archives, (e.g., ICPSR for social science data and the National Virtual Observatory for astronomy), many disciplines do not have archives available. At Johns Hopkins University, JHU PIs may deposit their research data into the JHU Data Archive. The Archive is currently in testing phase, with new architecture being rolled out incrementally in 2013.
Unique characteristics of the JHU Data Archive
- Cross-disciplinary data (discipline agnostic)
- Data integration framework that facilitates cross-cutting queries
- Preservation-ready system
A JHU Data Management Services consultant can assist you with determining which archive, including the JHU Data Archive, may be suitable for your particular research data.
If you are interested in depositing your data into the JHU Data Archive, please contact us to discuss your research with us prior to submitting your proposal. It is important that JHU Data Management Services works with you to consider the specifics of your long-term archiving needs and JHU Data Archiving policies, including the cost of using the services. A fee of 2% total direct costs on an NSF grant provides 5-years archiving for the project’s data, with the option for an extension, and our expert support helping you prepare data for preservation and sharing.
Text licensed under Creative Commons, unless otherwise noted.