Introduction to
Research Data Management


Nicholas Wolf & Vicky Rampin


The Problem

Funder see researchers working with a lot of data...

...but how should it be organized?

And how do we avoid this: "Most Scientific Research Data From the 1990s Is Lost Forever"

Article in the Atlantic

A new study has found that as much as 80 percent of the raw scientific data collected by researchers in the early 1990s is gone forever, mostly because no one knows where to find it.

Disappearing Data

...and this: human error


Washington Post: "An Alarming Number of Scientific Papers Contain Excel Errors"

The Solution

Spell out in detail how you will account for this in your grant (Data Management Plan)


Managing the way data is collected, processed, analyzed, preserved, and published for greater reuse by the community and the original researcher.

What is Data?

"the recorded factual material commonly accepted in the scientific community as necessary to validate research findings." -Federal Office of Management & Budget Circular A-110

Federal Regulations

High-Level View of RDM

Data Type Group Roles Data Storage Data Archiving
format of data to be generated who is primarily responsible for carrying out RDM? Set group norms where will you store your data and how will you backup your data? how will you preserve and make your data available to others?

Basically, think to yourself:

if I wanted to use this data in 10 years, what would I need to pack with it to make it useful?

Keep all those things

Managing Your Personal Research Archive

  • Master bulk file renaming and adapt bibliographic management tools like Zotero for file documenting
  • Get into the practice of generating documentation/table of contents files (often called a README) in a sustainable format like .txt or .html
  • Sign up for the Data Services Managing a Personal Research Archive class

...or Document with the Open Science Framework

  • Wiki: document your lab procedures, standards, etc.
  • Collaborators: add collaborators of all levels, on different parts of your project
  • Components: sub-projects to organize your research
  • Version Control: upload files of the same name & OSF will track your versions!
  • Add-Ons: use OSF to bring together tools you use | GitHub
  • Registrations: when you have an unchanging version of your project, register it & get a DOI!

Storage Rules!

Do some comparison of file storage options at NYU

NYU Storage Resources

  NYU Google Drive NYU Box NYU Research Workspace
Intended use General data use requiring password access General data, including sensitive or secure data High-capacity data storage
Storage size Unlimited Unlimited 2 TB
Sharing and user control Yes Yes Yes
Versioning and file change tracking Yes Some Snapshots of files
Funder requirements Moderate risk security High risk security U.S. based data location

Long Term Storage

Choose what you want to preserve/get to in the long term, but No matter WHAT, make sure you keep:

  • documentation (lab/field notebooks, etc.)
  • tools & analysis
Put your data into an archival format!

  • this should be open + accessible
  • Software agnostic

Post-Project Move to a Repository

When you publish, you may make portions of the underlying data available in a repository. See our guide for selecting a repository

Key archives in the discipline:

  • Qualitative Data Repository: https://qdr.syr.edu/
  • ICPSR

Data Management To-Do List

1. Create a Researcher Identity

Open Researcher & Contributor ID

  • free! persistent identifier for researchers (think DOI)
  • link all your publications to you rather than someone with your same name!
  • many journals are asking for an ORCID upon submission of materials

Do you have one? No? Let’s get you an ORCID.org!


2. Get a Home for Your Research

Open Science Framework

  • Wiki for documentation!
  • Collaborators of all levels, on different parts of your project!
  • Components: sub-projects to organize your research!
  • Add-Ons: use OSF to bring together tools you use!

3. Learn More about Data Management

(Free) Library Classes

  • Managing a Personal Research Archive
  • Extracting Text and Data from Files Using Optical Character Recognition (OCR)

Look at our calendar to find classes and sign up.

You can also refer to the AAA's presentation -- Cultural Anthropology: Principles and Practices of Digital Data Management

4. Know What Data Management Funders Want

From NSF CA-DDRIG

The DMP should address the following questions:

  • What kinds of data, software, and other materials will your research produce?
  • How will you manage them (e.g., standards for metadata, format, organization, etc.)?
  • How will you give other researchers access to your data, while preserving confidentiality, security, intellectual property, & other rights and requirements?
  • How will you archive data and preserve access in the short and the long term?

PIs are encouraged to consult the American Anthropological Association's Statement on Professional Ethics, Sections 5, "Make Your Results Accessible," and 6, "Protect and Preserve Your Records" (http://ethics.americananthro.org/category/statement).

DMPTool Online

DMPTool online's home logo, with the phrase 'Build Your Data Management Plan' DMPTool Online

Example Data Management Plans

Thank you! Questions?


Email us: vicky.rampin@nyu.edu or nicholas.wolf@nyu.edu

Learn more about RDM: guides.nyu.edu/data_management

Get this presentation: guides.nyu.edu/data_management/resources

Make an appointment: guides.nyu.edu/appointment