Skip to Main Content

Research Data Management

“When we have all data online it will be great for humanity. It is a prerequisite to solving many problems that humankind faces.” – Robert Cailliau, Belgian informatics engineer and computer scientist

Data Mandates

In order to expand access to results of federally funded research data the White House's Office of Science and Technology Policy issued a policy memorandum in February 2013. This memorandum directing Federal agencies with more than $100M in Research and Development expenditures to develop plans to make the published results of federally funded research freely available to the public within one year of publication and requiring researchers to better account for and manage the digital data resulting from federally funded scientific research.

Directive to Expand Public Access

Federal agencies have begun releasing plans that outline the requirements for publicly funded research to be made public. These plans apply to both the publications and the scientific data used in the research. A major component of these plans is the requirement that researchers provide a data management plan as part of a grant application. The data management plan will become one of the criteria by which grants are evaluated.

Oregon State provides links to each agency's Federal Public Access Plan, and updates the list as new ones are released.

National Science Foundation (NSF) Data Management Plan

The National Science Foundation requires a two-page Data Management Plan to be submitted with every grant application.  Researchers are expected to share their primary research data in a timely and efficient manner.  The data management plan should facilitate sharing, and the plan will be considered as part of the overall merit of the grant.

The NSF states that:

  1. "This supplement [data management plan] should describe how the proposal will conform to NSF policy on the dissemination and sharing of research results (see AAG Chapter VI.D.4), and may include:
  2. the types of data, samples, physical collections, software, curriculum materials, and other materials to be produced in the course of the project;
  3. the standards to be used for data and metadata format and content (where existing standards are absent or deemed inadequate, this should be documented along with any proposed solutions or remedies);
  4. policies for access and sharing including provisions for appropriate protection of privacy, confidentiality, security, intellectual property, or other rights or requirements;
  5. policies and provisions for re-use, re-distribution, and the production of derivatives; and
  6. plans for archiving data, samples, and other research products, and for preservation of access to them."

Each directorate of the NSF is drafting its own data management guidelines, these quidelines are available here.  The norms of each discipline will guide what standards and sharing mechanisms will be expected for NSF-funded data. The NSF FAQ page provides further guidance.

See also, the Association for Research Libraries' Guide to the NSF Data Sharing Policy.

Public Access to Results of NSF-funded Research

The National Science Foundation has developed a plan to increase public access to scientific publications and digital scientific data resulting from research funded by NSF This plan, entitled “Today’s Data, Tomorrow’s Discoveries,” is consistent with the objectives set forth in the Office of Science and Technology Policy's Feb. 22, 2013, memorandum, "Increasing Access to the Results of Federally Funded Research," and with long-standing policies encouraging data sharing and communication of research results.

NSF require that either the version of record or the final accepted manuscript in peer-reviewed scholarly journals and papers in juried conference proceedings or transactions must:

  1. Be deposited in a public access compliant repository designated by NSF; NSF requires principal investigators who publish peer-reviewed journal articles or  juried conference papers to deposit a copy of the items (either the final accepted version or the  version of record in the NSF public access repository hosted by the Department of Energy (DOE). The NSF public access repository is expected to be  available for voluntary compliance by the end of the 2015 calendar year. At this time, NSF has not  formally adopted ISO 16363, a recommended practice for assessing the trustworthiness of digital  repositories. As outlined in NSF's public access plan (section 7.7), "DOE stores and preserves the  information in a dark archive in a climate-controlled, appropriate environment in Oak Ridge, TN,  with redundant, backup systems in geographically distinct locations. DOE accommodates both the  widely used non-proprietary PDF and PDF/A formats and can convert material in PDF to PDF/A,  should the need arise."
  2. Be available for download, reading and analysis free of charge no later than 12 months after initial publication;
  3. Possess a minimum set of machine-readable metadata elements in a metadata record to be made available free of charge upon initial publication;
  4. Be managed to ensure long-term preservation; and
  5. Be reported in annual and final reports during the period of the award with a persistent identifier that provides links to the full text of the publication as well as other metadata elements.

These requirement will apply to new awards resulting from proposals submitted, or due, on or after the effective date of the Proposal & Award Policies & Procedures Guide (PAPPG) that will be issued in January 2016.

NSF's current data management plan requirement and policies on costs of publication and data citation in biographical sketches will remain unchanged for the present while NSF undertakes activities to engage the research communities around data management in support of public access goals. Additional guidance at the Foundation, directorate, division, office or program levels may become available in the future. As stipulated in section 3.a.ii of the OSTP Feb. 22, 2013, memorandum, NSF's plan (section 7.5) discusses a "mechanism for stakeholders to petition for changing the embargo period."

See: Public Access To Results of NSF-funded Research for additional details. 

Se also: Public Access: Frequently Asked Questions, NSF 1606

Legal and Ethical Considerations

Anyone creating or using data in their research must be aware of and abide by legal and ethical guidelines.

Legal Guidelines

Copyright law governs the expression of data.  See the Libraries' Copyright Guide for additional information and links.  Although raw data or "facts" are not copyrightable, any arrangement of data within a database, or a selection or expression of data, such as in a table may be copyrighted.

License agreements often govern the use of data. Researchers must ensure that they abide by the terms of use of any data they access.  You can share your own research data under specific licenses.  Creative Commons has a series of licenses, including the extremely open CCZero license which allows the free use of the data for any purpose.

Ethical Guidelines

Data should be collected in an ethical manner, stored securely, and closely reviewed before distribution to avoid the disclosure of confidential information.  

Missouri S&T researchers should comply with the guidelines outlined by the the Missouri S&T Institutional Review Board. Health research is also subject to HIPAA rules.

ICPSR has information on maintaining confidentiality and when evaluating a public release version of data, and how to distribute sensitive data under restricted use contracts.

The UK Data Archive's Data Security page contains guidance on storing and protecting confidential data.