Board on Research Data and Information
Policy and Global Affairs Division
The National Academies of Sciences, Engineering, and Medicine
500 Fifth Street, NW
Washington, DC 20001
Phone: (202) 334-2616
Past publications of the USNC-CODATA and of other predecessor NRC projects.
All publications are available free online or for purchase through the National Academies Press.
|International Coordination for Science Data Infrastructure: Proceedings of a Workshop—in Brief (February 2018)|
Advances in science and technology have led to the creation of large amounts of data—data that could be harnessed to improve productivity, cure disease, and address many other critical issues. Consensus in the scientific community is growing that the transition to truly data-driven and open science is best achieved by the establishment of a globally interoperable research infrastructure. A number of projects are looking to establish this infrastructure and exploit data to its fullest potential. Several projects in the United States, Europe, and China have made significant strides toward this effort. The goal of these projects is to make research data findable, accessible, interoperable, and reusable, or FAIR. The expected impact and benefits of FAIR data are substantial. To realize these benefits, there is a need to examine critical success factors for implementation, including training of a new generation of data experts to provide the necessary capacity. On November 1, 2017, the National Academies organized a symposium to explore these issues. This publication briefly summarizes the presentations and discussions from the symposium.
|Preparing the Workforce for Digital Curation (April 2015)|
The massive increase in digital information in the last decade has created new requirements for institutional and technological structures and workforce skills. Preparing the Workforce for Digital Curation focuses on education and training needs to meet the demands for access to and meaningful use of digital information, now and in the future. This study identifies the various practices and spectrum of skill sets that comprise digital curation, looking in particular at human versus automated tasks. Additionally, the report examines the possible career path demands and options for professionals working in digital curation activities, and analyzes the economic benefits and societal importance of digital curation for competitiveness, innovation, and scientific advancement. Preparing the Workforce for Digital Curation considers the evolving roles and models of digital curation functions in research organizations, and their effects on employment opportunities and requirements. The recommendations of this report will help to advance digital curation and meet the demand for a trained workforce.
|The Future of Scientific Knowledge Discovery in Open Networked Environments: Summary of a Workshop (December 2012)|
Digital technologies and networks are now part of everyday work in the sciences, and have enhanced access to and use of scientific data, information, and literature significantly. They offer the promise of accelerating the discovery and communication of knowledge, both within the scientific community and in the broader society, as scientific data and information are made openly available online. The focus of this project was on computer-mediated or computational scientific knowledge discovery, taken broadly as any research processes enabled by digital computing technologies. Such technologies may include data mining, information retrieval and extraction, artificial intelligence, distributed grid computing, and others. These technological capabilities support computer-mediated knowledge discovery, which some believe is a new paradigm in the conduct of research. The emphasis was primarily on digitally networked data, rather than on the scientific, technical, and medical literature. The meeting also focused mostly on the advantages of knowledge discovery in open networked environments, although some of the disadvantages were raised as well.
The Case for International Sharing of Scientific Data: A Focus on Developing Countries
The theme of this international symposium is the promotion of greater sharing of scientific data for the benefit of research and broader development, particularly in the developing world. This is an extraordinarily important topic. Indeed, I have devoted much of my own career to matters related to the concept of openness. I had the opportunity to promote and help build the open courseware program at the Massachusetts Institute of Technology (MIT). This program has made the teaching materials for all 2,000 subjects taught at MIT available on the Web for anyone, anywhere, to use anytime at no cost. In countries where basic broadband was not available, we shipped it in on hard drives and compact disks. Its impact has been worldwide, but it has surely had the greatest impact on the developing world. I am also a trustee of a nonprofit organization named Ithaca that operates Journal Storage (JSTOR) and other entities that make scholarly information available at very low cost.
|For Attribution – Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop (2012).|
The growth of electronic publishing of literature has created new challenges, such as the need for mechanisms for citing online references in ways that can assure discoverability and retrieval for many years into the future. The growth in online datasets presents related, yet more complex challenges. It depends upon the ability to reliably identify, locate, access, interpret, and verify the version, integrity, and provenance of digital datasets. Data citation standards and good practices can form the basis for increased incentives, recognition, and rewards for scientific data activities that in many cases are currently lacking in many fields of research. The rapidly-expanding universe of online digital data holds the promise of allowing peer-examination and review of conclusions or analysis based on experimental or observational data, the integration of data into new forms of scholarly publishing, and the ability for subsequent users to make new and unforeseen uses and analyses of the same data-either in isolation, or in combination with, other datasets.
|Designing the Microbial Research Commons: Proceedings of an International Workshop (2011)|
Recent decades have witnessed an ever-increasing range and volume of digital data. All elements of the pillars of science--whether observation, experiment, or theory and modeling--are being transformed by the continuous cycle of generation, dissemination, and use of factual information. This is even more so in terms of the re-using and re-purposing of digital scientific data beyond the original intent of the data collectors, often with dramatic results.