The National Academies of Sciences, Engineering and Medicine
Board on Research Data and Information
Policy and Global Affairs
Quick Links


Contact Us
Board on Research Data and Information
Policy and Global Affairs Division
The National Academies of Sciences, Engineering, and Medicine
500 Fifth Street, NW
Washington, DC 20001

Finding the Needle in the Haystack:
A Symposium of the Board on Research Data and Information on
Strategies for Discovering Research Data Online
National Academy of Sciences Auditorium
2100 C Street NW, Washington, DC
February 26, 2013
3:00 pm - 5:30 pm
One of the problems recognized by experts and casual data users alike has been the inability to find the full array of research databases or factual compilations that are needed to support any given query. As data continue to proliferate and research becomes more data intensive, the discoverability of factual references also grows in importance. For research funders and policymakers, there is a need to better understand data productivity and trends in science, both quantitatively and qualitatively. Yet the deluge of information and the diversity of the datasets makes the task for all users of data and facts that much more difficult.
The importance of the problem is not hard to grasp, but the solutions to the better discoverability online of data are not necessarily straightforward or obvious. Libraries have traditionally grouped literature by topic, using tools such as controlled vocabularies and onotologies, and also developed systems of other identifiers, such as call numbers and shelf lists to help the user locate them in the print context. However, physical collocation is not always a necessary or practical strategy for digital datasets. As with literature, the creation of registries, catalogs, and directories may be part of the solution, but we also need to consider online search engines, persistent identifiers and associated metadata, as well as better citation and reference practices in order to enable visibility of and access to digital data. Ontologies created for literature may not always adapt well for use with data sets, particularly if we are to tap the rich potential of unanticipated reuse of data by researchers in other disciplines different from those of the original creators. Links to and  between different information sources and objects may be important mechanisms for improving the “findability” of data too.
Despite the proliferation of models and solutions in various disciplines and sectors, there is a recognized need for a pervasive infrastructure, standardization of approaches, and the usual questions of who does what, where, and how? This symposium therefore seeks to highlight some of these different approaches, providing examples that are both broadly interdisciplinary as well as discipline-specific to finding the right data at the right place in the right time. Although we will not offer any common solutions to this set of problems, we do hope to shed some light on the underlying issues and provide an opportunity for experts working in this area to interact, both among each other and with the audience.
The co-chairs of the Board on Research Data and Information, Clifford Lynch of the Coalition on Networked Information, and Francine Berman of the Rensselaer Polytechnic Institute, will lead the symposium discussion, beginning at 3 p.m. on Tuesday, February 26. The event will continue for 2 ½ hours in a mix of short presentations and discussion. The entire proceedings will be recorded and an audio-tape will be archived on the Board’s website. The symposium is free and open to the public, but advance registration is requested. The meeting will be followed by a reception outside the main auditorium.
Preliminary Agenda            

3:00 pm   Framing the issue and introduction of the first panel
                  - Clifford Lynch, CNI  [ MP3 (8.3 MB) ]
3:10          Panel One: Interdisciplinary Approaches to Discovering Data Online
                  Session Chair: Clifford Lynch, CNI
                    - Data Cite and, Lorrie Johnson, OSTI  [ PPT ] [ MP3 (13 MB) ]
                    - Data Citation Index: Unlocking Hidden Data, Michael Takats, Thomson Reuters  [ PPT ] [ MP3 (15.6 MB) ]
                    - Information Types and Registries, Giridhar Manepalli, Corporation for National Research Initiatives  [ PPT ] [ MP3 (19.2 MB) ]
                    - General Discussion  [ MP3 (9.8 MB) ]
4:20         Panel Two: Discipline-Specific Examples of Discovering Data Online
                 Session Chair: Francine Berman, RPI  [ MP3 (1.6 MB) ]
                 - Data, Data Everywhere, But Not a Byte to Eat, Michael Huerta, National Library of Medicine, 
                        National Institutes of Health (NIH)  [ PPT ] [ MP3 (11.5 MB) ], and
                    Finding Disease Data: The Autism Example, Gregory Farber, National Institute of Mental Health, NIH   [ PPT ] [ MP3 (11.5 MB) ]
                 - Strategies for Finding Earth Observation Research Project Data, Suzie Allard, University of Tennessee and 
                         the DataOne Project   [ PPT ] [ MP3 (10.3 MB) ]
                 - Developments in Data Discovery at ICPSR, George Alter, ICPSR  [ PPT ] [ MP3 (9.4 MB) ]
                 - General Discussion  [ MP3 (21 MB) ]
5:30           End of Symposium – Reception
Please accept our apologies for the audio issues experienced during the presentations. At some points, the microphone cut out and this is, unfortunately, reflected in the recordings.  

How to get there by Metro:
  • Take the Orange or Blue Line to Foggy Bottom-GWU metro stop.
  • Turn right when you exit the metro.
  • Walk south down 23rd St, NW for approx 7 blocks.
  • Turn left onto C St, NW (after the State Dept.)
  • Cross 22nd St.
  • The main entrance is in the Front of the building
    (2101 Constitution Ave.).
  • A secondary entrance is located at 2100 C Street

If you are driving, the parking lot entrance is at the intersection of 21st Street and C Street, NW. Parking at the Academy building is on a first-come-first-served basis and cannot be reserved in advance.