The National Academies of Sciences, Engineering and Medicine
Board on Research Data and Information
Policy and Global Affairs
Quick Links


Contact Us
Board on Research Data and Information
Policy and Global Affairs Division
The National Academies of Sciences, Engineering, and Medicine
500 Fifth Street, NW
Washington, DC 20001

Infinite opportunities from open scientific information online

A. Three kinds of digital information: data, software and source code, scientific literature

Scientific inquiry both produces and uses (and reuses) various types of information. Digital scientific information may be broadly classified into three major categories: data, analytical device and source code, and scientific literature--journal articles and other published analytical results, such as grey literature, books. These are categories of convenience, so they merit more detail below —

Data: This category is further divided into two buckets: first bucket comprises raw data, that is, readings collected by mechanical or automated or near automated methods through sensors or surveys. Examples include physical data such as meteorological or land cover parameters, demographic information collected via census surveys, or traffic or visitor counts collected either using light or mechanical sensors or even by a person with a clicker. The second kind of data are interpreted data, that is, all data that are not raw data. These data have been analyzed and interpreted by the scientist. In many circumstances, interpreted data themselves serve as input for further analysis and interpretation.

Software and source code: While the focus here is on software, this category represents the “device” utilized to analyze and manipulate the data. This could be a mechanical device with our without software within it, or it could be a software that does the analysis.
 Source code represents the opportunity to examine, modify and test the very “device” of analysis. In the case of software it would be the source code that could be examined, while in the case of a physical instrument, access to that instrument would facilitate independent verification.

Scientific literature:  The journal article or other analytical written research output is usually the culmination of the scientific process. It is the currency by which a researcher's progress and research capacity is measured. Journal articles are typically published via established, scientific societies or private publishing companies, some of which publish many journals and in many related or unrelated fields. 

Kind of Information
Kind of IP Protection
No or “thin” copyright, or trade secret and even patent
Analytical Device (software)
copyright, trade secret, patent
Source Code
copyright, trade secret, patent
Journal Articles

B.   Questions to consider in framing the project:
1. What are the different values to making public sector scientific information accessible to all?

- Research and education.
- Economic.
- Cultural Heritage.
- Social Value.
- Reputational Value: Academic merit to the researcher who first discovered or invented.

2. What indicators could be used to measure the socioeconomic effects of such information in an open access model?

3. Develop methodology to measure the costs and benefits of open access. A few issues:
- How do you value contributions to the collective?
      - Reward innovation that improves the human quality of life?
      - Information should be open unless it needs to be closed?
      - Attribution is not the same as citation (see John Wilbank’s blogpost).
C. Scope of the Project?
- Should the information be produced by the government only or also government-funded?
- Should the project be discipline bound? For example: Geospatial data, public health data.
We can pick examples from each discipline with different attributes e.g., basic research vs applied research.
- Focus on a. Federal Agencies  b. State and Local Government? c. Private organizations: Use/Reuse of Public Sector data by the private sector.
- Geographic focus: United States.

Other questions:
1.    What is the cost of information produced?
2.    What will be the cost of not disseminating the information? (e.g., Landsat privatization; see Bits of Power analysis).
3.    Opportunity cost of losing the benefits of disseminating information.
4.    Externalities: Both positive and negative of digital public sector scientific information, including network effects.

D. Preliminary References:
1. See the list of papers in "The Socioeconomic Effects of Public Sector Information on Digital Networks" report.,3343,en_2649_34223_40046832_1_1_1_1,00.html

2. Peter Weiss Borders in Cyberspace: Conflicting Public Sector Information Policies and their Economic Impacts. 

David Newbery, Lionel Bently, and Rufus Pollock. 2008. Models of Public Sector Information Provision via Trading Funds. Found at

Stéphane Roche, et al. 2007. EcoGeo Project. 

Thomas Rogers and Andrew Szamosszegi. 2007. Fair Use in the U.S. Economy: Economic Contribution of Industries Relying on Fair Use. Found at

Ed Mayo and Tom Steinberg. 2007. The Power of Information: An Independent Review. Found at 

Pilar Garcia Almirall, Montse Moix Bergadà, and Pau Queraltó Ros. 2008. The Socio-Economic Impact of the Spatial Data Infrastructure of Catalonia. Edited by Max Craglia. 2007. Found at  

Irving Leveson. 2006. Benefits of the New GPS Civil Signal: The L2C Study. Found at  

Office of Fair Trading, United Kingdom. 2006. The Commercial Use of Public Information (CUPI). Found at  

Bastiaan Van Loenen. 2006. Developing Geographic Information Infrastructures: The Role of Information Policies. Found at  

Rishab Aiyer Ghosh, et al. 2006. Economic Impact of Open Source Software on Innovation and the Competitiveness of the Information and Communication Technologies (ICT) Sector in the EU. Found at

E. Phone conversation with Graham Vickery: OECD
-Mentioned Bob Cohen, independent researcher. To discuss commercial applications.
-    See proposal sent by Graham Vickery: OECD
-  Mentioned a program in Canada: CANARIE