Infinite opportunities from open scientific information online
A. Three kinds of digital information: data, software and source code, scientific literature
Scientific inquiry both produces and uses (and reuses) various types of information. Digital scientific information may be broadly classified into three major categories: data, analytical device and source code, and scientific literature--journal articles and other published analytical results, such as grey literature, books. These are categories of convenience, so they merit more detail below —
Data: This category is further divided into two buckets: first bucket comprises raw data, that is, readings collected by mechanical or automated or near automated methods through sensors or surveys. Examples include physical data such as meteorological or land cover parameters, demographic information collected via census surveys, or traffic or visitor counts collected either using light or mechanical sensors or even by a person with a clicker. The second kind of data are interpreted data, that is, all data that are not raw data. These data have been analyzed and interpreted by the scientist. In many circumstances, interpreted data themselves serve as input for further analysis and interpretation.
Software and source code: While the focus here is on software, this category represents the “device” utilized to analyze and manipulate the data. This could be a mechanical device with our without software within it, or it could be a software that does the analysis.
Source code represents the opportunity to examine, modify and test the very “device” of analysis. In the case of software it would be the source code that could be examined, while in the case of a physical instrument, access to that instrument would facilitate independent verification.
Scientific literature: The journal article or other analytical written research output is usually the culmination of the scientific process. It is the currency by which a researcher's progress and research capacity is measured. Journal articles are typically published via established, scientific societies or private publishing companies, some of which publish many journals and in many related or unrelated fields.
Kind of Information
Kind of IP Protection
No or “thin” copyright, or trade secret and even patent
Analytical Device (software)
copyright, trade secret, patent
copyright, trade secret, patent
B. Questions to consider in framing the project:
1. What are the different values to making public sector scientific information accessible to all?
- Research and education.
- Cultural Heritage.
- Social Value.
- Reputational Value: Academic merit to the researcher who first discovered or invented.
2. What indicators could be used to measure the socioeconomic effects of such information in an open access model?
3. Develop methodology to measure the costs and benefits of open access. A few issues:
- How do you value contributions to the collective?
- Reward innovation that improves the human quality of life?
- Information should be open unless it needs to be closed?
- Attribution is not the same as citation (see John Wilbank’s blogpost).
C. Scope of the Project?
- Should the information be produced by the government only or also government-funded?
- Should the project be discipline bound? For example: Geospatial data, public health data.
We can pick examples from each discipline with different attributes e.g., basic research vs applied research.
- Focus on a. Federal Agencies b. State and Local Government? c. Private organizations: Use/Reuse of Public Sector data by the private sector.
- Geographic focus: United States.
1. What is the cost of information produced?
2. What will be the cost of not disseminating the information? (e.g., Landsat privatization; see Bits of Power
3. Opportunity cost of losing the benefits of disseminating information.
4. Externalities: Both positive and negative of digital public sector scientific information, including network effects. D. Preliminary References
1. See the list of papers in "The Socioeconomic Effects of Public Sector Information on Digital Networks" report.http://www.nap.edu/catalog.php?record_id=12687 http://www.oecd.org/document/48/0,3343,en_2649_34223_40046832_1_1_1_1,00.html
2. Peter Weiss Borders in Cyberspace: Conflicting Public Sector Information Policies and their Economic Impacts.
Thomas Rogers and Andrew Szamosszegi. 2007. Fair Use in the U.S. Economy: Economic Contribution of Industries Relying on Fair Use. Found at