Are you interested in contributing to HLWIKI International? contact '"If people put data on the web – government data, scientific data, community data – it will be used by other people to do wonderful things in ways that they could never have imagined." – Sir Tim Berners-Lee
To browse other articles on a range of HSL topics, see the A-Z index.
- 28 April 2014
See also Data glossary | Data management portal | Finding medical / health care statistics online | Metadata | Open source | Semantic web |
"Access to data is fundamental if researchers are to reproduce, verify and build on results that are reported in the literature … The presumption must be that, unless there is a strong reason otherwise, data should be fully disclosed and made publicly available. In line with this principle, where possible, data associated with all publicly funded research should be made widely and freely available... — Walport, 2011 in the Lancet
Open data is a 21st century concept referring to data that is open and free to (re)use and examine without barriers or restrictions due to copyright, patents or controls of any kind. According to OpenDefinition.org, open data is "...data that can be freely used, reused and redistributed by anyone subject to the requirement to attribute and sharealike of the Creative Commons movement". (See the Open Data Commons Attribution License.) The Open Knowledge Foundation has produced a helpful Open Data Handbook which introduces you to the legal, social and technical aspects of open data.
Open data is similar to other information trends that have emerged in the Internet era, and is related philosophically to open source and open access. Some health librarians have started to participate in discussions about data due to their involvement in e-Science and data curation. An associated discourse within the open data movement is the notion of transparency and accountability inherent in making clinical datasets widely-available; this is particularly important in government-funded research (i.e., CIHR, NIH). A vast amount of data produced by governments (see also civic media), researchers and universities is not shared or made easily-accessible on the web. Where does this data go? There is a movement to make public institutions' data more widely available and to allow researchers to remix, reuse and repurpose it. For more open exchange of data to occur, data warehouses need to be made more widely-accessible to everyone.
In 2013, it was announced that Wikidata, an offshoot of Wikipedia, and centralized repository for data and facts, now feeds information for Wikipedia. For clinicians interested in tracking down "missing data", see Missing Data UK.
What is Open Data?
Open data is both a philosophical orientation and a practice that seeks to make data available for free (in machine-readable formats) to all without restriction. Open data formats run the gamut from MS Excel to extremely large bioinformatics datasets. In being free and open, open data is similar conceptually to open access and other open movements. Open data is concerned with making data more freely-accessible especially where it might lead to innovation but also to more transparency in governments; it is linked to social technologies in the cloud (e.g. social media), and the semantic web. To be truly open, data needs to be present on platforms with no restrictions, controls or barriers. Academic libraries can perform some archival functions by curating data; and, by making metadata in their catalogues free and open. Future roles for librarians includes explaining these trends. When speaking of open research data, we refer to information that might result from conducting research in clinical trials which may include datasets, microarray, numerical data, textual records, images or multimedia. Data critical for researchers helps to validate findings, observations and hypotheses. By making data open you encourage knowledge production - recognized as critical to solving society's most difficult problems.
Creative Commons plays a key role in promoting openness in science. Events such as this one in Auckland demonstrate the concern about open science that the community shares with Creative Commons.
Pros & cons
- open data may be complex, hard to understand, and packaged in a way that makes it inaccessible
- there are few good apps to find and assemble data
- there are no incentives and ecosystems to make data usage practical and sustainable
- come questions such as "who updates the data?" and "what can they really use it for?" go unanswered
- open data is a great trend in open access; visuals are cool, statements are bold, but what's in it for information people?
- how does open data help us do our work more efficiently
Select open data projects & initiatives
4 stars denotes librarian-selected, high quality information. Starred sites are great places to begin your research.
- Databib is a tool for helping people identify and locate online repositories of research data
- DataCite Canada's services are offered in cooperation with DataCite, an international consortium of national-scale libraries and research organizations committed to increasing access to research data on the Internet
- DataCite Canada is DataCite's DOI allocation agent for Canada
- DataCite promotes the value of data archiving, citation and discoverability within Canada
- Research Data Canada is a collaborative effort to address the challenges and issues surrounding the access and preservation of data arising from Canadian research. This multi-disciplinary group of universities, institutes, libraries, granting agencies, and individual researchers has a shared recognition of the pressing need to deal with Canadian data management issues from a national perspective.
Select examples of other open data
See also Open access
This conference showcased the breadth and depth of health data for researchers, planners, academics and decision-makers. It provided excellent opportunities to share information and expand our knowledge. Health Data: Pushing the Boundaries was the theme of Health Data Users which was sponsored by Statistics Canada and the Canadian Institute for Health Information (CIHI). The program for the conference consisted of two tracks (Methods and techniques / Data informing decisions), each targeting a different perspective on using health data.
- ARL Data curation as next-generation librarianship: a new leadership role for libraries; 2010.
- Anokwa Y, Hartung C. Open source data collection in the developing world. Computer. 2009;42(10):97-99.
- Boulton G, Rawlins M, Vallance P, Walport M. Science as a public enterprise: the case for open data. Lancet. 2011 May 14;377(9778):1633-5.
- Butler D. Data sharing: the next generation. Nature. 2007;446:10-11.
- Chalmers I, Altman DG, McHaffie H, Owens N, Cooke RWI. Data sharing among data monitoring committees and responsibilities to patients and science. Trials. 2013;14:102.
- Conway PH, VanLare JM. Improving access to health care data: the open government strategy. JAMA. 2010;304(9):1007–8.
- Cox A, Verbaan E, Sen B. Upskilling liaison librarians for research data management. Ariadne. 2012;70.
- Charbonneau DH. Strategies for data management engagement. Med Ref Serv Q. 2013;32(3):365-374.
- Davies A, Lithwick D. Government 2.0 and access to information: recent developments in proactive disclosure and open data in Canada; 2010.
- Giarlo MJ. Academic libraries as data quality hubs. J Libr Scholarly Commun. 2013;1(3):eP1059.
- Glick J. British Columbia leading on open data and open government. Google Public Policy Blog. 29 June 2009.
- Groves T. BMJ policy on data sharing. BMJ. 2010;340:c564.
- Haeussler C. Information-sharing in academia and the industry: a comparative study. 2010.
- Hrynaszkiewicz Iain. The need and drive for open data in biomedical publishing. Serials. 2011;24(1).
- LIBER Working Group. Ten recommendations for libraries to get started with research data management. Final Report on E-Science, 2012.
- Miller HE. Big-data in cloud computing: a taxonomy of risks. Info Res. 2013;18(1):paper 571.
- Molloy JC. The Open Knowledge Foundation: open data means better science. PLoS Biol. 2011 Dec;9(12):e1001195.
- Murray-Rust P. Open data in science. Nature Precedings. 2008
- Neylon C. Science publishing: Open access must enable open use. Nature. 2012 Dec 20;492(7429):348-9.
- Ohno-Machado L. A hybrid open-access model to bridge the publishing divide and reach out to a broader community. JAMIA. 2011;18(3):210-1.
- Ouellette D. Subject guides in academic libraries: a user-centred study of uses and perceptions/Les guides par sujets dans les bibliothèques académiques: une étude des utilisations et des perceptions centrée sur l'utilisateur. Can J Info Libr Science. 2011;35(4):436-451.
- Rani M, Buckley BS. Systematic archiving and access to health research data: rationale, current status and way forward. Bull World Health Organ. 2012;90:932–939.
- Reimer P. City of Vancouver embraces open data, standards and source: city videos could be more widely available soon. CBC News, May 2009.
- Ross JS, Krumholz HM. Ushering in a New Era of Open Science Through Data Sharing. JAMA. 2013;():1-2.
- Schofield PN. Post-publication sharing of data and tools. Nature. 2009;461:7261:171-173.
- Vogel L. The secret's in: open data is a foreign concept in Canada. CMAJ. 2011 Apr 19;183(7):E375-6.
- Walport M, Brest P. Sharing research data to improve public health. Lancet. 2011 Feb 12;377(9765):537-9.