Data management portal
To browse other articles on a range of HSL topics, see the A-Z index.
Data management (also research data management) refers to the effective capture, storage and preservation of data by computers and assorted machines. The capture of "data" is more often a flood of information that must be recorded, curated and preserved. Typically, this introduces a set of skills outside of the academic librarian's education and training but LIS programs are working to remedy this deficit. Some examples of data management in medicine are clinical research data projects, electronic patient records, images, video, audio, logbooks, simulations and more. In medicine, a range of data is generated in clinical research, medical learning and teaching at medical schools and universities around the world. Understanding the impact of this data on practice researchers, libraries and institutions is critical to the discourse in 2014, especially as issues around data preservation, sharing among communities, and enabling escience are bound up by it.
All researchers should take steps to know how data is used in real applications, and look at examples of collaborative efforts between institutions, groups or individuals specific to collection, use, access, preservation and overall management of data. A major data trend at the moment is data management (big data and data science). Data science refers to the collection, preparation, analysis, visualization, management and preservation of data (e.g., from experiments, research projects, clinical trials and even the use of social media). Data science is an emerging area that lies at the intersection of computer science, information science, applied mathematics, data visualization and data production. Some related terms are: big data, data management, data visualization, metadata, open data, e-science.
Implications for librarians
Data (and datasets) are increasing in importance around the world but especially in academia and scientific communities. Funding agencies are starting to adopt policies that require both research articles and the data associated with them to be publicly accessible once results are made public. Additionally, at the end of research funding periods, entire datasets must be available for remix and repurposing or auditing. To insure proper identification, access and preservation of datasets, the principles of library and archival sciences must be applied to assist researchers to comply with these policy requirements and to establish new best practices within their disciplines.
Data visualization is one part of the data management trend. Librarians are starting to immerse themselves in data science and visualization in collaboration with their academic peers. A range of knowledge, skills and abilities are required to communicate effectively with faculty and students about data and to provide consultancy on data visualization. There are several open source and highly-used commercial tools. Sharing of practices and experiences across institutions should be encouraged. Some topics in the area include: data exploration and statistical analysis; bibliometrics; data visualization; data description, sharing and reuse; messy data; analyzing textual and multimedia data; open data and open science.
A to Z