To browse other articles on a range of HSL topics, see the A-Z index.
Data literacy (also data information literacy, synonymous with numeracy and statistical literacy) is the ability to find, assess, manage and synthesize data, either in analogue or digital formats. Some academic librarians believe that data literacy is connected to open data and to related topics, media literacy and transliteracy. Being literate in using data and statistics is critical to functioning as an academic librarian in the 21st century. To evaluate data and statistics, a broad understanding of the major terms and concepts in data curation and management is needed. The current drive to develop strong information and data management skills among librarians is laudable. In fact, the ability to understand information trends and tools of all types in accessing, converting and manipulating information and data is critical in many fields. Data skills may include topics such as understanding structured query language (SQL), relational databases (e.g. MS Access), data mining techniques and statistical software (e.g., SPSS, STATA, Minitab and MS Excel), presentation software (e.g., MSExcel and MSPowerPoint), and so on. These tools, of course, are simply a way to begin to work towards data literacy.
Why is data literacy important? The evolutionary role of the librarian in offering data literacy services — including discussion about the skills needed by students and researchers for data literacy and development of content, methods, and formats — continues to evolve. Data literacy is an opportunity to connect librarians to research within institutions, and to expand on information literacy to include numerical literacy and statistics. Expanding data literacy to include statistical literacy will help those same user groups deal with inferring causation from associations. It's been said that some organizations are "data rich and information poor". Our work in libraries consists of spreadsheets, reports, surveys and databases of metadata that are critical in how we provide services to our users. But how do we know whether we are doing enough to know the data we make available? Further, how can librarians be more data literate to teach others the basics of data literacy? In 2013, it was announced that Wikidata, an offshoot of Wikipedia, and centralized repository for data and facts, now feeds information for Wikipedia.
Data information literacy (DIL)
Given the increasing attention to managing, publishing, and preserving research datasets as scholarly assets, what competencies are academic librarians expected to demonstrate in this emerging area? What roles can librarians play in helping students attain these competencies? Developing educational programs that introduce graduate students to the knowledge and skills are needed to work with research data. The term “data information literacy” has been adopted with the deliberate intent of tying two emerging roles for librarians together. By viewing information literacy and data services as complementary rather than separate activities, the contributors seek to leverage the progress made and the lessons learned in each service area.
Academic librarians can cultivate strategies and approaches in their data information literacy programs. Concepts and ideas behind data information literacy must be examined, such as the twelve data competencies. Case studies in data information literacy can be viewed at different institutions (Cornell, Purdue, Minnesota, Oregon), each focused on a different disciplinary area in science and engineering. They detail the approaches taken, how programs were implemented, and assessment metrics used to evaluate impact. The “DIL Toolkit,” a distillation of the lessons learned, is presented as a handbook for librarians interested in developing their own programs. Recommendations for future directions and growth of data information literacy are needed so as not to get sidetracked completely by the push for data.
Teaching 'data literacy'
Decision-making based on data is a common activity in 21st century life, be it data in weather reports, prices at the grocery store, or discussions about your blood test with your doctor. Innumeracy (low literacy with numbers) costs the economy hundreds of millions of dollars in errors and lost productivity every year. The discourse around numerical literacy has been around for some time, and is related to data literacy. Some data experts believe that the need to have multiple representation of numbers explains why data visualization is so popular at the moment. The thought is that by designing data for visualization, and building interactive experiences, large amounts of information are made visual and more understandable. Carlson (2011) articulates the need for a data information literacy program (DIL) to prepare students to engage in such an "e-research" environment. As far as libraries are concerned, the data movement presents some real opportunities for academic librarians. For example, in 2005, Humphrey created a "...collaborative training program introduced to develop baseline competencies in Canadian academic libraries to support data services...". In conjunction with an initiative between Statistics Canada and sixty-six Canadian universities, a data literacy program was developed to provide workshops over a seven-year period for librarians who provide services in handling data. A cost-sharing arrangement kept these data training courses at a minimum for individual universities.
For most scientists, research data is broadly defined as information (e.g. data sets, microarray, numerical data, clinical trial information, textual records, images, sound, etc.) that is generated or used as quantitative evidence in primary biomedical research. This research data is distinguished by the fact that it is accepted by the research community as a means to validate research findings, observations and hypotheses. More recently, Qin (2010) found that metadata had a central role in how scientists understood data management in e-Science.
Questions for librarians
The rise of data, its use, curation and management, is a growing trend in academic libraries. However, rather than wait for your library organization to hire a data librarian or to create a data repository, why not try to introduce some data literacy skills (or exercises) into your library workshops?