Data literacy

From HLWIKI Canada
Revision as of 17:57, 25 October 2016 by Dean (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Data information literacy comprises a range of metaskills
Are you interested in contributing to HLWIKI International? contact:

To browse other articles on a range of HSL topics, see the A-Z index.


Last Update

  • Updated.jpg 25 October 2016


See also Bioinformatics | Data management | Data management portal | e-Science | Open data | Information literacy | Media literacy | Web 2.0

"Data is defined as "...factual information, especially information organized for analysis or used to reason or make decisions..." — American Heritage Dictionary

"Data literacy refers to the ability to do something with raw data and information – to process them in some way. In an era where spreadsheets help us to make the grandest of decisions, we must have basic statistical literacy and fluency in the tools that allow us to make sense out of numerical data, not just words and ideas." — Johnson, "The Information Diet"

Data literacy (also data information literacy, synonymous with numeracy and statistical literacy) is the ability to find, assess, manage and synthesize data, either in analogue or digital formats. Some academic librarians believe that data literacy is connected to open data and to related topics, media literacy and transliteracy. Being literate in using data and statistics is critical to functioning as an academic librarian in the 21st century. To evaluate data and statistics, a broad understanding of the major terms and concepts in data curation and management is needed. The current drive to develop strong information and data management skills among librarians is laudable. In fact, the ability to understand information trends and tools of all types in accessing, converting and manipulating information and data is critical in many fields. Data skills may include topics such as understanding structured query language (SQL), relational databases (e.g. MS Access), data mining techniques and statistical software (e.g., SPSS, STATA, Minitab and MS Excel), presentation software (e.g., MSExcel and MSPowerPoint), and so on. These tools, of course, are simply a way to begin to work towards data literacy.

Why is data literacy important? The evolutionary role of the librarian in offering data literacy services — including discussion about the skills needed by students and researchers for data literacy and development of content, methods, and formats — continues to evolve. Data literacy is an opportunity to connect librarians to research within institutions, and to expand on information literacy to include numerical literacy and statistics. Expanding data literacy to include statistical literacy will help those same user groups deal with inferring causation from associations. It's been said that some organizations are "data rich and information poor". Our work in libraries consists of spreadsheets, reports, surveys and databases of metadata that are critical in how we provide services to our users. But how do we know whether we are doing enough to know the data we make available? Further, how can librarians be more data literate to teach others the basics of data literacy? In 2013, it was announced that Wikidata, an offshoot of Wikipedia, and centralized repository for data and facts, now feeds information for Wikipedia.

Data information literacy (DIL)

Given the increasing attention to managing, publishing, and preserving research datasets as scholarly assets, what competencies are academic librarians expected to demonstrate in this emerging area? What roles can librarians play in helping students attain these competencies? Developing educational programs that introduce graduate students to the knowledge and skills are needed to work with research data. The term “data information literacy” has been adopted with the deliberate intent of tying two emerging roles for librarians together. By viewing information literacy and data services as complementary rather than separate activities, the contributors seek to leverage the progress made and the lessons learned in each service area.

Academic librarians can cultivate strategies and approaches in their data information literacy programs. Concepts and ideas behind data information literacy must be examined, such as the twelve data competencies. Case studies in data information literacy can be viewed at different institutions (Cornell, Purdue, Minnesota, Oregon), each focused on a different disciplinary area in science and engineering. They detail the approaches taken, how programs were implemented, and assessment metrics used to evaluate impact. The “DIL Toolkit,” a distillation of the lessons learned, is presented as a handbook for librarians interested in developing their own programs. Recommendations for future directions and growth of data information literacy are needed so as not to get sidetracked completely by the push for data.

Teaching 'data literacy'

"...we use data every day—to choose medications or health practices, to decide on a place to live, or to make judgments about education policy and practice. The newspapers and TV news are full of data about nutrition, side effects of popular drugs, and polls for current elections. Surely there is valuable information here, but how do you judge the reliability of what you read, see, or hear? This is no trivial skill—and we are not preparing students to make these critical and subtle distinctions..." — Rubin, 2005

Decision-making based on data is a common activity in 21st century life, be it data in weather reports, prices at the grocery store, or discussions about your blood test with your doctor. Innumeracy (low literacy with numbers) costs the economy hundreds of millions of dollars in errors and lost productivity every year. The discourse around numerical literacy has been around for some time, and is related to data literacy. Some data experts believe that the need to have multiple representation of numbers explains why data visualization is so popular at the moment. The thought is that by designing data for visualization, and building interactive experiences, large amounts of information are made visual and more understandable. Carlson (2011) articulates the need for a data information literacy program (DIL) to prepare students to engage in such an "e-research" environment. As far as libraries are concerned, the data movement presents some real opportunities for academic librarians. For example, in 2005, Humphrey created a "...collaborative training program introduced to develop baseline competencies in Canadian academic libraries to support data services...". In conjunction with an initiative between Statistics Canada and sixty-six Canadian universities, a data literacy program was developed to provide workshops over a seven-year period for librarians who provide services in handling data. A cost-sharing arrangement kept these data training courses at a minimum for individual universities.

For most scientists, research data is broadly defined as information (e.g. data sets, microarray, numerical data, clinical trial information, textual records, images, sound, etc.) that is generated or used as quantitative evidence in primary biomedical research. This research data is distinguished by the fact that it is accepted by the research community as a means to validate research findings, observations and hypotheses. More recently, Qin (2010) found that metadata had a central role in how scientists understood data management in e-Science.

Questions for librarians

The rise of data, its use, curation and management, is a growing trend in academic libraries. However, rather than wait for your library organization to hire a data librarian or to create a data repository, why not try to introduce some data literacy skills (or exercises) into your library workshops?

  • First, how might you start to incorporate data literacy concepts into your information literacy programs?
  • Brainstorm and (re)write definitions, models and standards for our programs to include data
  • Develop discipline-based frameworks for information and data literacy
  • How should academic libraries provide data literacy education?
  • Should workshops be designed as standalone or integrated into courses?
  • Should they be part of research methods, theory courses or integrated across curricula?
  • Who should teach and support data literacy?
  • Data librarians, academic domain experts, LIS academics
  • Other subject experts

Web resources

4-star.gif 4 stars denotes librarian-selected, high quality information. Starred sites are great places to begin your research.
  • Research project which aims to understand how people make sense of big data visualizations


Personal tools