"...the application of library science principles and methodologies, such as cataloging, classification, and resource sharing, can be reinterpreted to meet the specific needs of scientific digital data management and described in terms that are more expansive and expressive of today’s challenges, such as metadata, taxonomy, and open source. In this way a network of well-documented data sets can be built that will facilitate the retrieval of data by researchers, today and into the future. Who else is better qualified than we librarians to bring this about?..."— Mullins, 2009
E-Science (also, e-science, data science & science 2.0) is defined as the use of computational tools and resources to analyze large scientific datasets (see alsobioinformatics entry). E-Science (or eScience) is defined as "...computationally intensive science carried out in highly distributed network environments...and science that uses immense data sets and grid computing; the term sometimes includes technologies that enable distributed collaboration, such as the 'access grid'.
In their article about the implications of e-science for librarians, Hey and Hey (2006) define e-science as a type of "networked, data-driven science." eScience is also known as e-research or cyberscience and its supporting systems as cyber-infrastructure or scientific collaboratories. The idea of e-science originated in the 1950s when scientists first applied computer power to carry out their research. The nature of scientific research since then has evolved from one of traditional laboratory-focused work to the application of digital technologies such as computer models, simulation programs and sensors. The major feature of e-science (in addition to generating large datasets) is such data can be rapidly disseminated and assessed by other researchers via the internet. According to health librarian Neil Rambo (2009), e-science "...alters the types of problems that scientists address, the tools that they use, and the nature of the publication that results from their research. Instead of conducting research to collect and analyze data, a typical e-science scenario mines existing data in search of patterns or correlations."
Emergence of a fourth research paradigm
The Fourth Paradigm is a term connected to e-science and data curation. Pioneering computer scientist Jim Gray has collected a monograph of essays that discuss the fourth paradigm of discovery based on data-intensive science. In 2007 Gray and his boat went missing after a short trip out of the San Francisco bay area. After five years of being missing at sea, Gray is legally assumed to have died at sea.
E-science roles for librarians
Williams (2009) identifies roles in the following areas for librarians in the developing world of e-science:
Science 2.0 is considered to be a shift from the publication of final results by well-defined collaborative groups towards a more open approach, which includes publicly sharing data, preliminary experimental results, and related information. To facilitate the shift, Science 2.0 focuses on providing tools that simplify communication, cooperation and collaboration between interested parties. Such an approach has the potential to speed up the process of scientific discovery, overcome problems associated with academic publishing and peer review and remove time and cost barriers limiting the process of generating new knowledge.
The Lamar Soutter Library at the University of Massachusetts Medical School and the Northeast Regional Medical library NN/LM have developed an e-Science web portal for librarians which includes educational resources for specific tools and subject/discipline tutorials and modules to assist librarians new to e-Science.