From HLWIKI Canada
Jump to: navigation, search
Webometrics what is it.jpg
Are you interested in contributing to HLWIKI International? contact: dean.giustini@ubc.ca

To browse other articles on a range of HSL topics, see the A-Z index.


Last Update

  • Updated.jpg 14 February 2017


See also Altmetrics | Bibliometrics | Citation analysis | Google scholar metrics | Grey data ("hard to find" data) | ImpactStory | Scientometrics

"...The application of bibliometric and informetric approaches to study the web, its information resources, structures and technologies, is known as webometrics. Since the name was coined in 1997, the value of webometrics quickly became established through the Web Impact Factor, the key metric for measuring and analyzing website hyperlinks..."Thelwall, 2013

Webometrics refers to a set of quantitative techniques to track and evaluate the impact of web sites (and other online sources of information). According to Wikipedia...."...webometrics (also cybermetrics) tries to measure the World Wide Web to get knowledge about the number and types of hyperlinks, structure of the World Wide Web and usage patterns...". Webometrics is an information-age discipline that measures web phenomena in numeric terms. Techniques include link analysis, web mention analysis, blog analysis and search engine evaluation, but from the perspective of digital library evaluation the main method is link analysis. Webometrics was coined by Almind and Ingwersen (1997) in recognition that infometric analyses could be applied to the web to reveal vital information. The authors refer to the number of hyperlinks on websites, structure and patterns of use and reciprocal links by other websites. This data can be mined for analyses but experts recommend the concomitant use of data cleansing heuristics to ensure reliability. The web is not unlike library catalogues in their infinite variety of connections; as librarians are aware there are challenges in looking at these connections meaningfully and in determining impact. According to Björneborn and Ingwersen (2004), webometrics is defined as "the study of the quantitative aspects of the construction and use of information resources, structures and technologies on the Web drawing on bibliometric and informetric approaches." According to Thelwall, "...webometrics is (a) a set of quantitative techniques for tracking and evaluating the impact of web sites and online ideas and (b) the information science research field that developed these ideas. Webometric techniques include link analysis, web mention analysis, blog analysis and search engine evaluation, but from the perspective of digital library evaluation the main method is link analysis..."

Can the analysis of web "metrics" help to evaluate digital resources? Yes, links to sites reveal useful information about how popular sites are, which pages or resources are most popular, why they are popular and where they are popular (with whom). While this information can also be retrieved from server log file analysis, this can normally only be conducted with permission of a site’s webmaster. In contrast, link analysis can be applied to any site, and link analysis can be used to evaluate it by comparing it with its competitors or to similar sites. The web server can also reveal information about missed audiences.

What is webometrics?

Web impact factors

In 1998, Ingwersen defined a web impact factor (see Noruzi, 2006) as the number of web pages on a web site receiving links from other web sites, divided by the number of web pages published on the site that are accessible to a web crawler ie., Google. The resulting use of this data in order to determine a website's impact has been done in webometric circles for well over a decade. In a 2002 medical paper, it was found that the web impact factor of a medical informatics society's site was statistically correlated to national productivity. Soualmia et al also found that discrepancies could be used to indicate countries where weak medical informatics associations existed or ones where they did not make optimal use of the web.

Invisible College

The concept of invisible college was developed in the sociology of science by Diana Crane building on Derek J. de Solla Price's work on citation networks. Though related, it is different from other concepts of expert communities such as epistemic communities (Haas, 1992) or communities of practice (Wenger, 1998). The concept of the invisible college was applied to the global network of communications among scientists by Caroline S. Wagner in "The new invisible college: science for development" (Brookings 2008). It was also referred to in Clay Shirky's book "Cognitive surplus". The invisible college also refers to the vast unfiltered, informal communications networks produced by communities of people who share interests in a common passion or subject discipline. These communications can be via e-mail, personal conversations, conference papers, unpublished diaries, meeting minutes, phone calls, newsletters, memoranda, and other sources that may not pass through the usual publishing, broadcasting, and distribution channels. This idea is linked to grey literature.

The strength of the invisible college is related to the following: historically inaccessible information made available over the Internet; it is also available sooner than conventional literature; it may allow readers to "listen in" on active debate of current issues. Keep in mind discourse in the invisible college is of variable quality. It can be hard to identify, search for, and access this information; it may require validation of data, especially with Internet sources. Moreoever, it assumes a high level of familiarity with issues and not good for background information. The advent of the Web and the shift from big science to networked science creates unprecedented opportunities. Rather than investing resources to mimic or duplicate the scientific institutions of previous centuries, policymakers can use networks and create incentives for scientists to focus on research that mobilizes knowledge for local problem solving. As network accessibility continues to grow, it is important that researchers take full advantage of networking tools and collaborative opportunities to address local issues as well as to attain international research opportunities. Caroline Wagner called it the “New Invisible College”, a global networked college based on mutual interests and open sharing of knowledge. This highly distributed college is the foundation for a new knowledge commons where there are few barriers to participation.

Tookit for the Impact of Digitized Scholarly Resources

The webometrics toolkit was originally developed by the Oxford Internet Institute in 2008 as part of a JISC-funded, Usage and impact study that explored the questions: are digital resources succeeding at reaching their intended users? Are they having an impact on their community of users? How can impact be measured? During the project, researchers carried out an impact analysis of five specific resources funded as part of the JISC Phase one Digitisation programme (2004-2007) by using the different methodologies listed in the toolkit.

Bibliometrics and scientometrics

Bibliometrics and scientometrics are two closely-related fields that aim to measure scientific publications and science in general. A lot of the research that falls under this topic involves citation analysis, or examining how scholars cite one another in publications. Author citation data can show a lot about scholar networks and scholarly communication, linkages between scholars, and the development of areas of knowledge over time. Modern scientometrics is based on the work of Derek J de Solla Price and Eugene Garfield.

The field of scientometrics – the science of measuring and analyzing science – took off in 1947 when mathematician Derek J. de Solla Price was asked to store a complete set of the Philosophical Transactions of the Royal Society temporarily in his house. He stacked them in order and he noticed that the height of the stacks fit an exponential curve. Price started to analyze all sorts of other kinds of scientific data and concluded in 1960 that scientific knowledge had been growing steadily at a rate of 4.7 percent annually since the 17th century. The upshot was that scientific data was doubling every 15 years.

As with other scientific approaches, scientometrics and bibliometrics have their own limitations. Recently, a criticism was voiced pointing toward certain deficiencies of the journal impact factor (JIF) calculation process, based on the Web of Science such as: journal citation distributions may be highly skewed towards established journals; journal impact factor properties are field-specific and can be easily manipulated by editors, or even by changing the editorial policies; this makes the entire process essentially nontransparent. Regarding the more objective journal metrics, there is a growing view that for greater accuracy it must be supplemented with an article-level metrics and peer-review. Thomson Reuters replied to criticism in general terms by stating that "no one metric can fully capture the complex contributions scholars make to their disciplines, and many forms of scholarly achievement should be considered."


According to Acharya A et al. Rise of the rest: the growing impact of non-elite journals. arXiv. 16 October 2014, the idea of non-elite journal articles (traditionally, those that have not been cited much) have started to be cited more in the last ten years due to Google scholar.

Personal tools