Google scholar

From HLWIKI Canada
Jump to: navigation, search
Google scholar provides advanced search capability which is only marginally better than its main search page
Are you interested in contributing to HLWIKI International? contact: dean.giustini@ubc.ca

To browse other articles on a range of HSL topics, see the A-Z index.

Contents

Last Update

  • Updated.jpg 9 November 2014

Introduction

See also Bibliometrics | EBooks | Google Drive | Google scholar metrics | Google scholar bibliography | Microsoft Academic Search | Web 2.0

Google scholar has its own e-alerting feature for articles added to the database

Released ten years ago in 2004, Google scholar is now widely-viewed as an important search tool for most students and scholars worldwide in order to find citations to articles, books, preprints, abstracts and technical reports. However, in terms of its overall functionality, Google scholar may be best described as a browsing tool. Most academic librarians view it as a scholarly channel of Google content and not as a proper literature search database or interface.

In 2012, Gray et al (2012) said that "Google scholar's value" may be how "useful [it is] for initial & supplemental information gathering". Compared to curated databases such as Medline, Google scholar offers part of what librarians expect to see in a search tool or interface but is, on its own, inadequate for most searching in medicine (Giustini, 2013).

Google scholar's ease-of-use and convenience is obvious, and of real benefit to researchers with few search skills. However, as an index and search tool, it is essentially a dumbed-down version of what an abstracting and index (A&I) service should be, and equal only to some tasks. Google scholar is allied to the search power of Google and that's a very powerful reason to use it. GS is especially useful in locating known items. Google has digital spiders that crawl password-protected and subscription-based websites (of hitherto uncrawled materials) every day, identifying authors and titles of papers in the grey literature. For those doing systematic review searching, it would be inconceivable to search the web without Google scholar. Google's spiders also help to establish linkages between two otherwise disconnected or unknown articles. Consequently, GS hits the search sweet spot for many researchers trying to locate a few good articles for journal club or continuing education. Google is adding articles and content from Google Books – another reason users find it indispensable in their research.

In early 2013, BioMedCentral published a paper entitled Is the coverage of Google Scholar enough to be used alone for systematic reviews? The article is timely and asks a good relevant focussed question. The abstract is clear enough about their aims. However, when you look beyond the abstract there are some semantic problems and problems with methodology. First, keyword searches for "erythropoietin" and "darbepoetin" combined with cancer yields = 100% recall/ precision of 0.1% (36,630 articles found, for 36 articles). Are the authors suggesting that researchers consider a precision level of .1% acceptable for the SR? Second, they don't extend their logic to other searches (generalizability). In other words, the articles are "indexed" in GS but the authors don't address the problem of finding them. Just because articles are in a database doesn't necessarily mean you will find them. In the meantime, the search itself isn't reproducible (the third problem).

Some background

"...For to all those who have, more will be given, and they will have an abundance..." — Matthew, 25:29 (See Matthew effect vis a vis Google scholar ranking algorithm)

Many years after its beta release and debates among librarians about its usefulness, Google scholar (GS) continues to be viewed as controversial. Its ease-of-access and simple interface make it extremely popular among academics. What are the main reasons, and what can librarians learn from asking these questions? For one, the interface is familiar, classic Google. GS links to 20 million books and hard to locate materials. Its size is probably more than one (1) billion documents making it the largest search tool ever conceived for scholarly searching.

GS is useful for locating peer-reviewed content and grey literature produced by government and other agencies. GS includes Elsevier journals, JSTOR and many other publishers. GS is comparable to CiteSeer and has most content found in other open search tools. GS points to more websites, journals and languages than any competitor. As such, it is useful (some say ideal) for browsing and pre-searching. Its tagline - "Stand on the shoulders of giants" - is a nod to scientists who have contributed to the scholarly literature. American LIS professor, Peter Jacso, is the most cited academic writing about Google scholar. Since Google Scholar weights citation counts highly in its ranking algorithm, it is criticized for strengthening the so-called Matthew effect. In other words, highly cited papers appear first in search results, and thus gain more citations while new papers don't appear highly and therefore get less attention.

In 2010, Anurag Acharya created the Google scholar blog and wrote his first post. His team at Google scholar post periodically about changes to the search tool.

Search features

Google3.png

GS provides easy searchability in classic Google fashion -- pop in a few keywords and start browsing. GS locates born digital content and points to articles online and on library shelves (see library links). Its "All versions" feature provides access to free-fee fulltext (preprints and early drafts) and its handy "Cited by" feature links to articles that cite the one being viewed. "Related articles" presents lists of closely-related articles and ranks them. In response to the release of Microsoft's Academic Search tool, an importing feature was included for RefWorks, Reference Manager, EndNote and BibTeX (see citing). GS also added a pull-down menu in the search display to limit searches to a range of years. US legal opinions and cases were made searchable in 2010 using the appropriate radio buttons on the front page; e-mail alerts are also available from the results display.

Pluses & minuses

  • same search operators and "limits" as in Mother Google (see Google commands)
  • easy to search for academic journals and grey literature
  • patents, legal opinions and law journals are searchable; e-mail alerts can be set up
  • some papers are only available to subscribers (unless open access)
  • citations are determined to be scholarly by Google (not by scholars or librarians)
  • total # of journals and coverage is unknown; many scholarly journals are not indexed
  • cannot do proper, structured literature reviews because there is no "history"
  • more coverage in sciences than humanities
  • older material ranked higher, usually
  • "Cited by" (citation searching) is useful but numbers may be inflated compared to Scopus or Web of Science
  • sorting, browsing and exporting all citations at the same time are not available

Searching for Canadian content

  • Limiting to Canadian materials only is not possible; using site:ca command limits results to Canadian sites means that relevant results may be missed.
  • Another way to retrieve Canadian articles is by using the journal name to limit results. In Advanced Search, add can OR canada OR canadian to the publication box. However, this string will return not just journal articles but other materials as well. Moreover, the search will omit any Canadian articles that do not have “Canada” or “Canadian” in their name and articles where Canadian is abbreviated (e.g. CBLJ).
  • In searching for articles in specific subject domains, limit to specific journals; ensure you include all ways the journal may be cited. Google does not allow you to truncate search terms.
  • The 'intitle: operator limits to “Canada” or “Canadian” in the title (e.g. intitle:canada OR intitle:canadian) and is not recommended. Although scholarly material in Google Scholar is international, there is an American bias but there are enough Canadian resources to make it worthwhile.

Quality, content

Some searchers consider GS of comparable value to commercial databases despite remaining in beta, and others say it is hampered by poor design and quality control. When searching for items based on publication dates, for example, results are inflated and unreliable. Other features such as the citation tracking numbers are inflated, sometimes as much as 50%. The number of articles found in some searches increases instead of decreases when limiting to specific year ranges (ie. 2000-2006). Some critics say that GS has a counterintuitive presentation of results. The longstanding problem here is the secrecy about GS' coverage and refusal to publish the names of journals it crawls. Consequently, it is impossible to know how current or exhaustive your searches are. Where searches do not have to achieve high levels of recall, GS finds favour with those who need a few articles and who want their searching to be on the open web and as simple as possible.

User-friendly as a browsing tool

GS is an ideal vehicle for browsing the academic side of the web. Several librarians have studied its potential as a research tool with the general conclusion that it cannot compete with the power and flexibility of interfaces such as OvidSP or EBSCO. Some librarians ask what content can be retrieved from GS but Google has never released this information - librarians are therefore left to make educated guesses about its coverage. The consensus seems to be that GS is not as current or comprehensive as other tools such as PubMed but it offers a quick and complementary view of findable materials. Being aware of its strengths and weaknesses and advanced search capabilities will help health librarians learn when to recommend it to users.

Search basics

Google scholar results display
  • Step I: type in http://scholar.google.com
  • Step II: Enter two specific words describing what you want to find: hit search
  • Step III: browse' results
    • Enter up to 32 words (used to be 10).
    • Remember default is AND e.g. heart failure retrieves heart AND failure (no need to type AND)
    • type in "heart failure" to force a phrase search
    • Use OR to find related or synonymous terms
    • e.g. surgery AND dental OR dentistry OR dentist
    • Exclude words causing problems with a minus sign e.g. clinton hospital -president
    • Add terms to searches e.g. clinton hospital umass OR massachusetts
    • Results based on terms, then ranks them according to:
      • word frequency (how many times query words appear in each document),
      • word order (first terms in query are given greater weight),
      • word proximity (how close terms are to each other),
      • word location (e.g. in title or heading)
      • PageRank (Google's link popularity algorithm based on how many other pages link to this one similar to cited reference searching) among other complex statistical algorithms

See also NNLM Super Searcher: Enhancing Your Online Search Super Powers

FindZebra

In head-to-head comparisons, a new search tool called FindZebra outperforms Google. The developers at FindZebra have created specialized functionalities exploiting medical ontological information and UMLS medical concepts to demonstrate different ways of displaying the retrieved results to medical experts. FindZebra is available at http://www.findzebra.com/

References

See Google scholar bibliography As of 2014, — 170 articles in total)

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox