Google, Bing and Yahoo are three of the top Internet search engines in 2014. As such, they have a direct impact on everyone's ability to find things. Reference librarians and information specialists are affected by search engines which are linked to other trends such as open access and web 2.0. Search engines are popular because they offer a quick search of the web. Search engines may introduce problems of various kinds in information retrieval, especially in subject-based searching. Some health librarians say that poor recall, consistency and authority control in search engine content make them unacceptable for some queries because but acknowledge that for known items they facilitate quick access and point to popular content.

In 2014, in attempt to improve the relevance of its search results, Google is exploring a combination of personalized results, localization, social networking and semantic search. In other words, two people can search for the same topic using the identical keywords and they may come up with different results.

Searching for information

Medical information is of interest to a variety of user groups including patients and families, researchers, general practitioners and clinicians, as well as specialists. Despite the popularity of search engines, the development of search and access technologies in health and medicine remains particularly challenging. One of the central issues is the diversity of user groups of these services. In particular, they have varying information needs, levels of medical knowledge and language skills. Format, reliability and quality of information varies considerably in the area. A single health record may contain clinical notes, technical pathology data, images and patient-contributed histories, and may be linked to research papers. The importance of health and medical topics to the everyday lives of patients makes the need for efficient and effective retrieval of best evidence especially important. Determining the reliability of information is challenging. As with information retrieval in general, evaluating medical search technologies is a perennial challenging.

Browsing & precision trends

It was inevitable that searching would take on some of the features of web 2.0. Many of the earliest algorithms were based on link popularity, a kind of wisdom of the crowds. Now, information needs in workplaces arise in the context of collaboration and participation. At one end, health teams write collaboratively and participate in trials where comprehensive (high recall) searches are needed (though they may tolerate low precision to begin). At the other end, medical students and nurses work together to retrieve a few good articles. These health professionals do not require high recall but high precision and algorithms coupled with other forms of recommended websites produce acceptable results. As the web scales in size, new requirements emerge. Customized social search - offered by Google health - is likely to become more important as a means of offering targeted searching among a group of sites. These websites will be recommended or compiled by other workers and collaborators or other experts. However, it must be said that with the rise of search tools and social search, there is a concomitant decline in the use of traditional databases. This may be due to the simple truth that high recall and precision in searching is not required for most queries. Health library users, for example, have different requirements for literature reviews and basic information. Precision tolerance of health professionals is directly related to recall. In the Internet age, the notion of complete recall as an indicator of success seems outdated and unrealistic. Exhaustive searching is not always needed. The idea of leading users to an acceptable number of papers has led some librarians to suggest proportional recall (or relative recall) where success is expressed as the number of relevant documents retrieved, over relevant documents required. A pharmacist may need five relevant documents, but her search retrieves only three. Proportional recall is therefore three-fifths, or 60%. This measure, while appealing, is artificial in that few health library users can specify what they really need before searching, let alone how many documents they will need.

Challenges faced by search engines

  • The web is growing faster than present-technology can index it. Major search-engines are slower to index new Web pages, according to anecdotal evidence. The overlap in the top three search engines of sites indexed is less than 10%.
  • Many web pages are updated frequently, which makes it necessary to revisit them daily.
  • Queries are limited to searching for key words, which may result in false drops, especially using the default page-wide search. Better results might be achieved by using a proximity-search option with a search-bracket to limit matches within a paragraph or phrase, rather than matching random words scattered across large pages. Another alternative is using human operators to do the researching for the user with organic search engines.
  • Dynamically-generated sites may be slow or difficult to index, or may result in excessive results, perhaps generating 500 times more Web pages than average. Example: for a dynamic Web page which changes content based on entries inserted from a database, a search-engine might be requested to index 50,000 static Web pages for 50,000 different parameter values passed to that dynamic web page.
  • Many dynamically generated sites are not indexable by search engines; this phenomenon is known as the deep, "dark" or invisible web.
  • Some search-engines do not rank results by relevance, but by the amount of money the matching Web sites pay.
  • In the past year, search engine optimization (SEO) has become big business and some techniques conspire to undermine organic results. This leads to linkspam or bait-and-switch pages which contain little information about matched phrases. Some observers suggest that relevant Web pages are being pushed down in results due to SEO.
  • Google is exploring a combination of personalized results, localization, social networking and semantic search. In other words, two people can search for the same topic using the identical keywords and they may come up with different results.

