Semantic search

From HLWIKI Canada
Jump to: navigation, search
Semantic search is linked to web 2.0 & web 3.0
Are you interested in contributing to HLWIKI International? contact:

To browse other articles on a range of HSL topics, see the A-Z index.


Last Update

  • Updated.jpg This entry is out of date, and will not be updated, February 2018


See also Data science portal | FOAF - Friend of a Friend | Google scholar | PubMed Alternative Interfaces | PubReMiner | Semantic web | Semiotics and the web | Web 3.0

Semantic search tools seek to improve the accuracy of web searching by considering the context (and/or meaning) of search terms as they are input or occur in many possible "desirable web documents". Rather than relying on well-indexed documents in a database or algorithms to predict document relevancy, semantic search tools use the science of meaning to produce a relevant list of search results. The goal of semantic search is to deliver information in a meaningful way rather than having to sort through lists of documents bound by loosely-related keywords (or by a lack of specificity in indexing). Some authors view semantic search as a set of techniques to retrieve knowledge from richly-structured databases and which enable web technologies to define domains at a sophisticated level. Semantic tools on the web rely on metadata to describe and bring together documents during information retrieval similar to human-mediated/curated indexing. This language will help to describe and retrieve documents much like what we encounter in medical databases. Metadata is defined as 'data about data' and the major standard in the field is Dublin Core.

For more background, see the Semantic Media wiki. In 2013, the JMIR published a relevant research article entitled "Social Semantic Web Techniques Foster Collaborative Curriculum Mapping In Medicine".

Cognition Search closed 2014

(now closed) Cognition search uses a natural language mapping technology and a blend of linguistics and mathematical algorithms to locate and collate content. In effect, this helps computers to find meaning (or related concepts) with the words we use in searches. CS understands the relationship between words and phrases (meaning), paraphrases (a "finger" or a "digit") and taxonomies (a "finger" is part of a "hand", a "cow" is a "bovine" and is a "mammal"). Cognition search permits searching across four domains:

  • Law - (1,858 volumes; 675,704 files of federal case law). US Supreme Court Decisions and Court of Appeals decisions from 1950 onward
  • Medicine - Medline, abstracts for biomedical information from international literature; database covers medicine, nursing, pharmacy, dentistry, veterinary medicine; fields with no direct medical connection, such as molecular evolution (~19 million files).
  • Wiki - English version of Wikipedia
  • Bible - New English Translation of Gospels of Matthew, Luke, John and Mark

Duck Duck Go

When searching Duck Duck Go, the service brings up the most 'official' page first and if the search terms are linked to a Wikipedia page, a short blurb will appear as well as related search terms at the top. DDG features special category pages, and recognizes calculations, phone numbers, zip codes, ISBNs and product codes, as well as street and IP addresses.


Exalead offers a host of enterprise 2.0 options to narrow searches based on image size, color and content. These features are appearing in other search engines. closed 2016

Evri is a technology company developing products that change the way consumers discover and engage with content on the Web. Some publishers have used Evri’s semantic platform on their sites, including the most prestigious news organizations such as the Washington Post, Hearst Publishing, Yahoo! and the Times of London. With over 2 million pages across 500 categories, several content recommendation applications and a feature-rich API platform, Evri is rapidly improving access to information.


The aim of the engine is to return meaningful sentences for the search query. Factbites offers you real, meaningful sentences that are right on topic - a technique that lies between a site summary and summary of results.

Hakia -

hakia Semantic search technology is based on a computerized system that understands content and query similar to how the human brain processes natural languages. Instead of matching the occurrence of words or symbols (as done in indexing systems), semantic search systems match concepts and their meaningful variations.


To search JANE, insert your text sample (a title, abstract or simply keywords) in the search box and choose one of the three basic search methods: by journal, by article or by author. Choosing to search by journal (title) finds journals that have published articles on similar topics to yours. The author search finds authors who have previously published on your topic. An article search finds published articles with similar content.


Leksi (now closed) is derived from a linguistic term "Lexical" meaning "related to words". It emphasizes language processing from the level of words and the meanings associated with them. It has been exploring more intelligent ways to find information for users in a more meaningful way. Method will bring more accurate and relevant search results than the current search tools.

NEPOMUK - The Social Semantic Desktop

Networked Environment for Personalized, Ontology-based Management of Unified Knowledge (NEPOMUK) brings together researchers, industrial software developers, and representative industrial users, to develop a comprehensive solution for extending the personal desktop into a collaboration environment which supports both the personal information management and the sharing and exchange across social and organizational relations.

NLM Plus

NLMplus is an innovative semantic search and discovery application, developed by WebLib LLC, a small business in Maryland, in response to a challenge contest by the National Library of Medicine (NLM) to make use of NLM’s vast collection of biomedical data and services for the benefit of the Library’s diverse worldwide user communities.


Powerset is applying its natural language processing to search, aiming to improve how we find information by unlocking the meaning in ordinary human language. Powerset is a search and discovery experience for Wikipedia and improves the entire search process. In the search box, you can express yourself in keywords, phrases, or simple questions. On the results page, questions are answered directly, and aggregated from across multiple articles.


PubReminer does a statistical analysis of words in a PubMed - MEDLINE search and generates hyperlinked frequency tables that outline Medline records that are listed by year, journal, author, keyword, MeSH, substances and country. This search tool provides an interesting view into queries and in the optimization of search results.

PureDiscovery closed 2014

(now closed) When it comes to search, - “Meaning Matters.” It’s time for a search engine that thinks like we do, learns like we can, and interacts with us in a human-like way. PureDiscovery KnowledgeGraph is a leap forward for search that allows users to interact with a search engine conversationally as opposed to using a cryptic search language. Armed with our new approach, your organization will both maximize on-target results and minimize unproductive search time. PureDiscovery is leading a radical reinvention of search, based on a core set of beliefs.

Quetzal -


A semantic search engine out of Germany. You should look at its analysis page for a website - interesting data there. I like looking into its list of related terms, which allows you to tag surf other aspects of the inquiries meaning.

Semantic Scholar

Semantic Scholar is a project developed at the Allen Institute for Artificial Intelligence, released in November 2015. It is designed to be a "smart" search service for journal articles. The project uses a combination of machine learning, natural language processing, machine vision to add a layer of semantic analysis to the traditional methods of citation analysis. In comparison to Google Scholar and PubMed, it is designed to quickly highlight the most important papers and identify the connections between them.

As of November 2016, the corpus includes 10 million papers from computer science and neuroscience, of which 25% fall into the latter category. The near-term goal is to expand to include all the biomedical sciences by 2017, a total of 20 million papers.

Semantic Web Search Engine


SenseBot (Beta) is a semantic search engine that generates a text summary of web pages on the topic of your search. It uses text mining and multidocument summarization to extract sense from Web pages and present it to the user in a coherent manner. A "Semantic Cloud" of concepts is displayed above the summary, allowing to steer the focus of the results. See some results.

Sindice, semantic web index

Billion pieces of reusable information can already be found across hundreds of millions web pages which embed RDF and Microformats. Start consuming this data today with Sindice Data Web services.


TipTop Technologies, Inc. is an emerging Silicon Valley-based company founded during the summer of 2008, whose first consumer-facing product on the Internet was launched at in June 2009. Through building some unique and powerful technology at the outset, TipTop is well-positioned to take up a leadership position in the growing market of semantic-driven products both in the consumer and the enterprise space.


The world's first AI question-answering platform. We are using our unique semantic technology to build the first internet-scale platform for directly answering the world's questions. As knowledge is added to the platform we understand and answer more and more.


This is the Watson Web interface for searching ontologies and semantic documents using keywords. The interface is subject to frequent evolutions and improvements. At the moment, enter a set of keywords (e.g. "cat dog old_lady") and a list of URIs of semantic documents will appear where keywords are as identifiers or in classes, properties, and individuals. You can use "wildcards" in the keywords (e.g., "ca? dog*"). You can restrict to particular types of entities (classes, properties or individuals) and elements within entities (local name, label, comment or any literal). For example, you can express queries like "give me the classes or the individuals using the term car in the name or in the label".


Yummly is the world’s first semantic recipe search and recommendation platform. Yummly enables you to find and customize recipes based on your personal taste, nutritional and dietary preferences. The site aggregates recipes from cooking websites, and is fully integrated with Facebook.


Use the Browser Application to browse data sources of the Semantic Web. Its UMBEL (Upper-level Mapping and Binding Exchange Layer) is a reference structure for placing content and data in context with other data. It is comprised of 20,000 subject concepts and their relationships — with one another and with external vocabularies and named entities.


Personal tools