As the digital era advances, the volume and complexity of information continue to grow. Finding your needle in this ever-expanding haystack is more challenging yet more critical than ever. Our lives and work often hinge on discovering that one crucial piece of information. This is especially true in fields like health, where finding the right answer at the right time can make all the difference. Search technology has made impressive strides to keep up. We’ve journeyed from basic keyword search to more advanced semantic search powered by natural language processing, knowledge graphs and AI. Then there’s hybrid search, which provides the benefits of both. But what exactly is hybrid search, and why it has emerged? We will explore differences between each approach, and help you determine which search approach is best suited for your needs.
What’s Inside:
Understanding a query - the powerhouse of great search
Keyword Search: Matching the exact words
Semantic Search: More than just words
Hybrid Search: The best of both worlds
The core of any search: Understanding a query
The fundamental approach to how search works typically follows a common structure - understand and process the query, match with all relevant results, and then rank based on the relevance to the original query, as well as any other predefined rules.
As the first step, understanding is also the most critical step, without which there will be no right response. When it comes to understanding and processing the original query, there are two main types of underlying technology: keyword search and semantic search.
Traditional Keyword Search: Matching the words
Traditional keyword search is centred around the simple idea of finding the identical matching text in your resources, similar to looking up words in a book index to find exactly where the information is located.
💡 A real-life example of how keyword search works in health
🔍 Search for “diabetes history” in EMR
The search will search for the exact text match for “diabetes history”. Depending on the configuration, this look for matches of “diabetes history”, “diabetes” and “history” (an AND operator), or “diabetes” or “history” (an OR operator). Based on this operator rule, the results will be vastly different, with no consideration for the full intent of the query.
The system ranks these results based on relevance factors, like frequency. For instance, results where "diabetes history" is mentioned frequently and prominently might be ranked higher.
Keyword Search in action: pros and cons
Keyword search is great for quickly retrieving consistent results, identically matched to words which are used in the query. Typically, this works best for searching on properties like a name, phone number, or a highly specialist term. When you know what exactly you're looking for - and have the specific keywords - you can find the results very quickly.
However, this exact text-matching method also has limitations.
It heavily depends on the user's ability to specify the right keywords - and has no consideration for different words with a similar meaning (synonyms) or different other grammatical considerations (like plurality or tense). This means a number of important and highly relevant results might be left out entirely.
Particularly in health, users may not always know the precise terminology or might miss important information if it is not captured by the specific keywords they use.
This can happen due to differences in language, levels of expertise, or specialization - but there is limited flexibility to understand and interpret these regular variances in our natural language with only a keyword search.
Semantic Search: More than just your words
Semantic search, as the name suggests, considers a query's meaning and uses that understanding to find the most relevant match, rather than only matching with the exact text of the keywords used.
Inside Semantic Search: NLP, Knowledge Graphs, and Vector Embeddings
Different underlying technologies provide different levels of semantic capabilities to interpret a keyword or string of keywords. To provide semantic understanding capabilities, there are typically two main approaches - using manually curated knowledge graphs, and the AI-powered vector search.
Traditional Semantic Search
Compared to keyword search, this approach is more intelligent in understanding the intent behind a query. It can handle longer and more complex text, and match words with similar or related meanings, to better understand the specific query topic.
This is achieved primarily by manually curating a list of related words and concepts, organized through a Knowledge Graph. This allows the system to connect associated words to the original query, matching the expanded list of relevant keywords against the potential results in the index. While it allows for more nuanced and comprehensive result matching, it requires significant manual work to curate the list, and typically covers only the most common queries and concepts, at the expense of more unique or specialized long tail queries.
Vector Search
While traditional semantic search methods layer on additional context on top of keyword searches, they still fall short to the rich understanding that a human would have. Vector similarity search achieves a significantly higher level of full query understanding through vector embeddings - a numeric representation of meaningful information, or relationships of a sentence, word or token. This representation of data can be understood as points in multidimensional space, where the locations of each data point is semantically meaningful.
This allows machines to retrieve more accurate and contextually relevant results, which is particularly helpful for highly vertical fields like health, where the precise understanding of terminology and relationships can significantly impact decision-making and outcomes.
Understanding the use case and workflow for every search experience is key. For applications where the user is searching for an exact text match on data like a name, phone number, or reference to a specific condition, typically a keyword search is the best option.
However, for situations where a patient may not use the exact keyword, or a provider would like to ensure all relevant results are included to give a more complete context with synonymous or related symptoms, conditions, or medications, a semantic search is a better option.
Hybrid Search: The best of both
At this point, you may already realize that both keyword and vector searches have their strengths and limitations, depending on the type of search experiences and use case at hand. Keyword searches are fast and precise for well-defined queries but struggle with context and semantics. On the other hand, semantic searches excel at understanding context and relationships but can be more resource-intensive or require significant upfront work.
How do you get the benefits of both sides? Enter Hybrid Search.
Hybrid Search allows you to combine the best of keyword and vector search in a single API. This approach enhances the overall search experience, providing a powerful semantic search when relevance and context matter, and performant keyword search for queries where the exact keyword is known.
Key Features of Hybrid Search:
Precise and contextually relevant results. Whether its highly specialist medical terms, or the natural language descriptions of a patient, you can enable accurate and contextually relevant results, on both structured and unstructured data. Whether its lab results, provider listings, and patient demographics, or health content, text-heavy notes, and referral letters, users can pull results from a wider range of relevant trusted sources, and provide a higher certainty of surfacing the right results, to deliver a more effective search experience.
Speed to answer. Speed is crucial in healthcare. Providing a hybrid search allows your users to rapidly get to the right answer they need without rewriting their search query to find that needle and cover all bases.
Optimized for each use case. By offering a single API for both keyword and semantic search, organizations can ensure performant search experiences optimized for each use case or workflow, across their different applications and all from a single managed data store.
Getting Hybrid Search Right for Health
You now have a grasp of how hybrid search works.
But there’s much more to it. Implementing a hybrid search system effectively requires significant effort and consideration. Each industry has unique requirements and challenges that must be addressed to ensure the technology functions well.
This is particularly true when it comes to the health industry where so much is at stake. In our next article, we will take a closer look at hybrid search in the context of health. We'll explore the specific considerations necessary to make it work seamlessly in this critical field, and how Clinia is approaching hybrid search to deliver precise and reliable search capabilities.
Stay tuned.