Constructor Blog | Ecommerce Search Industry and Product Information

Why Keyword Relevance Testing Is the Wrong Way to Audit Your Ecommerce Search Algorithm

Written by Nate Roy | Oct 28, 2024 3:46:38 PM

Keyword relevance testing is a popular way for ecommerce teams to audit the quality of the search experience on their site. While well-intentioned, this approach is outdated in a modern era of search and product discovery where personalization and Generative AI (GenAI) are becoming increasingly prevalent. It’s not just about matching keywords anymore — it’s about understanding what products will resonate most with your customers.

If you’re still using keyword relevance testing to evaluate your search algorithm, it’s time to rethink your approach. Find out more below. 

What Is Keyword Relevancy Testing?

Once a popular way to audit the on-site search experience, keyword relevance testing focuses on ensuring products appear in the search results based on relevant keywords

Essentially, this type of testing helps retailers verify that search results match customer intent and expectations, and that product listings use relevant keywords that align with how customers search.

How is keyword relevance testing traditionally performed?

Often, the process relies heavily on manual assessments, where you review your ecommerce site’s search data to identify common terms your customers are using. Then, you assess how well your search function is performing by following these key steps:

  • On-site search analysis. Perform searches using various terms related to your products, such as exact product names, generic product categories, specific attributes (color, size, brand, etc.), misspellings, typos, synonyms and related terms, etc.
  • Query variations. Test different types of queries, including long-tail keywords. For example, a grocery site might test "organic gluten-free pasta," while a furniture store could use "modern walnut dining table," and an apparel retailer might try "women's waterproof winter boots."  
  • Autocomplete and predictive search. Verify that autocomplete and predictive search suggestions are relevant and helpful, accurately reflecting your product catalog.
  • Search result accuracy. Evaluate whether the most relevant products appear at the top of search results. Confirm that the search results align with customer intent and check that faceted search and filtering options are working as expected.

While these steps cover a range of query types, they remain focused on keywords alone. This fails to account for the more dynamic aspects of user search behavior, such as personalization or evolving customer intents.

Why This Approach No Longer Works

At face value, keyword relevance testing appears useful. But as ecommerce evolves, so do customer expectations, making this traditional approach increasingly ineffective for various reasons:

Lack of personalization

Keyword relevance testing lacks personalization, as it doesn’t consider user-specific factors like search history, preferences, or behavior patterns that advanced solutions like Constructor take into account.

Additionally, keyword testing faces several limitations:

  • Difficulty with synonyms and related terms. It often misses the relevance of results that use synonyms rather than exact keyword matches.
  • Challenges with multi-lingual or localized searches. It struggles to account for language variations or regional and cultural differences in search behavior (i.e., “wellies” for rain boots or “jumper” for sweaters).
  • Inability to capture query reformulation. Users often refine their searches based on initial results, a behavior that keyword testing doesn't typically address. Constructor handles this with its search insights functionality.
  • Overemphasis on textual content. This approach often overlooks the relevance of non-textual content, like image or voice search.
  • Handling misspellings, typos, and their nuances. Though advanced algorithms can manage this, keyword testing may fail to evaluate nuanced differences, such as "pepper" vs. "peppers."
  • Inability to assess search quality over time. As search relevance changes with updates or seasonal trends, keyword testing only captures a snapshot at one moment, missing the dynamic nature of search relevance.

Overreliance on “The Eye Test”

The manual method of keyword relevance testing, often referred to as "The Eye Test," can lead to inconsistent results. It’s highly subjective, relying on individual interpretations of what’s relevant. What one team member considers relevant may differ from another’s perspective, leading to unreliable outcomes.

Beyond the inherent flaws of subjective human scoring, there are several additional issues:

  • Context. A single word can carry multiple meanings depending on the context. A small variation in wording — such as “pepper” vs. “peppers” — can completely change the search intent, which the eye test may fail to catch.
  • Long-tail queries. “The Eye Test” often struggles with long-tail queries, which are more detailed and specific. These nuanced searches require precise intent-matching, and subjective judgment alone often fails to align with what the customer is actually seeking.
  • Trends and seasons. Keyword testing doesn’t account for temporal, trending, or seasonal shifts in customer behavior, as it relies on static, one-time keyword evaluations. This means retailers may miss opportunities to capitalize on current trends or seasonal interest — something savvy retailers now address with GenAI-driven solutions.
  • Product catalog familiarity. Keyword testing assumes that first-time visitors understand your product catalog and metadata, which is rarely the case. New shoppers often use broad or unfamiliar terms, making it difficult for keyword-based testing to deliver relevant results, ultimately missing chances to engage with these potential customers.

In short, one person’s definition of relevance is no longer enough to meet modern customer expectations.

Overemphasis on basic keyword matching

Many ecommerce sites still rely too heavily on simple keyword-based search functionality. 

This keyword testing overlooks deeper patterns in customer behavior, such as customers’ evolving search habits or contextual needs. Even with attempts to test a range of queries and keywords, it also typically focuses on individual words or short phrases, which may not capture the nuances of natural language queries or complex search intents. 

These more complex queries will become more common as LLMs and AI capabilities are adopted by more people over time, such as ChatGPT, Gemini, Apple Intelligence, and more advanced Siri capabilities. As more mainstream audiences become familiar with these technologies, there will be an expectation that more complex queries will work on ecommerce sites.

Ignoring customer search data

Despite the initial step of reviewing search terms, many ecommerce sites fail to fully utilize customer search data. 

They often overlook deeper insights such as search patterns, user intent, and trends over time. This means they miss valuable opportunities to understand and respond to customer needs beyond surface-level keyword matching.

Poor navigation and browsing experience

A clunky navigation can quickly turn potential customers away, especially when faced with confusing category structures, limited filtering options, and non-intuitive menus or layouts. 

Additionally, when evaluating faceted search, keyword testing often falls short, as it fails to capture the true impact of filtered search results on the customer experience.

Neglecting product descriptions

Many ecommerce sites make the mistake of using thin or non-existent product descriptions, often relying on manufacturer descriptions verbatim and failing to optimize for relevant keywords. This not only limits the ecommerce site's SEO potential, but also renders keyword relevance testing ineffective. 

Without well-developed, unique descriptions, there’s a lack of keyword depth and variety needed to match diverse customer queries. This leads to poor search results and missed opportunities for discoverability.

What to Do Instead

Here’s how you can modernize your search auditing process:

Focus on attractiveness, not relevance

Search relevance alone isn’t sufficient for ecommerce success, as it often fails to deliver the most attractive results for customers. Rather than clinging to outdated methods, it’s time to shift your focus from relevance to attractiveness

Attractiveness takes into account more than just keywords. AI-native search technology built around attractiveness leverages clickstream data and user interactions to drive search results. This means user-entered search terms and subsequent clicks and conversions are pivotal in dictating what should be returned for a given search query. 

This allows for a dynamic product discovery experience that adapts to user behavior and preferences over time. It also drives key business metrics that matter most for large ecommerce companies, like revenue, conversion rates, and customer satisfaction over time. 

Analyze internal search data

Dive deeper into your internal search data beyond just a list of common search terms. Identify trends such as which search queries consistently lead to conversions and which fail to engage users. Look for patterns in user behavior — like queries that frequently result in ‘no results’ or those that trigger high bounce rates. 

Merchandisers of this apparel brand are able to see top searches in the back-end, and then searchandise on top of those for optimal business results. 

By analyzing not just what customers search for but how they interact with the results, you can pinpoint gaps in your search experience and prioritize areas for improvement.

Evaluate product listings

Audit product titles, descriptions, and metadata to ensure they contain relevant keywords. Also, check for consistency in terminology across similar products. 

Have a large, complex catalog? Leverage GenAI-powered solutions, which automatically enrich product information at scale.

Attribute Enrichment is a GenAI-powered tool that automatically enriches product attributes and categories, eliminating the headaches caused by incomplete, inaccurate, or non-existent product data and giving shoppers the best possible experience. 

Refine site search functionality & experiment with search algorithms

Experiment with different search ranking algorithms to find out which provides the best results for your customers. Regularly update your site's search algorithm based on user behavior and new product additions, and don’t forget to implement features like autocomplete, synonym matching, and spell-check to improve search accuracy.

Look at metrics like CTR, conversions, and other engagement metrics to measure success.

Test error tolerance

Try searches with common misspellings and typos to see if the algorithm can still return relevant results. 

On Sephora’s website, the mixed language query of “dry champu” (champu being Spanish for shampoo) produces attractive results for dry shampoo products.  

You can also check how the search handles plural vs. singular terms.

Pick n Pay returns principally spice options for the search query “pepper” and vegetables for the search query “peppers” for a new shopper with no previous search history. 

Assess natural language processing

Test searches using natural language queries (e.g., "blue pajamas for women"). Verify that the algorithm understands context and intent.

Target AU’s search solution slows contextually relevant results for natural language queries, like “blue pajamas for women.” 

Evaluate personalization

If your algorithm includes personalization features, test how search results vary for different user profiles or browsing histories.

See how Sephora's search engine personalizes the shopper's online experience based on subtle brand affinity cues.  

Use customer feedback

Engage with customers through surveys, interviews, or social media to understand the language they use when describing or searching for products

Optimize product listings

Incorporate relevant keywords naturally into product titles, descriptions, and metadata. Use long-tail keywords and related terms to improve specificity and match varied search queries.

Time to Take Action

If you’re feeling concerned that your search hasn't been as effective as it could be, you’re not alone. But the good news is that there’s a path forward. By focusing on attractiveness over relevance and embracing modern search capabilities, you can meet and exceed customer expectations — and ultimately drive more revenue.

Ready to take the next step? Download our free audit checklist (no email required) or request a complimentary audit from an ecommerce expert today

See how your site measures up and receive actionable advice on how to improve it for optimal KPIs this holiday season.