Experiments Blog | Insights from 1,000+ Ecommerce A/B Tests

Adapting Ranking Models to Short-Term Sales Behavior

Written by Nate Roy | Apr 28, 2025 10:20:14 AM

This post was written in collaboration with Valery Bezrukova, VP Product at Constructor.

When sales season hits, it’s a win for shoppers — but it can throw a wrench into even the best-performing product ranking algorithms. In this experiment, we set out to improve our models’ ability to adapt to the unique, fast-shifting dynamics of sale periods across several customers and verticals.

The Problem: Short-Term Sales Behavior Is More Challenging to Predict

A few of our customers flagged a trend they noticed: when they ran short-term sales, it would sometimes impact the quality of product results. That was surprising. Our ML models are typically quite strong at learning from historical behavioral patterns. So, why the change?

The answer lies in how differently shoppers behave during sales:

  • Pricing changes rapidly and dramatically. Algorithms trained on regular pricing have a harder time adjusting to short-term, steep discounts, especially when they are unprecedented (in other words, no historic examples to model them after)

  • Query distribution changes. Sales-specific terms and behaviors spike (think: things like “clearance,” “last chance,” or “under $20”)

  • Item popularity shifts quickly. Products that previously received little attention can become highly attractive overnight, then go back to normal once the sale is over

Even a human would struggle to predict behavior based on out-of-distribution pricing. For instance, let’s say a car Model A ($30k) historically outsells Model B ($50k) 2:1. If the prices of both cars suddenly drop by 50%, would you expect Model A to continue to outsell Model B by 2:1? That’s uncharted territory and would be difficult for anyone to guess. 

The same is true for the ML. The model will interpolate based on past data, but it would have a harder time extrapolating when prices for products suddenly move below anything it's seen historically.

The Hypothesis: Can We Use Weighted CTR to Adapt Faster? 

We hypothesized that we could mitigate this problem by using Clickthrough Rate (CTR) instead of raw action counters (e.g., clicks, purchases) as a primary signal. Why?

  • CTR is a normalized metric. It adjusts for the number of impressions, which helps surface items that are performing well relative to how often they’re seen

  • CTR is more stable for new or rarely-seen items. A previously unpopular product might suddenly be shown more during a sale. It might still have fewer total clicks than a perennial bestseller, but a higher CTR

To improve signal quality, we experimented with a weighted CTR:

Weighted CTR = Weighted actions / Unweighted impressions

This approach blends recency sensitivity (giving more weight to recent user actions) with volume smoothing (by using stable impression counts), enabling the model to adapt to trends without overfitting to momentary noise.

The Test: Running Weighted CTR Across Vertical and Experience 

We ran the experiment across several interested customers and categories. Tests spanned both Search and Browse, over a minimum three-day period (sometimes longer depending on the customer’s specific circumstances). Each test introduced the weighted CTR-based signal into the ranking logic and compared it to the original counter-based model.

Here’s a sample of the results:

Browse Tests

  • Jewelry:
    +1.31% Add-to-Cart (ATC)
    +1.23% Checkout Revenue
    +2.49% Purchased Items

  • Furniture & Home:
    +1.53% ATC
    +2.93% Purchased Items

  • Toys:
    +7.93% Checkout Revenue
    +0.69% Purchased Items

Search Tests

  • Apparel:
    +2.05% ATC

  • General Merchandise:
    +2.1% ATC
    +0.05% Checkout Revenue
    +0.56% Purchased Items

  • Pets:
    +0.01% Checkout Revenue
    +0.89% Purchased Items

Note: Each test above is a standalone example and should not be seen as representative of the entire vertical. Customer behavior can vary widely within the same industry depending on the specific audience, test duration, and business strategies.

The Outcome: Early Signals on Faster Adaption 

While the results varied across verticals, a couple of clear themes emerged:

  • Browse experiences generally benefited more from weighted CTR. This makes sense. Browse is often where shoppers encounter new items, and CTR helps highlight what's trending in near-real time.

  • Weighted CTR helped the model respond faster to shifts in behavior, especially where previously unpopular or newer items began trending.

For retailers considering introducing this type of variable, it is worth noting that acting quickly on trends might not always be what matters most to shoppers (especially if they value your brand for its bestsellers, for example). 

While promoting new trends might improve short-term sales metrics, for some types of businesses, it may ultimately hurt overall performance by making it harder for shoppers to find the products they truly want.

Retailers should consider what their brand is known for and whether this type of strategy makes sense in the context of their business before testing it.

What's Next: Continuous Optimization 

This test confirmed that faster model adaptation to user behavior shifts is both possible and valuable in certain instances. Weighted CTR isn’t a silver bullet, but it’s a powerful signal that helps us enhance real-time trend responsiveness.

And it's just one step. We’re continuing to explore:

  • Further development of features that improve trend detection (e.g., slope features, new iterations with real-time signals)

  • Smarter training window selection to match periods of user behavior, and enhancing the training dataset to better account for period differences

  • Manual override rules as fallback mechanisms for extreme scenarios

All of this aligns with our broader goal: to create adaptive ranking systems that don’t just react to change, but anticipate it.