AI Personalization Engine for D2C E-Commerce — Case Study

The Challenge

A fast-growing D2C e-commerce platform with over 15,000 SKUs and 4 million monthly active users was struggling with a fundamental problem: every customer saw the same product recommendations regardless of their preferences, browsing behavior, or purchase history. Their existing rules-based recommendation engine relied on simple heuristics — "customers who bought X also bought Y" — and static merchandising rules that were updated manually by a team of three.

The results were predictable. Conversion rates had plateaued at 2.1%, well below the industry benchmark of 3.5% for their category. Cart abandonment hovered at 74%. Customers were landing on the site, scrolling through pages of irrelevant products, and leaving without purchasing. The marketing team was spending aggressively on acquisition, but the funnel was leaking badly at the consideration and conversion stages.

The platform had rich behavioral data — clickstreams, search queries, wish lists, reviews — but none of it was being used to personalize the shopping experience. They needed a recommendation system that could learn individual preferences in real time and serve relevant products across every touchpoint: homepage, category pages, product detail pages, cart, and email.

Our Approach

Week 1-2: Data Discovery & Feature Engineering

We began with a deep dive into 14 months of behavioral data spanning 48 million sessions. The goal was to understand what signals actually predicted purchase intent versus casual browsing. We identified 63 behavioral features across four categories: explicit signals (purchases, ratings, wish list additions), implicit signals (dwell time, scroll depth, product image zoom), contextual signals (time of day, device type, geographic region), and sequential patterns (browsing paths that led to conversion versus abandonment).

We also audited the product catalog metadata and found significant gaps — inconsistent categorization, missing attributes, and thin descriptions for 30% of SKUs. We built an automated enrichment pipeline using NLP to extract and standardize product attributes from titles, descriptions, and customer reviews, creating a dense feature space for content-based filtering.

Week 3-4: Model Architecture & Training

We designed a hybrid recommendation architecture combining collaborative filtering with content-based methods. The core model was a two-tower neural network — one tower encoding user behavior sequences using a transformer encoder, the other encoding product features through a multi-layer embedding network. The dot product of the two tower outputs produced relevance scores.

User tower processed the last 50 interactions as a sequence, capturing temporal patterns and preference evolution
Product tower combined categorical embeddings (brand, category, price range) with learned visual embeddings from product images
A contextual layer incorporated real-time signals — current session behavior, time of day, device — to adjust recommendations dynamically
Candidate generation used approximate nearest neighbor search (FAISS) to retrieve 500 candidates in under 5ms, followed by a ranking model that scored and reordered the top 50

Week 5-6: A/B Testing & Optimization

We deployed the new system alongside the existing rules engine in a rigorous A/B test framework. Traffic was split 50/50 with stratification across user segments (new vs. returning, mobile vs. desktop, high-value vs. casual). We tracked conversion rate, revenue per session, click-through rate on recommendations, and downstream metrics like return rate and customer lifetime value.

During the first week of testing, the AI system outperformed the control on every metric. We used the second week to run multi-armed bandit experiments on recommendation placement, carousel length, and explanation copy ("Recommended for you" vs. "Based on your recent browsing" vs. "Customers with your taste also loved"). The personalized explanation copy drove an additional 8% lift in click-through rate.

Week 7: Full Rollout & Handoff

We ramped to 100% traffic with automated monitoring for recommendation quality metrics, latency, and coverage (ensuring long-tail products still received exposure). We built a real-time dashboard for the merchandising team showing recommendation performance by category, and trained the team on the feedback loop — how to use business rules as guardrails without overriding the model.

Tech Stack

Python PyTorch FastAPI Redis AWS PostgreSQL Snowflake

Results

Within 90 days of full production deployment, the personalization engine delivered transformative business impact:

34% conversion lift — conversion rate increased from 2.1% to 2.8%, with returning customers seeing even stronger gains at 3.4%
$18M in additional annual revenue — driven by higher conversion and increased average order value across all product categories
2.1x customer engagement — average pages per session nearly doubled, with recommendation click-through rates at 12.3% versus the previous 4.1%
40% increase in average order value — the cross-sell and upsell recommendations surfaced complementary products that customers genuinely wanted
Catalog coverage improved to 89% — long-tail products that previously received zero visibility now appeared in relevant recommendation slots

We tried three different recommendation vendors before Arkyon. The difference is they actually understood our data and our customers. The system they built doesn't just recommend popular products — it understands individual taste and adapts in real time. It's our single biggest revenue driver this year.

M.K. — Head of Product, D2C E-Commerce Platform

What Made This Work

Hybrid architecture — combining collaborative filtering with content-based methods solved the cold-start problem for new products and new users simultaneously
Real-time contextual adaptation — recommendations shifted based on current session behavior, not just historical preferences, making the experience feel genuinely responsive
Rigorous A/B testing — we didn't just measure clicks; we tracked downstream metrics like return rates and 90-day LTV to ensure recommendations drove genuine value, not just impulse purchases
Merchandising team empowerment — the system augmented rather than replaced human judgment, giving the team controls to set business rules while letting the model optimize within those constraints

AI Personalization Engine for D2C E-Commerce Platform

The Challenge

Our Approach

Week 1-2: Data Discovery & Feature Engineering

Week 3-4: Model Architecture & Training

Week 5-6: A/B Testing & Optimization

Week 7: Full Rollout & Handoff

Tech Stack

Results

What Made This Work

Facing a Similar Challenge?
Let's Talk.

AI Personalization Engine for D2C E-Commerce Platform

The Challenge

Our Approach

Week 1-2: Data Discovery & Feature Engineering

Week 3-4: Model Architecture & Training

Week 5-6: A/B Testing & Optimization

Week 7: Full Rollout & Handoff

Tech Stack

Results

What Made This Work

Facing a Similar Challenge?Let's Talk.

Facing a Similar Challenge?
Let's Talk.