Search and Recommendation System

Multimodal search and recommendation platform with full CI/CD pipeline, monitoring, and A/B experiments

Search and Recommendation System

One-liner: I increased search conversion by 54% and reduced request cost by a third, implementing a multimodal search and recommendation platform for 10M+ SKU in 90 days.

What the system does in simple terms

Problem: searching 10M+ products in e-commerce is slow (620 ms), multimodal search and recommendations are expensive (0.28perQPS).Clientslostupto0.28 per QPS). Clients lost up to 0.7M monthly due to slow search, GPT-4 token bill reached $45k per month.

Solution: system understands text and images through CLIP encoder, finds similar products via FAISS HNSW in 175 ms and accurately ranks via cross-encoder. LLM cascade (Claude 3 → Mistral-7B) saves on expensive models, TensorRT INT8 speeds up 3.6x.

Savings: latency dropped from 620 ms to 175 ms (minus 72%), cost decreased by 33% (from 0.28to0.28 to 0.19 per QPS). CTR increased by 54%, empty results from 28% to 4%. GMV increased by $8.4M, support tickets dropped by 52%.

ML part: platform uses fine-tuned CLIP ViT-L/14 on 42M pairs for multimodal search, FAISS HNSW with category sharding, TensorRT optimization and cross-encoder for rerank. System automatically retrains on drift via Evidently, supports 300 QPS with 99.97% SLA.


TL;DR

BeforeAfterWhat changed
p95 latency: 620 ms175 ms−72% latency
Cost: $0.28/QPS$0.19/QPS−33% cost
CTR: baseline+54%+54% CTR
Zero results: 28%4%−86% zero results
Manual deploymentCI/CD + canaryZero-downtime deployment

This is an English placeholder. Full translation coming soon.

Contact

Contact

Ready to discuss ML projects and implementations, I respond personally.