Search and Recommendation System
Multimodal search and recommendation platform with full CI/CD pipeline, monitoring, and A/B experiments
One-liner: I increased search conversion by 54% and reduced request cost by a third, implementing a multimodal search and recommendation platform for 10M+ SKU in 90 days.
What the system does in simple terms
Problem: searching 10M+ products in e-commerce is slow (620 ms), multimodal search and recommendations are expensive (0.7M monthly due to slow search, GPT-4 token bill reached $45k per month.
Solution: system understands text and images through CLIP encoder, finds similar products via FAISS HNSW in 175 ms and accurately ranks via cross-encoder. LLM cascade (Claude 3 → Mistral-7B) saves on expensive models, TensorRT INT8 speeds up 3.6x.
Savings: latency dropped from 620 ms to 175 ms (minus 72%), cost decreased by 33% (from 0.19 per QPS). CTR increased by 54%, empty results from 28% to 4%. GMV increased by $8.4M, support tickets dropped by 52%.
ML part: platform uses fine-tuned CLIP ViT-L/14 on 42M pairs for multimodal search, FAISS HNSW with category sharding, TensorRT optimization and cross-encoder for rerank. System automatically retrains on drift via Evidently, supports 300 QPS with 99.97% SLA.
TL;DR
| Before | After | What changed |
|---|---|---|
| p95 latency: 620 ms | 175 ms | −72% latency |
| Cost: $0.28/QPS | $0.19/QPS | −33% cost |
| CTR: baseline | +54% | +54% CTR |
| Zero results: 28% | 4% | −86% zero results |
| Manual deployment | CI/CD + canary | Zero-downtime deployment |
This is an English placeholder. Full translation coming soon.