# Igor Yakushev > Senior ML Engineer for high-traffic systems (10M+ req/day): MLOps, GenAI/RAG, and Recsys. Results: +14% conversion, −40% latency, −42% cost/req. - Locale: en - Canonical: https://igor-ya.com/ - Alternate locale: https://igor-ya.ru/ - Generated: 2026-02-19T22:13:30.069Z ## Core pages - [Home](https://igor-ya.com/) - Senior ML Engineer for high-traffic systems (10M+ req/day): MLOps, GenAI/RAG, and Recsys. Results: +14% conversion, −40% latency, −42% cost/req. - [About](https://igor-ya.com/about/) - Senior ML Engineer for search and recommendations at 10M+ req/day. Experience at Google, ViSenze, Ozon. Focus: production ML architecture. - [Articles](https://igor-ya.com/posts/) - ML engineering articles from production: architecture, MLOps, and case studies with patterns, anti-patterns, and metrics. - [Projects](https://igor-ya.com/projects/) - Production ML case studies: GenAI/RAG, MLOps, and recommender systems. Each case shows the problem, architecture, and measurable results. - [Hiring](https://igor-ya.com/formats/) - Full-time Senior/Staff ML Engineer. System design and end-to-end ownership for Search, Ranking, Recsys, GenAI/RAG, and MLOps. ## Latest posts - [Agent or Workflow: How to Choose Architecture Without Hype](https://igor-ya.com/posts/agent-vs-workflow-architecture-framework/) 2026-02-18 | tags: LLM, Agents, Workflow, System Design, AgentOps, Evals, AI Security, FinOps | category: Agents | reading: 12 min - A practical engineering framework for choosing between workflow and agent: criteria, architecture patterns, evals, security, cost, and rollout plan. - [MLOps for a Support RAG Agent in 2026: Releases, Security, and Cost](https://igor-ya.com/posts/mlops-rag-agent-support-release-gates-security-cost-2026/) 2026-02-10 | tags: MLOps, RAG, AgentOps, LLMOps, AI Security, Observability, FinOps | category: MLOps | reading: 32 min - A practical guide to shipping a support RAG agent with tool-calls: architecture contract, release gates, policy enforcement, observability, and FinOps. - [MLOps for Production ML: 7 Release Gates for Controlled Rollouts](https://igor-ya.com/posts/mlops-release-gates-production-ml/) 2025-12-26 | tags: MLOps, Model Registry, CI/CD, Observability, Drift Detection, FinOps, SRE, AI Security | category: MLOps | reading: 13 min - A practical MLOps framework for model releases: which gates are mandatory before rollout, and how to keep quality, SLO, and cost under control. - [Built: igorOS - A Browser PC with an Agent](https://igor-ya.com/posts/igoros-alternative-site/) 2025-12-15 | tags: Web OS, Tool calling, UX, React, TypeScript | category: Agents | reading: 6 min - A second mode of the site: a browser OS with an AI assistant that controls windows and apps through tool calls. - [Training a Hybrid LLM and Recommender System with Semantic IDs](https://igor-ya.com/posts/semantic-ids-llm-recsys/) 2025-01-20 | tags: LLM, Recommendations, Semantic IDs, RQ-VAE, Qwen3, Retrieval, Ranking, SASRec | category: RecSys | reading: 25 min - How to teach a language model to understand a catalog through semantic IDs and produce controllable recommendations with explanations ## Latest projects - [ML Inference Latency and Cost Evaluation Platform](https://igor-ya.com/projects/ml-cost/) 2025-12-15 | tags: MLOps, Torch, ONNX, Profiling, Cost Optimization - Internal tool for profiling latency, throughput, and $/req of models in production - [Voice AI Operator for Call Center](https://igor-ya.com/projects/voice-ai-contact-center/) 2025-09-28 | tags: Voice AI, Realtime GenAI, Agentic Workflow, MLOps, Call Center, RAG - On-prem voice AI operator handles 72% of calls without human in 0.96s with 58% cost reduction. - [RAG Assistant for Catalog](https://igor-ya.com/projects/rag-search/) 2025-07-15 | tags: RAG, Vector Search, LLM Serving, KServe, MLOps - MVP chat search with deployment automation, experiments, and quality monitoring - [Telegram Antifraud Analytics for Media Plans](https://igor-ya.com/projects/telegram-antifraud/) 2025-05-15 | tags: Telegram, Ad Verification, Anomaly Detection, Fraud, MLOps, Python, FastAPI - Fraud detection system reduces inefficient spending by 24% and automates verification of 100 channels in 12 minutes - [Search and Recommendation System](https://igor-ya.com/projects/search-recommend/) 2025-04-30 | tags: Multimodal, Vector Search, MLOps, Kubernetes, Cost Optimization - Multimodal search and recommendation platform with full CI/CD pipeline, monitoring, and A/B experiments ## Full posts index - [Agent or Workflow: How to Choose Architecture Without Hype](https://igor-ya.com/posts/agent-vs-workflow-architecture-framework/) 2026-02-18 | tags: LLM, Agents, Workflow, System Design, AgentOps, Evals, AI Security, FinOps | category: Agents | reading: 12 min - A practical engineering framework for choosing between workflow and agent: criteria, architecture patterns, evals, security, cost, and rollout plan. - [MLOps for a Support RAG Agent in 2026: Releases, Security, and Cost](https://igor-ya.com/posts/mlops-rag-agent-support-release-gates-security-cost-2026/) 2026-02-10 | tags: MLOps, RAG, AgentOps, LLMOps, AI Security, Observability, FinOps | category: MLOps | reading: 32 min - A practical guide to shipping a support RAG agent with tool-calls: architecture contract, release gates, policy enforcement, observability, and FinOps. - [MLOps for Production ML: 7 Release Gates for Controlled Rollouts](https://igor-ya.com/posts/mlops-release-gates-production-ml/) 2025-12-26 | tags: MLOps, Model Registry, CI/CD, Observability, Drift Detection, FinOps, SRE, AI Security | category: MLOps | reading: 13 min - A practical MLOps framework for model releases: which gates are mandatory before rollout, and how to keep quality, SLO, and cost under control. - [Built: igorOS - A Browser PC with an Agent](https://igor-ya.com/posts/igoros-alternative-site/) 2025-12-15 | tags: Web OS, Tool calling, UX, React, TypeScript | category: Agents | reading: 6 min - A second mode of the site: a browser OS with an AI assistant that controls windows and apps through tool calls. - [Training a Hybrid LLM and Recommender System with Semantic IDs](https://igor-ya.com/posts/semantic-ids-llm-recsys/) 2025-01-20 | tags: LLM, Recommendations, Semantic IDs, RQ-VAE, Qwen3, Retrieval, Ranking, SASRec | category: RecSys | reading: 25 min - How to teach a language model to understand a catalog through semantic IDs and produce controllable recommendations with explanations ## Full projects index - [ML Inference Latency and Cost Evaluation Platform](https://igor-ya.com/projects/ml-cost/) 2025-12-15 | tags: MLOps, Torch, ONNX, Profiling, Cost Optimization - Internal tool for profiling latency, throughput, and $/req of models in production - [Voice AI Operator for Call Center](https://igor-ya.com/projects/voice-ai-contact-center/) 2025-09-28 | tags: Voice AI, Realtime GenAI, Agentic Workflow, MLOps, Call Center, RAG - On-prem voice AI operator handles 72% of calls without human in 0.96s with 58% cost reduction. - [RAG Assistant for Catalog](https://igor-ya.com/projects/rag-search/) 2025-07-15 | tags: RAG, Vector Search, LLM Serving, KServe, MLOps - MVP chat search with deployment automation, experiments, and quality monitoring - [Telegram Antifraud Analytics for Media Plans](https://igor-ya.com/projects/telegram-antifraud/) 2025-05-15 | tags: Telegram, Ad Verification, Anomaly Detection, Fraud, MLOps, Python, FastAPI - Fraud detection system reduces inefficient spending by 24% and automates verification of 100 channels in 12 minutes - [Search and Recommendation System](https://igor-ya.com/projects/search-recommend/) 2025-04-30 | tags: Multimodal, Vector Search, MLOps, Kubernetes, Cost Optimization - Multimodal search and recommendation platform with full CI/CD pipeline, monitoring, and A/B experiments ## Machine-readable endpoints - [LLM JSON Index](https://igor-ya.com/api/llm-index.json) - [Sitemap](https://igor-ya.com/sitemap-index.xml) - [Robots](https://igor-ya.com/robots.txt)