# Igor Yakushev
> Senior ML Engineer for high-traffic systems (10M+ req/day): MLOps, GenAI/RAG, and Recsys. Results: +14% conversion, −40% latency, −42% cost/req.

- Locale: en
- Canonical: https://igor-ya.com/
- Alternate locale: https://igor-ya.ru/
- Generated: 2026-02-19T22:13:30.069Z

## Core pages
- [Home](https://igor-ya.com/) - Senior ML Engineer for high-traffic systems (10M+ req/day): MLOps, GenAI/RAG, and Recsys. Results: +14% conversion, −40% latency, −42% cost/req.
- [About](https://igor-ya.com/about/) - Senior ML Engineer for search and recommendations at 10M+ req/day. Experience at Google, ViSenze, Ozon. Focus: production ML architecture.
- [Articles](https://igor-ya.com/posts/) - ML engineering articles from production: architecture, MLOps, and case studies with patterns, anti-patterns, and metrics.
- [Projects](https://igor-ya.com/projects/) - Production ML case studies: GenAI/RAG, MLOps, and recommender systems. Each case shows the problem, architecture, and measurable results.
- [Hiring](https://igor-ya.com/formats/) - Full-time Senior/Staff ML Engineer. System design and end-to-end ownership for Search, Ranking, Recsys, GenAI/RAG, and MLOps.

## Latest posts
- [Agent or Workflow: How to Choose Architecture Without Hype](https://igor-ya.com/posts/agent-vs-workflow-architecture-framework/) 2026-02-18 | tags: LLM, Agents, Workflow, System Design, AgentOps, Evals, AI Security, FinOps | category: Agents | reading: 12 min
  - A practical engineering framework for choosing between workflow and agent: criteria, architecture patterns, evals, security, cost, and rollout plan.
- [MLOps for a Support RAG Agent in 2026: Releases, Security, and Cost](https://igor-ya.com/posts/mlops-rag-agent-support-release-gates-security-cost-2026/) 2026-02-10 | tags: MLOps, RAG, AgentOps, LLMOps, AI Security, Observability, FinOps | category: MLOps | reading: 32 min
  - A practical guide to shipping a support RAG agent with tool-calls: architecture contract, release gates, policy enforcement, observability, and FinOps.
- [MLOps for Production ML: 7 Release Gates for Controlled Rollouts](https://igor-ya.com/posts/mlops-release-gates-production-ml/) 2025-12-26 | tags: MLOps, Model Registry, CI/CD, Observability, Drift Detection, FinOps, SRE, AI Security | category: MLOps | reading: 13 min
  - A practical MLOps framework for model releases: which gates are mandatory before rollout, and how to keep quality, SLO, and cost under control.
- [Built: igorOS - A Browser PC with an Agent](https://igor-ya.com/posts/igoros-alternative-site/) 2025-12-15 | tags: Web OS, Tool calling, UX, React, TypeScript | category: Agents | reading: 6 min
  - A second mode of the site: a browser OS with an AI assistant that controls windows and apps through tool calls.
- [Training a Hybrid LLM and Recommender System with Semantic IDs](https://igor-ya.com/posts/semantic-ids-llm-recsys/) 2025-01-20 | tags: LLM, Recommendations, Semantic IDs, RQ-VAE, Qwen3, Retrieval, Ranking, SASRec | category: RecSys | reading: 25 min
  - How to teach a language model to understand a catalog through semantic IDs and produce controllable recommendations with explanations

## Latest projects
- [ML Inference Latency and Cost Evaluation Platform](https://igor-ya.com/projects/ml-cost/) 2025-12-15 | tags: MLOps, Torch, ONNX, Profiling, Cost Optimization
  - Internal tool for profiling latency, throughput, and $/req of models in production
- [Voice AI Operator for Call Center](https://igor-ya.com/projects/voice-ai-contact-center/) 2025-09-28 | tags: Voice AI, Realtime GenAI, Agentic Workflow, MLOps, Call Center, RAG
  - On-prem voice AI operator handles 72% of calls without human in 0.96s with 58% cost reduction.
- [RAG Assistant for Catalog](https://igor-ya.com/projects/rag-search/) 2025-07-15 | tags: RAG, Vector Search, LLM Serving, KServe, MLOps
  - MVP chat search with deployment automation, experiments, and quality monitoring
- [Telegram Antifraud Analytics for Media Plans](https://igor-ya.com/projects/telegram-antifraud/) 2025-05-15 | tags: Telegram, Ad Verification, Anomaly Detection, Fraud, MLOps, Python, FastAPI
  - Fraud detection system reduces inefficient spending by 24% and automates verification of 100 channels in 12 minutes
- [Search and Recommendation System](https://igor-ya.com/projects/search-recommend/) 2025-04-30 | tags: Multimodal, Vector Search, MLOps, Kubernetes, Cost Optimization
  - Multimodal search and recommendation platform with full CI/CD pipeline, monitoring, and A/B experiments

## Full posts index
- [Agent or Workflow: How to Choose Architecture Without Hype](https://igor-ya.com/posts/agent-vs-workflow-architecture-framework/) 2026-02-18 | tags: LLM, Agents, Workflow, System Design, AgentOps, Evals, AI Security, FinOps | category: Agents | reading: 12 min
  - A practical engineering framework for choosing between workflow and agent: criteria, architecture patterns, evals, security, cost, and rollout plan.
- [MLOps for a Support RAG Agent in 2026: Releases, Security, and Cost](https://igor-ya.com/posts/mlops-rag-agent-support-release-gates-security-cost-2026/) 2026-02-10 | tags: MLOps, RAG, AgentOps, LLMOps, AI Security, Observability, FinOps | category: MLOps | reading: 32 min
  - A practical guide to shipping a support RAG agent with tool-calls: architecture contract, release gates, policy enforcement, observability, and FinOps.
- [MLOps for Production ML: 7 Release Gates for Controlled Rollouts](https://igor-ya.com/posts/mlops-release-gates-production-ml/) 2025-12-26 | tags: MLOps, Model Registry, CI/CD, Observability, Drift Detection, FinOps, SRE, AI Security | category: MLOps | reading: 13 min
  - A practical MLOps framework for model releases: which gates are mandatory before rollout, and how to keep quality, SLO, and cost under control.
- [Built: igorOS - A Browser PC with an Agent](https://igor-ya.com/posts/igoros-alternative-site/) 2025-12-15 | tags: Web OS, Tool calling, UX, React, TypeScript | category: Agents | reading: 6 min
  - A second mode of the site: a browser OS with an AI assistant that controls windows and apps through tool calls.
- [Training a Hybrid LLM and Recommender System with Semantic IDs](https://igor-ya.com/posts/semantic-ids-llm-recsys/) 2025-01-20 | tags: LLM, Recommendations, Semantic IDs, RQ-VAE, Qwen3, Retrieval, Ranking, SASRec | category: RecSys | reading: 25 min
  - How to teach a language model to understand a catalog through semantic IDs and produce controllable recommendations with explanations

## Full projects index
- [ML Inference Latency and Cost Evaluation Platform](https://igor-ya.com/projects/ml-cost/) 2025-12-15 | tags: MLOps, Torch, ONNX, Profiling, Cost Optimization
  - Internal tool for profiling latency, throughput, and $/req of models in production
- [Voice AI Operator for Call Center](https://igor-ya.com/projects/voice-ai-contact-center/) 2025-09-28 | tags: Voice AI, Realtime GenAI, Agentic Workflow, MLOps, Call Center, RAG
  - On-prem voice AI operator handles 72% of calls without human in 0.96s with 58% cost reduction.
- [RAG Assistant for Catalog](https://igor-ya.com/projects/rag-search/) 2025-07-15 | tags: RAG, Vector Search, LLM Serving, KServe, MLOps
  - MVP chat search with deployment automation, experiments, and quality monitoring
- [Telegram Antifraud Analytics for Media Plans](https://igor-ya.com/projects/telegram-antifraud/) 2025-05-15 | tags: Telegram, Ad Verification, Anomaly Detection, Fraud, MLOps, Python, FastAPI
  - Fraud detection system reduces inefficient spending by 24% and automates verification of 100 channels in 12 minutes
- [Search and Recommendation System](https://igor-ya.com/projects/search-recommend/) 2025-04-30 | tags: Multimodal, Vector Search, MLOps, Kubernetes, Cost Optimization
  - Multimodal search and recommendation platform with full CI/CD pipeline, monitoring, and A/B experiments

## Machine-readable endpoints
- [LLM JSON Index](https://igor-ya.com/api/llm-index.json)
- [Sitemap](https://igor-ya.com/sitemap-index.xml)
- [Robots](https://igor-ya.com/robots.txt)