About Me

Designing ML systems that work on real traffic: search, generation, recommendations, pricing. Here's my path from business to ML development.

2010–2013

2013–2016

2010–2013

Thinking

I never had a dream to "become a programmer". I was interested in systems and why they work the way they do.

I finished engineering education in 2012. It didn't give great revelations, but taught the main thing: break complex things into parts and look for patterns.

At 20, I started working inside a large government mechanism and saw how decisions depend not on people, but on regulations, approvals, and piles of paperwork.

Efficiency comes from structure, not effort

2013–2016

Built a business

I launched an agency in digital marketing and quickly found myself at a point where growth started eating efficiency.

At the start, everything was done manually through chats: edits in 10 rounds, deadlines in Excel. No tracking, only "whoever remembers is right".

When the team grew from 5 to 30+ people, I realized - half the time went to putting out fires and forwarding tasks between people.

Had to redesign everything. Broke business processes into clear stages, implemented CRM and removed manual reports, replacing them with scripts. To test my hypotheses faster, I implemented APIs myself and wrote scripts for employees.

With the transition to a systems approach, profitability grew by 22%, and the number of overdue tasks in projects dropped to almost zero.

I realized: structure is power, it scales, people don't.

Managed complexity through structure, not people

Coded to remove dependency on hands

2016–2019

From business to code

The digital market was shifting toward recommendation systems, Big Data, and automation. I saw that this was the future and tried to restructure the agency for development: took startups, built product sites, entered outstaffing.

I started writing code more and more often. In IT there was what I lacked in marketing - transparent logic. There's input, there's code, there's result.

Realized that expanding the agency model wouldn't work - started building my own IT projects. Made parsers and bots in Python, built services to test how to connect data from SQL, logic, and automation into a working system.

My launches were engineering-startup experiments: how the system works, where it breaks, how to simplify and scale. Only one thing mattered to me: that everything technically works not once, but always.

Moved to systems where results depend on logic, not taste

Formed an approach: works means doesn't break, even without me

2019–2021

Jump into ML

In 2019 I built my first ML model from scratch, sticks, and forum datasets. It was a neural network on Keras generating music. Worked roughly, but the main thing - it generated on its own.

Guides were already there, but scattered: Jupyter, Colab, TensorFlow articles. Tried to build infrastructure around it and realized that without systematic knowledge I couldn't move forward.

I was interested in the engineering side of ML and Data Science: how infrastructure works, how models go from training to production.

To understand this in practice, ran mini-services on Heroku, but saw the limit: wanted to understand how systems with high traffic hold up, where a failure costs money. Decided to go for Big Tech knowledge. At that moment there were no technical vacancies, but experience in marketing gave an opportunity to enter through Google, where I was responsible for marketing analytics and digital products.

Inside I went through ML programs - first theoretically, then on production tools. Hence the first experience with BigQuery, pipelines on Airflow, tested TFX. Understood how real systems work: deployment, logging, stability requirements. This gave a foundation for a systematic approach to ML infrastructure.

ML without infrastructure is a toy

Focus: design solutions that work under load

2021–2024

ML in production

Google closed the office. Offered relocation, but again to marketing, refused: I want to not maintain ML, but be responsible for architecture and production.

Since May 2022 I've been responsible for the ML chain in a B2B platform. Built modules for pricing, description generation, and demand forecasting on XGBoost and Scikit-learn. Designed an end-to-end pipeline with auto-retrain, fallback, and end-to-end monitoring, ensured SLA 99.9% at ~1 million predictions per day.

In 2023 joined a team of engineers for an AI platform for e-commerce. Worked on the architecture of multimodal search and recommendations: from growing embeddings to online ranking.

Under the hood, CLIP models and LLMs convert text and image queries into unified vectors, a fast FAISS index raises candidates, and on top a hybrid BM25 + neural network re-ranks them.

Implemented online fine-tuning on clicks: CTR grew by 14%, infrastructure costs fell by 30%.

2024–present

Architecture that's trusted

Now I'm responsible for search and recommendations in e-commerce with traffic of 10 million requests per day. Inference on Triton, Faiss + CLIP, rerank BM25, recsys in ONNX. Any error is a minus in revenue.

Focus on platform architecture for ML products: auto-updating pipelines, observability, fault tolerance. I build so that an engineer doesn't patch bugs.

The final test for architecture is when it works, even if you're on vacation.

Systems thinkingLogicResponsibilityRealismSystem architecture

ManageabilityReliabilityRationalityScalabilitySelf-sufficient systemsArchitecture under load

DecompositionAutomationSystem designHypothesis testingPrototypingPragmatism

Articles

Offline-Online Gap in RecSys: 11 Release Gates and Incident Playbook

RecSys

A practical guide to offline-online regressions in RecSys: feedback loops, delayed labels, train/serve skew, OPE limits, 11 release gates, and an incident playbook.

Agent or Workflow: How to Choose Architecture Without Hype

Agents

A practical engineering framework for choosing between workflow and agent: criteria, architecture patterns, evals, security, cost, and rollout plan.

MLOps for a Support RAG Agent in 2026: Releases, Security, and Cost

MLOps

A practical guide to shipping a support RAG agent with tool-calls: architecture contract, release gates, policy enforcement, observability, and FinOps.

Cases

Related Cases

Projects with clear problem framing, architecture, and measurable outcomes.

All projects (5)

Voice AI Operator for Call Center

On-prem voice AI operator handles 72% of calls without human in 0.96s with 58% cost reduction.

Client: NDA Domain: FinTech

Problem: 600 seats in contact center, 9 min wait, SLA penalties and new AI Act requirements, regulations outdated faster than operators can learn.

Solution: On-prem stack with streaming, model cascade, orchestration, and knowledge base. Safety rules and manual escalation.