Operationalising Machine Learning

Operationalising Machine Learning

Author

Shankar, Garcia, Hellerstein, Parameswaran

Year
2022
image

Operationalising Machine Learning: An Interview Study

Shankar, Garcia, Hellerstein, Parameswaran 2022. (View Paper → )

Organisations rely on machine learning engineers (MLEs) to operationalise ML, i.e., deploy and maintain ML pipelines in production. The process of operationalising ML, or MLOps, consists of a continual loop of i) data collection and labelling, ii) experimentation to improve ML performance, iii) evaluation throughout a multi-staged deployment process, and (iv) monitoring of performance drops in production. When considered together, these responsibilities seem staggering-how does anyone do MLOps, what are the unaddressed challenges, and what are the implications for tool builders?

Successful machine learning operations centre around three critical variables:

  • Velocity (rapid prototyping and iteration)
  • Validation (early testing and error detection)
  • Versioning (maintaining multiple model versions to minimise downtime).

Rather than viewing the oft-cited 90% failure rate of ML models as problematic, the authors reframe it as a natural consequence of experimentation … most attempts shouldn't reach production, and effective teams excel at quickly identifying promising approaches.

In practice, successful ML teams focus on data quality over model complexity, collaborating closely with domain experts to validate ideas early. They implement multi-stage deployment processes (test, dev, canary, shadow) and tie ML metrics directly to business outcomes. Small, incremental changes using configuration files rather than code changes help maintain stability while enabling experimentation.

For production reliability, organisations establish regular retraining cadences, maintain fallback model versions, add rule-based guardrails to prevent obvious errors, and implement automated validation checks. Organisational practices like on-call rotations, centralised bug tracking, and defined service level objectives further support sustainable MLOps. Throughout all stages, the focus remains on high-value experiments rather than running many in parallel for the sake of keeping resources busy.