What’s wrong with MLOps?


With startups cheering to be included in the next top X hundred ML companies on social media and the announcement of yet another decamillion VC investment into a stealth mode startup, I thought to take stock of the state of MLOps today.

Why does Machine Learning matter?

Clearly, machine learning is here to stay. There are enough use cases and success stories to justify every corporation to investigate if it can increase its productivity and exploit otherwise unexploitable opportunities. The road to adoption also creates a race in today’s winner-takes-all environment and the first to succeed in each space will be disproportionally rewarded. If your enterprise struggles or haven’t yet started with machine learning adoption, your mindset should be “Trouble” instead of “Even Keel” (terms from the book “New Strategic Selling”). Time is running out for you. But to decide on what to do and what to invest in, you need to see clearly in the field and formulate an action plan that relates to reality. This is hindered by numerous factors that I will list below:

Uninformative Cliches

Machine Learning is hard; Machine Learning is biased; 90/87/85 % of all Machine Learning never makes it to production. How many times do you see these and many other cliches? The space is filled with audience builders trying to stay relevant and gaming the social networks’ recommender systems by reiterating old thoughts and content. Serendipity is dead, and the signal to noise ratio is extremely low. The MLOps scene desperately needs new voices and content.

Hype Driven Magical Thinking

There is not a day passing to see a thought piece on the effect of AI or robots or some unobtainable and very convoluted piece of technology touted as commonplace. This thematises the audience who thinks that these technologies are easy to obtain and “must-haves”. The fact is that chatbots don’t work unless you define “work” as a “hardcoded menu selection”. No one wants to write a “Natural Language Question” to get an answer achievable by a couple of lines of SQL query or clicks on a dashboard. Sentiment based decision making is impossible until you have a deep understanding of what sentiment means. And don’t get me started on generating stock buy/sell signal on market news or analysing Fed statements...

Yet these and many more topics are echoed by executives who cannot obtain a coherent view on where the industry heads but also don’t want to be seen as out of the loop. The side effect of these moonshot projects is that the company might commit to a large amount of capability building needed to achieve said moonshot. They do not realise that they built the wrong infrastructure until the moonshot fails and the decision-makers shut down the project. Starting big when you are uncertain about your direction is a recipe for disaster (more on this later).

No standard form

It is essential to understand that machine learning is in its infancy and hasn’t converged to a standardised solution. This is well summarised in the “The problem with AI developer tools for enterprises”. This tells you that there are no turnkey solutions yet, and it is entirely possible that there never will be. Any solution is too business-specific and tailored to be easily covered by standard products. You can think of machine learning solutions as websites. You cannot buy a website; you can hire frontend engineers who will make you one based on your specification. They will use production-grade tools to accelerate themselves, but the gist of the work will be low level and custom to the enterprise.

“You are not Google”-effect

Because of the lack of standard form for ML, audiences turn towards the Big Ones succeeding in the field. One of my favourite blog posts of all time is “You are not Google”, which I highly recommend you to read if you haven’t done so. Machine Learning is dominated by a couple of giant actors like Google, Facebook and OpenAI. These companies can fund work and exploit opportunities that are non-existent to others. They use their position to drive thought leadership, demonstrate their excellence and attract future talent.

Suppose you are not working or planning to work for them. How relevant their methods and workflows for you is questionable. Their solutions are usually highly sophisticated, with multiple abstraction levels requiring many highly aligned but still agile teams. These kind of resources are generally not available for “normal” companies, even for multinationals just beginning their journey into machine learning. Another worrying trend is the use of their analysis and justification for doing Machine Learning the way they do it as a forceful argument to dismiss alternative approaches. Don’t forget that machine learning doesn’t have a standard form. 

Logo fetishism

Well, if you can’t get the real thing, how about the second best thing, someone who worked there? This is similar to the “appeal to authority” logical fallacy, partly dependent on the previous section. Because the only companies that appear to be publicly and successfully doing machine learning are well-funded logo companies (FAANG + other decacorns), their (ex-)employees have an advantage dominating the conversation about machine learning and solutions. They are assumed to be in possession of secret knowledge with high value but otherwise inaccessible despite all of these companies having a highly public research and engineering (often open-source) presence. 

Too much VC money

VC’s were incredibly successful in recent years, which was further exacerbated by the rising stock market. Their investment paradigm determines which ideas get momentum and which don’t. A few well-funded startups have an outsized effect if a market is as crowded as the ML/MLOps space. One of the recent paradigms is that: “Replicate an internal tool that a Big One created and sell it to other large companies, think about Hadoop”. This is a form of logo fetishism and relates to the previous two sections. The fact is there is no free lunch in finding the next best thing. Many internal tools are specific to the circumstance of the Big Ones and might not be transferable to a different context. VC mindset also requires your product to be laser-focused (quite rightly so) which in an ecosystem play lead to many well funded, incompatible hard to harmonise tools. Enterprise products are convoluted and fiddly because large enterprises are convoluted and fiddly; laser-focused products cannot be fiddly. The size of investment into these startups are so large that they almost certainly commit a VC to be behind the company at any cost. They will be too big to fail, and they distort their marketing, networking and sales effects. Again, no standard form; probably it would be a good idea to start small.

Capability building: the fallacy of “Buy vs Build”

The market is very product-driven as the result of all the points above and because that’s what you can get funding for. This has an unfortunate effect on bottom-up capability building. Companies think of machine learning (they call it AI) as a capability, a resource that can be taken off the shelf and used to produce value. It is questionable what they build this capability for. Anecdotally, even large companies are struggling to come up with enough use cases to justify upfront investment. A capability also lacks “ownership”; their maintainers are not the persons that will add business value which means their priorities will not be aligned. 

That doesn’t deter MLOps vendors from framing the whole question as “buy vs build”, suggesting that “buy” is the only reasonable option. The fact is most MLOPs problems are engineering problems and pretty trivial ones. Despite the famous chart in “Hidden Technical Debt in Machine Learning Systems”, the only hard part is the actual model building. The rest can be resolved with a well-organised engineering and statistics effort. Even modest-sized organisations have many more engineers than data scientist who are solving the five parts of machine learning in their daily jobs regularly under different names: model deployment is A/B testing, model monitoring is business intelligence, feature engineering is microservices, ML CI/CD is well, just CI/CD. The fact is for most places (especially very large ones), “build” is a better solution (at the moment). It just won’t be built by the DSes but by the engineers.

Too little adoption, too early

The bottom-up building is generally a bad idea. It is especially a bad idea if you are coupling with new technology in an area without best practices (see also Choose Boring Technology). If hundreds of MLOps products are on the market, the adoption of each of them is marginal. You are very likely to couple your stack with a company that either goes bust or will be absorbed by their competition and definitely will go through several cycles of changes.

Data scientists going alone

If it is so risky to decide on tools, why are most teams struggling with it instead of leaving it to the engineers? Often Data Scientists are left alone “in the lab”, and they are supposed to productionise solutions on their own, while they are not trained to do that level of engineering. 

This is partly because they self describe their work as “experimental and POC”. They don’t connect to the business as natively as, for example, a frontend engineering team. This is despite the fact that the vast majority of the work is procedural. They need to go through a standardised workflow with plenty of examples online and business understanding inside the company. Of course, this will not be as interesting as doing the latest deep learning R&D project, but Machine Learning projects’ job is not to be interesting; it is to make money. Data Scientist should shed research thinking and build ownership mentality and workflow discipline and find fulfilment in that.

What is the way forward?

Machine Learning is a cross-functional business problem that happens to be solvable by mathematical models. Treat it like any other product:

  • Start from the business problem, not a research problem

  • Identify the team structure and necessary skills to deliver a solution

  • Implement a productionised MVP solution

  • Identify bottleneck issues

  • Decouple them into standalone services

  • Decide if buying is a good decision or invest in an in-house solution built by engineers

Each step allows you to frame the problem as a value decision. Is it worth doing it economically or not? That should be the only driving force and not hype or logos.