On Feb 27, a part of our Product Data Science team (Nikoleta Lindenau and I) travelled to Warsaw to moderate a roundtable discussion at the Big Data Technology Warsaw Summit. We had a pleasure of hosting a session titled MLOps: how to shorten the ML service development pipeline?
As this was a technology-oriented conference, we expected that most of the discussion would revolve around technologies that can be used to speed up and automate the training, validation and deployment of ML services. However, the discussion participants highlighted that tools used by a team aren’t the only thing that determines if a machine learning project succeeds.
They stressed that the other part of the equation are the ways of working between the people responsible for different aspects and stages of the project. As an example, in some companies data scientists responsible for model development only interact with production engineers when they need to hand off a prototype to be productionised. Such lack of co-operation can lead to situations in which a prototype written in Python is too costly to implement, and engineers need to rewrite it in another technology (e.g. Spark), which unnecessarily complicates the deployment pipeline. Our discussion showed that there’s an increasing trend towards strengthening the collaboration between those two groups throughout the project lifecycle. In our example, this would mean that the engineers start working with the data scientists from day 1, so that the engineers use the same tech stack and code base, and can help optimise the prototype with production in mind as early as possible.
We’re grateful to the organisers of Big data Technology Warsaw Summit for inviting us to host the roundtable. It was a great opportunity to exchange knowledge and experiences with other industry experts.