Laszlo’s Newsletter

Share this post
Article Review: Rendezvous Architecture for Data Science in Production by Jan Teichmann
laszlo.substack.com

Article Review: Rendezvous Architecture for Data Science in Production by Jan Teichmann

2022-03-22

Laszlo Sragner
Mar 22
Share this post
Article Review: Rendezvous Architecture for Data Science in Production by Jan Teichmann
laszlo.substack.com

Hypergolic (our ML consulting company) works on its own ML maturity model and ML assessment framework. In the next phase, I will review three more articles:

  • Machine Learning operations maturity model

  • Towards CRISP-ML(Q): A Machine Learning Process Model with Quality Assurance Methodology

  • Rendezvous Architecture for Data Science in Production

Our focus with this work is helping companies at an early to moderately advanced stage of ML adoption, and my comments will reflect this. Please subscribe if you would like to be notified of the rest of the series.

Photo by Luca J on Unsplash

Rendezvous Architecture for Data Science in Production

This is another framework rooted in CRISP-DM discussed yesterday [link]. The author identifies that CRISP-DM lacks model management in production:

The author also claims that A/B testing is inadequate to manage model updates and select the new model from challengers. That’s because you need to expose inferior models to at least a small portion of your customers.

Rendezvous architecture

Summarising the previous introduction to the problem statement, we are looking for some architecture to

  • Evaluate a big number of incumbent and challenger models in parallel

  • Manage the model life cycle

  • Handle an increasing heterogeneity of data science toolkits

  • Allow experimentation in production without impacting the user experience and decouple the business objectives from the data science objectives

  • Decouple enterprise requirements like SLAs and GDPR from the data science models

  • Scale this to peaks of 50+k page loads per minute without hiring armies of DevOps engineers

The architecture is from the Machine Learning Logistics book by Ted Dunning and Ellen Friedman [link].

In the architecture, the models receive their inputs through a pub/sub streaming pipeline and send it to the rendezvous service through another stream. The rendezvous service is also subscribed to the input stream. If the models fail to respond (or respond too slowly), it can return according to an SLA.

The main idea of the architecture is to split the business objective from the modelling objective. This decouples the business objective from the concrete implementation of models and experimentation. A user is only exposed to the rendezvous service, and that’s the point enterprise SLAs can be implemented.

The Model Data Enricher acts as a feature store/feature engineering implementation. The article doesn’t say what happens if different models need different inputs, especially if those impact the SLA differently (for example, some models need a more convoluted feature that will slow down all models).

Implementing the Rendezvous Service

I will skip this part as it is not talking about methodology.

I would mention that the rendezvous service is stateful. I am not sure if this is a problem, but it immediately raises concern about scalability as you can only have exactly one of these, or you need to shard the load somehow.

Summary

This was a shorter summary than the ones before. I included it in the reviews because of the idea of decoupling model results serving from model output calculation that, in turn, allows you to hide experiments from users. The decoupling also enables you to manage enterprise level SLAs and business objectives.

I hope you enjoyed this summary, and please subscribe to be notified about future parts of the series:

Comment
Share
Share this post
Article Review: Rendezvous Architecture for Data Science in Production by Jan Teichmann
laszlo.substack.com

Create your profile

0 subscriptions will be displayed on your profile (edit)

Skip for now

Only paid subscribers can comment on this post

Already a paid subscriber? Sign in

Check your email

For your security, we need to re-authenticate you.

Click the link we sent to , or click here to sign in.

TopNewCommunity

No posts

Ready for more?

© 2022 Laszlo Sragner
Privacy ∙ Terms ∙ Collection notice
Publish on Substack Get the app
Substack is the home for great writing