Characters in play:
Data Science Manager
Senior Machine Learning Engineer
Data Scientists, Managers, Executives, ML consultants, AI influencers (all nonspeaking)
Thunder and Lightning. Enter DS Manager and Senior MLE.
Act 1 - Setup
First, there were Google and BigTable, and we had Big Data and Hadoop.
But we can only write SQL, so FaceBook gave us Hive.
Then, we wanted to see the query results, so we had Tableau, Looker and QlikView.
Then we wanted to do this repeatedly, so AirBnB gave us AirFlow.
Someone need to sort this out, so the Data Scientist was born.

Act 2 - Confrontation
Some questions can't be answered with queries, so the Data Scientist made models. Sklearn was promoted from academy to industry.
They also used everything under the Sun, like notebooks, matplotlib, and pandas.
But there was a problem:
”99% of models don't make it into production.” (This is a play, so I can exaggerate.)
So deployment services were born.
"Actually, we need to feed all this data to the models."
So feature stores were born.
"Wait, what's going on with my models?"
So model monitoring was born.
"What did you say what you trained model Untitled023_updated2b_final.pkl on six months ago?"
So experiment tracking was born.
"This makes no sense. It worked on my machine."
So they started using containerisation.
"Wait, we need to do this every month? Why can't we just use AirFlow for this? It worked so well in Act I."
"Just because. It's not cool anymore. This is MLOps."
So more DAG tools were born, like Prefect, Flyte, Metaflow, DAGshub (and so on. Seriously, how many of these are there???)
Someone need to sort this out, so the Machine Learning Engineer was born.
Act 2.5 - Confrontation II
(You thought this was it? We are just getting started)
But some questions can't be solved with Sklearn, so XGBoost was born.
But some questions can't be solved with XGBoost, so TensorFlow was born.
But no one outside of Google Brain understood that, so PyTorch was born.
"Wait, what do you mean I need a different computer for this?"
And NVIDIA shareholders were very happy.
"I thought I had enough data. I already have Big Data!"
"Yes, but not the right kind of data."
So labelling services were born.
"How much????"
So automated labelling was born.
"Surely I have enough data by now." (Narrator: They haven't...)
So synthetic data generators were born.
"What did you say what you trained model Untitled023_updated3c_final.pkl on six months ago?"
So data version control was born.
"Ok, so I have the data. I have it versioned. I have the different kind of computer. What now?"
"What do you mean by "You need to train these for three weeks on 156 GPUs straight"?"
So distributed compute-as-a-service frameworks were born.
And the Machine Learning Engineer dutifully worked day and night to connect all these services. Sometimes succeeded, sometimes not.
And one day, a new company, up and coming, full of famous people, came up with something so revolutionary that everyone wanted it immediately.
So the Machine Learning Engineer was asked to redo all of the above. Once more, with feeling.
Act 3 - Solution
(to be continued)