Deliberate Machine Learning

Share this post

Unpopular Opinion: Agile is not only suitable for Data Science projects, but it is the only way to run one

laszlo.substack.com

Unpopular Opinion: Agile is not only suitable for Data Science projects, but it is the only way to run one

Published on LinkedIn 2022-01-24

Laszlo Sragner
Jan 24, 2022
4
Share this post

Unpopular Opinion: Agile is not only suitable for Data Science projects, but it is the only way to run one

laszlo.substack.com

You can read many-many articles about how Agile/Scrum is unsuitable for Data Science projects. Of course, as a born contrarian, I must argue for the opposite.

At my previous startup, we built high-end NLP solutions with human-in-the-loop and active learning components and sold them to hedge funds and investment banks. Our team of six (including me) had no NLP experience whatsoever, so we needed to get organised to overcome this. 

Key to our success was figuring out how to make Agile work for us.

Reframe agility through John Boyd's OODA-loop

The term "Agile" carries a lot of baggage, so one should rather focus on "agility". Instead of arguing about ceremonies, let's see how the OODA process applies to typical DS projects.

At the same time, I will point out why typical DS projects might struggle with building up agility.

OODA: Observe - Orient - Decide - Act

I like the steps of the OODA loop better than Agile's because of its relentless, no-where-to-hide names detached from software engineering terminology. It's difficult to obscure its simple terms with flowery language and pretend. 

Or with the infamous words of Master Yoda: "Do. Or do not. There is no try."

Photo by Dorian Mongel on Unsplash

Observe - Agile terms: Feedback / Learn

Typical DS projects spend ages in EDA/POC state "researching" without real feedback. This is a wasteful state in any dynamic environment, especially when you try to build a product from user interaction (which most DS projects do).

OODA's Observe step by definition only apply to solutions in production, and if you want agility, you need to get there. 

You need to expose your proposed solution outside of your team ASAP! They can be internal users, design partners or selected customers with a major pain point. You need someone outside of your team, or you will never know if you are solving the right problem.

Orient - Agile terms: Analyse / Design

Typical DS projects struggle with designing high-quality options because of a lack of software architecture skills. At least being a quantitative field, analysis is usually not a problem.

We invested in a robust architecture that always enabled us to come up with a list of potential solutions while not compromising the overall product. A critical aspect of this was training everyone to use the right language and concepts to propose ideas and increase productivity by suggesting "cheap" and compatible ideas rather than full-stack rewrites. 

Decide - Agile terms: Develop / Test

Our team spent a significant amount of energy training each staff to write good code and maintain a good codebase. 

This enabled us to create new features and react to the observed new aspects of the product at production grade quality. Given that the Observe phase requires you to be in production early on, this is crucial to success.

Act - Agile term: Deploy / Release 

Making a release decision in an abstract environment is extremely difficult. Without real impact, all your choices are theoretical. Typical DS projects usually fall into "Analysis Paralysis" at this point. To reduce risk (or just to postpone the decision), they require more studies that are increasingly difficult to do in an artificial environment. 

Productionised MVP and high-quality options make this step an easy one. Just show them to your customers. Are you worried about risks? Design release cadence with internal->design partner->everyone schedule with canary/AB-test deployment. Rather than worrying about risks, build a process to mitigate them.

Conclusion

You can not adopt a paradigm from another industry without addressing if you have the right ingredients. Agile/Scrum is often used in DS teams without addressing two key preconditions: 

  • Training your team in software craft and building the right architecture

  • Enabling early productionisation (by whatever means)

DS's lack of productivity is not rooted in project management and cannot be resolved by cargoculting ceremonies. But with the right preparation, an Agile process is the only true way forward.

Share this post

Unpopular Opinion: Agile is not only suitable for Data Science projects, but it is the only way to run one

laszlo.substack.com
Comments
TopNewCommunity

No posts

Ready for more?

© 2023 Laszlo Sragner
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing