Unpopular Opinion: Agile is not only suitable for Data Science projects, but it is the only way to run one
Published on LinkedIn 2022-01-24
You can read many-many articles about how Agile/Scrum is unsuitable for Data Science projects. Of course, as a born contrarian, I must argue for the opposite.
At my previous startup, we built high-end NLP solutions with human-in-the-loop and active learning components and sold them to hedge funds and investment banks. Our team of six (including me) had no NLP experience whatsoever, so we needed to get organised to overcome this.
Key to our success was figuring out how to make Agile work for us.
Reframe agility through John Boyd's OODA-loop
The term "Agile" carries a lot of baggage, so one should rather focus on "agility". Instead of arguing about ceremonies, let's see how the OODA process applies to typical DS projects.
At the same time, I will point out why typical DS projects might struggle with building up agility.
OODA: Observe - Orient - Decide - Act
I like the steps of the OODA loop better than Agile's because of its relentless, no-where-to-hide names detached from software engineering terminology. It's difficult to obscure its simple terms with flowery language and pretend.
Or with the infamous words of Master Yoda: "Do. Or do not. There is no try."
Observe - Agile terms: Feedback / Learn
Typical DS projects spend ages in EDA/POC state "researching" without real feedback. This is a wasteful state in any dynamic environment, especially when you try to build a product from user interaction (which most DS projects do).
OODA's Observe step by definition only apply to solutions in production, and if you want agility, you need to get there.
You need to expose your proposed solution outside of your team ASAP! They can be internal users, design partners or selected customers with a major pain point. You need someone outside of your team, or you will never know if you are solving the right problem.
Orient - Agile terms: Analyse / Design
Typical DS projects struggle with designing high-quality options because of a lack of software architecture skills. At least being a quantitative field, analysis is usually not a problem.
We invested in a robust architecture that always enabled us to come up with a list of potential solutions while not compromising the overall product. A critical aspect of this was training everyone to use the right language and concepts to propose ideas and increase productivity by suggesting "cheap" and compatible ideas rather than full-stack rewrites.
Decide - Agile terms: Develop / Test
Our team spent a significant amount of energy training each staff to write good code and maintain a good codebase.
This enabled us to create new features and react to the observed new aspects of the product at production grade quality. Given that the Observe phase requires you to be in production early on, this is crucial to success.
Act - Agile term: Deploy / Release
Making a release decision in an abstract environment is extremely difficult. Without real impact, all your choices are theoretical. Typical DS projects usually fall into "Analysis Paralysis" at this point. To reduce risk (or just to postpone the decision), they require more studies that are increasingly difficult to do in an artificial environment.
Productionised MVP and high-quality options make this step an easy one. Just show them to your customers. Are you worried about risks? Design release cadence with internal->design partner->everyone schedule with canary/AB-test deployment. Rather than worrying about risks, build a process to mitigate them.
You can not adopt a paradigm from another industry without addressing if you have the right ingredients. Agile/Scrum is often used in DS teams without addressing two key preconditions:
Training your team in software craft and building the right architecture
Enabling early productionisation (by whatever means)
DS's lack of productivity is not rooted in project management and cannot be resolved by cargoculting ceremonies. But with the right preparation, an Agile process is the only true way forward.