Introduction
A couple of years ago, I moved in my job from the most experienced team on building solutions with Machine Learning to a team eager to have its first wave of models.
We, the “Data Science people”, did a brief document about the model and explained it to the team in a meeting. Soon, we started grabbing the data and fitting some models. After we provided the first results - still as model metrics, someone cheered in a team retrospective: “We are close to having ML models!”. I had to adjust the expectations, and we agreed to clarify the project scope for the team.
The Product Manager set a meeting with everybody, so we could write a PRD (Product Requirements Document) to define the project scope. Then I had this great revelation. I was amazed by how different ideas people had about what we would build. That was one of the best things I could have been exposed to learn about the importance of shared understanding.
After a confusing hour, I approached an Agilist to get some help. He recommended to me Jeff Patton’s User Story Mapping book 1. The book is excellent. As I applied it to Data Science projects, I could identify the key parts of it for the field and develop others with the team.
I’ll share in this material my current approach regarding Project Management for Data Science. I’ll keep updating, extending, and reviewing it.
Content summary
Why are we building it?
provides a couple of checks to make sure you really need to build it.
The team
comments on the roles of the people involved.
Relevant ML conceps
makes salient what usually have an impact in project management.
Story Mapping
contains the core material; it describes an exercise done in a group that aligns the team’s expectations on what they need to develop and enables the developers to uncover the most relevant parts of it, making it possible to slice a large project into smaller deliveries.
Metrics
distinguishes model from business metrics and provide guidance on how to use them during project development.
Research & Optimization
provides management tooling to avoid getting lost in the rabbit role of improving ML models.
Communication
offers principles on how information should flow in the project and a suggestion on how to make it happen.
Strategic decisions
make it clear the critical moments the project leader is needed to push the team for decision making.
Dysfunctionalities
contains a list of common anti-patterns.
Recurrent example
To illustrate it, the recurrent example will use a Delivery company that wants to boost its referral program by offering discount coupons to customers and nudging them to invite someone to join the platform.
References
-
Patton, J., & Economy, P. (2014). User story mapping: discover the whole story, build the right product. : O’Reilly Media, Inc. ↩