Is a “JIRA” mindset for Data Science the way to execute Data Science projects? Can some structure be put in place for Data Science projects? Is there an an Agile Manifesto for Data Science projects?
The Agile Manifesto for Data Science
Well…there isn’t an official manifesto for Data Science, but there was a good one published by Russell Jurney in his book — Agile Data Science 2.0. The seven principles captured in the book are:
- Iterate, iterate, iterate: tables, charts, reports, predictions.
- Ship intermediate output. Even failed experiments have output.
- Prototype experiments over implementing tasks.
- Integrate the tyrannical opinion of data in product management.
- Climb up and down the data-value pyramid as we work.
- Discover and pursue the critical path to a killer product.
- Get meta. Describe the process, not just the end-state.
Project Management for Data Science
Does this mean Agile is suited for Data Science? There are both sides to it. An interesting quote on this states:
The best you can do is experiment, as rapidly as possible. And then use the results of your experiment to decide what to experiment on next, and then gather the next set of results and so on. In other words you need to iterate.
We will get into more details about how Data Science projects can be managed in subsequent articles.
Are you using Agile frameworks for Data Science projects? Do share your thoughts on how it has been implemented in your organisation.
We do bring in flavor of many things. JIRA complemented by customised fields such as risk level for stories, Risk profiling app, big picture & off course RAID logs. Even the workflow is tweaked & definition of DONE-DONE has mandatory artefacts & acceptance criteria. Data science ideas if going to fail have to “fail fast” paving way for new set of iterations.
When developing data science based app/ product emphasis is on curating stories with developers and keeping things as simple & crisp as possible. Thumb rule : don’t create a story with story point more than 5. Each story should lead to go/ no go decision for other stories in the backlog.
These are good points, Yogesh! Thanks for sharing.
I know many people argue that agile is not suited for Data Science but having worked with Data Science projects with lot of moving parts and tasks, here are some of the things that worked in my previous job:
1. We made sure that planning and prioritization is done very carefully at the start of each sprint. Even though we had some many technologies at our disposal we used a tracker board with post eds to create a prioritized list and moved them along the board in our stand ups. We need to ensure that the data sciences resources know what they need to do in assisting devops resources since they are enablers rather than doers and their tasks have huge dependence on devops.
2. Clearly defining tasks with deliverables and timelines – no brainier
3. Retrospectives and demos at the end of each sprint – This was important for us to ensure that there are no missing pieces and helped maintaining WIPs to a minimum and reduce reword. This also helped us greatly in our requirements validation.