AI Has Outgrown Scrum. We Built What Came Next.

Over the last few years, AI teams crossed a quiet threshold. We’re no longer just experimenting. We are shipping systems that have to run in production, integrate with real products, and serve real users.

But the workflows AI teams rely on were never designed for this reality. This is the story of how we overcame that wall—and why we built Umaku.

Background: Journey of Omdena

Journey from Omdena to Umaku

Phase 1: A collaborative platform for AI model development

Omdena started in 2019 as a collaborative platform to build AI models for social good. Hundreds of contributors explored models and ideas together, collaborating with leading organizations such as WFP, UNHCR, UNICEF, and Save the Children.

That approach worked extremely well—for research.

But as expectations shifted, organizations didn’t just want AI models for research. They wanted demonstrable MVPs: systems with real architecture, frontends, and backends.

Phase 2: Building MVPs was the easy part

To deliver MVPs, on top of the collaborative platform, we formed smaller teams from the top 1–2% of Omdena contributors—people who combined data science depth with engineering judgment.

This approach worked well for the next few years until we had to deliver production-ready, scalable AI products.

Phase 3: Real-World AI Product Development and Deployment

AI Product Development Workflow

We were managing federated, remote AI teams spread across countries, time zones, and skill levels—data scientists, ML engineers, product thinkers, domain experts—at scale. Our clients were asking us to:

package experiments into services,
maintain pipelines,
collaborate across multiple domain teams,
and ship work that had to survive beyond a notebook.

This transition for us happened fast—often without new tools or processes to support it. We experienced that AI delivery introduces new failure modes that traditional software and data science workflows were never designed to handle:

Leaky data splits when experiments become shared artifacts
Inconsistent feature engineering across notebooks as work scales
Silent shape mismatches that surface only downstream
Hard-coded paths that break in collaborative or remote runs
Notebook cells that depend on hidden execution order

These weren’t beginner mistakes. These were symptoms of a workflow that hadn’t caught up with what AI teams were now expected to deliver.

AI Has Moved On. Software Development Workflows Haven’t.

Software Development vs AI Development

We’re no longer experimenting in isolation. We’re delivering:

Products and models that need to be deployed and served as a microservice.
Pipelines that need to be maintained
Notebooks that need to evolve into production systems
Decisions that affect real users, not just benchmarks

But most of our tools still assumed that:

Code lives in neat scripts
Reviews are generic
Tasks and code exist in separate universes
“AI copilot feedback” can be context-free

That mismatch is where everything starts to break.

The Challenges

1. The Copilot Problem: Feedback Without Context Is Just Noise

Representation of Copilot Reviewing Code

We decided to use AI copilots and automated code reviewers, but what we noticed instead were:

Noise: Style suggestions on research notebooks.
Irrelevance: Deployment advice on pre-training experiments.
Blindness: Feedback that looked smart but didn’t understand the goal.

The Insight: An AI reviewing a PR without knowing the project maturity is like reviewing a book by reading one random paragraph. Technically impressive. Practically useless.

2. Kanban Boards and Code Live in Parallel Universes

Kanban Boards and Code Live in Parallel Universes

Another daily frustration: project management tools don’t speak code. Kanban boards know what task is “in progress”, who is assigned, when something is “done”, but they don’t know:

Which notebook implements the task
Which PR actually advances it
whether the code aligns with the task’s intent

Meanwhile, GitHub knows the code—but has no idea why it exists.

So teams spend time translating:

task → code
code → task
decision → documentation

That translation tax compounds fast in federated teams.

3. Scrum and Kanban weren’t built for AI

Scrum and Kanban weren’t the enemy. They just weren’t built for AI work. Scrum assumes “known-knowns,” while AI is 80% “known-unknowns.” You can’t “Sprint” toward model accuracy; you can only “Explore.”

As AI engineers, we found that most Scrum ceremonies added friction, while only a few elements actually helped:

clear goals
visible progress
shared understanding of what “done” means

4. Jupyter Notebooks: Powerful, Fragile, and Misunderstood

Fragile Jupyter Notebooks

And then there’s Jupyter. The backbone of AI work. Also, one of the hardest artifacts to reason about automatically. We repeatedly ran into:

inaccurate parsing of notebooks
missed dependencies between cells
broken assumptions about execution order
tools treating notebooks like scripts (they’re not)

Most systems either oversimplify notebooks—or avoid them altogether. Our experience days, ‘The Ticket’ and ‘The Notebook’ is where AI projects go to die.”

So, What Did We Build for AI Development?

Project Overview in Umaku

Our goal was to keep:

Focus over velocity: sprint goals that embrace uncertainty instead of fake precision
Intent over status: tickets that explain why work exists, not just its state
Progress over ceremony: visibility through real artifacts—code, notebooks, data—not meetings
Alignment over micromanagement: context that keeps teams moving together without constant syncs

And we dropped the rest. No performative planning. No story-point theater. No boards that look busy but explain nothing. Also, we didn’t recreate Scrum boards or Kanban flows. We extracted what truly benefited us as AI engineers. So, we needed something that understands the project charter, the sprint goal, the intent behind each ticket, and how code and notebooks change against them.

Most Importantly, We Needed Context-Aware, Agentic Feedback

Context-aware agentic feedback means the agent does not operate on artifacts in isolation. It reasons over the project charter, the explicit business objectives, and the current execution phase—model exploration, validation, hardening, or packaging for production. It ingests tickets, ticket comments, design decisions, and historical discussion to reconstruct why the work exists and what constraints shaped it.

Code Quality Feedback in Umaku

Code Quality Feedback

Code Snippet Comparison in Umaku

Code Snippet Comparison

Bug Finder Feedback in Umaku

Bug Finder Feedback

Overall Agentic Feedback Dashboard in Umaku

Overall Agentic Feedback Dashboard

This changes the nature of feedback entirely. Instead of flagging patterns blindly, the agent evaluates decisions relative to project goals and the delivery stage. A modeling shortcut during early experimentation is treated differently from the same shortcut during packaging. A hard-coded path is understood as a prototype artifact—or identified as a release-blocking risk—based on context, not heuristics.

Most teams today assemble this workflow from disconnected tools: sprint boards to track tasks, ticketing systems to capture intent, bug trackers to log failures, and AI copilots that analyze code without access to any of that context. Each handoff strips meaning away. By the time feedback is generated, the agent knows what changedbut not why.

Context is the primary object, not an afterthought. The agent is charter-aware, ticket-aware, and discussion-aware. It understands how decisions evolve over time and how expectations shift as a project moves from research-grade notebooks to production-ready systems.

Umaku: Built From the Inside, Not the Whiteboard

We named our platform Umaku. It comes from the Japanese word (うまく/上手く/巧く/旨く), meaning “skillful”. Not generic. Not stylistic. But grounded in the intent of the system being built.

Umaku exists because we lived the pain of:

managing distributed AI teams
shipping under ambiguity
reviewing messy notebooks at scale
aligning product intent with technical reality

Umaku sits between your task board and your Jupyter notebooks, ensuring the AI agent reviewing your PR knows whether you’re in ‘research mode’ or ‘production mode’.”

In short, Umaku is a single platform where:

Sprints are designed for exploratory, evolving AI work,
tickets preserve intent across notebooks, models, and code,
bugs capture assumptions, data issues, and silent failures—not just errors,
and the project charter remains visible as work evolves.

We didn’t design it in theory. We built it to survive reality.

And now, we’re opening it up—because we know we’re not the only ones who felt this gap, and as we’re moving beyond the era of the ‘Experimental Notebook’ and into the era of the ‘AI Product.’

It’s time our workflows caught up. Welcome to Umaku.

Please signup for a trial account

umaku.ai

But the workflows AI teams rely on were never designed for this reality. This is the story of how we overcame that wall—and why we built Umaku.

Background: Journey of Omdena

Journey from Omdena to Umaku

Phase 1: A collaborative platform for AI model development

That approach worked extremely well—for research.

But as expectations shifted, organizations didn’t just want AI models for research. They wanted demonstrable MVPs: systems with real architecture, frontends, and backends.

Phase 2: Building MVPs was the easy part

To deliver MVPs, on top of the collaborative platform, we formed smaller teams from the top 1–2% of Omdena contributors—people who combined data science depth with engineering judgment.

This approach worked well for the next few years until we had to deliver production-ready, scalable AI products.

Phase 3: Real-World AI Product Development and Deployment

AI Product Development Workflow

package experiments into services,
maintain pipelines,
collaborate across multiple domain teams,
and ship work that had to survive beyond a notebook.

Leaky data splits when experiments become shared artifacts
Inconsistent feature engineering across notebooks as work scales
Silent shape mismatches that surface only downstream
Hard-coded paths that break in collaborative or remote runs
Notebook cells that depend on hidden execution order

These weren’t beginner mistakes. These were symptoms of a workflow that hadn’t caught up with what AI teams were now expected to deliver.

AI Has Moved On. Software Development Workflows Haven’t.

Software Development vs AI Development

We’re no longer experimenting in isolation. We’re delivering:

Products and models that need to be deployed and served as a microservice.
Pipelines that need to be maintained
Notebooks that need to evolve into production systems
Decisions that affect real users, not just benchmarks

But most of our tools still assumed that:

Code lives in neat scripts
Reviews are generic
Tasks and code exist in separate universes
“AI copilot feedback” can be context-free

That mismatch is where everything starts to break.

The Challenges

1. The Copilot Problem: Feedback Without Context Is Just Noise

Representation of Copilot Reviewing Code

We decided to use AI copilots and automated code reviewers, but what we noticed instead were:

Noise: Style suggestions on research notebooks.
Irrelevance: Deployment advice on pre-training experiments.
Blindness: Feedback that looked smart but didn’t understand the goal.

The Insight: An AI reviewing a PR without knowing the project maturity is like reviewing a book by reading one random paragraph. Technically impressive. Practically useless.

2. Kanban Boards and Code Live in Parallel Universes

Kanban Boards and Code Live in Parallel Universes

Another daily frustration: project management tools don’t speak code. Kanban boards know what task is “in progress”, who is assigned, when something is “done”, but they don’t know:

Which notebook implements the task
Which PR actually advances it
whether the code aligns with the task’s intent

Meanwhile, GitHub knows the code—but has no idea why it exists.

So teams spend time translating:

task → code
code → task
decision → documentation

That translation tax compounds fast in federated teams.

3. Scrum and Kanban weren’t built for AI

As AI engineers, we found that most Scrum ceremonies added friction, while only a few elements actually helped:

clear goals
visible progress
shared understanding of what “done” means

4. Jupyter Notebooks: Powerful, Fragile, and Misunderstood

Fragile Jupyter Notebooks

And then there’s Jupyter. The backbone of AI work. Also, one of the hardest artifacts to reason about automatically. We repeatedly ran into:

inaccurate parsing of notebooks
missed dependencies between cells
broken assumptions about execution order
tools treating notebooks like scripts (they’re not)

Most systems either oversimplify notebooks—or avoid them altogether. Our experience days, ‘The Ticket’ and ‘The Notebook’ is where AI projects go to die.”

So, What Did We Build for AI Development?

Project Overview in Umaku

Our goal was to keep:

Focus over velocity: sprint goals that embrace uncertainty instead of fake precision
Intent over status: tickets that explain why work exists, not just its state
Progress over ceremony: visibility through real artifacts—code, notebooks, data—not meetings
Alignment over micromanagement: context that keeps teams moving together without constant syncs

Most Importantly, We Needed Context-Aware, Agentic Feedback

Code Quality Feedback in Umaku

Code Quality Feedback

Code Snippet Comparison in Umaku

Code Snippet Comparison

Bug Finder Feedback in Umaku

Bug Finder Feedback

Overall Agentic Feedback Dashboard in Umaku

Overall Agentic Feedback Dashboard

Umaku: Built From the Inside, Not the Whiteboard

Umaku exists because we lived the pain of:

managing distributed AI teams
shipping under ambiguity
reviewing messy notebooks at scale
aligning product intent with technical reality

Umaku sits between your task board and your Jupyter notebooks, ensuring the AI agent reviewing your PR knows whether you’re in ‘research mode’ or ‘production mode’.”

In short, Umaku is a single platform where:

Sprints are designed for exploratory, evolving AI work,
tickets preserve intent across notebooks, models, and code,
bugs capture assumptions, data issues, and silent failures—not just errors,
and the project charter remains visible as work evolves.

We didn’t design it in theory. We built it to survive reality.

It’s time our workflows caught up. Welcome to Umaku.

Please signup for a trial account

umaku.ai

AI Has Outgrown Scrum. We Built What Came Next.

Background: Journey of Omdena

Phase 1: A collaborative platform for AI model development

Phase 2: Building MVPs was the easy part

Phase 3: Real-World AI Product Development and Deployment

AI Has Moved On. Software Development Workflows Haven’t.

The Challenges

1. The Copilot Problem: Feedback Without Context Is Just Noise

2. Kanban Boards and Code Live in Parallel Universes

3. Scrum and Kanban weren’t built for AI

4. Jupyter Notebooks: Powerful, Fragile, and Misunderstood

So, What Did We Build for AI Development?

Most Importantly, We Needed Context-Aware, Agentic Feedback

Umaku: Built From the Inside, Not the Whiteboard

Share this article

AI Has Outgrown Scrum. We Built What Came Next.

Background: Journey of Omdena

Phase 1: A collaborative platform for AI model development

Phase 2: Building MVPs was the easy part

Phase 3: Real-World AI Product Development and Deployment

AI Has Moved On. Software Development Workflows Haven’t.

The Challenges

1. The Copilot Problem: Feedback Without Context Is Just Noise

2. Kanban Boards and Code Live in Parallel Universes

3. Scrum and Kanban weren’t built for AI

4. Jupyter Notebooks: Powerful, Fragile, and Misunderstood

So, What Did We Build for AI Development?

Most Importantly, We Needed Context-Aware, Agentic Feedback

Umaku: Built From the Inside, Not the Whiteboard

Share this article