Get Contextual Feedback From Your AI Agents

The Context Gap & The Architecture of Data Ingestion

Introduction: The “Context Gap” in Project Management

In the current landscape of AI, building a “Chat with your PDF” tool is the “Hello World” of Large Language Models (LLMs). But applying AI to Project Management (PM) requires solving a much harder problem: Contextual Intelligence.

A standard LLM can tell you what a piece of code does. But it cannot tell you if that code satisfies the Acceptance Criteria written three weeks ago by a Product Manager, or if it conflicts with the DevOps compliance rules defined in the Project Charter.

At Umaku, we realized that to build a truly “Agentic” PM platform—one that doesn’t just store tickets but evaluates them—we had to move beyond simple keyword search. We had to build a system that understands the relationships between three distinct data planes:

The Static Plane: Project Charters, Tech Stacks, and compliance rules.
The Transactional Plane: Kanban board movements, Sprint goals, and Ticket descriptions.
The Activity Plane: Git commits, Pull Requests, and Slack discussions.

This blog explores the engineering architecture behind Umaku, specifically how we built a Massive Vector Database (VDB) and a Sequential Multi-Agent System to give our AI the “Context” it needs to act as a Project Manager, Tech Lead, and QA Engineer simultaneously.

1. The “Context Lake”: Flattening Project Data for RAG

The core challenge of an AI Project Manager is Data Fragmentation. Your code is in GitHub, your requirements are in the Board, and your discussions are in Slack. To an LLM, these are disconnected silos.

To solve this, Umaku uses a unified Context Ingestion Pipeline. We do not query these services in real-time for every request (which would be slow and context-limited). Instead, we ingest, chunk, and embed this data into a centralized Vector Database.

The Ingestion Strategy

We treat the project as a living organism where every action is a “memory.”

Codebase Indexing: We use the semantic indexing system (detailed in our previous blog on Code Search) to chunk code based on semantic logic (Classes/Functions) rather than arbitrary lines.
Ticket Semantics: When a ticket is created in the Sprint Board, we embed its User Story, Acceptance Criteria, and Attachments.
Team Interactions: Slack threads and PR comments are ingested to capture the “human” context of decisions.

Figure 1: The Umaku Context Ingestion Pipeline.

Retrieval Augmented Generation (RAG) at Scale

Ingesting data is easy; retrieving the right data is hard. If a user asks, “Why is the payment module failing?”, a naive RAG system might retrieve every document containing the word “payment.”

Umaku employs a Hierarchical RAG Strategy.

Macro-Filtering: The system first filters by Project ID and current Sprint ID to narrow the search space.
Semantic Ranking: It then ranks chunks by semantic similarity.
Context Assembly: Crucially, it assembles a “Context Payload” that stitches together the Requirement (Ticket) with the Execution (Code).

This allows our agents to “see” that Ticket #305 (Implement Stripe) is linked to src/payments/stripe_adapter.py, bridging the gap between plan and reality.

2. The “Truth Verification” Engine: Agentic Ticket Evaluation

Most PM tools rely on users to report status. If a developer moves a ticket to “Done,” the tool accepts it as truth. Umaku flips this model. We treat a ticket status as a claim that needs verification.

When a developer works on a ticket, they attach their Git commits or PRs. When the “Evaluate” agent is triggered, it performs a Cross-Modal Verification:

Fetch Truth Sources: It retrieves the Acceptance Criteria (Text) and the Git Diff (Code).
Semantic Comparison: It doesn’t just check if code exists; it checks if the code semantics match the criteria.

For example, if the ticket requires “User must be able to reset password via email,” but the code only contains a UI button without the backend logic, the Agent flags this discrepancy.

Figure 2: The Cross-Modal Truth Verification Loop.

This architecture serves as the foundation for our most advanced feature: The Automated Sprint Retrospective, where multiple agents work in sequence to audit the entire project.

Orchestrating Intelligence – The Sequential Agent Pipeline

Ingesting data into a Vector Database is only the first step. The real power of Umaku lies in how we act on that data.

In traditional RAG applications, you have one agent answering one user query. But evaluating a completed Agile Sprint—which contains dozens of tickets, hundreds of commits, and thousands of lines of code—is too complex for a single prompt context window. It requires a Sequential Multi-Agent Architecture.

3. The Sequential Agent Pipeline

We designed the Sprint Report mechanism not as a single “analysis” step, but as a relay race between specialized agents. We run these agents sequentially because the output of one agent often provides necessary context for the next.

When a Project Manager clicks “Complete Sprint,” the following pipeline is triggered:

Agent A: The Sprint Inclusion Agent (The “Strategist”)

Role: Verifies alignment between Execution and Intent.
Input: It pulls the Project Charter, Sprint Goals, and the list of Completed Tickets.
Logic: It asks, “Did the work delivered in this sprint actually move the needle on the project’s high-level objectives?” It looks for “scope creep”—tasks that were completed but weren’t part of the original Sprint Goal.
Output: An alignment score (e.g., 94%) and a narrative report on goal achievement.

Agent B: The Code Quality Agent (The “Reviewer”)

Role: Analyzes the technical health of the code delivered.
Input: It fetches the specific Git Commits linked to the sprint’s tickets.
Logic: It doesn’t just run a linter. It uses the LLM to understand code semantics. It checks for cyclomatic complexity, proper error handling, and adherence to the specific Tech Stack defined in the project settings (e.g., “Are we correctly using React Hooks as per the project guidelines?”).
Output: A quality score (e.g., 96%) and specific refactoring recommendations.

Agent C: The DevOps Compliance Agent (The “Architect”)

Role: Ensures the infrastructure code matches the project’s architectural standards.
Input: Dockerfiles, CI/CD YAML configurations, and environment setups.
Logic: It verifies if the new code adheres to the deployment pipelines. For example, if the project requires unit tests to run before deployment, this agent checks if the CI config was updated or bypassed.
Output: A compliance score (e.g., 91%) highlighting security vulnerabilities or configuration drift.

Agent D: The Bug Finder (The “QA Engineer”)

Role: Proactive error detection.
Input: The aggregated code changes from the sprint.
Logic: Unlike the Quality agent (which looks at style), this agent looks for logic errors and edge cases. It simulates how different modules interact to find potential breaking changes that unit tests might miss.
Output: A reliability score (e.g., 97%) and a list of potential bugs to be logged.

Figure 3: The Sequential Multi-Agent Relay Architecture.

4. The “Descoping” Protocol: Handling Partial Success

One of the hardest parts of Agile management is deciding what to do with a ticket that is mostly done but not fully done. Usually, this results in the entire ticket being dragged into the next sprint, messing up velocity metrics.

Umaku introduces an AI-Driven Descoping Agent that automates this granularity.

The Problem: The “Almost Done” Fallacy

Developers often mark a ticket as “Done” when the coding is finished, even if testing or documentation is missing. If the system accepts this, the Sprint Report becomes inaccurate.

The Solution: Semantic Splitting

When the agents detect a discrepancy between the User Story Requirements and the delivered Code, the Descoping Agent intervenes. It doesn’t just reject the ticket; it proposes a constructive split.

Analysis: The agent identifies which Acceptance Criteria (AC) were met and which were missed.
- Example: Ticket “User Auth” has AC1 (Login) and AC2 (Forgot Password). The code only shows AC1.
Proposal: The agent suggests:
- Keep: A modified ticket for the current sprint covering only AC1 (marked “Completed”).
- Descope: A new ticket for the next sprint covering AC2 (marked “Todo”).
Action: The Project Manager receives this as a “Descope Request” in the dashboard. They can Accept (automatically splitting the tickets) or Reject.

This ensures that the Velocity Chart reflects actual value delivered, rather than a binary “All or Nothing.”

Figure 4: The AI Descoping Logic Flow.

5. Beyond Code: Long-Term Memory & Team Intelligence

While code is critical, projects are built by people. A key differentiator of Umaku is its ability to understand the “Human Context” of a project.

If a Project Manager asks, “How is Ahmed performing this month?”, a standard RAG system might just look for tickets assigned to Ahmed. Umaku goes deeper by utilizing a Long-Term Memory module for team evaluation.

Decoding the Human Element & The Future of Agentic PM

While code is deterministic, software development is fundamentally a human activity. A project management tool that understands code but ignores the team is incomplete. Umaku bridges this gap by applying Contextual Intelligence to the people building the software.

5. Team Intelligence: The “Collaborator Profiling” Engine

Traditional metrics like “lines of code written” or “tickets closed” are notoriously bad at measuring productivity. They encourage quantity over quality.

Umaku employs a Team Evaluation Agent that runs weekly to build a comprehensive profile of each contributor. This isn’t a simple counter; it is a multi-modal analysis engine.

The Inputs

The agent aggregates data from two primary streams:

Communication (Slack): It ingests public channel history to analyze engagement.
- Sentiment Analysis: Is the user constructive?
- Supportiveness: Does the user answer questions from peers?
Execution (GitHub & Board): It analyzes the impact of work.
- Complexity: Was the ticket a simple text change or a complex refactor?
- Consistency: Is there regular activity, or bursty, last-minute commits?

The Output: Contextual Performance Reports

The agent synthesizes this into a narrative report. Instead of saying “Ahmed closed 5 tickets,” it might report:

“Ahmed focused on high-complexity backend tasks this week (Tickets #201, #204). While his ticket volume was lower, his code contribution involved significant refactoring of the authentication module. He was also highly active on Slack, assisting junior developers with environment setup.”

This allows Project Managers to evaluate performance based on value, not just velocity.

Figure 5: The Multi-Modal Collaborator Profiling Engine.

6. The Chatbot: The “All-Knowing” Project Oracle

The interface for all this intelligence is the Umaku Chatbot. But this is not a standard “Chat with your Data” bot. It is Entity-Aware.

When a user asks: “What did Ahmed work on last week, and did he use React Native?”, the bot must perform a complex chain of operations:

Entity Resolution: It identifies “Ahmed” as a specific User Entity ID and “last week” as a specific date range.
Context Retrieval: It queries the VDB for:
- Tickets assigned to Ahmed in that date range.
- Code commits linked to those tickets.
Technical Analysis: It scans the retrieved code chunks for specific technology signatures (e.g., import React from ‘react-native’).
Synthesized Answer: It returns a factual, cited response: “Ahmed worked on Ticket #3.2.1 (Login Screen). Yes, he used React Native, specifically using the Expo framework, as seen in commit 8a9f42.”

This level of detail is only possible because we index the meta-data (Who, When, What) alongside the content (The Code).

Figure 6: The Context-Aware Query Resolution Path.

Conclusion: From Managing Tasks to Guiding Projects

The shift from standard project management tools to Umaku represents a fundamental change in how we view software delivery. We are moving from Passive Data Entry—where humans have to constantly update the tool—to Active Intelligence, where the tool observes, analyzes, and guides the humans.

By building a system that understands the semantic relationships between a Project Charter, a Sprint Goal, a User Story, and a Git Commit, we have created more than just a dashboard. We have created a Context Engine that ensures every line of code written is a step towards the project’s actual goal.

Umaku doesn’t just manage the project; it understands it.

The Context Gap & The Architecture of Data Ingestion

Introduction: The “Context Gap” in Project Management

The Static Plane: Project Charters, Tech Stacks, and compliance rules.
The Transactional Plane: Kanban board movements, Sprint goals, and Ticket descriptions.
The Activity Plane: Git commits, Pull Requests, and Slack discussions.