Designing Agent-Native Systems: A Deep Dive into StudyPal
The transition from traditional LLM applications (like basic chat interfaces) to agent-native systems represents a shift from static prompt-response templates to autonomous, goal-driven execution loop platforms.
In this article, I want to take a technical look under the hood of StudyPal (a project in active development), analyzing how we built a two-layer agent model designed for education: balancing light, single-function Tools with orchestrator-driven Capabilities.
The Paradigm: Two-Layer Architecture
An agent-native system requires flexibility. If you equip an LLM with fifty distinct tools, the context window fills up with schema definitions, the reasoning overhead increases, and the likelihood of tool-selection hallucination spikes.
To solve this, we partitioned StudyPal into two distinct layers:
- Level 1: Tools — Lightweight, stateless, single-function tools that the LLM calls on-demand during a chat session (e.g., knowledge retrieval, web searches, paper retrieval).
- Level 2: Capabilities — Multi-step, stateful pipelines that take control of the execution loop (e.g., deep math reasoning, curriculum creation, and question generation).
┌──────────────────────┐
│ ChatOrchestrator │
└──────────┬───────────┘
│
┌───────────────┴───────────────┐
▼ ▼
┌────────────────┐ ┌────────────────────┐
│ ToolRegistry │ │ CapabilityRegistry │
│ (Stateless) │ │ (Stateful) │
└────────────────┘ └────────────────────┘
- RAG KB - deep_solve
- Web Search - deep_question
- Paper Search - math_animator
Level 1: The Stateless Tool Registry
Stateless tools are lightweight functions registered with standard JSON schemas. When the orchestrator detects that a user request needs factual context, it selects and executes the tools.
A prime example is the RAG Knowledge Base Tool. It connects to a vectorized knowledge base (using document chunking and vector search), pulls the most relevant fragments, and appends them to the LLM's context.
Here is a simplified Python representation of our tool interface protocol:
from abc import ABC, abstractmethod
from typing import Any, Dict
class BaseTool(ABC):
@property
@abstractmethod
def name(self) -> str:
pass
@property
@abstractmethod
def description(self) -> str:
pass
@abstractmethod
async def execute(self, args: Dict[str, Any]) -> Any:
pass
By enforcing a strict interface, adding new capabilities—like searching arXiv for papers or running sandboxed Python code—is simply a matter of implementing BaseTool and registering it in the ToolRegistry.
Level 2: Stateful Capabilities and Orchestration
When a task requires multiple steps of reasoning, validation, and layout, a stateless tool isn't enough. We need a state machine. This is where Capabilities come in.
For example, our deep_solve capability handles mathematical equations through a structured loop:
- Plan: Analyze the equation and break down steps.
- Reason: Perform calculations step-by-step using tools (like SymPy or code execution).
- Verify: Double check the calculations against edge cases.
- Animate (Optional): Generate visual mathematical explanations using a Manim-based renderer.
Because this flow is complex, the ChatOrchestrator hands over control to the specific capability class:
class DeepSolveCapability(BaseCapability):
async def run(self, context: UnifiedContext, stream: StreamBus) -> None:
# Step 1: Planning stage
async with stream.stage("planning", source=self.name):
plan = await self.generate_execution_plan(context)
await stream.content("Created step-by-step solution plan...", source=self.name)
# Step 2: Reasoning & Execution
async with stream.stage("reasoning", source=self.name):
solution = await self.execute_reasoning_steps(plan, context)
await stream.content("Calculated solution steps.", source=self.name)
# Step 3: Verification
async with stream.stage("validation", source=self.name):
validation = await self.validate_solution(solution)
await stream.result({"solution": solution, "valid": validation}, source=self.name)
Developing for the Future
Building agent-native software is about managing complexity and state. By keeping the interface clean and dividing simple tools from complex capabilities, we build robust systems that don't get lost in infinite loops.
If you are interested in trying this out, run:
deeptutor run deep_solve "Solve x^2 - 4 = 0"
Or clone the repository and dive into our code at deeptutor/core/.