Designing Agent-Native Systems: A Deep Dive into StudyPal

The transition from traditional LLM applications (like basic chat interfaces) to agent-native systems represents a shift from static prompt-response templates to autonomous, goal-driven execution loop platforms.

In this article, I want to take a technical look under the hood of StudyPal (a project in active development), analyzing how we built a two-layer agent model designed for education: balancing light, single-function Tools with orchestrator-driven Capabilities.

The Paradigm: Two-Layer Architecture

An agent-native system requires flexibility. If you equip an LLM with fifty distinct tools, the context window fills up with schema definitions, the reasoning overhead increases, and the likelihood of tool-selection hallucination spikes.

To solve this, we partitioned StudyPal into two distinct layers:

Level 1: Tools — Lightweight, stateless, single-function tools that the LLM calls on-demand during a chat session (e.g., knowledge retrieval, web searches, paper retrieval).
Level 2: Capabilities — Multi-step, stateful pipelines that take control of the execution loop (e.g., deep math reasoning, curriculum creation, and question generation).

                      ┌──────────────────────┐
                      │   ChatOrchestrator   │
                      └──────────┬───────────┘
                                 │
                 ┌───────────────┴───────────────┐
                 ▼                               ▼
        ┌────────────────┐             ┌────────────────────┐
        │  ToolRegistry  │             │ CapabilityRegistry │
        │   (Stateless)  │             │     (Stateful)     │
        └────────────────┘             └────────────────────┘
          - RAG KB                       - deep_solve
          - Web Search                   - deep_question
          - Paper Search                 - math_animator

Level 1: The Stateless Tool Registry

Stateless tools are lightweight functions registered with standard JSON schemas. When the orchestrator detects that a user request needs factual context, it selects and executes the tools.

A prime example is the RAG Knowledge Base Tool. It connects to a vectorized knowledge base (using document chunking and vector search), pulls the most relevant fragments, and appends them to the LLM's context.

Here is a simplified Python representation of our tool interface protocol:

from abc import ABC, abstractmethod
from typing import Any, Dict

class BaseTool(ABC):
    @property
    @abstractmethod
    def name(self) -> str:
        pass

    @property
    @abstractmethod
    def description(self) -> str:
        pass

    @abstractmethod
    async def execute(self, args: Dict[str, Any]) -> Any:
        pass

By enforcing a strict interface, adding new capabilities—like searching arXiv for papers or running sandboxed Python code—is simply a matter of implementing BaseTool and registering it in the ToolRegistry.

Level 2: Stateful Capabilities and Orchestration

When a task requires multiple steps of reasoning, validation, and layout, a stateless tool isn't enough. We need a state machine. This is where Capabilities come in.

For example, our deep_solve capability handles mathematical equations through a structured loop:

Plan: Analyze the equation and break down steps.
Reason: Perform calculations step-by-step using tools (like SymPy or code execution).
Verify: Double check the calculations against edge cases.
Animate (Optional): Generate visual mathematical explanations using a Manim-based renderer.

Because this flow is complex, the ChatOrchestrator hands over control to the specific capability class:

class DeepSolveCapability(BaseCapability):
    async def run(self, context: UnifiedContext, stream: StreamBus) -> None:
        # Step 1: Planning stage
        async with stream.stage("planning", source=self.name):
            plan = await self.generate_execution_plan(context)
            await stream.content("Created step-by-step solution plan...", source=self.name)

        # Step 2: Reasoning & Execution
        async with stream.stage("reasoning", source=self.name):
            solution = await self.execute_reasoning_steps(plan, context)
            await stream.content("Calculated solution steps.", source=self.name)

        # Step 3: Verification
        async with stream.stage("validation", source=self.name):
            validation = await self.validate_solution(solution)
            
        await stream.result({"solution": solution, "valid": validation}, source=self.name)

Developing for the Future

Building agent-native software is about managing complexity and state. By keeping the interface clean and dividing simple tools from complex capabilities, we build robust systems that don't get lost in infinite loops.

If you are interested in trying this out, run:

deeptutor run deep_solve "Solve x^2 - 4 = 0"

Or clone the repository and dive into our code at deeptutor/core/.