Skip to content

Architecture โ€‹

OpenSpider uses a hierarchical multi-agent architecture where a Manager agent orchestrates specialized Worker agents to fulfill complex requests.

System Overview โ€‹

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   WhatsApp      โ”‚โ”€โ”€โ”€โ”€โ–ธโ”‚               โ”‚โ”€โ”€โ”€โ”€โ–ธโ”‚  Manager     โ”‚
โ”‚   (Baileys)     โ”‚     โ”‚    Server     โ”‚     โ”‚  Agent ๐Ÿง     โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ”‚  (Express +   โ”‚     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                        โ”‚   WebSocket)  โ”‚            โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”‚               โ”‚     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Dashboard     โ”‚โ—‚โ”€โ”€โ”€โ–ธโ”‚               โ”‚     โ”‚  Workers     โ”‚
โ”‚   (React/Vite)  โ”‚     โ”‚               โ”‚     โ”‚  โšก Coder    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ”‚  ๐Ÿ”ฎ Researcherโ”‚
                                โ”‚             โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   CLI / TUI     โ”‚โ”€โ”€โ”€โ”€โ–ธโ”‚  Scheduler    โ”‚
โ”‚   (Commander)   โ”‚     โ”‚  (60s loop)   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Key Files & Their Roles โ€‹

FilePurpose
src/server.tsExpress + WebSocket server, serves dashboard, API routes, console.log โ†’ WS broadcast
src/whatsapp.tsBaileys WhatsApp connection, security firewall, message routing
src/agents/ManagerAgent.tsOrchestrator โ€” plans, delegates to workers, emits agent_flow events
src/agents/WorkerAgent.tsTask executor โ€” tools: search_web, browse_web, schedule_task, send_email
src/agents/PersonaShell.tsReads agent identity/soul/capabilities from workspace filesystem
src/scheduler.ts60-second heartbeat loop, executes cron jobs from workspace/cron_jobs.json
src/llm/index.tsProvider factory โ€” returns the configured LLM provider
src/llm/BaseProvider.tsAbstract base class for all LLM providers
src/cli.tsCommander.js CLI with all commands
src/setup.tsInteractive onboarding wizard (Clack prompts)
src/memory.tsPersistent memory / context management
src/usage.tsToken usage tracking and analytics
skills/send_email.pyGmail OAuth email sender with markdownโ†’HTML converter
dashboard/src/App.tsxMain dashboard (~2050 lines), all tabs/views

Agent Architecture โ€‹

Manager Agent (๐Ÿง  Ananta) โ€‹

The Manager is the orchestrator. When it receives a user request, it:

  1. Analyzes the request and creates a plan
  2. Delegates tasks to Worker agents by role (Coder or Researcher)
  3. Coordinates parallel and sequential task execution
  4. Aggregates results into a final response

The Manager's decision format:

json
{
  "direct_response": "...",    // If the Manager can answer directly
  "plan": [
    {
      "type": "task",
      "role": "researcher",
      "instruction": "Search for the latest AI news"
    },
    {
      "type": "parallel",
      "subtasks": [
        { "role": "coder", "instruction": "Write a Python script" },
        { "role": "researcher", "instruction": "Find documentation" }
      ]
    }
  ]
}

Worker Agents โ€‹

Workers are specialized executors. Each worker:

  1. Receives an instruction from the Manager
  2. Uses an action loop to reason and use tools iteratively
  3. Returns a result summary to the Manager

Available actions:

ActionDescription
search_webSearch the internet via web search API
browse_webNavigate to a URL and extract content (Playwright)
read_fileRead a file from the workspace
write_fileWrite or update a file in the workspace
run_commandExecute a shell command
send_emailSend an email via Gmail OAuth
schedule_taskCreate a recurring cron job
wait_for_userPause and wait for user input
final_answerReturn the final result to the Manager

PersonaShell โ€‹

Each agent's persona is loaded from the filesystem at workspace/agents/<name>/:

  • IDENTITY.md โ€” Who the agent is (personality, speaking style)
  • SOUL.md โ€” Behavioral directives and guardrails
  • CAPABILITIES.json โ€” Name, role, emoji, allowed tools, token budget
  • USER.md โ€” Learned context about the user (evolves over time)

Event Flow โ€‹

OpenSpider uses a unique event broadcasting pattern:

Manager/Worker โ†’ console.log(JSON) โ†’ server.ts intercepts โ†’ WebSocket broadcast โ†’ Dashboard UI
  1. Agent events: ManagerAgent.ts emits structured JSON via console.log() with types like task_start, task_complete, agent_flow
  2. Server intercept: server.ts overrides console.log to detect JSON events and broadcasts them via WebSocket
  3. Dashboard render: The React dashboard receives events and updates Agent Flow graph, Agent Chat, and System Logs in real-time

Cron Isolation โ€‹

When cron jobs execute, they use the same Managerโ†’Worker pipeline. To prevent cron-triggered events from interfering with the dashboard:

  • scheduler.ts maintains an activeCronJobs counter
  • server.ts checks this counter and suppresses WebSocket broadcast of agent_flow events when activeCronJobs > 0
  • This ensures the dashboard only shows user-initiated agent activity

Server Architecture โ€‹

The server (src/server.ts) combines:

  • Express HTTP server on port 4001
    • Serves the built dashboard (dashboard/dist/)
    • REST API endpoints for agent management, config, cron jobs, usage
  • WebSocket server on the same port
    • Real-time event streaming to dashboard
    • Agent chat message relay
  • Static file serving for the production dashboard build

Scheduler โ€‹

The scheduler (src/scheduler.ts) provides autonomous task execution:

  1. Initialization: Creates workspace/cron_jobs.json if missing, starts 60-second check loop
  2. Heartbeat: Every 60 seconds, scans all jobs and executes any that are due
  3. Execution: Creates a fresh ManagerAgent instance and sends the job's prompt as a system cron trigger
  4. Safety: Updates lastRunTimestamp before execution to prevent rapid-fire on crash

Scheduling Modes โ€‹

ModeTrigger ConditionUse Case
Interval-basedtimeSinceLastRun >= intervalHours * 3600000Recurring tasks (every 1h, 2h, etc.)
Time-of-dayCurrent time within 5 min of preferredTime AND hasn't run todayDaily reports at specific times (7:00 AM weather)

When preferredTime is set on a job (e.g. "07:00"), the scheduler ignores intervalHours and instead checks if the current local time matches the preferred time window and the job hasn't already run today.

Jobs can also be triggered manually via the dashboard or API using runJobForcefully().

Workspace Defaults & First Run โ€‹

On first run, initWorkspace() detects that no workspace/ directory exists and automatically copies the shipped workspace-defaults/ template into workspace/. This ensures all 3 default agents (Manager, Coder, Researcher) with their full SOUL.md, IDENTITY.md, and CAPABILITIES.json are ready out of the box.

The copy operation never overwrites existing files, so user customizations are preserved across upgrades.

Technology Stack โ€‹

LayerTechnology
RuntimeNode.js 22+ (TypeScript)
Web ServerExpress.js
Real-timeWebSocket (ws)
WhatsAppBaileys (@whiskeysockets/baileys)
DashboardReact + Vite
Browser ToolPlaywright Core
EmailPython 3 + Gmail API (OAuth 2.0)
Process ManagerPM2
CLICommander.js + Clack Prompts
BuildTypeScript Compiler (tsc)

Built with ๐Ÿ•ท๏ธ by the OpenSpider team