Architecture โ
OpenSpider uses a hierarchical multi-agent architecture where a Manager agent orchestrates specialized Worker agents to fulfill complex requests.
System Overview โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ
โ WhatsApp โโโโโโธโ โโโโโโธโ Manager โ
โ (Baileys) โ โ Server โ โ Agent ๐ง โ
โโโโโโโโโโโโโโโโโโโ โ (Express + โ โโโโโโโโฌโโโโโโโโ
โ WebSocket) โ โ
โโโโโโโโโโโโโโโโโโโ โ โ โโโโโโโโผโโโโโโโโ
โ Dashboard โโโโโโธโ โ โ Workers โ
โ (React/Vite) โ โ โ โ โก Coder โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโฌโโโโโโโโ โ ๐ฎ Researcherโ
โ โโโโโโโโโโโโโโโโ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโผโโโโโโโโ
โ CLI / TUI โโโโโโธโ Scheduler โ
โ (Commander) โ โ (60s loop) โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโKey Files & Their Roles โ
| File | Purpose |
|---|---|
src/server.ts | Express + WebSocket server, serves dashboard, API routes, console.log โ WS broadcast |
src/whatsapp.ts | Baileys WhatsApp connection, security firewall, message routing |
src/agents/ManagerAgent.ts | Orchestrator โ plans, delegates to workers, emits agent_flow events |
src/agents/WorkerAgent.ts | Task executor โ tools: search_web, browse_web, schedule_task, send_email |
src/agents/PersonaShell.ts | Reads agent identity/soul/capabilities from workspace filesystem |
src/scheduler.ts | 60-second heartbeat loop, executes cron jobs from workspace/cron_jobs.json |
src/llm/index.ts | Provider factory โ returns the configured LLM provider |
src/llm/BaseProvider.ts | Abstract base class for all LLM providers |
src/cli.ts | Commander.js CLI with all commands |
src/setup.ts | Interactive onboarding wizard (Clack prompts) |
src/memory.ts | Persistent memory / context management |
src/usage.ts | Token usage tracking and analytics |
skills/send_email.py | Gmail OAuth email sender with markdownโHTML converter |
dashboard/src/App.tsx | Main dashboard (~2050 lines), all tabs/views |
Agent Architecture โ
Manager Agent (๐ง Ananta) โ
The Manager is the orchestrator. When it receives a user request, it:
- Analyzes the request and creates a plan
- Delegates tasks to Worker agents by role (Coder or Researcher)
- Coordinates parallel and sequential task execution
- Aggregates results into a final response
The Manager's decision format:
{
"direct_response": "...", // If the Manager can answer directly
"plan": [
{
"type": "task",
"role": "researcher",
"instruction": "Search for the latest AI news"
},
{
"type": "parallel",
"subtasks": [
{ "role": "coder", "instruction": "Write a Python script" },
{ "role": "researcher", "instruction": "Find documentation" }
]
}
]
}Worker Agents โ
Workers are specialized executors. Each worker:
- Receives an instruction from the Manager
- Uses an action loop to reason and use tools iteratively
- Returns a result summary to the Manager
Available actions:
| Action | Description |
|---|---|
search_web | Search the internet via web search API |
browse_web | Navigate to a URL and extract content (Playwright) |
read_file | Read a file from the workspace |
write_file | Write or update a file in the workspace |
run_command | Execute a shell command |
send_email | Send an email via Gmail OAuth |
schedule_task | Create a recurring cron job |
wait_for_user | Pause and wait for user input |
final_answer | Return the final result to the Manager |
PersonaShell โ
Each agent's persona is loaded from the filesystem at workspace/agents/<name>/:
- IDENTITY.md โ Who the agent is (personality, speaking style)
- SOUL.md โ Behavioral directives and guardrails
- CAPABILITIES.json โ Name, role, emoji, allowed tools, token budget
- USER.md โ Learned context about the user (evolves over time)
Event Flow โ
OpenSpider uses a unique event broadcasting pattern:
Manager/Worker โ console.log(JSON) โ server.ts intercepts โ WebSocket broadcast โ Dashboard UI- Agent events:
ManagerAgent.tsemits structured JSON viaconsole.log()with types liketask_start,task_complete,agent_flow - Server intercept:
server.tsoverridesconsole.logto detect JSON events and broadcasts them via WebSocket - Dashboard render: The React dashboard receives events and updates Agent Flow graph, Agent Chat, and System Logs in real-time
Cron Isolation โ
When cron jobs execute, they use the same ManagerโWorker pipeline. To prevent cron-triggered events from interfering with the dashboard:
scheduler.tsmaintains anactiveCronJobscounterserver.tschecks this counter and suppresses WebSocket broadcast ofagent_flowevents whenactiveCronJobs > 0- This ensures the dashboard only shows user-initiated agent activity
Server Architecture โ
The server (src/server.ts) combines:
- Express HTTP server on port 4001
- Serves the built dashboard (
dashboard/dist/) - REST API endpoints for agent management, config, cron jobs, usage
- Serves the built dashboard (
- WebSocket server on the same port
- Real-time event streaming to dashboard
- Agent chat message relay
- Static file serving for the production dashboard build
Scheduler โ
The scheduler (src/scheduler.ts) provides autonomous task execution:
- Initialization: Creates
workspace/cron_jobs.jsonif missing, starts 60-second check loop - Heartbeat: Every 60 seconds, scans all jobs and executes any that are due
- Execution: Creates a fresh
ManagerAgentinstance and sends the job's prompt as a system cron trigger - Safety: Updates
lastRunTimestampbefore execution to prevent rapid-fire on crash
Scheduling Modes โ
| Mode | Trigger Condition | Use Case |
|---|---|---|
| Interval-based | timeSinceLastRun >= intervalHours * 3600000 | Recurring tasks (every 1h, 2h, etc.) |
| Time-of-day | Current time within 5 min of preferredTime AND hasn't run today | Daily reports at specific times (7:00 AM weather) |
When preferredTime is set on a job (e.g. "07:00"), the scheduler ignores intervalHours and instead checks if the current local time matches the preferred time window and the job hasn't already run today.
Jobs can also be triggered manually via the dashboard or API using runJobForcefully().
Workspace Defaults & First Run โ
On first run, initWorkspace() detects that no workspace/ directory exists and automatically copies the shipped workspace-defaults/ template into workspace/. This ensures all 3 default agents (Manager, Coder, Researcher) with their full SOUL.md, IDENTITY.md, and CAPABILITIES.json are ready out of the box.
The copy operation never overwrites existing files, so user customizations are preserved across upgrades.
Technology Stack โ
| Layer | Technology |
|---|---|
| Runtime | Node.js 22+ (TypeScript) |
| Web Server | Express.js |
| Real-time | WebSocket (ws) |
| Baileys (@whiskeysockets/baileys) | |
| Dashboard | React + Vite |
| Browser Tool | Playwright Core |
| Python 3 + Gmail API (OAuth 2.0) | |
| Process Manager | PM2 |
| CLI | Commander.js + Clack Prompts |
| Build | TypeScript Compiler (tsc) |