Evolution agent that bridges Obsidian (task/knowledge management) with Claude Code (execution) and Telegram (human-in-the-loop Q&A). Learns from its own behavior — detects repeated LLM patterns and generates tools to replace them, like a human automating repetitive work.
| Loop | What it does |
|---|---|
| Task Loop | Adaptive polling → preflight splits into sub-tasks → parallel fast agents + full pipeline agents → critic → eval → audit |
| Evolution | Records structured action sequences → fingerprints → detects repetition → generates replacement scripts → skips LLM next time |
| Knowledge | Watches knowledge/sources/ → ingests into persistent wiki → periodic lint → queryable via Telegram |
| Skill | Reads conversation history → extracts reusable skills → auto-organizes hierarchy → feeds into future runs |
| Feedback | Checks outcomes of completed tasks → learns from merged/rejected/reverted PRs → updates strategy |
Every task goes through a multi-stage pipeline. The orchestrator structurally enforces each stage — the agent can't skip them.
Pre-flight (split work into sub-tasks)
├── Investigate current state (PRs, branches, CI, issues)
├── Learn from past eval findings, outcome feedback, cross-connector patterns
├── Split into discrete sub-tasks (one per PR, one per action)
└── Output: structured sub-task list with priority and fast/full classification
Orchestrator (structural parallelism — not prompt-dependent)
├── Fast sub-tasks → parallel spawns, skip critique/eval
│ (e.g., comment "run buildall" on 10 PRs simultaneously)
└── Full sub-tasks → one per poll, full pipeline:
Execute → Critic → Fix pass → Eval → Learn
Critic (adversarial challenge)
├── Claimed vs actually done?
├── What was missed or assumed?
└── If needs_fix → resume agent to fix, then re-check
Eval (quality review + learning)
├── Score quality (1-10)
└── Route findings:
SKILL → saved for future runs
CODE → task added to thinkops connector
CRITICAL → Telegram alert
Key principles:
- Structural task splitting — orchestrator splits work at the system level, not via prompt instructions
- Parallel by default — fast sub-tasks run concurrently, full sub-tasks run one per poll
- Evolution — repeated LLM patterns are detected and replaced by generated scripts
- Adaptive polling — interval grows when idle (10m → 15m → 22m → ... → 1h), resets on activity
- Token-efficient —
--baremode for analysis-only spawns, context caps on audit/skills/preflight
ThinkOps observes its own behavior and learns to skip the LLM when possible:
1. Record: After each spawn, save structured action sequence
(tool names + structural inputs + outcome class)
2. Detect: Fingerprint sequences via SHA-256. When the same
pattern appears 3+ times → flag as repeated.
3. Generate: Spawn LLM once to produce a replacement bash script
that does the same work without any LLM call.
4. Use: Before each spawn, check for a generated tool.
If it succeeds → skip LLM entirely.
If it fails → fall back to LLM (self-healing).
This is general-purpose — it works for any template, any connector, any action pattern. The system doesn't know what "checking for tasks" means; it just notices "the LLM keeps doing the same sequence with the same outcome" and generates a cheap replacement.
Orchestrator (TypeScript — thin plumbing, no intelligence)
├── Task Loops (parallel, one per connector)
│ ├── Cheap check (## Check command — zero LLM cost)
│ ├── Generated tool check (learned scripts)
│ ├── Pre-flight (split into sub-tasks)
│ ├── Fast sub-tasks (parallel, no critique/eval)
│ └── Full sub-task (execute → critic → eval)
├── Action Tracker (fingerprint, detect repeats, trigger tool-gen)
├── Knowledge Loop (watch sources/, ingest, lint)
├── Skill Loop (extract from history, organize)
└── Feedback Loop (check outcomes, learn from results)
│
▼
Subagent Spawner
(each op = isolated CLI session with prompt template + context)
│
├── CLI Adapters (claude --bare / claude full / opencode)
├── Telegram Bot (Telegraf — Q&A bridge)
├── Web Dashboard (real-time SSE, sidebar navigation)
├── Run Logger (→ thinkops/_run_log.md)
└── Prompt Templates (prompts/*.md — THE BRAIN)
All intelligence lives in prompt templates (prompts/). TypeScript is config, CLI spawning, task splitting, action tracking, and logging.
# Install
npm install
# Configure
cp .env.example .env
# Edit .env with your Telegram bot token, chat ID, and vault path
# Install CLI globally
npm link
# Run
thinkopsYou can also run without global install via npm run dev.
| Variable | Description | Default |
|---|---|---|
TELEGRAM_BOT_TOKEN |
Telegram bot token from @BotFather | required |
TELEGRAM_CHAT_ID |
Your Telegram chat ID | required |
VAULT_PATH |
Path to Obsidian vault | ~/Documents/Obsidian Vault |
AGENT_CLI |
CLI agent to use (claude or opencode) |
claude |
AGENT_MODEL |
Model name passed to the CLI | sonnet |
AGENT_MAX_TIME |
Hard ceiling per agent run (seconds) | 7200 (2h) |
AGENT_IDLE_TIME |
Kill agent if no output for this long (seconds) | 300 (5min) |
TASK_CONCURRENCY |
Max parallel connector agent runs | 1 |
TASK_POLL_INTERVAL |
Initial poll interval in seconds (adaptive: 10m–1h) | 600 (10min) |
SKILL_EXTRACT_INTERVAL |
Seconds between skill extractions | 3600 |
SKILL_ORGANIZE_INTERVAL |
Seconds between skill reorganizations | 86400 |
KNOWLEDGE_LINT_INTERVAL |
Seconds between knowledge lint runs | 86400 |
TOOL_REVIEW_INTERVAL |
Seconds between tool reviews | 2592000 (30d) |
FEEDBACK_CHECK_INTERVAL |
Seconds between feedback learning cycles | 86400 |
DASHBOARD_PORT |
Web dashboard port | 3120 |
Real-time web dashboard at http://localhost:3120 with left sidebar navigation:
- Overview: uptime, completed tasks, active agents, concurrency gauge
- Tasks: all handled tasks (done/attempted) with quality scores, filterable by connector and status
- Connectors: per-connector stats, poll counts, audit logs
- Tools: generated and extracted tool scripts
- Skills: learned skills with hierarchy
- Loops: background loop stats with run history (adaptive intervals visible)
- Activity Log: real-time SSE stream, filterable by connector
API endpoints:
GET /api/status— JSON status snapshotGET /api/agents— active agents listGET /api/connectors— connector statsGET /api/audit/:name— parsed audit log entriesGET /api/tools— generated/extracted toolsGET /api/skills— learned skillsGET /api/loops— background loop stats with historyGET /api/events— SSE stream for real-time updates
ThinkOps is designed to minimize token consumption:
| Optimization | Savings |
|---|---|
--bare mode for analysis-only spawns (preflight, critique, eval) |
~15-20K tokens/spawn |
| Adaptive polling (10m–1h, grows when idle) | Fewer idle polls |
| Generated tools (replace repeated LLM patterns with scripts) | 100% for learned patterns |
## Check change detection (connector-level) |
Skip poll when nothing changed |
| Context caps: audit log (50 lines), skills (1.5K/file, 8K total), preflight (5K) | Bounded prompt growth |
Sub-task splitting with fast: true |
Skip critique/eval for mechanical tasks |
When the agent CLI returns a rate limit error (429, "hit your limit", etc.), ThinkOps applies exponential backoff for that connector (5min initial, doubling up to 1hr max). Backoff resets on the next successful run.
~/Documents/Obsidian Vault/
connectors/ # Task sources (Jira, GitHub, manual lists)
thinkops.md # Self-improvement connector
knowledge/
_schema.md # Wiki conventions
_index.md # Content catalog
sources/ # Raw source material (immutable)
entities/ # People, projects, technologies
topics/ # Concepts, patterns, comparisons
skills/
_schema.md # Skill format instructions
_tree.md # Auto-maintained hierarchy
coding/ # Domain folders (auto-created)
tools/
_gen_<fingerprint>.sh # Auto-generated replacement scripts
thinkops/
_run_log.md # All agent activity (append-only)
audit/ # Per-connector audit logs
actions/ # Action sequence history (for pattern detection)
A connector is an endless task source. Each .md file in connectors/ describes where to fetch tasks and how to work on them.
code directory: /path/to/project
using git worktree from upstream/master to isolate tasks.
create pr to apache/doris
## Tasks
### 1. Trigger CI builds (check every poll)
List open PRs: `gh pr list --repo apache/doris --author me --state open`
For each PR, check if latest commit has a "run buildall" comment.
If not, comment `run buildall` on the PR.
### 2. Resolve PR feedback (one PR per run)
Pick ONE PR with unresolved review comments or CI failures. Fix it fully.
## Check
gh pr list --repo apache/doris --author me --state open --json number,updatedAt
## Context
Additional context for the agent (auth tokens, conventions, etc.)How it works:
- Each connector gets its own independent polling loop with adaptive intervals.
- Pre-flight analyzes state and splits work into discrete sub-tasks.
- Fast sub-tasks (commenting, labeling) run in parallel without critique/eval.
- Full sub-tasks (code changes) go through the complete pipeline.
## Checkprovides cheap change detection — if output is unchanged, skip the poll entirely.- Completed tasks are tracked in
thinkops/audit/<connector>.md. - Concurrency is controlled by
TASK_CONCURRENCY(default 1).
After each task, an eval agent reviews the result and routes findings:
Task completed → Critic challenges → Eval reviews
├── SKILL: behavioral pattern → saved as skill for future runs
├── CODE: prompt/code fix → task added to thinkops connector
└── CRITICAL: serious bug → Telegram alert for human review
Past eval findings and outcome feedback feed back into future pre-flight analyses:
Eval finding → skill file / audit log
→ loaded by pre-flight for next task
→ agent avoids repeating the same mistake
| Command | Description |
|---|---|
/status |
Show ThinkOps status |
/connectors |
List all connectors with completed task counts |
/audit <name> |
Show audit log for a connector |
/query <question> |
Query the knowledge wiki |
/lint |
Run knowledge wiki audit |
/skills |
Show skill tree |
/ingest <path> |
Ingest a source into the wiki |
When an agent needs input or has low confidence, it outputs HUMAN_INPUT_NEEDED: <question>. The orchestrator sends the question to Telegram and waits for your reply, then resumes the agent session.
Implement the AgentCLI interface in src/agent/:
interface AgentCLI {
name: string;
execute(prompt: string, opts?: { cwd?: string; model?: string; bare?: boolean }): Promise<CLIResult>;
resume(sessionId: string, prompt: string, opts?: { cwd?: string }): Promise<CLIResult>;
}Then add it to src/agent/spawner.ts.
src/
index.ts # Entry point + --check flag
config.ts # Zod-validated config from .env
check.ts # Health check (vault, CLI, Telegram)
orchestrator.ts # Task splitting, parallel execution, adaptive polling
agent/
types.ts # AgentCLI + CLIResult + ToolAction interfaces
claude-cli.ts # Claude Code adapter (stream-json, action capture)
opencode-cli.ts # OpenCode adapter
spawner.ts # Template loading + CLI dispatch + bare mode
telegram/
bot.ts # Telegraf bot (Q&A bridge)
web/
server.ts # HTTP server + SSE + JSON API
dashboard.html # Single-file real-time dashboard (sidebar navigation)
utils/
run-logger.ts # Append to thinkops/_run_log.md
file-watcher.ts # chokidar wrapper
action-tracker.ts # Record actions, fingerprint, detect repeats
prompts/ # Prompt templates (THE BRAIN)
task-preflight.md # Split work into sub-tasks
connector-run.md # Execute a single sub-task
task-critique.md # Adversarial review
task-recover.md # Analyze failure, decide retry/escalate/abandon
eval-run.md # Quality review → SKILL / CODE / CRITICAL
tool-gen.md # Generate replacement scripts for repeated patterns
tool-extract.md # Extract reusable tools from agent output
tool-review.md # Review and maintain tool quality
feedback-check.md # Check outcomes of completed tasks
feedback-learn.md # Learn from outcome signals
knowledge-*.md # Ingest, query, lint
skill-*.md # Extract, organize, select
templates/ # Vault setup examples