Shadman's AI Operating System

Agent Playbook

How each agent captures, synthesizes, and delivers · 4 agents · 18 flows

May 2026 Architecture map
Source
Process
Storage
Decision
Output
Chief of Staff
Manages work ops, surfaces what matters each morning, and acts as an always-on note-taker and knowledge base. The first thing I check every day and the last thing I brief before a meeting.
Morning brief
How the daily digest is assembled and delivered before 09:00.
Pre-cache scripts fire
calendar + work queue fetched by 08:25
Cached JSON snapshots
calendar-YYYY-MM-DD.json, queue-YYYY-MM-DD.json
Most recent meeting notes
Granola transcripts from last 48h
08:30 CEST weekdays
Sonnet synthesis
brief prompt + all three sources, < 90s
Telegram delivery
plain text, max 4000 chars
Brief archived
briefs/nabila/YYYY-MM-DD.md
Why cache first? MCP tools that talk to external services can take 30-60s each. Pre-caching before 08:25 means the brief fires on time even if the calendar or queue service is slow.
Capture a thought
Sending any note via Telegram gets it classified, filed, and confirmed within ~30 seconds.
Telegram message
any free-form note, idea, or status update
every 60s
Inbox listener
appends to _scratch/inbox/YYYY-MM-DD.md
Router hashes the entry
skips duplicates, processes new entries only
Haiku classifier
Intent: capture
vs. recall, action, feedback, diagnose
Project match found
filed to project notes
No project match
filed to ideas folder
Mirrored to Obsidian vault
Keystone/Captures/YYYY-MM-DD title.md
ROOT indexes within 1h
Semantic index updated
searchable by ask/recall intents
Telegram confirmation
"Captured: [title]. Filed at [path]."
The pipeline gap this fixed: Before May 2026, captures landed in project notes but never reached the semantic index or the weekly action scanner. Now every capture is mirrored to the vault so it closes open items automatically on Sunday.
Ask anything
Recall from the knowledge base using a natural language question via Telegram.
Telegram: "ask [question]"
or "recall [question]"
Router extracts question
strips "ask " prefix
Workspace scan
reads project notes, memory, handoffs, briefs
Sonnet, 180s timeout
Synthesis
grounded answer from workspace files only
Telegram reply
split across multiple messages if > 4000 chars
What it reads: project notes, memory shards, session handoffs, brief archives. It does not make external API calls, so it answers fast but only knows what has been indexed.
Leave feedback
How feedback on a brief changes what the next brief looks like.
Telegram: "fb: [feedback]"
on any brief, any persona
Router classifies as brief_feedback
extracts target persona from context
Appended to feedback log
briefs/nabila/feedback.log with timestamp
next run reads the log
Brief prompt includes feedback
last 5 entries injected into synthesis prompt
Next brief reflects it
no manual prompt editing required
Example: "fb: stop showing me cancelled meetings" gets logged and injected into the next morning's synthesis prompt. The brief adjusts without touching any config file.
Midday ping
Weekdays at 13:00: what three things still need to close before end of day. Uses the same cached data from the morning so it never re-fetches.
Morning cache (already fetched)
calendar + queue snapshots from 08:25 run
Today's inbox
captures and updates since morning — new signal only
13:00 CEST weekdays, Haiku 90s
Priority narrowing
cross-ref: unblocked + high impact + not waiting on someone else; skips anything inbox shows as done
Telegram delivery
2-3 bullets + one line on what's left on the calendar today
Brief archived
briefs/nabila/YYYY-MM-DD-midday-ping.md
Design choice: always fires, no conditional. Even on quiet days, a "nothing new, here's what's left on the calendar" message is useful. The constraint is Haiku at 90s — fast enough to be in the flow, cheap enough to run daily.
Block scan
Mondays at 08:15: looks at the last 7 days of captured notes for anything that keeps appearing without a completion signal. Stuck work, surfaced early in the week.
Last 7 days of inbox files
_scratch/inbox/YYYY-MM-DD.md, newest last
Ideas folder (14-day window)
slugs captured 3+ times flagged as recurring
Haiku, 90s
Recurring topic detection
topic on 3+ different days, no done/shipped/merged/closed near it
Blocks found?
NO_BLOCKS: silent exit, no Telegram
Diagnoses delivered
block type (cognitive/emotional/capability), root cause, next physical action, reframe question
Why Mondays: The week starts with a clean read on what didn't resolve last week. Up to 4 blocks, each under 60 words. The reframe question is the sharpest part: "Is this actually stuck, or just unscheduled?"
1:1 prep
Thursdays at 09:30: cross-references open items from the last 1:1 against this week's changelogs and meeting transcripts. Only asks questions when there are genuine data gaps.
Last 1:1 brief
open decisions, commitments, RED/YELLOW OKR items
PPP changelogs this week
git log last 7 days on product-brain; changelog sections only
Granola meeting transcripts
PM, sprint, OKR, and planning meetings only; standups excluded
Sonnet, Thursdays 09:30
Gap detection
for each open item: is it covered by PPP or Granola this week? If not, it's a gap.
Gaps found?
no gaps: "Brief looks solid. Nothing needed." Telegram fires either way.
Targeted questions or clear
max 4 questions, each names a specific project or commitment
The constraint that makes it useful: every question must name a specific project or commitment, not a general topic. "What's the status of the onboarding OKR?" is banned. "Has the FoS 3-combo cost decision been made?" is the target.
Meeting coaching
Every weekday at 18:00, every meeting that day gets scanned for behavioral patterns. Telegram only fires if there's something worth saying.
18:00 weekdays
end-of-day trigger; not tied to any specific meeting
All today's Granola transcripts
every meeting, not just 1:1s; standups and pure-listening sessions filtered out
Active performance themes
4 behavioral patterns: credit-sharing, backbencher identity, mid-spiral commits, conflict avoidance
Sonnet, up to 5 meetings
Transcript scan
exact quote + timestamp + which theme it matches
Coaching archive written
pm-workspace/coaching/YYYY-MM-DD-coaching.md, every run
Any incidents found?
+ repetition watch if same theme fires 3+ times
Telegram: coaching block
exact quote, theme, 2-sentence coaching note. Max 3 incidents.
Silent: clean day
archive written, Telegram skipped. No noise when nothing happened.
What makes it honest: The coaching is grounded in exact transcript quotes, not impressions. If the model can't find a line that matches a pattern, it says so. Clean days stay silent. The archive captures everything regardless.
Weekly action scanner
Sunday at 10:00: closes out the past week's commitments from completion signals, then layers in next week's calendar, OKRs, and project themes to produce a forward-looking actions doc.
Obsidian vault scanned
Keystone/, Captures/, Side Projects/ — action lines extracted
Completion signal matching
keyword overlap (2+ words, 5+ chars) against "completed, shipped, merged, sent..."
State file updated
matched items closed with reason; unmatched items persist; stale after 14 days
M365 calendar next Mon-Fri
via MCP; Swedish public holidays table
Squad OKRs
Q2 OKR CSV; RED/YELLOW status surfaced
Active project themes
_product-brain/ context for each open item
Sunday 10:00
Week actions doc
_work/actions/YYYY-MM-DD-week-actions.md — backward closed + forward context
Not just a checklist: the backward pass closes what's done, the forward pass adds why the remaining items matter next week. OKR status and calendar context mean the doc tells you what to prioritise, not just what's open.
Open loops drift
On the 1st of each month at 21:00: scans every memory shard for unresolved markers. Silent unless something is hot or going stale.
All memory shard files
~/.claude/projects/.../memory/*.md
Open marker count
grep: Open: / TODO: / Resume: / unchecked - [ ] across all shards
Hot or stale file?
hot: >5 open markers in one file; stale: file idle 60+ days with any open markers
Telegram alert (if drift)
lists hot and stale files by name; silent on clean months
Why monthly: the memory store accumulates cognitive debt slowly. Daily or weekly scans would produce noise. Monthly on the 1st is a natural review cadence — and silent months mean the store is healthy.
Revenue and Content
Tracks side project performance, monitors content reach, and surfaces the signal in a weekly revenue pulse. Built to keep moonlight work visible without needing a dashboard open.
Weekly revenue pulse
How side project metrics are gathered and delivered twice a week.
Analytics scripts fire
GA4 for two properties, Sunday 07:30
28-day snapshots saved
traffic, affiliate CTR, search rankings
Ad platform pulse
daily 08:00, flight performance + CTR
Mon 08:00 + Thu 17:00
Sonnet synthesis
revenue tracking against $5K/mo target
Telegram delivery
numbers, delta, and action signal if off-track
Content idea pipeline
How a rough thought becomes a filed draft ready to publish.
Idea via Telegram capture
intent: capture, topic: content
Filed to ideas folder
_scratch/content/ideas/YYYY-MM-DD-slug.md
Outbound pulse picks it up
ideas surface in Mon/Thu synthesis
Draft expanded in session
Substack + LinkedIn article + post variants
Campaign bundle filed
_scratch/content/posts/YYYY-MM-DD-slug/
Published + performance tracked
next pulse surfaces reach and engagement
SEO position watch
How keyword ranking changes surface as actionable signals.
Search Console snapshot
impressions, clicks, average position
Delta calculated
vs. 28-day prior snapshot
Position moved?
target keywords watched explicitly
Alert in Telegram pulse
rank + delta + suggested action if drop detected
Catalog health check
Verifies every Amazon affiliate link in the PrintPick catalog against live ASINs. Alerts only when new links break — silent on stable runs.
Full product catalog
all printer records with ASINs
Node.js ASIN verifier
checks each ASIN against Amazon; marks broken vs valid
State file diffed
current broken list vs prior run; first run saves baseline silently
New broken ASINs?
count grew OR new slugs appeared in broken list
Telegram drift alert
lists newly broken slugs + recovery commands to re-discover replacements
Revenue protection: a broken Amazon link loses the affiliate commission silently. This catches drift before it compounds. The alert includes the exact Node.js commands to run discovery and apply fixes.
Code and PR Review
Monitors code repos, surfaces PR queue and CI status at end of day, and flags anything that needs attention before the next standup.
Daily PR digest
How PR queue and CI status arrive before end of day.
GitHub API
via authenticated CLI, work repos
16:00 weekdays
PR queue fetched
open PRs, review requests, CI status
Sonnet synthesis
terse format: what needs action vs. waiting
Telegram delivery
PRs needing review first, CI failures flagged
Brief archived
briefs/kit/YYYY-MM-DD.md
Triage rule: PRs waiting on me appear first. PRs waiting on others appear as context. CI failures always surface even if nothing is blocking.
CI failure alert
How a failing build surfaces immediately, not just at 16:00.
GitHub CI run
on push or PR update
Pulse script checks status
looks for conclusion: failure on recent runs
Failure on my branch?
filters by author + branch ownership
Immediate Telegram alert
repo, branch, failed job, link to logs
Hobby and Side Projects
Keeps personal and pro-bono projects moving without letting them disappear for weeks. Low-pressure check-ins, no deadlines enforced, just visibility.
Hobby project check-in
How personal projects surface a brief weekly status without requiring manual updates.
Project files scanned
_hobby/* notes and progress files
weekly, low priority
Last-touched and open items
what was last updated, what is stuck
Haiku synthesis
terse: one line per project, no fluff
Telegram delivery
status + one suggested next step per project
Design intent: Musa never nags. If a hobby project went cold for three weeks, it shows that as a fact, not a failure. The goal is awareness, not accountability.
Go-live tracking
How Musa tracks deployment health and open issues on hobby projects.
Deployment platforms checked
Vercel status, last deploy timestamp
Open GitHub issues scanned
hobby repos, labelled bugs and features
Anything broken or stale?
deploy older than 30d or open P0 bug
Flagged in check-in
surfaced at top of Musa weekly brief
Building something similar?
This OS is a work in progress. I share what I learn on Substack and LinkedIn. If you're building personal AI infrastructure and want to compare notes, reach out.