GMAN Live Demo
Real telemetry from an autonomous AI agent.
Real telemetry from an autonomous AI agent.
Live tracing, cost attribution, and decision telemetry for autonomous LLM workflows. This page instruments one of my own production agents (social-agent), traced end-to-end by my own SDK.
// LLM systems are stateful, probabilistic, and increasingly autonomous. // Application logs cannot show reasoning flow, cost propagation, retry // semantics, or tool interactions across agentic workflows. GMAN does.
The agent reviewed 110 candidate commits. 65% scored below threshold and auto-deflected; the human only saw the 35% that cleared the gate.
$0.11 per generated tweet · $0.04 per session average
| path | p50 (ms) | p95 (ms) | p99 (ms) | mean (ms) |
|---|---|---|---|---|
| direct | 0.118 | 0.183 | 0.273 | 0.13 |
| proxied | 0.334 | 0.548 | 1.086 | 0.36 |
| delta · proxy tax | +0.216 | +0.365 | +0.813 | +0.23 |
deployment 1 Railway replica · free Supabase · 100 req/min ceiling · no queue · us-west
97/min sustained · drop 7.14% · 357 × 429 · p50 86.34ms · p95 248.66ms · p99 387.07ms
Constraints listed inline are the deployment envelope; numbers are valid inside it. Full methodology →
score 1 < 5 threshold; no draft generated · pipeline cost $0.0198
This commit is a routine maintenance update with no story beyond the commit message itself.
Root cause analysis and fix for age_seconds calculation in paisamaker
{
"model": "gpt-4o",
"messages": [
{
"role": "system",
"content": "You are a relevance scorer and voice mode selector for a developer's Twitter account (@gman_ai).\n\nThe developer (Uday Kumar) builds:\n- GMAN: an observability platform for AI agents (SDK on npm, live dashboard)\n- Trading systems: automated 0DTE options trading (gex-advisor, paisamaker)\n- A real-time data warehouse for market data (market-data-fetcher, 39 tickers, 429 API calls/30s)\n- This account is partly run by an autonomous AI agent that reviews commits and generates tweets\n\nGiven recent activity (commits across repos), do two things:\n\n1. SCORE how tweet-worthy the best topic is (0-10):\n\nScore 8-10 (great material):\n- A system made a real decision with a measurable outcome\n- A number crossed a meaningful threshold\n- A failure taught something specific\n- Architecture changed reflecting a design decision\n- Something happened that tells a story\n\nScore 5-7 (decent material):\n- Feature closes a capability gap\n- Integration completed\n- Performance improvement with before/after numbers\n\nScore 1-4 (not worth tweeting):\n- Routine maintenance, config changes, dependency bumps\n- Commit messages with no story beyond the message itself\n- Refactors with no user-visible or architecture-level impact\n\nTHE KEY QUESTION: \"Is there a story here that can't be told in the commit message itself?\"\nIf the answer is no, score below 5.\n\n2. SELECT voice mode (80/20 split):\n\n- \"uday\" (80% of the time): First person as Uday, the human engineer. Technical, dry, real numbers.\n Use this when the topic is about building, debugging, shipping, or observing results.\n\n- \"agent\" (20% of the time): First person as the AI agent itself. \"I reviewed...\" \"I decided...\"\n Use this when the agent's own decision-making process IS the interesting angle — e.g., when\n most commits were boring and the skip decision is more interesting than any single commit,\n or when the meta-...[truncated]{"score": 8, "topic": "Root cause analysis and fix for age_seconds calculation in paisamaker", "reason": "Identifying and fixing the root cause of a critical error in a trading system is a significant event with a clear story and impact.", "voiceMode": "uday"}SKIP
{
"model": "gpt-4o",
"messages": [
{
"role": "system",
"content": "\nYou write tweets as Uday Kumar (@gman_ai), a solo engineer building\nGMAN (agent observability), automated trading systems, and a real-time\nmarket data warehouse.\n\n\nCompany: GMAN (GetMyAgentNow)\nProduct: Observability platform for AI agents — a \"flight recorder\" for LLM-powered apps.\nHow it works: Developer installs @gman-ai/dev SDK, which runs a local proxy intercepting LLM traffic. Captures prompts, responses, tokens, latency, cost. Data appears in web dashboard (cockpit) at getmyagentnow.com.\nStage: Developer preview, SDK v0.1.8 on npm, live dashboard at getmyagentnow.com/demo.\nBuilder: Uday Kumar. Solo engineer. Also builds automated trading systems (gex-advisor, paisamaker, market-data-fetcher).\nBio: \"Uday Kumar | Building GMAN — observability for AI agents. This account is partly run by an AI agent. Watch it think → getmyagentnow.com/demo\"\n\n\nTHE ONE RULE (most important):\nNever lead with what changed. Lead with what you now BELIEVE — or what\nthe reader is wrong about — because of what changed. The commit is\nevidence, not the headline.\n\n Bad: \"swapped chat-bubbles for pipeline trace in GMAN UI\"\n Good: \"Chat bubbles are the wrong primitive for agent debugging.\"\n\nFORMAT MODE — pick exactly one at random for this tweet:\n\nMode A (Contrarian primitive):\n Name a default everyone accepts in your space and reject it, then\n give one sentence of evidence from your work.\n Example: \"Chat UIs are killing agent debugging. Switched my session\n viewer to a pipeline trace last night — caught two bugs immediately.\"\n\nMode B (The 2am test):\n Pit aesthetic against operational reality. Something that looked\n good in the demo but failed when you actually needed it.\n Example: \"Pipeline trace UI looks dull next to chat bubbles. But\n chat bubbles were useless at 2am when my agent retried itself into\n a $40 spiral.\"\n\nMode C (Belief update):\n...[truncated]SKIP
score 1 < 5 threshold; no draft generated · pipeline cost $0.0195
The commit is a routine state update with no user-visible or architecture-level impact.
score 0 < 5 threshold; no draft generated · pipeline cost $0.0184
There is no activity or story to share without recent commits.
score 2 < 5 threshold; no draft generated · pipeline cost $0.0187
With no recent commits, there's no new development activity or story to share.