Anthropic
Published Claude Code as a thin client over their own MCP servers
Direct peer in the agent-tooling space — the public posture is 'use the thing you ship'. Useful read for what eat-your-dogfood looks like in writing.
Source: anthropic.com
Your stack ∩ mine
MCP-first server design — same primitive I run today across research/dev/architecture/marketing crews.
Source: docs/mcp
Your changelog calls out 'observability for tool-calls' as a Q3 priority — Langfuse + structured tracing is part of my default stack.
Source: changelog
OSS repos lean Python + TypeScript with explicit MCP servers — I ship in both, MCP-native (no wrappers).
Source: github.com/example-stackbeam
What you're already building toward
You're not a skeptic of this thesis — you're shipping the tooling that enables it. The question is whether your own platform team is run the same way: know-how documented as it's produced, infra agent-ready so the next iteration is a prompt instead of a rebuild. That's the operating mode I run today, and it's the one I'd bring to your team on day one.
Documented know-how
Agent-driven runbooks + decision logs. The platform that ships your product also documents it.
Agent-ready by default
MCP-native, composable internal tools. Pivots are one prompt away.
Eat your own dogfood
I run a generic 4-crew agent platform on hardware I bought myself. I know how it breaks.
Who's already moving this way in your space
You build agent tooling. Below are three peer-tier teams that have publicly described running their own platform internally — useful as a reference posture, not as a benchmark you have to match.
Anthropic
Published Claude Code as a thin client over their own MCP servers
Direct peer in the agent-tooling space — the public posture is 'use the thing you ship'. Useful read for what eat-your-dogfood looks like in writing.
Source: anthropic.com
Replit
Released Replit Agent and have written about using it internally
Adjacent posture: build the tool, live inside it, write about what breaks. A clean reference for the same operating mode.
Source: blog.replit.com
Vercel
v0 + AI SDK shipped alongside public posts on agent-assisted internal workflows
Same audience. The recurring theme in their writing is that the team that dogfoods first tends to write the abstractions everyone else copies.
Source: vercel.com/blog
I run a generic 4-crew agent platform end-to-end on my own hardware — happy to walk through what works and what doesn't on a short call.
What I'd do in the first 90 days
Day one I'd land an MCP-native test harness for the agent platform — your changelog flagged tool-call observability as Q3 and that's the boring infra that makes everything else shippable. Then I'd own the multi-crew dispatch path end-to-end: research → architect → dev → marketing, with Langfuse-style tracing on every hop. The third wedge is the one most platform candidates dodge — actually run a real end-to-end company on the platform and feed back the rough edges. I've been doing exactly this with my own fork.
Closest match — agent platform
If you're building agent tooling, you want someone who has burned hands operating a real one. I do.
Generic 4-crew agent platform (research/architecture/development/marketing) running real workloads — including this microsite — with zero per-customer code. MCP servers, Hermes director, Paperclip approvals.
Speed is the moat
The companies shipping agentic features in weeks are the ones where one engineer can rebuild infra, agent loop, and the demo app inside the same week. That's the move I make — and it's cheaper for you than a separate platform team because the same person who lands the abstraction also sees how it breaks in a real workload.
If after a short call it isn't the right fit, no pressure. Either way you get the analysis I already wrote about your stack.
Fidel Perez · Agent-platform engineer · 11+ yrs · MCP-native