How do you handle context compression cloud workflows?

2 points by gtram20 2 hours ago|1 comments

•

gtram20 2 hours ago

I’ve consistently found when running agentic cloud/infra workflows that the first bottleneck is almost always context size from excessive tool use: models need to know enough about your cloud environment to effectively answer architecture Terraform/migration/audit questions; if you inject too much every turn then the run gets expensive and noisy, but if you inject too little, it wastes time rediscovering the system.

Which solutions have worked for others in this regard? Especially interested in: - concise persistence between runs - compressing states into a stable outlines - which tools to expose every turn vs only on request - avoiding stale context as environments change

I’m one of the devs working on CloudGo.ai, so this is a problem I've spent a lot of time thinking about, but effective solutions change rapidly so I'm interested in concrete patterns from people building MCPs or similar agents with lots of available tools.