Last 12 weeks · 11 commits
3 of 6 standards met
Workflow args passed via are currently untyped — they come in as a plain object and every skill that reads them has to trust that the caller got it right. There's no schema validation at the boundary, so a missing or misspelled arg silently propagates until something breaks deep inside a skill execution. This also means workflow authors get no autocomplete or type errors when defining their args, and skill authors have to defensively check for every field. Should we consider letting workflows declare an args schema upfront, so that is fully typed and validated before the workflow runs?
Repository: withastro/flue. Description: The sandbox agent framework. Stars: 9, Forks: 0. Primary language: TypeScript. Languages: TypeScript (48.5%), JavaScript (29.2%), Astro (22%), CSS (0.3%). License: Apache-2.0. Homepage: https://www.flueframework.com Open PRs: 0, open issues: 3. Last activity: 26m ago. Community health: 62%. Top contributors: FredKSchott.
The sandbox file () currently has to be manually 'd into the Docker image at a magic path () by every consumer's Dockerfile. This leaks implementation details (flue sets , OpenCode reads from ) and means changing the instructions requires an image rebuild. Since flue already injects content into the prompt, it could just read at runtime and include it directly. The agent gets the same instructions either way, the line disappears from every Dockerfile, and the instructions always match the current workspace instead of whatever was baked into the image.
Every repo that uses flue workflows with a sandbox currently needs to copy-paste the same ~40 lines of GitHub Actions YAML: checkout, docker login, buildx setup, build-push with caching and GHCR tagging. It's entirely generic except for the Dockerfile path and image name. When we improve the build (e.g. add multi-platform support, change caching strategy, or update pinned action versions), every repo needs to be updated independently. Should we consider publishing a reusable composite GitHub Action that encapsulates this? The action would enforce conventions by default (Dockerfile at , image pushed to ) but allow overrides via inputs. A consumer's workflow would shrink to: Inputs like , , and would be available for repos that need to customize. The path trigger () has to stay in the consumer's workflow since that's workflow-level config GitHub Actions can't encapsulate, but everything inside the job becomes a single line.
Skills running inside the sandbox keep hitting the same environmental pitfalls: failing without , agents trying / (not installed), using instead of , instead of readiness polling. These are environment mismatches that waste 30-60 seconds per skill execution and appear in nearly every run. We could hardcode "use not " into every skill file, but that scatters environment knowledge across dozens of files. Should we consider a sandbox environment manifest that flue automatically injects into the agent's context when running in sandbox mode, similar to ? Where should it live?
In the Astro repo, we still have a need to run privileged CLI calls from inside the restricted sandbox (with sandbox mode enabled). This is impossible because the sandbox is untrusted, so we can't pass privileged tokens into the sandbox and the sandbox cannot call back out to the host. Should we consider an optional MCP tool to the container via host-side MCP server. Workflow authors declare a policy of allowed commands via , and the LLM inside the container is then able to call a MCP tool that tunnels the command back to the host for execution. The host enforces the allowlist, rate limits, and per-command environment variables — the container never sees the actual tokens. This keeps the security model honest: the policy is set by the workflow, and anything not explicitly allowed is denied. Alternatively, should we consider a proxy system that allows the LLM to speak to services? without ever seeing the tokens that access them. This would bring more flexibility and fewer restrictions to how the LLM talks to these services, but also greater risk that it could be misused/exploited. A surprise benefit: this would unlock supporting more LLM providers than just Anthropic, and our current anthropic proxy could move to this public API.