This appendix provides a curated list of tools and resources for
building agent infrastructure. The landscape changes rapidly - tools
listed here may have been superseded by the time you read this. The
categories and evaluation criteria are more durable than the specific
tools.
When evaluating tools, consider five criteria. Maturity: Is it
production-ready or experimental? Check the GitHub stars, the release
cadence, and whether anyone is using it in production. Standards
compliance: Does it support MCP, OpenTelemetry, and other standards?
Non-standard tools create lock-in. Community: Is there an
active community? Can you get help when you’re stuck?
Maintenance: Is it actively maintained? Check the last commit
date and the issue response time. License: Is the license
compatible with your use case? Some tools are open-source for
development but require a commercial license for production.
Agent frameworks
Choosing an agent framework is one of the most consequential early
decisions in agent adoption. The framework determines your development
velocity, your operational capabilities, and your migration cost if you
need to switch later.
For teams starting out: Use the simplest framework that meets
your needs. If you’re building a single-agent system for coding tasks,
you may not need a framework at all - the model provider’s SDK
(Anthropic’s Python SDK, OpenAI’s Python SDK) plus a simple loop is
sufficient. Adding a framework adds complexity, and complexity you don’t
need is complexity that slows you down.
For teams building multi-agent systems: LangGraph or CrewAI
provide the orchestration primitives you need - agent definitions, tool
routing, state management, and handoff protocols. LangGraph is more
flexible (graph-based, supports arbitrary workflows). CrewAI is more
opinionated (role-based, supports team-of-agents patterns). Choose based
on whether you want flexibility or convention.
For teams building production infrastructure: Consider whether
you need a framework or a platform. A framework gives you building
blocks - you assemble them into a system. A platform gives you a
complete system - you configure it for your use case. Frameworks are
more flexible but require more engineering investment. Platforms are
less flexible but get you to production faster.
| Tool | Type | Language | License | Best For |
|---|
| LangChain | Framework | Python/JS | MIT | Complex chains and workflows |
| LangGraph | Orchestration | Python/JS | MIT | Graph-based agent workflows |
| CrewAI | Framework | Python | MIT | Role-based multi-agent teams |
| AutoGen | Framework | Python | MIT | Multi-agent conversations |
| OpenAI Agents SDK | SDK | Python | MIT | OpenAI-native agent building |
| Goose | Agent | Rust/Python | Apache 2.0 | MCP-native production agent |
| Mastra | Framework | TypeScript | MIT | TypeScript-first agent building |
Coding agents
| Tool | Type | Pricing | Best For |
|---|
| Ona | Autonomous | Usage-based | Production-grade fully autonomous engineering with full environment access, secure sandboxed execution, multi-repo migrations at scale, enterprise (VPC) |
| Claude Code | CLI + IDE | Usage-based | Deep codebase reasoning |
| OpenAI Codex | Cloud | Usage-based | Parallel task execution |
| Cursor | IDE | $20/month | Real-time code generation |
| Windsurf | IDE | Free/$15/month | Multi-file cascade editing |
| GitHub Copilot | IDE + CLI | $10-39/month | Workspace-aware completion |
| Devin | Autonomous | Usage-based | End-to-end task completion |
Context engineering
| Tool | Purpose | Language | License |
|---|
| Distill | Context deduplication and compression | Python | MIT |
| AnythingLLM | Local RAG and document management | JavaScript | MIT |
| Chroma | Vector database for embeddings | Python | Apache 2.0 |
| Weaviate | Vector database | Go | BSD-3 |
| Qdrant | Vector database | Rust | Apache 2.0 |
| skills.sh | Open agent skills ecosystem (reusable SKILL.md files) | TypeScript | MIT |
Authorization & security
| Tool | Purpose | Language | License |
|---|
| OpenFGA | Zanzibar-style authorization | Go | Apache 2.0 |
| sandbox-runtime | Agent sandboxing | Python | MIT |
| Bubblewrap | Unprivileged sandboxing | C | LGPL |
| gVisor | Application kernel for containers | Go | Apache 2.0 |
Observability
| Tool | Purpose | Language | License |
|---|
| OpenTelemetry | Distributed tracing and metrics | Multi | Apache 2.0 |
| Langfuse | LLM observability platform | TypeScript | MIT |
| LangSmith | LangChain observability | Python | Commercial |
| Arize Phoenix | ML observability | Python | Apache 2.0 |
MCP ecosystem
Local inference
| Tool | Purpose | Best For |
|---|
| Ollama | Local model runner | Easy setup, many models |
| llama.cpp | Optimized local inference | Performance-critical |
| vLLM | High-throughput serving | Production serving |
| LocalAI | OpenAI-compatible local API | Drop-in replacement |