I've been thinking about AI agents wrong. We keep treating them as this special thing that needs special infrastructure. But what if agent skills are just... functions you compose?
# Specify what skills the agent has - like importing modules
async with boxlite.SkillBox(skills=["anthropics/skills"]) as agent:
# Skills loaded: pdf, docx, pptx, xlsx handling
await agent.call("Read quarterly_report.pdf and summarize key metrics")
await agent.call("Export that to a PowerPoint with charts") # remembers context
That's SkillBox. You declare what skills your agent has upfront. Then call it like any other async function. State persists between calls.
The same skill works across Claude Code, OpenAI Codex CLI, and Gemini CLI. It's becoming the package manager for agent capabilities.
# Install skills dynamically - like pip install at runtime
await agent.install_skill("anthropics/skills")
# PDF, Word, Excel, PowerPoint
await agent.install_skill("example/web-scraper")
# Custom scraping skill
await agent.install_skill("your-company/internal")
# Your private skills
Orchestrating multiple agents:
async with boxlite.SkillBox(skills=["anthropics/skills"]) as doc_agent:
async with boxlite.SkillBox(skills=["example/web-research"]) as research_agent:
research = await research_agent.call("Find Q3 earnings for top 5 tech companies")
analysis = await doc_agent.call(f"Analyze this data: {research}")
Just async Python. No YAML. No special DSL.
The fun demo - Starbucks: One function call, Claude navigates the website, handles popups, adds coffee to cart. You watch via noVNC at localhost:3000.
Yeah, it runs in a micro-VM (hardware isolation). But that's a feature, not the point. The point is the programming model.