What I build
Hands-on building and implementation, engage me for a single workstream or as your embedded AI engineer. Every one of these ends in working software in your stack, owned by your team.
RAG & knowledge systems
Grounded answers over your own data, without the hallucinations that get a project shut down.
You get: Ingestion + retrieval pipeline, an eval set that measures answer quality, and a working assistant or API your team can extend.
Agents that take real action
Multi-step, tool-using agents that do work, not just chat about it.
You get: Agent wired into your APIs/tools with guardrails, retries, error-handling, and an audit trail. The kind of rigor you need once it's touching real systems or money (I've built exactly this over email, Brex, and enrichment APIs).
AI workflow automation
The unglamorous internal work that quietly saves hours every week.
You get: Automations for triage, drafting, research, and back-office flows, measured in hours saved, integrated into the tools you already use.
Production hardening / LLMOps
Turn the promising prototype into something you can actually rely on.
You get: Eval harness, cost + latency controls, observability, prompt/version management, and the fixes that get a stalled pilot to production.
Model selection & fine-tuning
The right model and the right technique, decided by testing, not vendor decks.
You get: A model/approach recommendation backed by evals on your data, plus fine-tuning or distillation where prompting genuinely isn't enough.
Honest strategy (when you need it)
A short, real roadmap, what's worth doing, what isn't, what it costs.
You get: A prioritized, costed plan grounded in what's actually buildable. A few tight pages you'll use, not a 50-page deck you'll file.