I build AI systems that ship
Erick Paniagua. I design and ship production agentic systems end to end: voice agents, LLM pipelines, and the eval and safety layers around them. AI-native from day one.
Most teams have AI tools. Few have AI leverage.
The gap is not access to models, it is getting them into production: wired into real workflows, measured, and trusted. That is the work I do.
- Prototype with no path forward
- No evals, no trust
- Not wired into real data
- Works in demo, fails in prod
- No observability
- Brittle integrations
- Unmeasured outputs
- Never handed off to the team
I close the gap: wire the model into real workflows, add eval gates, and hand it off as a system the team can trust.
WHAT I BUILD
From prototype to production.
Voice Agents
Production conversational voice systems: real-time, multi-tenant, with consent and call-outcome classification.
LLM Pipelines
RAG, structured outputs, tool use, and multi-step agent workflows wired into real data.
Eval Harnesses
LLM-as-judge gates with versioned rubrics and deterministic pass/fail in code, not vibes.
Agentic Automation
Agents that own end-to-end workflows: ingest, decide, act, on a schedule.
Multi-tenant SaaS
Row-level security, atomic billing, webhook integrity, built to run itself.
Data Pipelines
Scraping, enrichment, storage, and analysis with per-run cost budgets as a design constraint.
Forward-Deployed Delivery
Embed with a team, translate messy operations into shipped AI systems.
MCP and Integrations
Connect agents to the tools and systems a business already runs on.
Prompt Engineering
Systematic prompt design, model selection, and structured output tuning for production reliability.
Observability and Monitoring
Structured logging, cost tracking, and latency dashboards across deployed AI systems.
Voice Agents
Production conversational voice systems: real-time, multi-tenant, with consent and call-outcome classification.
LLM Pipelines
RAG, structured outputs, tool use, and multi-step agent workflows wired into real data.
Eval Harnesses
LLM-as-judge gates with versioned rubrics and deterministic pass/fail in code, not vibes.
Agentic Automation
Agents that own end-to-end workflows: ingest, decide, act, on a schedule.
Multi-tenant SaaS
Row-level security, atomic billing, webhook integrity, built to run itself.
Data Pipelines
Scraping, enrichment, storage, and analysis with per-run cost budgets as a design constraint.
Forward-Deployed Delivery
Embed with a team, translate messy operations into shipped AI systems.
MCP and Integrations
Connect agents to the tools and systems a business already runs on.
Prompt Engineering
Systematic prompt design, model selection, and structured output tuning for production reliability.
Observability and Monitoring
Structured logging, cost tracking, and latency dashboards across deployed AI systems.
Each engagement is scoped to one system. Ship it, measure it, then expand.
RESULTS
Systems that moved the number.
Real client engagements. Tap any tile for the full breakdown.
HOW I WORK
Forward-deployed, end to end.
From first conversation to live system, inside the operation, not at arm's length.
Embed
Sit with the team, learn the real workflow and where it breaks.
Scope
Define the outcome and the smallest system that delivers it.
Build
Ship it AI-native: agents, pipelines, evals, in days not quarters.
Deploy
Put it into the live operation with guardrails and monitoring.
Iterate
Measure against the goal, feed back, compound the gains.
Where I've Shipped
Across industries, one pattern: get AI into production.
Different domains, same discipline: find the highest-leverage workflow, build the agent, ship it.
Healthcare
Logistics & Fulfillment
Legal
Retail & E-Commerce
Sales & GTM
Professional Services
Trading & Fintech
Startups & AI-Native Teams
From regulated healthcare to fast-moving startups. The work translates.
My Stack
Four layers I build across.
Voice, browser, MCP, and coding. Each layer targets a distinct class of problem. Together they cover most of what a modern AI build requires.
Voice
Real-time conversational agents.
Consent, routing, call-outcome classification.
Browser
Agents that operate the tools
a business already uses.
MCP
Model Context Protocol integrations
connecting agents to live systems.
Coding
Agentic coding workflows across
the full build cycle.
- ✓Each layer can be engaged independently or together.
- ✓Most projects start with one layer and expand from there.
- ✓All four share the same underlying agent architecture.
BUILT WITH
The modern AI engineering stack.
The tools I reach for to ship agentic systems: typed, fast, and production ready.

