Building semantic memory for AI assistants

Cesar Moreno

Full Stack · AI Engineer

AI Agents · Developer Tools · Multi-LLM Systems

I build AI developer tools and multi-agent systems in production — from orchestration platforms to semantic memory, TypeScript to Rust.

About

Over 13 years I've gone from real-time reservation platforms to building AI infrastructure — multi-agent orchestration systems, developer tools, and now persistent semantic memory for coding assistants.

At Apprecio , I lead a cross-functional team of 5 engineers — building AI agents that cut delivery time by 40%, achieving 97.5% token savings through intelligent caching, and optimizing microservice performance by 25% across a multi-tenant platform serving 500K+ users in 6 LATAM countries.

Outside work, I build tools for the AI developer ecosystem — Agentes Hub for multi-LLM code review orchestration, Claude Statusline for developer workflow visibility, and ContextForge, an MCP server in Rust for persistent semantic memory. I believe AI assistants should remember what they learn.

TypeScript for product, Rust for systems, production-tested always. I don't prototype — I ship.

13+
Years Building
97.5%
Token Reduction
500K+
Users Served

Experience

2020 — Present

Full Stack Engineer · AI · Apprecio

Santiago, Chile (Remote)

Lead a cross-functional team of 5 engineers, driving AI agent development and full-stack automation across a multi-tenant rewards platform serving 500K+ users in 6 LATAM countries. Built code review agents with multi-LLM orchestration (Claude, OpenAI, Gemini), achieving 97.5% token cost reduction through intelligent caching. Mentor junior and mid-level developers on AI-native workflows. Drove 40% faster product delivery by automating QA workflows and PR analysis. Optimized microservice performance by 25% across 13+ services handling peak loads during reward campaigns.

TypeScript Node.js NestJS React PostgreSQL AWS LangChain Claude API
2016 — 2020

Senior Full Stack Developer · Ae Online Solutions

Lima, Peru

Led a team of 3 developers building and integrating a customer management system for discovering valuable business opportunities and data analysis. Scaled the platform to handle 10K+ daily transactions with high availability. Built RESTful APIs with Node.js and PHP, integrated SQL and MongoDB databases, developed frontends with React and Vue.js.

JavaScript Node.js PHP Laravel React Vue.js MongoDB MySQL
2016

Full Stack Developer · Publicidad y Sistemas OPER

Lima, Peru

Developed real-time reservation management system for web and mobile platforms. Built user interfaces with Angular and Ionic, backend services with Laravel and Node.js, real-time features with Socket.io.

Angular Ionic Laravel Node.js Socket.io JavaScript

Projects

Agentes Hub

Multi-agent system that automates PR code reviews with security-focused analysis. Supports Claude, OpenAI, and Gemini with 97.5% token savings through intelligent caching.

  • TypeScript
  • LangChain
  • Claude API
  • Node.js
  • 97.5% token savings through intelligent caching
  • Multi-model support: Claude, OpenAI, Gemini
  • Security-focused code analysis pipeline
  • Configurable review templates per repository
  • Parallel agent execution with conflict resolution

ContextForgeWIP

MCP server in Rust for persistent semantic memory — AI coding assistants that remember decisions, understand codebases, and recover context across sessions.

  • Rust
  • MCP Protocol
  • libSQL
  • Vector Search
  • tree-sitter
  • Hybrid semantic + keyword search (FTS5 + vector embeddings)
  • Automatic codebase analysis with tree-sitter (50+ languages)
  • Git commit parsing for architectural decision extraction
  • MCP native — plugs into Claude Code, Cursor, Copilot
  • Single Rust binary, ~4-8MB, zero external dependencies

Claude Statusline

Rich two-line statusline for Claude Code CLI. Shows context usage, API rate limits with visual bars, git branch, session duration, and lines changed. Published on npm, optimized with batched JSON parsing for <50ms render.

  • Bash
  • Shell
  • Node.js
  • npm
  • Real-time context window usage visualization
  • API rate limit bars with color-coded thresholds
  • Git branch, lines changed, and session duration
  • Batched JSON parsing for <50ms render time
  • Published on npm as @cmorenogit/claude-statusline

Apprecio Rewards Platform

Multi-tenant rewards platform serving 500K+ users across 6 LATAM countries. 13+ microservices, 25% performance improvement, and 40% faster delivery through AI-powered automation.

  • TypeScript
  • NestJS
  • React
  • PostgreSQL
  • AWS
  • 500K+ active users across 6 LATAM countries
  • 13+ microservices architecture
  • 25% performance improvement in response times
  • 40% faster feature delivery with AI automation
  • Multi-tenant with per-country customization