The problem
Industrial operators receive thousands of raw anomaly logs every week — most are noise, but a few threaten safety, availability, or equipment integrity. Engineers waste hours triaging by hand and still miss critical issues.
What I built
A full-stack platform that ingests anomaly reports from multiple data sources and uses a multi-agent AI system to triage, classify, and resolve them.
Multi-agent classification pipeline
A LangGraph state machine chains specialized agents:
gather_context → classify → suggest_actions → generate_plans → finalize
- Classifier Agent — scores each anomaly across three industrial-safety dimensions (integrity, availability, process safety, 0-5 each), computes a criticality score (0-15), maps to urgency tiers. Uses few-shot examples drawn dynamically from past validated classifications + a secondary LLM critique loop.
- Context Gatherer Agent — builds a four-layer dossier per anomaly: general precedents, similar-equipment precedents, equipment maintenance history, and technical manual excerpts — all retrieved via semantic search.
- Action Suggester Agent — recommends maintenance actions and matches anomalies to upcoming maintenance windows by proximity, location, and equipment.
RAG over the company knowledge base
PDFs, URLs, and technical manuals are chunked (recursive, semantic, fixed-size strategies), embedded with Nomic Embed Text (768-dim) via Ollama, and stored in PostgreSQL with pgvector + HNSW indexes for cosine-similarity search.
Streaming chat agent
A FastAPI service exposes 11 tools to the LLM via OpenAI function calling — equipment search, KB retrieval, statistics, anomaly creation. Real-time progress events ("Searching knowledge base…") stream to the UI.
Architecture decisions
✏️ TODO:
- Why LangGraph over plain LangChain agents
- Closed-loop learning: validated outcomes feed back as few-shot examples
- Provider-agnostic LLM layer (
llm_factory) for OpenAI / OpenRouter / Ollama- Why split into three independent FastAPI services
What I'd do differently
✏️ TODO
Tech stack
LangGraph · LangChain · FastAPI · OpenAI · Ollama · pgvector · PostgreSQL · Next.js