Enterprise RAG System
AI-powered knowledge management for enterprise document processing
Overview
Built a production RAG (Retrieval-Augmented Generation) system that processes over 10,000 documents to power an enterprise knowledge management chatbot. The system handles diverse document formats including PDFs, Word documents, and complex Excel spreadsheets.
Beyond the core chatbot, developed internal LLM tooling including a Slack bot for quick queries, a knowledge base management UI, and automated deployment pipelines.
Challenge
The client's knowledge base was scattered across thousands of documents in various formats. Employees spent significant time searching for information, and institutional knowledge was often locked in the heads of senior staff.
Specific challenges included:
- Excel files with complex structures and embedded business logic
- Legacy documents with inconsistent formatting
- Need for accurate citations and source references
- Multi-language support (Japanese and English)
Solution
1. Semantic Chunking Pipeline
Developed custom chunking strategies for different document types. For Excel files, built a tool that understands table structures, preserves relationships between cells, and extracts meaningful chunks that maintain context.
2. RAG Architecture
Implemented a hybrid search approach combining dense vector embeddings with BM25 keyword search. The retrieval pipeline uses re-ranking to improve relevance, especially for domain-specific queries.
3. Chatbot Interface
Built a conversational interface using LangChain that maintains context across turns, provides source citations, and gracefully handles queries outside its knowledge base.
4. Internal Tooling
Beyond the main chatbot, created:
- Slack Bot: Quick Q&A directly in team channels
- Knowledge Base UI: Document upload, tagging, and management
- Deploy Pipeline: Automated testing and deployment for model updates
Infrastructure
Cloud-native architecture on Azure with infrastructure as code:
- Compute: Azure Container Apps for scalable API hosting
- Vector Store: Azure AI Search for embedding storage and retrieval
- Storage: Azure Blob Storage for document management
- IaC: Terraform modules for consistent environment provisioning
Results
The system has become a critical part of the client's daily operations:
- Average time to find information reduced from 15 minutes to under 30 seconds
- 85% of queries answered accurately without human intervention
- Onboarding time for new employees reduced significantly
- Knowledge documentation improved as gaps were identified
Technology Stack
Backend
- Python
- FastAPI
- LangChain
AI/ML
- OpenAI API
- Azure AI Search
- Embeddings
Infrastructure
- Azure
- Terraform
- Docker
DevOps
- GitHub Actions
- Container Apps
- Blob Storage