# LangChain Learning Kit - Architecture ## Overview The LangChain Learning Kit is an enterprise-grade learning platform built with FastAPI, LangChain, MySQL, and FAISS. It provides: - **Model Management**: Configure and manage multiple LLMs and embedding models - **Knowledge Bases**: Vector-based document storage and retrieval using FAISS - **Multi-turn Conversations**: Context-aware chat with RAG support - **Agent Orchestration**: LangChain agents with custom tools ## Architecture Layers ### 1. API Layer (`app/api/`) FastAPI routers handling HTTP requests: - **models.py**: Model CRUD operations - **kb.py**: Knowledge base management and querying - **conv.py**: Conversation and message endpoints - **agent.py**: Agent execution and logging ### 2. Service Layer (`app/services/`) Business logic and orchestration: - **ModelManager**: LLM and embedding model management with environment variable substitution - **KnowledgeBaseManager**: Document ingestion, chunking, and vector search - **ConversationManager**: Multi-turn conversations with RAG integration - **AgentOrchestrator**: Agent execution with tool calling - **AsyncJobManager**: Background job processing using asyncio ### 3. Tools Layer (`app/tools/`) LangChain tools for agent use: - **KnowledgeBaseRetriever**: Vector similarity search tool - **CalculatorTool**: Mathematical expression evaluation ### 4. Data Layer (`app/db/`) Database models and management: - **SQLAlchemy ORM Models**: Models, KnowledgeBase, Document, Conversation, Message, ToolCall - **Alembic Migrations**: Schema versioning and migration scripts - **DatabaseManager**: Connection pooling and session management ### 5. Utilities (`app/utils/`) Supporting utilities: - **TextSplitter**: Document chunking with configurable size/overlap - **FAISSHelper**: FAISS index creation, persistence, and querying - **Logger**: Structured JSON logging with structlog - **Exceptions**: Custom exception hierarchy ## Data Flow ### RAG-Enhanced Chat ``` User Request ↓ [API Layer] conv.py → chat() ↓ [Service] ConversationManager.chat() ↓ 1. Save user message to DB 2. Retrieve conversation history 3. [If RAG enabled] Query KB → FAISS → Top-K docs 4. Build prompt: context + history + user input 5. LLM.predict() 6. Save assistant message ↓ Response with sources ``` ### Document Ingestion ``` Document Upload ↓ [API Layer] kb.py → ingest_documents() ↓ [Service] KnowledgeBaseManager.ingest_documents() ↓ [Background Job] _ingest_documents_sync() ↓ 1. Get embedding model 2. Chunk documents (TextSplitter) 3. Create/load FAISS index 4. Add document vectors 5. Save index to disk 6. Save metadata to DB ↓ Job Complete ``` ### Agent Execution ``` Agent Task ↓ [API Layer] agent.py → execute_agent() ↓ [Service] AgentOrchestrator.execute_agent() ↓ 1. Create tools (Calculator, Retriever) 2. Create LangChain agent 3. Execute with AgentExecutor 4. Log tool calls to DB ↓ Response with tool call history ``` ## Database Schema ### Models Table - Stores LLM and embedding model configurations - Config stored as JSON with ${VAR} substitution support ### Knowledge Bases - KnowledgeBase: Metadata (name, description) - Document: Content, source, metadata, embedding_id ### Conversations - Conversation: User ID, title - Message: Role, content, metadata (sources, model) ### Tool Calls - Logs all agent tool invocations - Stores input/output for debugging ## Configuration Environment-based configuration using Pydantic Settings: ``` DATABASE_URL # MySQL connection string OPENAI_API_KEY # OpenAI API key OPENAI_BASE_URL # Optional proxy/custom endpoint FAISS_BASE_PATH # FAISS index storage directory CHUNK_SIZE # Document chunk size (default 1000) CHUNK_OVERLAP # Chunk overlap (default 200) API_HOST # Server host (default 0.0.0.0) API_PORT # Server port (default 8000) ENVIRONMENT # dev/staging/production DEBUG # Enable debug mode ``` ## Key Design Decisions ### 1. Asyncio over Celery - Uses Python's native asyncio for background jobs - Simpler deployment, no external message broker required - Suitable for medium-scale workloads ### 2. FAISS CPU Version - Uses faiss-cpu for broader compatibility - Easier deployment without GPU requirements - Sufficient for learning/development purposes ### 3. JSON Metadata Storage - Uses MySQL JSON columns for flexible metadata - Renamed from `metadata` to `doc_metadata`/`msg_metadata` to avoid SQLAlchemy reserved words ### 4. OpenAI Proxy Support - Configurable `base_url` for API routing - Supports both LLM and embedding endpoints - Applied globally or per-model ### 5. Structured Logging - JSON-formatted logs via structlog - Easy parsing for log aggregation systems - Contextual information for debugging ## Deployment ### Development ```bash # Local development conda activate pyth-311 python scripts/init_db.py python -m uvicorn app.main:app --reload ``` ### Production with Docker ```bash # Build and run with Docker Compose docker-compose up -d # Check status docker-compose ps # View logs docker-compose logs -f api ``` ## Scalability Considerations ### Current Architecture - Single FastAPI instance - MySQL for persistence - FAISS indexes on local disk - Asyncio for background jobs ### Future Enhancements - **Horizontal Scaling**: Add load balancer, shared FAISS storage (S3/NFS) - **Distributed Jobs**: Replace asyncio with Celery + Redis/RabbitMQ - **Vector Database**: Migrate from FAISS to Pinecone/Weaviate/Milvus - **Caching**: Add Redis for conversation history and KB query results - **Monitoring**: Prometheus metrics, Grafana dashboards - **Authentication**: JWT-based auth with user management ## Security Current implementation is designed for learning/development: - ⚠️ No authentication/authorization - ⚠️ CORS configured for `*` (all origins) - ⚠️ Database credentials in .env file **Production recommendations:** - Implement JWT authentication - Add role-based access control (RBAC) - Use secrets management (Vault, AWS Secrets Manager) - Restrict CORS to specific origins - Enable HTTPS/TLS - Add rate limiting - Implement input validation and sanitization