6.2 KiB
6.2 KiB
LangChain Learning Kit - Architecture
Overview
The LangChain Learning Kit is an enterprise-grade learning platform built with FastAPI, LangChain, MySQL, and FAISS. It provides:
- Model Management: Configure and manage multiple LLMs and embedding models
- Knowledge Bases: Vector-based document storage and retrieval using FAISS
- Multi-turn Conversations: Context-aware chat with RAG support
- Agent Orchestration: LangChain agents with custom tools
Architecture Layers
1. API Layer (app/api/)
FastAPI routers handling HTTP requests:
- models.py: Model CRUD operations
- kb.py: Knowledge base management and querying
- conv.py: Conversation and message endpoints
- agent.py: Agent execution and logging
2. Service Layer (app/services/)
Business logic and orchestration:
- ModelManager: LLM and embedding model management with environment variable substitution
- KnowledgeBaseManager: Document ingestion, chunking, and vector search
- ConversationManager: Multi-turn conversations with RAG integration
- AgentOrchestrator: Agent execution with tool calling
- AsyncJobManager: Background job processing using asyncio
3. Tools Layer (app/tools/)
LangChain tools for agent use:
- KnowledgeBaseRetriever: Vector similarity search tool
- CalculatorTool: Mathematical expression evaluation
4. Data Layer (app/db/)
Database models and management:
- SQLAlchemy ORM Models: Models, KnowledgeBase, Document, Conversation, Message, ToolCall
- Alembic Migrations: Schema versioning and migration scripts
- DatabaseManager: Connection pooling and session management
5. Utilities (app/utils/)
Supporting utilities:
- TextSplitter: Document chunking with configurable size/overlap
- FAISSHelper: FAISS index creation, persistence, and querying
- Logger: Structured JSON logging with structlog
- Exceptions: Custom exception hierarchy
Data Flow
RAG-Enhanced Chat
User Request
↓
[API Layer] conv.py → chat()
↓
[Service] ConversationManager.chat()
↓
1. Save user message to DB
2. Retrieve conversation history
3. [If RAG enabled] Query KB → FAISS → Top-K docs
4. Build prompt: context + history + user input
5. LLM.predict()
6. Save assistant message
↓
Response with sources
Document Ingestion
Document Upload
↓
[API Layer] kb.py → ingest_documents()
↓
[Service] KnowledgeBaseManager.ingest_documents()
↓
[Background Job] _ingest_documents_sync()
↓
1. Get embedding model
2. Chunk documents (TextSplitter)
3. Create/load FAISS index
4. Add document vectors
5. Save index to disk
6. Save metadata to DB
↓
Job Complete
Agent Execution
Agent Task
↓
[API Layer] agent.py → execute_agent()
↓
[Service] AgentOrchestrator.execute_agent()
↓
1. Create tools (Calculator, Retriever)
2. Create LangChain agent
3. Execute with AgentExecutor
4. Log tool calls to DB
↓
Response with tool call history
Database Schema
Models Table
- Stores LLM and embedding model configurations
- Config stored as JSON with ${VAR} substitution support
Knowledge Bases
- KnowledgeBase: Metadata (name, description)
- Document: Content, source, metadata, embedding_id
Conversations
- Conversation: User ID, title
- Message: Role, content, metadata (sources, model)
Tool Calls
- Logs all agent tool invocations
- Stores input/output for debugging
Configuration
Environment-based configuration using Pydantic Settings:
DATABASE_URL # MySQL connection string
OPENAI_API_KEY # OpenAI API key
OPENAI_BASE_URL # Optional proxy/custom endpoint
FAISS_BASE_PATH # FAISS index storage directory
CHUNK_SIZE # Document chunk size (default 1000)
CHUNK_OVERLAP # Chunk overlap (default 200)
API_HOST # Server host (default 0.0.0.0)
API_PORT # Server port (default 8000)
ENVIRONMENT # dev/staging/production
DEBUG # Enable debug mode
Key Design Decisions
1. Asyncio over Celery
- Uses Python's native asyncio for background jobs
- Simpler deployment, no external message broker required
- Suitable for medium-scale workloads
2. FAISS CPU Version
- Uses faiss-cpu for broader compatibility
- Easier deployment without GPU requirements
- Sufficient for learning/development purposes
3. JSON Metadata Storage
- Uses MySQL JSON columns for flexible metadata
- Renamed from
metadatatodoc_metadata/msg_metadatato avoid SQLAlchemy reserved words
4. OpenAI Proxy Support
- Configurable
base_urlfor API routing - Supports both LLM and embedding endpoints
- Applied globally or per-model
5. Structured Logging
- JSON-formatted logs via structlog
- Easy parsing for log aggregation systems
- Contextual information for debugging
Deployment
Development
# Local development
conda activate pyth-311
python scripts/init_db.py
python -m uvicorn app.main:app --reload
Production with Docker
# Build and run with Docker Compose
docker-compose up -d
# Check status
docker-compose ps
# View logs
docker-compose logs -f api
Scalability Considerations
Current Architecture
- Single FastAPI instance
- MySQL for persistence
- FAISS indexes on local disk
- Asyncio for background jobs
Future Enhancements
- Horizontal Scaling: Add load balancer, shared FAISS storage (S3/NFS)
- Distributed Jobs: Replace asyncio with Celery + Redis/RabbitMQ
- Vector Database: Migrate from FAISS to Pinecone/Weaviate/Milvus
- Caching: Add Redis for conversation history and KB query results
- Monitoring: Prometheus metrics, Grafana dashboards
- Authentication: JWT-based auth with user management
Security
Current implementation is designed for learning/development:
- ⚠️ No authentication/authorization
- ⚠️ CORS configured for
*(all origins) - ⚠️ Database credentials in .env file
Production recommendations:
- Implement JWT authentication
- Add role-based access control (RBAC)
- Use secrets management (Vault, AWS Secrets Manager)
- Restrict CORS to specific origins
- Enable HTTPS/TLS
- Add rate limiting
- Implement input validation and sanitization