langchain-learning-kit/docs/architecture.md

231 lines
6.2 KiB
Markdown

# LangChain Learning Kit - Architecture
## Overview
The LangChain Learning Kit is an enterprise-grade learning platform built with FastAPI, LangChain, MySQL, and FAISS. It provides:
- **Model Management**: Configure and manage multiple LLMs and embedding models
- **Knowledge Bases**: Vector-based document storage and retrieval using FAISS
- **Multi-turn Conversations**: Context-aware chat with RAG support
- **Agent Orchestration**: LangChain agents with custom tools
## Architecture Layers
### 1. API Layer (`app/api/`)
FastAPI routers handling HTTP requests:
- **models.py**: Model CRUD operations
- **kb.py**: Knowledge base management and querying
- **conv.py**: Conversation and message endpoints
- **agent.py**: Agent execution and logging
### 2. Service Layer (`app/services/`)
Business logic and orchestration:
- **ModelManager**: LLM and embedding model management with environment variable substitution
- **KnowledgeBaseManager**: Document ingestion, chunking, and vector search
- **ConversationManager**: Multi-turn conversations with RAG integration
- **AgentOrchestrator**: Agent execution with tool calling
- **AsyncJobManager**: Background job processing using asyncio
### 3. Tools Layer (`app/tools/`)
LangChain tools for agent use:
- **KnowledgeBaseRetriever**: Vector similarity search tool
- **CalculatorTool**: Mathematical expression evaluation
### 4. Data Layer (`app/db/`)
Database models and management:
- **SQLAlchemy ORM Models**: Models, KnowledgeBase, Document, Conversation, Message, ToolCall
- **Alembic Migrations**: Schema versioning and migration scripts
- **DatabaseManager**: Connection pooling and session management
### 5. Utilities (`app/utils/`)
Supporting utilities:
- **TextSplitter**: Document chunking with configurable size/overlap
- **FAISSHelper**: FAISS index creation, persistence, and querying
- **Logger**: Structured JSON logging with structlog
- **Exceptions**: Custom exception hierarchy
## Data Flow
### RAG-Enhanced Chat
```
User Request
[API Layer] conv.py → chat()
[Service] ConversationManager.chat()
1. Save user message to DB
2. Retrieve conversation history
3. [If RAG enabled] Query KB → FAISS → Top-K docs
4. Build prompt: context + history + user input
5. LLM.predict()
6. Save assistant message
Response with sources
```
### Document Ingestion
```
Document Upload
[API Layer] kb.py → ingest_documents()
[Service] KnowledgeBaseManager.ingest_documents()
[Background Job] _ingest_documents_sync()
1. Get embedding model
2. Chunk documents (TextSplitter)
3. Create/load FAISS index
4. Add document vectors
5. Save index to disk
6. Save metadata to DB
Job Complete
```
### Agent Execution
```
Agent Task
[API Layer] agent.py → execute_agent()
[Service] AgentOrchestrator.execute_agent()
1. Create tools (Calculator, Retriever)
2. Create LangChain agent
3. Execute with AgentExecutor
4. Log tool calls to DB
Response with tool call history
```
## Database Schema
### Models Table
- Stores LLM and embedding model configurations
- Config stored as JSON with ${VAR} substitution support
### Knowledge Bases
- KnowledgeBase: Metadata (name, description)
- Document: Content, source, metadata, embedding_id
### Conversations
- Conversation: User ID, title
- Message: Role, content, metadata (sources, model)
### Tool Calls
- Logs all agent tool invocations
- Stores input/output for debugging
## Configuration
Environment-based configuration using Pydantic Settings:
```
DATABASE_URL # MySQL connection string
OPENAI_API_KEY # OpenAI API key
OPENAI_BASE_URL # Optional proxy/custom endpoint
FAISS_BASE_PATH # FAISS index storage directory
CHUNK_SIZE # Document chunk size (default 1000)
CHUNK_OVERLAP # Chunk overlap (default 200)
API_HOST # Server host (default 0.0.0.0)
API_PORT # Server port (default 8000)
ENVIRONMENT # dev/staging/production
DEBUG # Enable debug mode
```
## Key Design Decisions
### 1. Asyncio over Celery
- Uses Python's native asyncio for background jobs
- Simpler deployment, no external message broker required
- Suitable for medium-scale workloads
### 2. FAISS CPU Version
- Uses faiss-cpu for broader compatibility
- Easier deployment without GPU requirements
- Sufficient for learning/development purposes
### 3. JSON Metadata Storage
- Uses MySQL JSON columns for flexible metadata
- Renamed from `metadata` to `doc_metadata`/`msg_metadata` to avoid SQLAlchemy reserved words
### 4. OpenAI Proxy Support
- Configurable `base_url` for API routing
- Supports both LLM and embedding endpoints
- Applied globally or per-model
### 5. Structured Logging
- JSON-formatted logs via structlog
- Easy parsing for log aggregation systems
- Contextual information for debugging
## Deployment
### Development
```bash
# Local development
conda activate pyth-311
python scripts/init_db.py
python -m uvicorn app.main:app --reload
```
### Production with Docker
```bash
# Build and run with Docker Compose
docker-compose up -d
# Check status
docker-compose ps
# View logs
docker-compose logs -f api
```
## Scalability Considerations
### Current Architecture
- Single FastAPI instance
- MySQL for persistence
- FAISS indexes on local disk
- Asyncio for background jobs
### Future Enhancements
- **Horizontal Scaling**: Add load balancer, shared FAISS storage (S3/NFS)
- **Distributed Jobs**: Replace asyncio with Celery + Redis/RabbitMQ
- **Vector Database**: Migrate from FAISS to Pinecone/Weaviate/Milvus
- **Caching**: Add Redis for conversation history and KB query results
- **Monitoring**: Prometheus metrics, Grafana dashboards
- **Authentication**: JWT-based auth with user management
## Security
Current implementation is designed for learning/development:
- ⚠️ No authentication/authorization
- ⚠️ CORS configured for `*` (all origins)
- ⚠️ Database credentials in .env file
**Production recommendations:**
- Implement JWT authentication
- Add role-based access control (RBAC)
- Use secrets management (Vault, AWS Secrets Manager)
- Restrict CORS to specific origins
- Enable HTTPS/TLS
- Add rate limiting
- Implement input validation and sanitization