231 lines
6.2 KiB
Markdown
231 lines
6.2 KiB
Markdown
|
|
# LangChain Learning Kit - Architecture
|
||
|
|
|
||
|
|
## Overview
|
||
|
|
|
||
|
|
The LangChain Learning Kit is an enterprise-grade learning platform built with FastAPI, LangChain, MySQL, and FAISS. It provides:
|
||
|
|
|
||
|
|
- **Model Management**: Configure and manage multiple LLMs and embedding models
|
||
|
|
- **Knowledge Bases**: Vector-based document storage and retrieval using FAISS
|
||
|
|
- **Multi-turn Conversations**: Context-aware chat with RAG support
|
||
|
|
- **Agent Orchestration**: LangChain agents with custom tools
|
||
|
|
|
||
|
|
## Architecture Layers
|
||
|
|
|
||
|
|
### 1. API Layer (`app/api/`)
|
||
|
|
|
||
|
|
FastAPI routers handling HTTP requests:
|
||
|
|
|
||
|
|
- **models.py**: Model CRUD operations
|
||
|
|
- **kb.py**: Knowledge base management and querying
|
||
|
|
- **conv.py**: Conversation and message endpoints
|
||
|
|
- **agent.py**: Agent execution and logging
|
||
|
|
|
||
|
|
### 2. Service Layer (`app/services/`)
|
||
|
|
|
||
|
|
Business logic and orchestration:
|
||
|
|
|
||
|
|
- **ModelManager**: LLM and embedding model management with environment variable substitution
|
||
|
|
- **KnowledgeBaseManager**: Document ingestion, chunking, and vector search
|
||
|
|
- **ConversationManager**: Multi-turn conversations with RAG integration
|
||
|
|
- **AgentOrchestrator**: Agent execution with tool calling
|
||
|
|
- **AsyncJobManager**: Background job processing using asyncio
|
||
|
|
|
||
|
|
### 3. Tools Layer (`app/tools/`)
|
||
|
|
|
||
|
|
LangChain tools for agent use:
|
||
|
|
|
||
|
|
- **KnowledgeBaseRetriever**: Vector similarity search tool
|
||
|
|
- **CalculatorTool**: Mathematical expression evaluation
|
||
|
|
|
||
|
|
### 4. Data Layer (`app/db/`)
|
||
|
|
|
||
|
|
Database models and management:
|
||
|
|
|
||
|
|
- **SQLAlchemy ORM Models**: Models, KnowledgeBase, Document, Conversation, Message, ToolCall
|
||
|
|
- **Alembic Migrations**: Schema versioning and migration scripts
|
||
|
|
- **DatabaseManager**: Connection pooling and session management
|
||
|
|
|
||
|
|
### 5. Utilities (`app/utils/`)
|
||
|
|
|
||
|
|
Supporting utilities:
|
||
|
|
|
||
|
|
- **TextSplitter**: Document chunking with configurable size/overlap
|
||
|
|
- **FAISSHelper**: FAISS index creation, persistence, and querying
|
||
|
|
- **Logger**: Structured JSON logging with structlog
|
||
|
|
- **Exceptions**: Custom exception hierarchy
|
||
|
|
|
||
|
|
## Data Flow
|
||
|
|
|
||
|
|
### RAG-Enhanced Chat
|
||
|
|
|
||
|
|
```
|
||
|
|
User Request
|
||
|
|
↓
|
||
|
|
[API Layer] conv.py → chat()
|
||
|
|
↓
|
||
|
|
[Service] ConversationManager.chat()
|
||
|
|
↓
|
||
|
|
1. Save user message to DB
|
||
|
|
2. Retrieve conversation history
|
||
|
|
3. [If RAG enabled] Query KB → FAISS → Top-K docs
|
||
|
|
4. Build prompt: context + history + user input
|
||
|
|
5. LLM.predict()
|
||
|
|
6. Save assistant message
|
||
|
|
↓
|
||
|
|
Response with sources
|
||
|
|
```
|
||
|
|
|
||
|
|
### Document Ingestion
|
||
|
|
|
||
|
|
```
|
||
|
|
Document Upload
|
||
|
|
↓
|
||
|
|
[API Layer] kb.py → ingest_documents()
|
||
|
|
↓
|
||
|
|
[Service] KnowledgeBaseManager.ingest_documents()
|
||
|
|
↓
|
||
|
|
[Background Job] _ingest_documents_sync()
|
||
|
|
↓
|
||
|
|
1. Get embedding model
|
||
|
|
2. Chunk documents (TextSplitter)
|
||
|
|
3. Create/load FAISS index
|
||
|
|
4. Add document vectors
|
||
|
|
5. Save index to disk
|
||
|
|
6. Save metadata to DB
|
||
|
|
↓
|
||
|
|
Job Complete
|
||
|
|
```
|
||
|
|
|
||
|
|
### Agent Execution
|
||
|
|
|
||
|
|
```
|
||
|
|
Agent Task
|
||
|
|
↓
|
||
|
|
[API Layer] agent.py → execute_agent()
|
||
|
|
↓
|
||
|
|
[Service] AgentOrchestrator.execute_agent()
|
||
|
|
↓
|
||
|
|
1. Create tools (Calculator, Retriever)
|
||
|
|
2. Create LangChain agent
|
||
|
|
3. Execute with AgentExecutor
|
||
|
|
4. Log tool calls to DB
|
||
|
|
↓
|
||
|
|
Response with tool call history
|
||
|
|
```
|
||
|
|
|
||
|
|
## Database Schema
|
||
|
|
|
||
|
|
### Models Table
|
||
|
|
- Stores LLM and embedding model configurations
|
||
|
|
- Config stored as JSON with ${VAR} substitution support
|
||
|
|
|
||
|
|
### Knowledge Bases
|
||
|
|
- KnowledgeBase: Metadata (name, description)
|
||
|
|
- Document: Content, source, metadata, embedding_id
|
||
|
|
|
||
|
|
### Conversations
|
||
|
|
- Conversation: User ID, title
|
||
|
|
- Message: Role, content, metadata (sources, model)
|
||
|
|
|
||
|
|
### Tool Calls
|
||
|
|
- Logs all agent tool invocations
|
||
|
|
- Stores input/output for debugging
|
||
|
|
|
||
|
|
## Configuration
|
||
|
|
|
||
|
|
Environment-based configuration using Pydantic Settings:
|
||
|
|
|
||
|
|
```
|
||
|
|
DATABASE_URL # MySQL connection string
|
||
|
|
OPENAI_API_KEY # OpenAI API key
|
||
|
|
OPENAI_BASE_URL # Optional proxy/custom endpoint
|
||
|
|
FAISS_BASE_PATH # FAISS index storage directory
|
||
|
|
CHUNK_SIZE # Document chunk size (default 1000)
|
||
|
|
CHUNK_OVERLAP # Chunk overlap (default 200)
|
||
|
|
API_HOST # Server host (default 0.0.0.0)
|
||
|
|
API_PORT # Server port (default 8000)
|
||
|
|
ENVIRONMENT # dev/staging/production
|
||
|
|
DEBUG # Enable debug mode
|
||
|
|
```
|
||
|
|
|
||
|
|
## Key Design Decisions
|
||
|
|
|
||
|
|
### 1. Asyncio over Celery
|
||
|
|
- Uses Python's native asyncio for background jobs
|
||
|
|
- Simpler deployment, no external message broker required
|
||
|
|
- Suitable for medium-scale workloads
|
||
|
|
|
||
|
|
### 2. FAISS CPU Version
|
||
|
|
- Uses faiss-cpu for broader compatibility
|
||
|
|
- Easier deployment without GPU requirements
|
||
|
|
- Sufficient for learning/development purposes
|
||
|
|
|
||
|
|
### 3. JSON Metadata Storage
|
||
|
|
- Uses MySQL JSON columns for flexible metadata
|
||
|
|
- Renamed from `metadata` to `doc_metadata`/`msg_metadata` to avoid SQLAlchemy reserved words
|
||
|
|
|
||
|
|
### 4. OpenAI Proxy Support
|
||
|
|
- Configurable `base_url` for API routing
|
||
|
|
- Supports both LLM and embedding endpoints
|
||
|
|
- Applied globally or per-model
|
||
|
|
|
||
|
|
### 5. Structured Logging
|
||
|
|
- JSON-formatted logs via structlog
|
||
|
|
- Easy parsing for log aggregation systems
|
||
|
|
- Contextual information for debugging
|
||
|
|
|
||
|
|
## Deployment
|
||
|
|
|
||
|
|
### Development
|
||
|
|
```bash
|
||
|
|
# Local development
|
||
|
|
conda activate pyth-311
|
||
|
|
python scripts/init_db.py
|
||
|
|
python -m uvicorn app.main:app --reload
|
||
|
|
```
|
||
|
|
|
||
|
|
### Production with Docker
|
||
|
|
```bash
|
||
|
|
# Build and run with Docker Compose
|
||
|
|
docker-compose up -d
|
||
|
|
|
||
|
|
# Check status
|
||
|
|
docker-compose ps
|
||
|
|
|
||
|
|
# View logs
|
||
|
|
docker-compose logs -f api
|
||
|
|
```
|
||
|
|
|
||
|
|
## Scalability Considerations
|
||
|
|
|
||
|
|
### Current Architecture
|
||
|
|
- Single FastAPI instance
|
||
|
|
- MySQL for persistence
|
||
|
|
- FAISS indexes on local disk
|
||
|
|
- Asyncio for background jobs
|
||
|
|
|
||
|
|
### Future Enhancements
|
||
|
|
- **Horizontal Scaling**: Add load balancer, shared FAISS storage (S3/NFS)
|
||
|
|
- **Distributed Jobs**: Replace asyncio with Celery + Redis/RabbitMQ
|
||
|
|
- **Vector Database**: Migrate from FAISS to Pinecone/Weaviate/Milvus
|
||
|
|
- **Caching**: Add Redis for conversation history and KB query results
|
||
|
|
- **Monitoring**: Prometheus metrics, Grafana dashboards
|
||
|
|
- **Authentication**: JWT-based auth with user management
|
||
|
|
|
||
|
|
## Security
|
||
|
|
|
||
|
|
Current implementation is designed for learning/development:
|
||
|
|
|
||
|
|
- ⚠️ No authentication/authorization
|
||
|
|
- ⚠️ CORS configured for `*` (all origins)
|
||
|
|
- ⚠️ Database credentials in .env file
|
||
|
|
|
||
|
|
**Production recommendations:**
|
||
|
|
- Implement JWT authentication
|
||
|
|
- Add role-based access control (RBAC)
|
||
|
|
- Use secrets management (Vault, AWS Secrets Manager)
|
||
|
|
- Restrict CORS to specific origins
|
||
|
|
- Enable HTTPS/TLS
|
||
|
|
- Add rate limiting
|
||
|
|
- Implement input validation and sanitization
|