6.2 KiB

Raw Blame History

LangChain Learning Kit - Architecture

Overview

The LangChain Learning Kit is an enterprise-grade learning platform built with FastAPI, LangChain, MySQL, and FAISS. It provides:

Model Management: Configure and manage multiple LLMs and embedding models
Knowledge Bases: Vector-based document storage and retrieval using FAISS
Multi-turn Conversations: Context-aware chat with RAG support
Agent Orchestration: LangChain agents with custom tools

Architecture Layers

1. API Layer (`app/api/`)

FastAPI routers handling HTTP requests:

models.py: Model CRUD operations
kb.py: Knowledge base management and querying
conv.py: Conversation and message endpoints
agent.py: Agent execution and logging

2. Service Layer (`app/services/`)

Business logic and orchestration:

ModelManager: LLM and embedding model management with environment variable substitution
KnowledgeBaseManager: Document ingestion, chunking, and vector search
ConversationManager: Multi-turn conversations with RAG integration
AgentOrchestrator: Agent execution with tool calling
AsyncJobManager: Background job processing using asyncio

3. Tools Layer (`app/tools/`)

LangChain tools for agent use:

KnowledgeBaseRetriever: Vector similarity search tool
CalculatorTool: Mathematical expression evaluation

4. Data Layer (`app/db/`)

Database models and management:

SQLAlchemy ORM Models: Models, KnowledgeBase, Document, Conversation, Message, ToolCall
Alembic Migrations: Schema versioning and migration scripts
DatabaseManager: Connection pooling and session management

5. Utilities (`app/utils/`)

Supporting utilities:

TextSplitter: Document chunking with configurable size/overlap
FAISSHelper: FAISS index creation, persistence, and querying
Logger: Structured JSON logging with structlog
Exceptions: Custom exception hierarchy

Data Flow

RAG-Enhanced Chat

User Request
    ↓
[API Layer] conv.py → chat()
    ↓
[Service] ConversationManager.chat()
    ↓
1. Save user message to DB
2. Retrieve conversation history
3. [If RAG enabled] Query KB → FAISS → Top-K docs
4. Build prompt: context + history + user input
5. LLM.predict()
6. Save assistant message
    ↓
Response with sources

Document Ingestion

Document Upload
    ↓
[API Layer] kb.py → ingest_documents()
    ↓
[Service] KnowledgeBaseManager.ingest_documents()
    ↓
[Background Job] _ingest_documents_sync()
    ↓
1. Get embedding model
2. Chunk documents (TextSplitter)
3. Create/load FAISS index
4. Add document vectors
5. Save index to disk
6. Save metadata to DB
    ↓
Job Complete

Agent Execution

Agent Task
    ↓
[API Layer] agent.py → execute_agent()
    ↓
[Service] AgentOrchestrator.execute_agent()
    ↓
1. Create tools (Calculator, Retriever)
2. Create LangChain agent
3. Execute with AgentExecutor
4. Log tool calls to DB
    ↓
Response with tool call history

Database Schema

Models Table

Stores LLM and embedding model configurations
Config stored as JSON with ${VAR} substitution support

Knowledge Bases

KnowledgeBase: Metadata (name, description)
Document: Content, source, metadata, embedding_id

Conversations

Conversation: User ID, title
Message: Role, content, metadata (sources, model)

Tool Calls

Logs all agent tool invocations
Stores input/output for debugging

Configuration

Environment-based configuration using Pydantic Settings:

DATABASE_URL         # MySQL connection string
OPENAI_API_KEY       # OpenAI API key
OPENAI_BASE_URL      # Optional proxy/custom endpoint
FAISS_BASE_PATH      # FAISS index storage directory
CHUNK_SIZE           # Document chunk size (default 1000)
CHUNK_OVERLAP        # Chunk overlap (default 200)
API_HOST             # Server host (default 0.0.0.0)
API_PORT             # Server port (default 8000)
ENVIRONMENT          # dev/staging/production
DEBUG                # Enable debug mode

Key Design Decisions

1. Asyncio over Celery

Uses Python's native asyncio for background jobs
Simpler deployment, no external message broker required
Suitable for medium-scale workloads

2. FAISS CPU Version

Uses faiss-cpu for broader compatibility
Easier deployment without GPU requirements
Sufficient for learning/development purposes

3. JSON Metadata Storage

Uses MySQL JSON columns for flexible metadata
Renamed from metadata to doc_metadata/msg_metadata to avoid SQLAlchemy reserved words

4. OpenAI Proxy Support

Configurable base_url for API routing
Supports both LLM and embedding endpoints
Applied globally or per-model

5. Structured Logging

JSON-formatted logs via structlog
Easy parsing for log aggregation systems
Contextual information for debugging

Deployment

Development

# Local development
conda activate pyth-311
python scripts/init_db.py
python -m uvicorn app.main:app --reload

Production with Docker

# Build and run with Docker Compose
docker-compose up -d

# Check status
docker-compose ps

# View logs
docker-compose logs -f api

Scalability Considerations

Current Architecture

Single FastAPI instance
MySQL for persistence
FAISS indexes on local disk
Asyncio for background jobs

Future Enhancements

Horizontal Scaling: Add load balancer, shared FAISS storage (S3/NFS)
Distributed Jobs: Replace asyncio with Celery + Redis/RabbitMQ
Vector Database: Migrate from FAISS to Pinecone/Weaviate/Milvus
Caching: Add Redis for conversation history and KB query results
Monitoring: Prometheus metrics, Grafana dashboards
Authentication: JWT-based auth with user management

Security

Current implementation is designed for learning/development:

⚠️ No authentication/authorization
⚠️ CORS configured for * (all origins)
⚠️ Database credentials in .env file

Production recommendations:

Implement JWT authentication
Add role-based access control (RBAC)
Use secrets management (Vault, AWS Secrets Manager)
Restrict CORS to specific origins
Enable HTTPS/TLS
Add rate limiting
Implement input validation and sanitization

6.2 KiB Raw Blame History