langchain-learning-kit/docs/architecture.md

6.2 KiB

LangChain Learning Kit - Architecture

Overview

The LangChain Learning Kit is an enterprise-grade learning platform built with FastAPI, LangChain, MySQL, and FAISS. It provides:

  • Model Management: Configure and manage multiple LLMs and embedding models
  • Knowledge Bases: Vector-based document storage and retrieval using FAISS
  • Multi-turn Conversations: Context-aware chat with RAG support
  • Agent Orchestration: LangChain agents with custom tools

Architecture Layers

1. API Layer (app/api/)

FastAPI routers handling HTTP requests:

  • models.py: Model CRUD operations
  • kb.py: Knowledge base management and querying
  • conv.py: Conversation and message endpoints
  • agent.py: Agent execution and logging

2. Service Layer (app/services/)

Business logic and orchestration:

  • ModelManager: LLM and embedding model management with environment variable substitution
  • KnowledgeBaseManager: Document ingestion, chunking, and vector search
  • ConversationManager: Multi-turn conversations with RAG integration
  • AgentOrchestrator: Agent execution with tool calling
  • AsyncJobManager: Background job processing using asyncio

3. Tools Layer (app/tools/)

LangChain tools for agent use:

  • KnowledgeBaseRetriever: Vector similarity search tool
  • CalculatorTool: Mathematical expression evaluation

4. Data Layer (app/db/)

Database models and management:

  • SQLAlchemy ORM Models: Models, KnowledgeBase, Document, Conversation, Message, ToolCall
  • Alembic Migrations: Schema versioning and migration scripts
  • DatabaseManager: Connection pooling and session management

5. Utilities (app/utils/)

Supporting utilities:

  • TextSplitter: Document chunking with configurable size/overlap
  • FAISSHelper: FAISS index creation, persistence, and querying
  • Logger: Structured JSON logging with structlog
  • Exceptions: Custom exception hierarchy

Data Flow

RAG-Enhanced Chat

User Request
    ↓
[API Layer] conv.py → chat()
    ↓
[Service] ConversationManager.chat()
    ↓
1. Save user message to DB
2. Retrieve conversation history
3. [If RAG enabled] Query KB → FAISS → Top-K docs
4. Build prompt: context + history + user input
5. LLM.predict()
6. Save assistant message
    ↓
Response with sources

Document Ingestion

Document Upload
    ↓
[API Layer] kb.py → ingest_documents()
    ↓
[Service] KnowledgeBaseManager.ingest_documents()
    ↓
[Background Job] _ingest_documents_sync()
    ↓
1. Get embedding model
2. Chunk documents (TextSplitter)
3. Create/load FAISS index
4. Add document vectors
5. Save index to disk
6. Save metadata to DB
    ↓
Job Complete

Agent Execution

Agent Task
    ↓
[API Layer] agent.py → execute_agent()
    ↓
[Service] AgentOrchestrator.execute_agent()
    ↓
1. Create tools (Calculator, Retriever)
2. Create LangChain agent
3. Execute with AgentExecutor
4. Log tool calls to DB
    ↓
Response with tool call history

Database Schema

Models Table

  • Stores LLM and embedding model configurations
  • Config stored as JSON with ${VAR} substitution support

Knowledge Bases

  • KnowledgeBase: Metadata (name, description)
  • Document: Content, source, metadata, embedding_id

Conversations

  • Conversation: User ID, title
  • Message: Role, content, metadata (sources, model)

Tool Calls

  • Logs all agent tool invocations
  • Stores input/output for debugging

Configuration

Environment-based configuration using Pydantic Settings:

DATABASE_URL         # MySQL connection string
OPENAI_API_KEY       # OpenAI API key
OPENAI_BASE_URL      # Optional proxy/custom endpoint
FAISS_BASE_PATH      # FAISS index storage directory
CHUNK_SIZE           # Document chunk size (default 1000)
CHUNK_OVERLAP        # Chunk overlap (default 200)
API_HOST             # Server host (default 0.0.0.0)
API_PORT             # Server port (default 8000)
ENVIRONMENT          # dev/staging/production
DEBUG                # Enable debug mode

Key Design Decisions

1. Asyncio over Celery

  • Uses Python's native asyncio for background jobs
  • Simpler deployment, no external message broker required
  • Suitable for medium-scale workloads

2. FAISS CPU Version

  • Uses faiss-cpu for broader compatibility
  • Easier deployment without GPU requirements
  • Sufficient for learning/development purposes

3. JSON Metadata Storage

  • Uses MySQL JSON columns for flexible metadata
  • Renamed from metadata to doc_metadata/msg_metadata to avoid SQLAlchemy reserved words

4. OpenAI Proxy Support

  • Configurable base_url for API routing
  • Supports both LLM and embedding endpoints
  • Applied globally or per-model

5. Structured Logging

  • JSON-formatted logs via structlog
  • Easy parsing for log aggregation systems
  • Contextual information for debugging

Deployment

Development

# Local development
conda activate pyth-311
python scripts/init_db.py
python -m uvicorn app.main:app --reload

Production with Docker

# Build and run with Docker Compose
docker-compose up -d

# Check status
docker-compose ps

# View logs
docker-compose logs -f api

Scalability Considerations

Current Architecture

  • Single FastAPI instance
  • MySQL for persistence
  • FAISS indexes on local disk
  • Asyncio for background jobs

Future Enhancements

  • Horizontal Scaling: Add load balancer, shared FAISS storage (S3/NFS)
  • Distributed Jobs: Replace asyncio with Celery + Redis/RabbitMQ
  • Vector Database: Migrate from FAISS to Pinecone/Weaviate/Milvus
  • Caching: Add Redis for conversation history and KB query results
  • Monitoring: Prometheus metrics, Grafana dashboards
  • Authentication: JWT-based auth with user management

Security

Current implementation is designed for learning/development:

  • ⚠️ No authentication/authorization
  • ⚠️ CORS configured for * (all origins)
  • ⚠️ Database credentials in .env file

Production recommendations:

  • Implement JWT authentication
  • Add role-based access control (RBAC)
  • Use secrets management (Vault, AWS Secrets Manager)
  • Restrict CORS to specific origins
  • Enable HTTPS/TLS
  • Add rate limiting
  • Implement input validation and sanitization