Context: SmartDoc AI - Intelligent Document Q&A System
Project Scope: OSSD Course - Spring 2026
1. Tech Stack & Environment
Bạn là một AI Coding Assistant hỗ trợ phát triển hệ thống RAG local theo kiến trúc OOP chuẩn. Hãy tuân thủ nghiêm ngặt các công nghệ sau:
- Frontend: Streamlit 1.41.1 (hoặc cao hơn)
- Application Framework: LangChain 0.3.16 & LangChain Community 0.3.16
- LLM Engine: Ollama chạy mô hình
qwen2.5:7b(được tối ưu cho tiếng Việt) - Embedding Model: HuggingFace
sentence-transformers/paraphrase-multilingual-mpnet-base-v2(768-dimensional, chạy trên CPU) - Vector Database: FAISS 1.9.0
- Document Loader: PDFPlumber (dùng
PDFPlumberLoader) - Python Version: 3.8+
2. Software Architecture: Clean Architecture & MVC Pattern
2.1. Architecture Overview
Dự án tuân theo Clean Architecture với 4 layer độc lập:
┌─────────────────────────────────────────┐
│ Presentation Layer │ ← Views (Streamlit UI)
├─────────────────────────────────────────┤
│ Application Layer │ ← Controllers (orchestration)
├─────────────────────────────────────────┤
│ Domain/Business Layer │ ← Services (business logic)
├─────────────────────────────────────────┤
│ Infrastructure Layer │ ← Models, Utils, External APIs
└─────────────────────────────────────────┘
Dependency Rule: Dependencies chỉ được hướng từ ngoài vào trong (Views → Controllers → Services → Models).
2.2. Layer Responsibilities
Models Layer (src/models/)
- Trách nhiệm: Định nghĩa data structures, domain entities, và business rules cơ bản
- Không được: Chứa logic UI, gọi external services, hoặc biết về LangChain
- Ví dụ class:
@dataclass class Document: """Domain entity representing a document""" id: str content: str metadata: Dict[str, Any] created_at: datetime @dataclass class ChatMessage: """Value object for chat messages""" role: Literal["user", "assistant", "system"] content: str timestamp: datetime metadata: Optional[Dict] = None
Services Layer (src/services/)
- Trách nhiệm: Business logic, RAG pipeline, vector store operations, LLM interactions
- Design Pattern: Repository Pattern, Strategy Pattern
- Ví dụ interface:
class AbstractDocumentService(ABC): @abstractmethod def load_document(self, file_path: str) -> List[Document]: pass @abstractmethod def chunk_document(self, document: Document, chunk_size: int, chunk_overlap: int) -> List[Document]: pass class AbstractVectorStoreService(ABC): @abstractmethod def add_documents(self, documents: List[Document]) -> None: pass @abstractmethod def similarity_search(self, query: str, k: int) -> List[Document]: pass @abstractmethod def clear_store(self) -> None: pass
Controllers Layer (src/controllers/)
- Trách nhiệm: Orchestration, input validation, error handling, logging
- Pattern: Facade Pattern, Command Pattern
- Nguyên tắc: Mỗi controller method phải:
- Validate input
- Log action
- Delegate to services
- Handle errors
- Return standardized response
Views Layer (src/views/)
- Trách nhiệm: Streamlit UI components, user interactions, state management
- Pattern: Observer Pattern (Streamlit session state), Component Pattern
- Nguyên tắc:
- Components phải reusable và stateless (dùng session_state)
- Không được chứa business logic
- Chỉ gọi controllers, không gọi trực tiếp services
2.3. Folder Structure Chi Tiết
src/
├── models/
│ ├── __init__.py
│ ├── document_model.py # Document, Chunk entities
│ ├── chat_model.py # ChatMessage, ChatHistory
│ └── config_model.py # AppConfig, RAGConfig
├── services/
│ ├── __init__.py
│ ├── document_service.py # Document loading & chunking
│ ├── vector_store_service.py # FAISS operations
│ ├── llm_service.py # Ollama LLM wrapper
│ ├── embedding_service.py # HuggingFace embeddings
│ └── rag_service.py # Complete RAG pipeline
├── controllers/
│ ├── __init__.py
│ ├── document_controller.py # Upload, process, manage docs
│ ├── chat_controller.py # Handle queries, get answers
│ └── config_controller.py # Manage settings
├── views/
│ ├── __init__.py
│ ├── document_screen.py # Document upload UI
│ ├── chat_screen.py # Chat interface
│ ├── settings_screen.py # Configuration UI
│ └── components.py # Reusable UI components
└── utils/
├── __init__.py
├── logger.py # Logging configuration
├── validators.py # Input validation
├── exceptions.py # Custom exceptions
└── constants.py # App-wide constants
3. Design Patterns & Best Practices
3.1. Essential Design Patterns
Repository Pattern (Vector Store)
class VectorStoreRepository(ABC):
"""Abstract repository for vector store operations"""
@abstractmethod
def save(self, embeddings: List[np.ndarray],
metadata: List[Dict]) -> None:
pass
@abstractmethod
def search(self, query_embedding: np.ndarray,
k: int) -> List[Tuple[Document, float]]:
pass
class FAISSRepository(VectorStoreRepository):
"""Concrete implementation using FAISS"""
pass
Factory Pattern (Document Loaders)
class DocumentLoaderFactory:
"""Factory to create appropriate loader based on file type"""
@staticmethod
def create_loader(file_path: str) -> AbstractDocumentLoader:
extension = Path(file_path).suffix.lower()
if extension == '.pdf':
return PDFPlumberLoader()
elif extension == '.docx':
return DocxLoader()
else:
raise UnsupportedFileTypeError(f"Unsupported: {extension}")
Strategy Pattern (Text Splitting)
class TextSplittingStrategy(ABC):
@abstractmethod
def split(self, text: str) -> List[str]:
pass
class RecursiveCharacterSplitter(TextSplittingStrategy):
def __init__(self, chunk_size: int = 1000, chunk_overlap: int = 100):
self.chunk_size = chunk_size
self.chunk_overlap = chunk_overlap
Singleton Pattern (Configuration)
class ConfigManager:
_instance = None
def __new__(cls):
if cls._instance is None:
cls._instance = super().__new__(cls)
cls._instance._initialize()
return cls._instance
def _initialize(self):
self.rag_config = RAGConfig(
chunk_size=1000,
chunk_overlap=100,
temperature=0.7,
top_p=0.9,
repeat_penalty=1.1
)
3.2. SOLID Principles Implementation
S - Single Responsibility Principle
Mỗi class chỉ có một lý do để thay đổi:
# ✅ GOOD: Separation of concerns
class DocumentLoader:
def load(self, path: str) -> str:
pass
class DocumentChunker:
def chunk(self, text: str) -> List[str]:
pass
# ❌ BAD: Multiple responsibilities
class DocumentProcessor:
def load_and_chunk(self, path: str) -> List[str]:
pass
O - Open/Closed Principle
Mở cho mở rộng, đóng cho sửa đổi:
# ✅ GOOD: Extensible through inheritance
class AbstractEmbeddingService(ABC):
@abstractmethod
def embed(self, texts: List[str]) -> List[np.ndarray]:
pass
class HuggingFaceEmbedding(AbstractEmbeddingService):
pass
class OpenAIEmbedding(AbstractEmbeddingService): # Easy to add new
pass
L - Liskov Substitution Principle
Subclass phải thay thế được base class:
# ✅ GOOD: Consistent interface
class BaseVectorStore(ABC):
@abstractmethod
def add_vectors(self, vectors: np.ndarray) -> None:
pass
class FAISSStore(BaseVectorStore):
def add_vectors(self, vectors: np.ndarray) -> None:
# Implementation that honors the contract
pass
I - Interface Segregation Principle
Không ép client implement methods không dùng:
# ✅ GOOD: Segregated interfaces
class Searchable(ABC):
@abstractmethod
def search(self, query: str) -> List[Document]:
pass
class Clearable(ABC):
@abstractmethod
def clear(self) -> None:
pass
class VectorStore(Searchable, Clearable): # Implement what's needed
pass
D - Dependency Inversion Principle
Depend on abstractions, not concretions:
# ✅ GOOD: Depend on abstraction
class ChatController:
def __init__(self, llm_service: AbstractLLMService,
vector_service: AbstractVectorStoreService):
self.llm = llm_service
self.vector_store = vector_service
# ❌ BAD: Depend on concrete class
class ChatController:
def __init__(self):
self.llm = OllamaLLM() # Tightly coupled
3.3. Code Quality Standards
Type Hints & Docstrings (Bắt buộc)
from typing import List, Optional, Dict, Any
from dataclasses import dataclass
def process_query(
query: str,
k: int = 3,
filters: Optional[Dict[str, Any]] = None
) -> List[Document]:
"""
Process user query and retrieve relevant documents.
Args:
query: User's question in natural language
k: Number of documents to retrieve (default: 3)
filters: Optional metadata filters
Returns:
List of relevant Document objects sorted by relevance
Raises:
ValueError: If query is empty
VectorStoreError: If vector store is not initialized
Example:
>>> docs = process_query("What is RAG?", k=5)
>>> print(len(docs))
5
"""
if not query.strip():
raise ValueError("Query cannot be empty")
# Implementation...
Error Handling Strategy
# src/utils/exceptions.py
class SmartDocError(Exception):
"""Base exception for SmartDoc application"""
pass
class DocumentLoadError(SmartDocError):
"""Raised when document loading fails"""
pass
class VectorStoreError(SmartDocError):
"""Raised when vector store operations fail"""
pass
class LLMConnectionError(SmartDocError):
"""Raised when cannot connect to Ollama"""
pass
# Usage in service
def load_document(self, path: str) -> Document:
try:
with open(path, 'rb') as f:
content = self.loader.load(f)
return Document(content=content, metadata={'source': path})
except FileNotFoundError:
logger.error(f"File not found: {path}")
raise DocumentLoadError(f"Cannot find file: {path}")
except Exception as e:
logger.exception(f"Unexpected error loading {path}")
raise DocumentLoadError(f"Failed to load document: {str(e)}")
Logging Standards
# src/utils/logger.py
import logging
from typing import Optional
def setup_logger(name: str, level: int = logging.INFO) -> logging.Logger:
"""Configure and return a logger instance"""
logger = logging.getLogger(name)
logger.setLevel(level)
if not logger.handlers:
handler = logging.StreamHandler()
formatter = logging.Formatter(
'%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
handler.setFormatter(formatter)
logger.addHandler(handler)
return logger
# Usage in all modules
logger = setup_logger(__name__)
class DocumentService:
def chunk_document(self, doc: Document, chunk_size: int) -> List[Document]:
logger.info(f"Chunking document {doc.id} with size={chunk_size}")
chunks = self._split(doc.content, chunk_size)
logger.info(f"Created {len(chunks)} chunks from document {doc.id}")
return chunks
4. Core Architecture & Pipeline Parameters
4.1. Default RAG Configuration
@dataclass
class RAGConfig:
"""Default configuration for RAG pipeline"""
# Text Splitting
chunk_size: int = 1000
chunk_overlap: int = 100
# Retrieval
search_type: str = "similarity"
retrieval_k: int = 3
retrieval_fetch_k: int = 20
# LLM Parameters
temperature: float = 0.7
top_p: float = 0.9
repeat_penalty: float = 1.1
max_tokens: int = 512
# Models
llm_model: str = "qwen2.5:7b"
embedding_model: str = "sentence-transformers/paraphrase-multilingual-mpnet-base-v2"
# Prompt Template
system_prompt: str = """You are a helpful AI assistant. Answer the question based on the provided context.
If the context doesn't contain relevant information, say "I don't have enough information to answer this question."
Detect the language of the question and respond in the same language.
Keep your answer concise (3-4 sentences maximum).
"""
4.2. Prompt Engineering Template
class PromptTemplate:
"""Professional prompt templates with language detection"""
SYSTEM_PROMPT = """You are SmartDoc AI, an intelligent document assistant.
Instructions:
1. Analyze the CONTEXT provided below
2. Answer the QUESTION based ONLY on the context
3. If context is insufficient, clearly state: "I don't have enough information"
4. Detect question language (Vietnamese/English) and respond in SAME language
5. Keep answers concise: 3-4 sentences maximum
6. Be factual and precise
CONTEXT:
{context}
QUESTION:
{question}
ANSWER:"""
@staticmethod
def format_prompt(context: str, question: str) -> str:
return PromptTemplate.SYSTEM_PROMPT.format(
context=context,
question=question
)
5. Development Roadmap & Task Implementation
Task 1: DOCX Support (OOP Implementation)
# src/services/document_service.py
from langchain_community.document_loaders import Docx2txtLoader
from pathlib import Path
class DocumentLoaderFactory:
"""Factory pattern for document loaders"""
LOADERS = {
'.pdf': PDFPlumberLoader,
'.docx': Docx2txtLoader,
'.txt': TextLoader,
}
@classmethod
def create_loader(cls, file_path: str):
extension = Path(file_path).suffix.lower()
loader_class = cls.LOADERS.get(extension)
if not loader_class:
raise UnsupportedFileTypeError(
f"File type {extension} not supported. "
f"Supported types: {list(cls.LOADERS.keys())}"
)
logger.info(f"Creating {loader_class.__name__} for {file_path}")
return loader_class(file_path)
Task 2 & 3: Chat History & Vector Store Management
# src/models/chat_model.py
@dataclass
class ChatHistory:
"""Manages conversation history"""
messages: List[ChatMessage] = field(default_factory=list)
max_history: int = 50
def add_message(self, role: str, content: str) -> None:
self.messages.append(ChatMessage(
role=role,
content=content,
timestamp=datetime.now()
))
if len(self.messages) > self.max_history:
self.messages.pop(0)
def clear(self) -> None:
logger.info("Clearing chat history")
self.messages.clear()
def to_dict(self) -> List[Dict]:
return [asdict(msg) for msg in self.messages]
# src/controllers/chat_controller.py
class ChatController:
def clear_history(self) -> None:
"""Clear chat history with confirmation"""
if st.session_state.get('chat_history'):
st.session_state.chat_history.clear()
logger.info("Chat history cleared by user")
st.success("✅ Chat history cleared successfully")
def clear_vector_store(self) -> None:
"""Clear vector store with confirmation"""
try:
self.vector_service.clear_store()
logger.warning("Vector store cleared by user")
st.success("✅ Vector store cleared successfully")
except Exception as e:
logger.error(f"Error clearing vector store: {e}")
st.error(f"❌ Error: {str(e)}")
Task 4: Chunk Tuning UI
# src/views/settings_screen.py
class SettingsScreen:
"""Settings configuration UI"""
def render_chunk_settings(self):
st.subheader("⚙️ Chunk Configuration")
chunk_size = st.slider(
"Chunk Size",
min_value=500,
max_value=2000,
value=1000,
step=100,
help="Size of text chunks for processing"
)
chunk_overlap = st.slider(
"Chunk Overlap",
min_value=50,
max_value=200,
value=100,
step=10,
help="Overlap between consecutive chunks"
)
if st.button("Apply Settings"):
self.controller.update_chunk_config(chunk_size, chunk_overlap)
st.success("✅ Settings updated successfully")
Task 5: Citations & Metadata Display
# src/models/document_model.py
@dataclass
class DocumentChunk:
"""Document chunk with metadata for citations"""
content: str
metadata: Dict[str, Any]
@property
def page_number(self) -> Optional[int]:
return self.metadata.get('page')
@property
def source_file(self) -> str:
return self.metadata.get('source', 'Unknown')
def get_citation(self) -> str:
"""Format citation string"""
page = self.page_number
source = Path(self.source_file).name
if page:
return f"[{source}, page {page}]"
return f"[{source}]"
# src/views/chat_screen.py
def display_answer_with_citations(self, answer: str, sources: List[DocumentChunk]):
"""Display answer with clickable citations"""
st.markdown(answer)
with st.expander("📚 View Sources"):
for idx, source in enumerate(sources, 1):
st.markdown(f"**Source {idx}:** {source.get_citation()}")
st.text_area(
f"Context {idx}",
value=source.content,
height=100,
key=f"source_{idx}",
disabled=True
)
6. Coding Principles & Standards
6.1. Local-First Architecture
# ✅ GOOD: Local inference only
class OllamaLLMService(AbstractLLMService):
def __init__(self, model: str = "qwen2.5:7b"):
self.llm = Ollama(
model=model,
base_url="http://localhost:11434" # Local Ollama
)
def generate(self, prompt: str) -> str:
try:
return self.llm.invoke(prompt)
except Exception as e:
raise LLMConnectionError("Cannot connect to Ollama. Is it running?")
# ❌ BAD: External API calls
class OpenAIService: # Don't use this!
def __init__(self, api_key: str):
self.client = OpenAI(api_key=api_key) # ❌ Costs money
6.2. Comprehensive Logging
# Bắt buộc log ở các điểm sau:
logger = setup_logger(__name__)
class RAGService:
def process_query(self, query: str) -> str:
# 1. Log input
logger.info(f"Received query: {query[:50]}...")
# 2. Log retrieval
docs = self.vector_store.search(query, k=3)
logger.info(f"Retrieved {len(docs)} documents")
# 3. Log chunks
context = "\n".join([doc.content for doc in docs])
logger.debug(f"Context length: {len(context)} characters")
# 4. Log LLM call
logger.info("Calling LLM for generation")
answer = self.llm.generate(self.prompt.format(context, query))
# 5. Log output
logger.info(f"Generated answer: {answer[:50]}...")
return answer
6.3. Robust Error Handling
# src/controllers/document_controller.py
class DocumentController:
def upload_and_process(self, uploaded_file) -> bool:
"""Upload and process document with comprehensive error handling"""
try:
# Validation
if not self._validate_file(uploaded_file):
st.error("❌ Invalid file format")
return False
# Save file
file_path = self._save_uploaded_file(uploaded_file)
logger.info(f"Saved file to {file_path}")
# Load document
try:
document = self.doc_service.load_document(file_path)
except DocumentLoadError as e:
st.error(f"❌ Cannot load document: {str(e)}")
return False
# Process chunks
try:
chunks = self.doc_service.chunk_document(document)
logger.info(f"Created {len(chunks)} chunks")
except Exception as e:
st.error(f"❌ Chunking failed: {str(e)}")
return False
# Store vectors
try:
self.vector_service.add_documents(chunks)
st.success(f"✅ Processed {len(chunks)} chunks successfully")
return True
except VectorStoreError as e:
st.error(f"❌ Vector store error: {str(e)}")
return False
except Exception as e:
logger.exception("Unexpected error in upload_and_process")
st.error(f"❌ Unexpected error: {str(e)}")
return False
7. Testing Strategy
7.1. Unit Tests Structure
# tests/test_document_service.py
import pytest
from src.services.document_service import DocumentService
from src.utils.exceptions import DocumentLoadError
class TestDocumentService:
@pytest.fixture
def doc_service(self):
return DocumentService()
def test_load_pdf_success(self, doc_service, tmp_path):
# Arrange
pdf_path = tmp_path / "test.pdf"
# Create dummy PDF...
# Act
document = doc_service.load_document(str(pdf_path))
# Assert
assert document is not None
assert document.content
assert document.metadata['source'] == str(pdf_path)
def test_load_invalid_file_raises_error(self, doc_service):
# Act & Assert
with pytest.raises(DocumentLoadError):
doc_service.load_document("nonexistent.pdf")
def test_chunk_document(self, doc_service):
# Arrange
doc = Document(content="A" * 5000, metadata={})
# Act
chunks = doc_service.chunk_document(doc, chunk_size=1000, chunk_overlap=100)
# Assert
assert len(chunks) > 1
assert all(len(chunk.content) <= 1000 for chunk in chunks)
7.2. Integration Tests
# tests/test_rag_pipeline.py
class TestRAGPipeline:
def test_end_to_end_query(self, rag_service, sample_document):
# Arrange
rag_service.add_document(sample_document)
query = "What is the main topic?"
# Act
answer = rag_service.query(query)
# Assert
assert answer
assert len(answer) > 0
assert isinstance(answer, str)
8. Performance Optimization Guidelines
8.1. Caching Strategy
from functools import lru_cache
class EmbeddingService:
@lru_cache(maxsize=1000)
def embed_text(self, text: str) -> np.ndarray:
"""Cache embeddings for frequently used texts"""
return self.model.encode(text)
8.2. Batch Processing
def add_documents_batch(self, documents: List[Document], batch_size: int = 32):
"""Process documents in batches for efficiency"""
for i in range(0, len(documents), batch_size):
batch = documents[i:i+batch_size]
embeddings = self.embedding_service.embed_batch([d.content for d in batch])
self.vector_store.add(embeddings, batch)
9. Streamlit Best Practices
9.1. Session State Management
# src/utils/session_state.py
class SessionStateManager:
"""Centralized session state management"""
@staticmethod
def initialize():
"""Initialize all session state variables"""
if 'chat_history' not in st.session_state:
st.session_state.chat_history = ChatHistory()
if 'vector_store_initialized' not in st.session_state:
st.session_state.vector_store_initialized = False
if 'current_document' not in st.session_state:
st.session_state.current_document = None
if 'rag_config' not in st.session_state:
st.session_state.rag_config = RAGConfig()
@staticmethod
def get(key: str, default: Any = None) -> Any:
"""Safely get session state value"""
return st.session_state.get(key, default)
@staticmethod
def set(key: str, value: Any) -> None:
"""Set session state value"""
st.session_state[key] = value
logger.debug(f"Session state updated: {key}")
9.2. Component Architecture
# src/views/components.py
class UIComponents:
"""Reusable UI components following Component Pattern"""
@staticmethod
def file_uploader(
label: str = "Upload Document",
accepted_types: List[str] = ['.pdf', '.docx']
) -> Optional[Any]:
"""Reusable file uploader component"""
return st.file_uploader(
label,
type=accepted_types,
help=f"Supported formats: {', '.join(accepted_types)}"
)
@staticmethod
def chat_message(role: str, content: str, avatar: str = None):
"""Render a chat message bubble"""
with st.chat_message(role, avatar=avatar):
st.markdown(content)
@staticmethod
def loading_spinner(message: str = "Processing..."):
"""Context manager for loading spinner"""
return st.spinner(message)
@staticmethod
def error_alert(message: str, details: Optional[str] = None):
"""Styled error alert"""
st.error(f"❌ {message}")
if details:
with st.expander("Error Details"):
st.code(details)
@staticmethod
def success_alert(message: str):
"""Styled success alert"""
st.success(f"✅ {message}")
@staticmethod
def sidebar_section(title: str, icon: str = "📌"):
"""Create sidebar section with title"""
st.sidebar.markdown(f"### {icon} {title}")
9.3. Main App Structure (app.py)
# app.py
import streamlit as st
from src.controllers.chat_controller import ChatController
from src.controllers.document_controller import DocumentController
from src.views.chat_screen import ChatScreen
from src.views.document_screen import DocumentScreen
from src.views.settings_screen import SettingsScreen
from src.utils.session_state import SessionStateManager
from src.utils.logger import setup_logger
logger = setup_logger(__name__)
def main():
"""Main application entry point"""
# Page configuration
st.set_page_config(
page_title="SmartDoc AI",
page_icon="📚",
layout="wide",
initial_sidebar_state="expanded"
)
# Initialize session state
SessionStateManager.initialize()
# Initialize controllers (dependency injection)
chat_controller = ChatController()
document_controller = DocumentController()
# Sidebar navigation
with st.sidebar:
st.title("📚 SmartDoc AI")
st.markdown("---")
page = st.radio(
"Navigation",
["💬 Chat", "📄 Documents", "⚙️ Settings"],
label_visibility="collapsed"
)
# Route to appropriate screen
if page == "💬 Chat":
chat_screen = ChatScreen(chat_controller)
chat_screen.render()
elif page == "📄 Documents":
doc_screen = DocumentScreen(document_controller)
doc_screen.render()
elif page == "⚙️ Settings":
settings_screen = SettingsScreen()
settings_screen.render()
if __name__ == "__main__":
main()
9.4. Screen Implementation Example
# src/views/chat_screen.py
from typing import Optional
import streamlit as st
from src.controllers.chat_controller import ChatController
from src.views.components import UIComponents
from src.utils.logger import setup_logger
logger = setup_logger(__name__)
class ChatScreen:
"""Chat interface screen following MVC pattern"""
def __init__(self, controller: ChatController):
self.controller = controller
self.components = UIComponents()
def render(self):
"""Render the chat screen"""
st.title("💬 Chat with Your Documents")
# Check if vector store is ready
if not st.session_state.get('vector_store_initialized'):
self._render_empty_state()
return
# Render chat interface
self._render_chat_history()
self._render_chat_input()
# Sidebar actions
self._render_sidebar_actions()
def _render_empty_state(self):
"""Show empty state when no documents loaded"""
st.info("📄 Please upload documents first in the Documents tab")
col1, col2, col3 = st.columns([1, 2, 1])
with col2:
st.image("assets/empty_state.png", use_container_width=True)
def _render_chat_history(self):
"""Display chat history"""
chat_history = st.session_state.get('chat_history')
if not chat_history or not chat_history.messages:
st.info("👋 Start a conversation by asking a question below")
return
for message in chat_history.messages:
avatar = "🧑" if message.role == "user" else "🤖"
self.components.chat_message(
role=message.role,
content=message.content,
avatar=avatar
)
def _render_chat_input(self):
"""Render chat input box"""
if prompt := st.chat_input("Ask a question about your documents..."):
# Add user message
self._add_user_message(prompt)
# Get AI response
with self.components.loading_spinner("Thinking..."):
try:
response = self.controller.process_query(prompt)
self._add_assistant_message(response)
except Exception as e:
logger.error(f"Error processing query: {e}")
self.components.error_alert(
"Failed to process your question",
details=str(e)
)
def _add_user_message(self, content: str):
"""Add user message to chat"""
st.session_state.chat_history.add_message("user", content)
self.components.chat_message("user", content, avatar="🧑")
def _add_assistant_message(self, content: str):
"""Add assistant message to chat"""
st.session_state.chat_history.add_message("assistant", content)
self.components.chat_message("assistant", content, avatar="🤖")
def _render_sidebar_actions(self):
"""Render sidebar action buttons"""
st.sidebar.markdown("---")
self.components.sidebar_section("Actions", "🔧")
if st.sidebar.button("🗑️ Clear Chat History", use_container_width=True):
if st.sidebar.checkbox("Confirm clear history?"):
self.controller.clear_history()
st.rerun()
10. Quick Start Implementation Guide
Step 1: Create Base Models
# src/models/document_model.py
from dataclasses import dataclass, field
from datetime import datetime
from typing import Dict, Any, Optional
import uuid
@dataclass
class Document:
"""Domain entity for documents"""
content: str
metadata: Dict[str, Any] = field(default_factory=dict)
id: str = field(default_factory=lambda: str(uuid.uuid4()))
created_at: datetime = field(default_factory=datetime.now)
def __post_init__(self):
"""Validation after initialization"""
if not self.content:
raise ValueError("Document content cannot be empty")
# src/models/chat_model.py
from dataclasses import dataclass, field
from datetime import datetime
from typing import List, Literal, Optional, Dict
from src.utils.logger import setup_logger
logger = setup_logger(__name__)
@dataclass
class ChatMessage:
"""Value object for chat messages"""
role: Literal["user", "assistant", "system"]
content: str
timestamp: datetime = field(default_factory=datetime.now)
metadata: Optional[Dict] = None
@dataclass
class ChatHistory:
"""Conversation history management"""
messages: List[ChatMessage] = field(default_factory=list)
max_history: int = 50
def add_message(self, role: str, content: str) -> None:
"""Add a new message to history"""
self.messages.append(ChatMessage(role=role, content=content))
# Trim if exceeds max
if len(self.messages) > self.max_history:
removed = self.messages.pop(0)
logger.debug(f"Removed old message from history: {removed.timestamp}")
def clear(self) -> None:
"""Clear all messages"""
logger.info(f"Clearing {len(self.messages)} messages from history")
self.messages.clear()
def get_recent(self, n: int = 10) -> List[ChatMessage]:
"""Get n most recent messages"""
return self.messages[-n:] if len(self.messages) > n else self.messages
Step 2: Implement Services
# src/services/llm_service.py
from abc import ABC, abstractmethod
from typing import Optional
from langchain_community.llms import Ollama
from src.utils.exceptions import LLMConnectionError
from src.utils.logger import setup_logger
logger = setup_logger(__name__)
class AbstractLLMService(ABC):
"""Abstract interface for LLM services"""
@abstractmethod
def generate(self, prompt: str, **kwargs) -> str:
"""Generate response from prompt"""
pass
class OllamaLLMService(AbstractLLMService):
"""Ollama LLM service implementation"""
def __init__(
self,
model: str = "qwen2.5:7b",
base_url: str = "http://localhost:11434",
temperature: float = 0.7,
top_p: float = 0.9,
repeat_penalty: float = 1.1
):
logger.info(f"Initializing Ollama with model: {model}")
try:
self.llm = Ollama(
model=model,
base_url=base_url,
temperature=temperature,
top_p=top_p,
repeat_penalty=repeat_penalty
)
logger.info("Ollama LLM initialized successfully")
except Exception as e:
logger.error(f"Failed to initialize Ollama: {e}")
raise LLMConnectionError(
"Cannot connect to Ollama. Ensure Ollama is running on localhost:11434"
)
def generate(self, prompt: str, **kwargs) -> str:
"""Generate response from prompt"""
try:
logger.info(f"Generating response for prompt length: {len(prompt)}")
response = self.llm.invoke(prompt, **kwargs)
logger.info(f"Generated response length: {len(response)}")
return response
except Exception as e:
logger.error(f"Generation failed: {e}")
raise LLMConnectionError(f"Failed to generate response: {str(e)}")
# src/services/vector_store_service.py
from abc import ABC, abstractmethod
from typing import List, Optional
import numpy as np
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import HuggingFaceEmbeddings
from src.models.document_model import Document
from src.utils.logger import setup_logger
logger = setup_logger(__name__)
class AbstractVectorStoreService(ABC):
"""Abstract interface for vector store operations"""
@abstractmethod
def add_documents(self, documents: List[Document]) -> None:
pass
@abstractmethod
def similarity_search(self, query: str, k: int = 3) -> List[Document]:
pass
@abstractmethod
def clear_store(self) -> None:
pass
class FAISSVectorStoreService(AbstractVectorStoreService):
"""FAISS vector store implementation"""
def __init__(self, embedding_model: str = "sentence-transformers/paraphrase-multilingual-mpnet-base-v2"):
logger.info(f"Initializing FAISS with embedding model: {embedding_model}")
self.embeddings = HuggingFaceEmbeddings(
model_name=embedding_model,
model_kwargs={'device': 'cpu'},
encode_kwargs={'normalize_embeddings': True}
)
self.vector_store: Optional[FAISS] = None
logger.info("FAISS vector store initialized")
def add_documents(self, documents: List[Document]) -> None:
"""Add documents to vector store"""
if not documents:
logger.warning("No documents to add")
return
logger.info(f"Adding {len(documents)} documents to vector store")
# Convert to LangChain document format
from langchain.schema import Document as LCDocument
lc_docs = [
LCDocument(page_content=doc.content, metadata=doc.metadata)
for doc in documents
]
if self.vector_store is None:
self.vector_store = FAISS.from_documents(lc_docs, self.embeddings)
logger.info("Created new FAISS index")
else:
self.vector_store.add_documents(lc_docs)
logger.info("Added documents to existing FAISS index")
def similarity_search(self, query: str, k: int = 3) -> List[Document]:
"""Search for similar documents"""
if self.vector_store is None:
logger.warning("Vector store not initialized, returning empty results")
return []
logger.info(f"Searching for {k} similar documents")
results = self.vector_store.similarity_search(query, k=k)
# Convert back to domain Document
docs = [
Document(content=doc.page_content, metadata=doc.metadata)
for doc in results
]
logger.info(f"Found {len(docs)} similar documents")
return docs
def clear_store(self) -> None:
"""Clear the vector store"""
logger.warning("Clearing vector store")
self.vector_store = None
Step 3: Create Controllers
# src/controllers/chat_controller.py
from typing import List, Optional
import streamlit as st
from src.services.llm_service import AbstractLLMService, OllamaLLMService
from src.services.vector_store_service import AbstractVectorStoreService
from src.models.chat_model import ChatHistory
from src.utils.logger import setup_logger
logger = setup_logger(__name__)
class ChatController:
"""Controller for chat operations"""
def __init__(
self,
llm_service: Optional[AbstractLLMService] = None,
vector_service: Optional[AbstractVectorStoreService] = None
):
# Dependency injection with defaults
self.llm_service = llm_service or OllamaLLMService()
self.vector_service = vector_service or st.session_state.get('vector_service')
logger.info("ChatController initialized")
def process_query(self, query: str) -> str:
"""Process user query and return response"""
# Validation
if not query or not query.strip():
logger.warning("Empty query received")
raise ValueError("Query cannot be empty")
logger.info(f"Processing query: {query[:50]}...")
# Retrieve relevant documents
try:
relevant_docs = self.vector_service.similarity_search(query, k=3)
logger.info(f"Retrieved {len(relevant_docs)} relevant documents")
except Exception as e:
logger.error(f"Document retrieval failed: {e}")
raise
# Build context
context = "\n\n".join([doc.content for doc in relevant_docs])
logger.debug(f"Context length: {len(context)} characters")
# Create prompt
prompt = self._build_prompt(context, query)
# Generate response
try:
response = self.llm_service.generate(prompt)
logger.info("Response generated successfully")
return response
except Exception as e:
logger.error(f"Response generation failed: {e}")
raise
def _build_prompt(self, context: str, question: str) -> str:
"""Build prompt for LLM"""
return f"""You are SmartDoc AI, an intelligent document assistant.
Answer the QUESTION based ONLY on the CONTEXT below.
If the context doesn't contain enough information, say "I don't have enough information to answer this question."
Detect the question language and respond in the SAME language.
Keep your answer concise (3-4 sentences maximum).
CONTEXT:
{context}
QUESTION:
{question}
ANSWER:"""
def clear_history(self) -> None:
"""Clear chat history"""
if 'chat_history' in st.session_state:
st.session_state.chat_history.clear()
logger.info("Chat history cleared")
11. Common Pitfalls & Solutions
Pitfall 1: Tight Coupling
❌ Bad:
class ChatScreen:
def __init__(self):
self.llm = Ollama(model="qwen2.5:7b") # Tightly coupled
self.vector_store = FAISS()
✅ Good:
class ChatScreen:
def __init__(self, controller: ChatController):
self.controller = controller # Depend on abstraction
Pitfall 2: Business Logic in Views
❌ Bad:
# In chat_screen.py
def render(self):
query = st.text_input("Question")
docs = vector_store.search(query) # Business logic in view!
answer = llm.generate(docs)
st.write(answer)
✅ Good:
# In chat_screen.py
def render(self):
query = st.text_input("Question")
answer = self.controller.process_query(query) # Delegate to controller
st.write(answer)
Pitfall 3: No Error Handling
❌ Bad:
def load_document(self, path: str):
return PDFPlumberLoader(path).load() # No error handling
✅ Good:
def load_document(self, path: str) -> Document:
try:
loader = PDFPlumberLoader(path)
content = loader.load()
return Document(content=content)
except FileNotFoundError:
logger.error(f"File not found: {path}")
raise DocumentLoadError(f"Cannot find file: {path}")
except Exception as e:
logger.exception(f"Unexpected error loading {path}")
raise DocumentLoadError(f"Failed to load: {str(e)}")
Pitfall 4: Missing Type Hints
❌ Bad:
def process_query(self, query):
docs = self.vector_store.search(query)
return self.llm.generate(docs)
✅ Good:
def process_query(self, query: str) -> str:
"""Process user query and return AI response"""
docs: List[Document] = self.vector_store.search(query, k=3)
response: str = self.llm.generate(self._build_prompt(docs, query))
return response
Pitfall 5: Streamlit State Mismanagement
❌ Bad:
# Direct manipulation
st.session_state['chat_history'] = []
✅ Good:
# Centralized management
class SessionStateManager:
@staticmethod
def initialize():
if 'chat_history' not in st.session_state:
st.session_state.chat_history = ChatHistory()
12. Deployment Checklist
Pre-Deployment
- All type hints added
- Comprehensive docstrings
- Error handling in all services
- Logging configured properly
- Unit tests passing
- Integration tests passing
- No hardcoded secrets/API keys
- requirements.txt updated
Ollama Setup
- Ollama installed and running
- Model
qwen2.5:7bdownloaded - Base URL configured correctly
- Connection test successful
Performance
- Embedding caching enabled
- Batch processing for documents
- FAISS index optimized
- Memory usage monitored
Security
- No external API calls
- All data processed locally
- Input validation on all endpoints
- File upload size limits set