Production RAG Chatbot on AWS: Development Plan (Concise)

Objective: Build and deploy a scalable, production-ready RAG chatbot on AWS using proprietary data, supporting multiple LLMs, with VTION branding, developed component-by-component.

Core Principles: Modular, User Control, Iterative Development (especially for prompts), Comprehensive (yet concise) Documentation, Detailed Logging, Best Practices, Security, VTION Branding (#00afaf, #0c0e2d, light theme).

Phase 0: Project Setup & Foundational Configuration

Goal: Establish environment, AWS services, GitHub, LLM access, and project structure.

Task 0.1: Environment & Tooling
- Confirm local dev setup (IDE, Git).
- Standardize: Python 3.11.
- List/confirm library versions (FastAPI/Flask, Boto3, Frontend framework).
- AI Action: Verify with user.
Task 0.2: AWS Setup/Verification
- Confirm AWS account & IAM permissions for: S3, Bedrock, Lambda, API Gateway, Athena/Redshift Spectrum, IAM, CloudWatch, (Optional: ECS, EKS, Amplify).
- AI Action: Prompt user for AWS config/entitlement confirmation. Document services.
Task 0.3: GitHub & CI/CD Foundation
- Create private GitHub repo.
- Define branching strategy (e.g., main, develop, feature/*).
- Basic GitHub Actions: linting, placeholder for build/test/deploy.
- AI Action: Guide repo creation & initial workflow.
Task 0.4: LLM Access & Key Management
- Confirm Bedrock access: Claude 3 Sonnet (default).
- Confirm access/keys for optional LLMs (OpenAI, Gemini).
- Secure API key management (AWS Secrets Manager or env vars). No hardcoded keys.
- AI Action: Confirm LLM access; discuss key management.
Task 0.5: Define Core Technologies (Revisit)
- Backend: FastAPI (recommended) or Flask.
- Frontend: React with Vite (recommended) or HTML/JS/Tailwind.
- AWS SDK: Boto3 (latest).
- AI Action: Discuss choices; document.

Task 0.6: Project Directory Structure

Define structure (backend, frontend, data, scripts, docs, .github).

/project-root
├── /backend (app, Dockerfile, requirements.txt)
├── /frontend (public, src, package.json, Dockerfile)
├── /data (sample data, S3 upload scripts, generated_context/)
├── /scripts (helpers, deployment)
├── /docs
├── .github/workflows
├── .gitignore
└── README.md

AI Action: Propose; get approval.

Phase 1: Data Understanding & Context Generation

Goal: Analyze data, build modular data loader, implement "first-time context & dictionary builder."

Task 1.1: Analyze Sample Data (S3)
- User provides sample data/schema.
- Identify types, patterns, PII. Storage format in S3 (CSV, JSON, Parquet).
- AI Action: Request sample/schema. Discuss S3 format.
Task 1.2: Design Modular S3 Data Access (Athena/Spectrum)
- Python module for S3 data interaction (via Athena/Spectrum).
- Configurable for S3 paths & data formats.
- AI Action: Propose design.
Task 1.3: Implement Initial Data Profiling
- Script: list columns, infer types, basic stats (via Athena/Spectrum).
- AI Action: Code utility.
Task 1.4: Develop "First-Time Context & Dictionary Builder"
- Concept: One-time LLM process (Claude 3 Sonnet) on data subset (schema, sample rows) for:
  - ~500-word domain context: Overview of data, purpose, common query types, business rules.
  - Data dictionary: Column names, detailed descriptions, data types (Athena/Presto specific), table relationships, value distributions/enums where applicable, sample valid values.
- Implementation: Script to fetch schema/sample, prompt LLM, store context/dictionary (e.g., JSON/YAML in S3 /generated_context/ or DynamoDB). Ensure efficient loading/caching by backend.
- AI Action: Design prompt, write script, discuss storage & efficient access.

Phase 2: Backend Skeleton - API & Core Logic

Goal: Setup backend app, API endpoints, LLM module, text-to-SQL, data querying.

Task 2.1: Design Core API Endpoints
- /query (POST), /capabilities (GET), /config/llm (GET/POST, Admin), /admin/context/regenerate (POST, Admin).
- AI Action: Define OpenAPI/Swagger spec.
Task 2.2: Setup Basic Backend App (FastAPI/Flask)
- Init framework, "hello world" endpoint, requirements.txt, Dockerfile.
- AI Action: Provide boilerplate.
Task 2.3: Implement LLM Interaction Module (llm_clients.py)
- Unified interface for LLMs.
- Sub-tasks: Bedrock (Claude 3 Sonnet), optional OpenAI, optional Gemini. Secure key handling.
- Easy model switching via config.
- AI Action: Code module, start with Bedrock.
Task 2.4: Develop Text-to-SQL Component
- Takes user query, loads comprehensive context/dictionary (Task 1.4).
- Advanced Prompt Engineering:
  - Use detailed context (schema, descriptions, relationships, sample values/queries, business rules).
  - Provide few-shot examples of good text-to-SQL pairs if necessary.
  - Clear instructions on SQL dialect (Athena/Presto), functions to use/avoid.
  - Security constraints (e.g., generate read-only queries, avoid specific clauses if risky).
- Prompts LLM (Claude 3 Sonnet) for SQL.
- Robust SQL validation/sanitization: e.g., allowlisting commands/patterns, parsing query structure, checking for DML or harmful operations, regex checks.
- AI Action: Design prompt, implement component with validation.
Task 2.5: Implement AWS Data Querying (Athena/Redshift Spectrum)
- Executes SQL, retrieves results.
- Crucial: Limit to 50 rows for LLM. Aggregate if >50 rows. No raw DB info to next LLM.
- Ensure query execution IAM roles have least privilege access to data.
- AI Action: Code module.
Task 2.6: Implement Result Processing & Formatting (LLM-based)
- Takes query data (max 50 rows), original query.
- Prompts LLM (Claude 3 Sonnet) for natural language answer & graphical representation suggestion (e.g., {type: 'bar', data: {...}}).
- Strictly no raw table data in final response.
- AI Action: Design prompt, implement component.
Task 2.7: Implement Robust Logging Framework
- Python logging module, structured JSON logs (for CloudWatch).
- Logs: timestamp, level, component, correlation ID, clear messages (mind PII).
- Strictly maintain detailed logging.
- AI Action: Setup config, integrate into components.

Phase 3: Frontend Skeleton - User Interface

Goal: Develop basic UI with VTION branding.

Task 3.1: Choose Frontend Framework (Confirm 0.5)
- Confirm (React, Vue, or HTML/JS/Tailwind). Setup project.
- AI Action: Provide boilerplate.
Task 3.2: Design Basic UI Layout
- Chat interface: input, conversation history, graphics area.
- VTION branding: colors (#00afaf, #0c0e2d), light theme (e.g., #f8f9fa), logo. Responsive.
- AI Action: Create HTML/CSS or React components.
Task 3.3: Implement "Capabilities Screen/Initial Message"
- On load/help: explain data, example questions, limitations. Static or dynamic.
- AI Action: Implement.
Task 3.4: Implement Chat Input & Display Components
- User query input. Display user/bot messages (text & graphics placeholder).
- AI Action: Develop UI components.
Task 3.5: Integrate Frontend with Backend API (/query)
- JS/TS: send query, receive response, display text, pass graphics data to stub.
- Handle API states (loading, success, error).
- AI Action: Implement API integration.
Task 3.6: Apply VTION Branding & Styling
- Ensure UI adheres to VTION colors/theme. Clean font.
- AI Action: Refine CSS/styling.

Phase 4: Connecting Pipeline & Initial End-to-End Testing

Goal: Integrate backend components, connect frontend for basic E2E pipeline.

Task 4.1: Backend: Integrate Context with Text-to-SQL
- Task 2.4 uses context/dictionary from Task 1.4. Test.
- AI Action: Verify, debug.
Task 4.2: Backend: Integrate Text-to-SQL with Data Querying
- Task 2.4 SQL output to Task 2.5. Verify aggregation/50-row limit.
- AI Action: Verify, debug.
Task 4.3: Backend: Integrate Query Results with LLM for Final Response
- Task 2.5 data to Task 2.6. Verify answer/graphic suggestion.
- AI Action: Verify, debug.
Task 4.4: Full End-to-End Test (Manual)
- UI query -> Backend (Text-to-SQL -> Data Query -> Result Processing) -> UI response.
- Log extensively.
- AI Action: Guide testing, analyze logs.
Task 4.5: Implement Graphical Representation Stubs (Frontend)
- Placeholders for charts (log or display raw data).
- AI Action: Implement stubs.

Phase 5: Enhancements & Production Readiness

Goal: Add graphics, error handling, admin console, prepare for production (security, scalability).

Task 5.1: Develop Graphical Representation Component (Frontend)
- Charting library (Chart.js, D3.js, Recharts). Render chart from backend data. Responsive, VTION colors.
- AI Action: Implement.
Task 5.2: Implement Robust Error Handling & User Feedback
- Backend: Global error handlers, meaningful errors/HTTP codes (no sensitive details to user).
- Frontend: Graceful API error handling, user-friendly messages.
- AI Action: Review, enhance.
Task 5.3: Build Basic Admin Console (Optional)
- Secured interface: view logs, trigger context regen, config LLM, view usage stats.
- Secure appropriately.
- AI Action: Design, implement.
Task 5.4: Security Hardening
- Input validation, SQLi prevention (validate LLM SQL per Task 2.4), API auth, least privilege IAM for all AWS services (including Bedrock model invocation).
- Secure data handling, dependency scanning. Review Bedrock security best practices.
- AI Action: Security review.
Task 5.5: Scalability & Performance
- Backend: Async (FastAPI), Lambda/API Gateway or ECS for scaling. Efficient loading/caching of context (Task 1.4).
- Data Querying: Optimize Athena/Spectrum. Partition S3 data.
- LLM Calls: Retries, timeouts.
- Frontend: CDN for static assets.
- AI Action: Review, suggest scaling strategies.
Task 5.6: Comprehensive CI/CD Pipeline (GitHub Actions)
- Backend/Frontend: Lint, test, build, deploy to staging/prod (Docker, ECR, Lambda/ECS, S3/Amplify).
- AI Action: Implement CI/CD.

Phase 6: Documentation & Logging Refinement

Goal: Finalize comprehensive documentation and perfect logging.

Task 6.1: Create User Manual
- Access, usage, examples, graphics, troubleshooting.
- AI Action: Draft.
Task 6.2: Create Developer & Operations Documentation
- Architecture, setup, component details (including prompt strategies), data schema/context, deployment, troubleshooting, LLM config.
- Detailed for non-programmers.
- AI Action: Compile/refine throughout project.
Task 6.3: Finalize and Test Logging
- Review logs (CloudWatch). Ensure clarity, consistency, correlation IDs. No sensitive PII.
- AI Action: Review logging.

Recurring Tasks (For Each Component/Major Step)

R.1: Code: Write. Self-Reflect: Correct, consistent, aligned? Prompts effective?
R.2: Unit Test: Write, pass.
R.3: Push to GitHub: Feature branch, PR.
R.4: CI Checks: Pass.
R.5: Peer Review: (Optional).
R.6: Deploy to Staging (AWS): CI/CD or manual.
R.7: Test on Staging: Integration, UAT. Verify, check logs, test various prompts.
R.8: User Approval: On staging.
R.9: Merge: To develop/main.
R.10: Deploy to Production (AWS).
R.11: Update Documentation.

This task list is a living document.

ナビゲーション

Skillsとは？

リンク

Production RAG Chatbot on AWS: Development Plan (Concise)

Production RAG Chatbot on AWS: Development Plan (Concise)

Phase 0: Project Setup & Foundational Configuration

Phase 1: Data Understanding & Context Generation

Phase 2: Backend Skeleton - API & Core Logic

Phase 3: Frontend Skeleton - User Interface

Phase 4: Connecting Pipeline & Initial End-to-End Testing

Phase 5: Enhancements & Production Readiness

Phase 6: Documentation & Logging Refinement

Recurring Tasks (For Each Component/Major Step)

関連スキル(🌐 Web開発)