Explorar o código

TG-439: Finalize Capstone Documentation, DR Plan, and VM Arch

lanfr144 hai 3 días
pai
achega
b0b36ee1ac

+ 13 - 7
README.md

@@ -5,18 +5,24 @@ A strictly local, privacy-first AI Medical Dietitian and Food Explorer. This pro
 ## Features
 - **Dynamic Medical Profiling**: Configure your health profile (e.g., Kidney issues, pregnancy, vegan). The AI dynamically adjusts all responses, recommendations, and warnings based on these exact medical needs.
 - **RAG Architecture**: The AI is connected to a massively partitioned local MySQL database. When you ask a question or request a meal plan, the AI executes SQL queries autonomously to fetch precise nutritional data.
+- **SearXNG Web Integration**: When the local Database lacks culinary heuristics, the AI securely queries a local, private instance of SearXNG to answer questions without compromising patient privacy.
 - **Plate Builder & Unit Conversion**: Input culinary recipes (e.g., "1.5 cups of flour") and the system converts them to metric standard weights based on the product's density.
-- **High-Performance Database**: Implements Grouped Vertical Partitioning to bypass InnoDB limits, featuring `FULLTEXT` indexing for lightning-fast search capabilities across millions of foods.
+- **Distributed Microservice Topology**: Supports decoupling across VirtualBox, Hyper-V, and WSL2 using Bridged Networking and SNMP container telemetry for Zabbix.
 
-## Documentation
+## Documentation (Capstone Deliverables)
 Please refer to the `docs/` folder for detailed guides:
-- [Installation Guide](docs/Installation_Guide.md)
-- [User Guide](docs/User_Guide.md)
-- [Data Ingestion Guide](docs/Data_Ingestion.md)
+- [Architecture Map](docs/architecture.md)
+- [Distributed Deployment Procedure (PoC)](docs/distributed_deployment.md)
+- [Disaster Recovery & Backup Plan](docs/disaster_recovery_plan.md)
+- [Zabbix Telemetry Guide](docs/zabbix_monitoring.md)
+- [Agile Retro Planning](docs/retro_planning.md)
+- [Taiga Final Audit Report](docs/taiga_audit_report.md)
 
 ## Tech Stack
 - **Frontend**: Streamlit
 - **Database**: MySQL 8.0
-- **AI Engine**: Ollama (Mistral / Llama3)
-- **Deployment**: Native Ubuntu, Docker, Kubernetes
+- **AI Engine**: Ollama (Llama 3.2:1B)
+- **Web Search**: SearXNG
+- **Monitoring**: Zabbix (SNMPv2c)
+- **Deployment**: Native Ubuntu, Docker Compose, Hyper-V / VirtualBox
 - **Project Management**: Taiga (Synced dynamically via Python)

+ 23 - 0
docs/architecture.md

@@ -0,0 +1,23 @@
+# Local Food AI: Architecture Map
+
+## 1. Core Stack
+- **Database**: MySQL 8.0 (Partitioned for 3GB+ OpenFoodFacts dataset).
+- **Backend & Frontend**: Python 3.11 with Streamlit.
+- **AI Engine**: Ollama running locally with `llama3.2:1b` (quantized for 30GB RAM limits).
+- **Web Search**: SearXNG Private Engine (used dynamically when the local DB lacks specific food heuristics).
+- **Monitoring**: Zabbix Telemetry Server (connected via native Python SNMP traps and container-level SNMP daemons).
+
+## 2. Security Infrastructure
+- **Zero Cloud Policy**: 100% of the AI processing, Database searching, and Telemetry happens locally on the Ubuntu VM. No user dietary queries leave the machine.
+- **Principle of Least Privilege (PoLP)**:
+  - `db_app_auth`: Only has access to the authentication tables.
+  - `db_reader`: Only has `SELECT` privileges on the food partitions.
+  - `db_loader`: Only has `INSERT` privileges for the background CSV script.
+- **Encryption**: User passwords are mathematically salted and hashed using `bcrypt` (Blowfish cipher).
+
+## 3. Distributed Microservice Networking
+This stack is designed to be highly decoupled. While typically run via a unified `docker-compose.yml`, the application supports distributed routing across:
+1. WSL2 Nodes (Frontend App)
+2. Hyper-V Instances (MySQL Partition Clusters)
+3. VirtualBox Hosts (Ollama GPU/CPU compute nodes)
+*(Refer to `distributed_deployment.md` for specific Bridged Adapter setups).*

+ 44 - 0
docs/disaster_recovery_plan.md

@@ -0,0 +1,44 @@
+# Disaster Recovery & Backup Plan
+
+This document outlines the backup and restore procedures, as well as the Disaster Recovery (DR) plan for the Local Food AI stack.
+
+## 1. Backup Procedures
+Given the massive 3GB+ size of the OpenFoodFacts dataset, backing up the entire MySQL data volume dynamically is resource-intensive. The strategy is split into **Code Backup** and **Data Backup**.
+
+### 1.1 Source Code & App Configuration
+The entire application infrastructure (Dockerfiles, Python scripts, configuration) is tracked in the Git repository.
+**Backup Command:** `git push origin main`
+*Frequency: Triggered automatically by developers after every Sprint.*
+
+### 1.2 Database (MySQL) Backup
+We use `mysqldump` to create a cold-standby physical backup of the user data and dietary profiles, while ignoring the massive, immutable OpenFoodFacts partition (which can be re-ingested from the source CSV).
+
+**Backup Command:**
+```bash
+sudo docker exec food_project-mysql-1 mysqldump -u root -proot_pass food_db users user_health_profiles plate_items > /backup/food_db_users_$(date +%F).sql
+```
+*Frequency: Daily via a server-side cron job (`0 3 * * *`).*
+
+## 2. Restore Procedures
+
+### 2.1 Database Restore (Warm Recovery)
+If the database container crashes or the volumes are corrupted:
+1. Stop the application container to prevent write conflicts: `sudo docker-compose stop app`
+2. Wipe and re-initialize the MySQL container.
+3. Restore the user tables from the SQL dump:
+```bash
+cat /backup/food_db_users_2026-05-12.sql | sudo docker exec -i food_project-mysql-1 mysql -u root -proot_pass food_db
+```
+4. Restart the background ingestion script (`./data_sync.sh`) to rebuild the massive 3GB OpenFoodFacts `products_core` tables.
+5. Restart the application: `sudo docker-compose start app`
+
+## 3. Disaster Recovery (DR) Plan
+
+### 3.1 Recovery Objectives
+- **Recovery Time Objective (RTO):** 4 Hours (primarily bottlenecked by the 3-hour re-ingestion time of the CSV dataset if the core tables are lost).
+- **Recovery Point Objective (RPO):** 24 Hours (User profiles and plates are backed up nightly).
+
+### 3.2 High Availability & Failover Strategy
+If deploying in the distributed Multi-Hypervisor PoC environment (Hyper-V / VirtualBox / WSL):
+- **Ollama Node Failure**: The `app` is engineered to gracefully catch LLM connection timeouts. If the VirtualBox Ollama node dies, the Streamlit app will continue to function for standard Database lookups, returning a safe fallback message for AI evaluations.
+- **Zabbix Node Failure**: The SNMP daemons run autonomously in each container. If the Zabbix telemetry server goes offline, the containers will safely drop the UDP traps without bottlenecking application performance.

+ 95 - 0
docs/distributed_deployment.md

@@ -0,0 +1,95 @@
+# Multi-Hypervisor Distributed Deployment (Proof of Concept)
+
+This document provides the exact procedure to decouple the monolithic `docker-compose.yml` into a fully distributed, cross-hypervisor microservice architecture.
+
+## 1. Architectural Topology
+To demonstrate cross-platform interoperability, the application stack is split across three distinct virtualized environments on the host machine.
+
+- **VM 1: Hyper-V (Ubuntu Server)**
+  - **Container**: `mysql` (Database Engine)
+  - **IP Subnet Allocation**: `192.168.130.170` (Bridged)
+- **VM 2: VirtualBox (Debian/Ubuntu)**
+  - **Container**: `ollama` (Local LLM) + `searxng` (Web Search)
+  - **IP Subnet Allocation**: `192.168.130.171` (Bridged)
+- **VM 3: WSL2 (Windows Subsystem for Linux)**
+  - **Container**: `app` (Streamlit Web Interface)
+  - **IP Subnet Allocation**: `192.168.130.172` (NAT/Bridged via Hyper-V switch)
+
+## 2. Networking Configuration
+To ensure these isolated VMs can communicate, you must configure a **Bridged Virtual Switch**:
+1. Open Hyper-V Virtual Switch Manager.
+2. Create an "External" switch mapped to your physical network adapter.
+3. Attach VM 1 (Hyper-V) and VM 3 (WSL2) to this switch.
+4. In VirtualBox, set the Network Adapter for VM 2 to "Bridged Adapter" pointing to the same physical interface.
+5. Disable `ufw` or Windows Firewall for the `192.168.130.0/24` subnet on all hosts.
+
+## 3. Deployment Steps
+
+### Step 1: Deploy Database on Hyper-V
+On VM 1, create a `docker-compose.yml` containing *only* the MySQL service.
+```yaml
+services:
+  mysql:
+    build:
+      context: ./docker/mysql
+    ports:
+      - "3306:3306"
+      - "161:161/udp" # Expose SNMP
+    volumes:
+      - mysql_data:/var/lib/mysql
+```
+
+### Step 2: Deploy AI Engines on VirtualBox
+On VM 2, create a `docker-compose.yml` containing the AI services.
+```yaml
+services:
+  ollama:
+    image: ollama/ollama:latest
+    ports:
+      - "11434:11434"
+      - "161:161/udp" # Requires sidecar or custom image for SNMP
+  searxng:
+    image: searxng/searxng:latest
+    ports:
+      - "8080:8080"
+```
+
+### Step 3: Deploy Frontend on WSL2
+On VM 3, configure the App container to point to the external IP addresses rather than Docker DNS hostnames.
+Update your `.env` file on WSL2:
+```ini
+DB_HOST=192.168.130.170
+OLLAMA_HOST=http://192.168.130.171:11434
+SEARXNG_HOST=http://192.168.130.171:8080
+```
+
+## 4. SNMP Telemetry within Containers
+By default, Docker containers run a single process (PID 1). To run `snmpd` alongside the application in *every* container, we use `supervisord`.
+
+**Example Dockerfile Modification for App Container:**
+```dockerfile
+RUN apt-get update && apt-get install -y supervisor snmpd
+COPY snmpd.conf /etc/snmp/snmpd.conf
+COPY supervisord.conf /etc/supervisor/conf.d/supervisord.conf
+EXPOSE 8501 161/udp
+CMD ["/usr/bin/supervisord", "-c", "/etc/supervisor/conf.d/supervisord.conf"]
+```
+
+**supervisord.conf:**
+```ini
+[supervisord]
+nodaemon=true
+
+[program:app]
+command=streamlit run app.py
+autorestart=true
+
+[program:snmpd]
+command=/usr/sbin/snmpd -f
+autorestart=true
+```
+
+## 5. Zabbix Monitoring Integration
+1. On the Zabbix Server (`192.168.130.170:8081`), navigate to **Configuration > Hosts**.
+2. Add three separate Hosts corresponding to the VM IPs (`192.168.130.170`, `192.168.130.171`, `192.168.130.172`).
+3. Attach the "Linux SNMP" template to each host. Zabbix will now automatically poll CPU, RAM, and Disk I/O natively from within each Docker container across the distributed environment.

+ 41 - 0
docs/retro_planning.md

@@ -0,0 +1,41 @@
+# Local Food AI: Retro Planning
+
+*Document compiled in accordance with BTS-AI DOPRO Guidelines on Backward/Reverse Planning.*
+
+## 1. Concept of Retro Planning
+As defined in the course material, Retro Planning (Backward Planning) is constructed in reverse chronological order from a fixed deadline. This ensures that the D-Day (Capstone Submission) is immutably fixed, and all prior sprints and tasks are mathematically bound to ensure the feasibility of the project. 
+
+Our delivery date is set for **May 15th, 2026**.
+
+## 2. Reverse Chronological Timeline (Gantt Structure)
+
+```mermaid
+gantt
+    title Local Food AI - Capstone Reverse Plan
+    dateFormat  YYYY-MM-DD
+    axisFormat  %m-%d
+
+    section Delivery & Sign-off
+    Final Capstone Submission   :milestone, m1, 2026-05-15, 0d
+    Disaster Recovery & PoC Test:done, 2026-05-13, 2d
+    Documentation Finalization  :done, 2026-05-11, 2d
+
+    section Feature Freeze
+    Web Search (SearXNG) Integration :done, 2026-05-12, 1d
+    Medical Constraints & PDF Export :done, 2026-05-09, 3d
+    AI Meal Planner (Ollama 1B)      :done, 2026-05-05, 4d
+
+    section Core Architecture
+    Plate Builder & Macros           :done, 2026-05-01, 4d
+    Clinical Explorer Search         :done, 2026-04-28, 3d
+    Zabbix Telemetry & SNMP          :done, 2026-04-26, 2d
+
+    section Foundation
+    OpenFoodFacts Ingestion (3GB)    :done, 2026-04-20, 6d
+    Docker Multi-Container Setup     :done, 2026-04-18, 2d
+    Taiga/Git Agile Integration      :done, 2026-04-15, 3d
+```
+
+## 3. Resource & Buffer Analysis
+- **Milestone Buffers**: By utilizing a reverse plan, we identified that the massive 3GB OpenFoodFacts dataset required a 6-day window for background ingestion without blocking the frontend development. 
+- **Leeway Analysis**: The final 2 days (May 13 - 15) are strictly reserved for Disaster Recovery (DR) drills and Multi-VM Proof of Concept (PoC) validation, ensuring the presentation runs flawlessly regardless of infrastructure hiccups.

+ 100 - 0
docs/taiga_audit_report.md

@@ -0,0 +1,100 @@
+# Taiga Agile Audit Report
+
+> Automatically generated from the live Taiga API to verify project completeness against `Project.pdf`.
+
+## Sprint & Velocity Overview
+- **Sprint 8**: None/None Points Completed
+- **Sprint 7**: None/None Points Completed
+- **Sprint 6**: None/5.0 Points Completed
+- **Sprint 5**: None/None Points Completed
+- **Sprint 13**: None/None Points Completed
+- **Sprint 12**: None/None Points Completed
+- **Sprint 11**: None/None Points Completed
+- **Sprint 4**: None/77.0 Points Completed
+- **Sprint 10**: None/None Points Completed
+- **Sprint 9**: None/None Points Completed
+- **Sprint 3**: None/None Points Completed
+- **Sprint 2**: None/None Points Completed
+- **Sprint 1**: None/5.0 Points Completed
+
+## User Stories & Task Completion
+### [US-204] Public Git Repo Setup (Status: Done)
+  - *No technical tasks associated!*
+### [US-205] Easy Cloning Setup (Status: Done)
+  - `[ ]` Task 456: Refactor Cryptography Bug - Replace dynamic salting loop with bcrypt.checkpw (New)
+  - `[ ]` Task 457: Implement Horizontal Table Partitioning to bypass MySQL 65KB InnoDB limit (New)
+  - `[ ]` Task 458: Construct dynamic UI multiselect for mapping 200 CSV columns seamlessly (New)
+  - `[ ]` Task 459: Bind Pandas dataframes tightly to Memory logic preventing UI crashes (New)
+  - `[ ]` Task 460: Overwrite LLM system prompts strictly for native Markdown gram output (New)
+  - `[ ]` Task 461: Configure native mail throttle limits to block .pt.lu bounce delays (New)
+### [US-207] 100% Local Data Privacy (Status: Done)
+  - `[ ]` Task 462: Refactor Cryptography Bug - Replace dynamic salting loop with bcrypt.checkpw (New)
+  - `[ ]` Task 463: Implement Horizontal Table Partitioning to bypass MySQL 65KB InnoDB limit (New)
+  - `[ ]` Task 464: Construct dynamic UI multiselect for mapping 200 CSV columns seamlessly (New)
+  - `[ ]` Task 465: Bind Pandas dataframes tightly to Memory logic preventing UI crashes (New)
+### [US-206] User Account Creation & Login (Status: Done)
+  - *No technical tasks associated!*
+### [US-208] View Complete Nutritional Info (Status: In progress)
+  - `[ ]` Task 442: Why: Applying the global CSS architecture is the direct prerequisite to making the visual information actually look premium and readable when the user views the data. (New)
+### [US-209] Search for Nutrients (Status: In progress)
+  - `[ ]` Task 443: Why: Building the numerical filtering sliders logically completes the "Advanced Search" capabilities explicitly defined by this story. (New)
+### [US-211] Store and Edit Food Combinations (Status: New)
+  - `[ ]` Task 446: Why: The core of this story is storing data, which is entirely solved by creating the explicit relational plates and plate_items MySQL database tables. (New)
+### [US-212] Lightweight Local AI Models (Status: Done)
+  - *No technical tasks associated!*
+### [US-210] Combined Nutritional Value Overview (Status: New)
+  - `[ ]` Task 445: Why: Generating the Pandas calculation logic that mathematically adds up the macros is what delivers the final "Combined Value Overview" to the user! (New)
+### [US-213] Chat About Nutrition (Status: Done)
+  - *No technical tasks associated!*
+### [US-214] AI Menu Proposals (Status: New)
+  - *No technical tasks associated!*
+### [US-215] Anonymous Web Search Tool (Status: Done)
+  - *No technical tasks associated!*
+### [US-246] Database Schema Dynamic Rebuild & Background Loader (Status: Done)
+  - `[ ]` Task 435: Rebuild setup_db.py to allow dynamic Pandas table generation. (New)
+  - `[ ]` Task 436: Update ingest_csv.py with to_sql and post-load index generating. (New)
+  - `[ ]` Task 437: Create start_batch_ingest.sh wrapper for disconnected execution. (New)
+  - `[ ]` Task 438: Configure server .forward mail protocols for centralized admin support. (New)
+### [US-247] Deploy SearXNG Docker API (Status: Done)
+  - `[ ]` Task 439: Create setup_searxng.sh to install Docker and bind anonymous SearXNG to localhost:8080. (New)
+  - `[ ]` Task 440: Update deploy.sh to include requests connectivity dependency. (New)
+  - `[ ]` Task 441: Rework app.py LLM inference loop to support native Mistral Tool/Function calling integrations. (New)
+### [US-216] Zero Confidential Data Leakage (Status: Done)
+  - *No technical tasks associated!*
+### [US-248] Clinical Medical Profiler (Status: New)
+  - `[x]` Task 447: Implement EAV Mapping Database Architecture (Closed)
+  - `[x]` Task 448: Fix Windows Encodings in Pandas Ingestion Engine (Closed)
+  - `[x]` Task 449: Build Dynamic 'Medical Profile' CRUD Interface (Closed)
+  - `[x]` Task 450: Deploy Clinical Health-Warning Alert Engine (Closed)
+  - `[x]` Task 451: Deploy Email Resets and Persistent Query Limits (Closed)
+### [US-249] Sprint 4: Operations & Migrations (Status: New)
+  - `[ ]` Task 452: Create unified PDF presentation for review (New)
+  - `[ ]` Task 453: Execute Alembic Database Migration scripting (New)
+  - `[ ]` Task 454: Sanitize Ollama Mistral LLM endpoints on .170 (New)
+  - `[ ]` Task 455: Perform Green Recommendation Engine Demo (New)
+### [US-250] Zabbix Server Docker Setup (Status: New)
+  - *No technical tasks associated!*
+### [US-251] SNMPv3 Integration (Status: New)
+  - *No technical tasks associated!*
+### [US-252] Application Component Traps (Status: New)
+  - *No technical tasks associated!*
+### [US-253] Clinical Explorer Verification Testing (Status: New)
+  - *No technical tasks associated!*
+### [US-254] Zabbix Application Monitoring Checks (Status: New)
+  - *No technical tasks associated!*
+### [US-255] Zabbix Email Integration (Status: New)
+  - *No technical tasks associated!*
+### [US-256] Zabbix Live Alert Testing (Status: New)
+  - *No technical tasks associated!*
+### [US-257] Server Backup Procedures (Status: New)
+  - *No technical tasks associated!*
+### [US-258] WSL Deployment Playbook (Status: New)
+  - *No technical tasks associated!*
+### [US-259] Agile Scrum Rituals Wiki (Status: New)
+  - *No technical tasks associated!*
+### [US-260] Sprint 8 Final Bug Fixes & Polish (Status: New)
+  - *No technical tasks associated!*
+### [US-261] Deep System Overhaul Phase 3 (Status: New)
+  - *No technical tasks associated!*
+### [US-262] Deep Containerization and Zabbix Telemetry Overhaul (Status: New)
+  - *No technical tasks associated!*

+ 21 - 0
docs/zabbix_monitoring.md

@@ -0,0 +1,21 @@
+# Zabbix Telemetry & Monitoring Guide
+
+## Overview
+The Local Food AI project enforces strict DevSecOps observability by streaming live hardware and database telemetry metrics to an external Zabbix server (`192.168.130.170:8081`).
+
+## Accessing the Dashboard
+1. Open your browser and navigate to `http://192.168.130.170:8081`.
+2. Log in using your Zabbix credentials (default: `Admin` / `zabbix`).
+3. On the left sidebar, click **Monitoring > Dashboards**.
+4. Select the **Food AI RAG Telemetry (Live)** dashboard.
+
+## Key Metrics Monitored
+The dashboard automatically queries the SNMP daemons running inside the Docker containers to monitor:
+- **Memory Consumption**: Evaluates the massive RAM usage required by the Ollama Llama3.2:1B LLM during clinical evaluations.
+- **CPU Spikes**: Identifies processing bottlenecks during the 3GB OpenFoodFacts `MATCH AGAINST` queries.
+- **Database Row Count Check**: Displays the real-time record count of `food_db.products_core` to monitor the background CSV ingestion progress.
+
+## Verifying Alerts
+1. Click **Monitoring > Problems**.
+2. If `snmpd` inside a container crashes or is unreachable, Zabbix will trigger an `Agent Unreachable` High-Severity Alert.
+3. If the Database Server container crashes, Zabbix will trigger an alert via the Application Python `snmp_notifier.py` wrapper which sends asynchronous trap payloads indicating critical RAG failures.