1 semana atrás · 9a4985b476
--- a/DOCUMENTATION_CHECKLIST.md
+++ b/DOCUMENTATION_CHECKLIST.md
@@ -5,7 +5,7 @@ This file tracks information that Antigravity (the AI) must provide to the Tech
 
				 ## Technical Document Requirements
			
 
				 - [ ] **Installation & Configuration:** Step-by-step guide for a clean Ubuntu 24.04 VM.
			
 
				 - [ ] **Tech Stack Rationale:** Why FastAPI, SQLite/PostgreSQL, etc.
			
 
				-- [ ] **LLM Selection:** Explain which local model is used (e.g., Llama 3.1 8B), why it was chosen, and its quantization level.
			
 
				+- [ ] **LLM Selection:** Explain which local model is used (e.g., Qwen 3.5:9B), why it was chosen, and its quantization level.
			
 
				 - [ ] **Agent Permissions:** Explain how and why Antigravity model permissions were configured.
			
 
				 - [ ] **Infrastructure Diagram:** Description/code for a diagram showing how app components communicate locally.
			
 
				 - [ ] **Privacy Proof:** Explanation of how we verified that no user data leaves the server.
			
--- a/documentation/Sprint5_Architecture_Decisions.md
+++ b/documentation/Sprint5_Architecture_Decisions.md
@@ -9,7 +9,7 @@ Sprint 5 established the core backend infrastructure required to operate a fully
 
				 *   **Database**: **SQLite**
			
 
				     *   *Reasoning*: Selected over heavy databases like PostgreSQL because this is a local application running on a constrained VM. SQLite is incredibly fast for read-heavy operations, requires zero setup, and naturally aligns with the "local file" privacy philosophy.
			
 
				     *   *Optimization*: Enabled `WAL` (Write-Ahead Logging) mode and `check_same_thread=False` to allow FastAPI's asynchronous workers to read/write concurrently without immediate locking issues.
			
 
				-*   **AI Engine**: **Llama 3.1 8B via Ollama**
			
 
				+*   **AI Engine**: **Qwen 3.5:9B via Ollama** (Upgraded from Llama 3.1)
			
 
				     *   *Reasoning*: Selected because it provides exceptional natural language understanding while fitting within the RAM constraints of the local VM. Ollama handles the model serving locally on port `11434`.
			
 
				 *   **Authentication**: **Custom Token-Based Auth**
			
 
				     *   *Reasoning*: Rather than relying on external OAuth providers (Google/Auth0), we built a local token system (`secrets.token_urlsafe`) stored in the SQLite database to ensure zero data leaves the server.
			
--- a/main.py
+++ b/main.py
@@ -51,7 +51,7 @@ async def get_current_user(authorization: Optional[str] = Header(None)):
 
				     return user
			
 
				 
			
 
				 OLLAMA_URL = "http://localhost:11434/api/chat"
			
 
				-MODEL_NAME = "llama3.1:8b"
			
 
				+MODEL_NAME = "qwen3.5:4b"
			
 
				 
			
 
				 # Common stopwords to strip before searching the food database
			
 
				 _STOPWORDS = {
			
@@ -104,6 +104,7 @@ def extract_food_context(messages: list) -> str | None:
 
				     lines = [
			
 
				         "[SYSTEM: NUTRITIONAL ANALYST MODE]",
			
 
				         "You are the LocalFoodAI Analyst. Use ONLY verified local data for values.",
			
 
				+        "CRITICAL: Provide direct, concise answers. Skip all internal monologues, <thought> tags, or reasoning steps.",
			
 
				         "For each food discussed, you MUST follow this structure:",
			
 
				         "1. Header: ### 🥗 [Name] (per 100g)",
			
 
				         "2. Macros: A markdown table for Cal, P, F, C, Fib, Sug, Chol.",
			
@@ -231,7 +232,8 @@ async def chat_endpoint(request: ChatRequest, current_user: dict = Depends(get_c
 
				     payload = {
			
 
				         "model": MODEL_NAME,
			
 
				         "messages": messages,
			
 
				-        "stream": True
			
 
				+        "stream": True,
			
 
				+        "think": False  # Disable reasoning/thinking mode for faster responses
			
 
				     }
			
 
				     
			
 
				     async def generate_response():
			
--- a/static/index.html
+++ b/static/index.html
@@ -124,7 +124,7 @@
 
				                     <svg width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><line x1="22" y1="2" x2="11" y2="13"></line><polygon points="22 2 15 22 11 13 2 9 22 2"></polygon></svg>
			
 
				                 </button>
			
 
				             </form>
			
 
				-            <div class="footer-note">Powered by Llama 3.1 8B running locally on Ubuntu 24.04 via Ollama</div>
			
 
				+            <div class="footer-note">Powered by Qwen 3.5:4B running locally on Ubuntu 24.04 via Ollama</div>
			
 
				         </footer>
			
 
				     </div>
			
 
				     
			
--- a/user_stories.md
+++ b/user_stories.md
@@ -34,7 +34,7 @@ _Note: Sprints 1–3 covered initial VM setup, Ollama framework installation, Go
 
				   - **[Back] (2 pts):** Create a fast fuzzy-matching API endpoint `GET /api/food/search`.
			
 
				   - **[Front] (2 pts):** Implement the search bar component with real-time fetch autocomplete.
			
 
				 - **[US-06]** As a developer, I want to connect the AI logic to the local database securely.
			
 
				-  - **[Back] (2 pts):** Equip the Llama 3.1 8B agent with local SQL lookup tools so it can query the DB.
			
 
				+  - **[Back] (2 pts):** Equip the Qwen 3.5:9B agent with local SQL lookup tools so it can query the DB.
			
 
				 
			
 
				 ## 🏃 Sprint 6: Comprehensive Nutritional Information
			
 
				 **Total Points: 10** | **Goal:** Expose deep nutritional data (macros, minerals, vitamins, amino acids) to the user.