GROOT FORCE - Functional Requirements Document (FRD)

Core AI System

Document Version: 1.0
Date: November 2025
Status: Active Development
Owner: AI Engineering Team

Document Control

Version	Date	Author	Changes
1.0	Nov 2025	AI Team	Initial FRD for Core AI System

Approval:

AI/ML Lead
Software Architect
Security Lead

1. Introduction

1.1 Purpose

This Functional Requirements Document (FRD) defines the detailed functional requirements for the Core AI System of GROOT FORCE. This is the "brain" that makes GROOT FORCE unique - a human-bound, privacy-first, emotionally intelligent AI that runs entirely on-device.

1.2 Scope

This FRD covers:

Local LLM Engine - On-device language model inference
RAG System - Retrieval-Augmented Generation for knowledge grounding
Critical Reasoning Kernel (CRK) - Anti-hallucination and logic verification
Emotional Engine - User emotional state tracking and adaptive behavior
Executive Function Framework (EFF) - Task decomposition and cognitive support
Memory Architecture - Multi-tier, domain-isolated memory
Tool Calling System - Function calling and skill integration
Anticipation Layer - Predictive assistance

Out of Scope:

Vision AI (separate FRD)
Speech processing (separate FRD)
Sensor fusion (separate FRD)
UI/UX implementation (separate document)

Traces To:

Master PRD - Product requirements
System SRS - System-level software requirements
Hardware Requirements - Hardware specs

Related FRDs:

FRD: Sensor & Safety Systems (to be created)
FRD: User Experience & Interface (to be created)
FRD: Vision AI & Computer Vision (to be created)

2. System Architecture Overview

2.1 AI System Components

┌─────────────────────────────────────────────────────┐
│              CORE AI SYSTEM ARCHITECTURE            │
├─────────────────────────────────────────────────────┤
│                                                     │
│  ┌───────────────────────────────────────────────┐ │
│  │         INPUT PROCESSING LAYER                │ │
│  │  - Intent Parser                              │ │
│  │  - Context Builder                            │ │
│  │  - Entity Extraction                          │ │
│  └───────────────────────────────────────────────┘ │
│                       ↓                             │
│  ┌───────────────────────────────────────────────┐ │
│  │         EMOTIONAL ENGINE                      │ │
│  │  - State Tracker (Valence/Arousal/Control)   │ │
│  │  - Trigger Bank                               │ │
│  │  - Tone Adapter                               │ │
│  └───────────────────────────────────────────────┘ │
│                       ↓                             │
│  ┌───────────────────────────────────────────────┐ │
│  │    CRITICAL REASONING KERNEL (CRK)            │ │
│  │  - Micro-Reasoning (per reply)                │ │
│  │  - Meso-Reasoning (session)                   │ │
│  │  - Macro-Reasoning (life goals)               │ │
│  │  - Evidence Tagger                            │ │
│  │  - PFC Load Estimator                         │ │
│  └───────────────────────────────────────────────┘ │
│            ↓                      ↓                 │
│  ┌──────────────────┐   ┌──────────────────┐      │
│  │  RAG ENGINE      │   │  LOCAL LLM       │      │
│  │  - Vector Store  │   │  - 3-8B Models   │      │
│  │  - Retrieval     │   │  - Inference     │      │
│  │  - Domain Filter │   │  - Context Mgmt  │      │
│  └──────────────────┘   └──────────────────┘      │
│                       ↓                             │
│  ┌───────────────────────────────────────────────┐ │
│  │   EXECUTIVE FUNCTION FRAMEWORK (EFF)          │ │
│  │  - Task Decomposition                         │ │
│  │  - Prioritization Engine                      │ │
│  │  - Gating & Protection                        │ │
│  │  - Habit Builder                              │ │
│  └───────────────────────────────────────────────┘ │
│                       ↓                             │
│  ┌───────────────────────────────────────────────┐ │
│  │         TOOL CALLING & SKILLS                 │ │
│  │  - Function Router                            │ │
│  │  - Skill Manager                              │ │
│  │  - Action Executor                            │ │
│  └───────────────────────────────────────────────┘ │
│                       ↓                             │
│  ┌───────────────────────────────────────────────┐ │
│  │         OUTPUT GENERATION                     │ │
│  │  - Response Composer                          │ │
│  │  - Tone Adjustment                            │ │
│  │  - Confidence Tagging                         │ │
│  └───────────────────────────────────────────────┘ │
│                                                     │
└─────────────────────────────────────────────────────┘

2.2 Data Flow

Typical Interaction Flow:

User speaks → STT converts to text
Intent Parser extracts intent and entities
Emotional Engine updates user state
CRK estimates cognitive load and reasoning requirements
RAG retrieves relevant memories/documents
Local LLM generates response (with CRK supervision)
EFF applies task decomposition if needed
Tool Calling executes actions if required
Output Generation composes final response with appropriate tone
TTS converts to speech → User hears response

3. Local LLM Engine

3.1 Model Requirements

FRD-AI-LLM-001 [P0]
System shall support loading and running quantized LLMs from 3B to 8B parameters.

Functional Specification:

Supported Models:
- Llama 3 (3B, 4B, 7B, 8B variants)
- Mistral (7B variant)
- Phi-3 (3.8B variant)
- Other compatible architectures
Quantization Formats:
- Primary: Q4_K_M (4-bit, k-quant, medium)
- Optional: Q8_0 (8-bit, zero-point)
- Optional: Q5_K_M (5-bit, k-quant, medium)
Model Loading:
- Load time: < 10 seconds for 3-4B models, < 20 seconds for 7-8B models
- Memory footprint: 2-5 GB depending on model size and quantization
- Persistent loading: Model remains in memory until user switches
Context Window:
- Minimum: 4096 tokens
- Target: 8192 tokens (if RAM permits)
- Context management: Sliding window with importance-based pruning

Validation:

Load test models of each supported size
Verify memory usage within limits
Confirm context window functionality

Traces To: REQ-AI-001 (System SRS)

FRD-AI-LLM-002 [P0]
System shall implement multi-tier inference routing based on task complexity.

Functional Specification:

Tier 1: Local Low-Power (3-4B models)

Trigger Conditions:
- Token estimate < 100 tokens
- Simple queries (definitions, facts, quick replies)
- User explicitly requests fast mode
- Battery < 15%
Use Cases:
- "What time is it?"
- "Turn on flashlight"
- "Navigate to home"
- Object labeling
- Quick translations

Tier 2: Local High-Performance (7-8B models)

Trigger Conditions:
- Token estimate 100-500 tokens
- Complex reasoning required
- Multi-step tasks
- NDIS documentation
- Battery > 15%
Use Cases:
- Email drafting
- Meeting notes summarization
- Complex reasoning (math, logic)
- Long-form writing
- Code generation

Tier 3: Cloud Boost (70B+ models) - Optional

Trigger Conditions:
- Token estimate > 500 tokens
- Web search required
- Multi-document RAG synthesis
- User explicitly requests "deep think"
- Cloud tier subscription active
Use Cases:
- Research synthesis
- Complex multi-document analysis
- Specialized domain knowledge
- Code review and debugging

Routing Algorithm:

def select_inference_tier(query, context):
    # Estimate token count
    token_estimate = estimate_tokens(query, context)
    
    # Check user preferences
    if user.privacy_mode == "strict":
        return TIER_LOCAL_HIGH  # Never use cloud
    
    # Check battery
    if battery_level < 15:
        return TIER_LOCAL_LOW
    
    # Analyze complexity
    complexity = analyze_complexity(query)
    
    # Route decision
    if token_estimate < 100 and complexity == "simple":
        return TIER_LOCAL_LOW
    elif token_estimate < 500 or not cloud_available:
        return TIER_LOCAL_HIGH
    elif user.subscribed and requires_web_search(query):
        return TIER_CLOUD_BOOST
    else:
        return TIER_LOCAL_HIGH

Validation:

Test routing logic with diverse query set
Verify correct tier selection > 95% of time
Confirm privacy preferences respected

Traces To: REQ-AI-003 (System SRS)

FRD-AI-LLM-003 [P0]
System shall implement thermal-aware throttling for AI inference.

Functional Specification:

Temperature Monitoring:

Poll CPU/GPU/NPU temperature every 1 second during inference
Maintain temperature history (rolling 60-second window)
Predict temperature trajectory

Throttling Levels:

Level	Temperature	Action	Performance Impact
0 (Normal)	< 42°C	None	100%
1 (Mild)	42-45°C	Reduce token generation speed 10%	90%
2 (Moderate)	45-48°C	Reduce token speed 25%, switch to smaller model	75%
3 (Severe)	48-50°C	Emergency low-power mode, minimal AI	30%
4 (Critical)	> 50°C	System shutdown	0%

Implementation:

def thermal_throttle(current_temp):
    if current_temp < 42:
        return 1.0  # No throttling
    elif current_temp < 45:
        return 0.9  # 10% slower
    elif current_temp < 48:
        # Switch to smaller model + slow down
        if current_model != MODEL_3B:
            switch_model(MODEL_3B)
        return 0.75
    elif current_temp < 50:
        # Emergency mode
        pause_non_critical_ai()
        return 0.3
    else:
        # Critical shutdown
        emergency_shutdown()
        return 0.0

User Notification:

Level 1: No notification (transparent)
Level 2: Optional notification "AI running in power-saving mode"
Level 3: Visible notification "Device cooling down, AI limited"
Level 4: "Device too hot, shutting down for safety"

Validation:

Sustained AI load testing
Verify temperature never exceeds 50°C
Confirm proper throttling transitions

Traces To: REQ-AI-004 (System SRS)

FRD-AI-LLM-004 [P0]
System shall implement token generation optimization for low latency.

Functional Specification:

Optimization Techniques:

Model Caching:
- Keep model in RAM between requests
- Preload commonly used models
- Lazy-load rarely used models
KV Cache Management:
- Reuse KV cache from previous turns
- Prune old conversation turns intelligently
- Compress KV cache for long contexts
Speculative Decoding:
- Use small draft model (1B) to predict tokens
- Verify with main model in batch
- 1.5-2x speedup on simple responses
Batching:
- Batch multiple user requests when possible
- Batch tool calls
- Amortize inference overhead
Quantized Inference:
- Use INT4/INT8 kernels where available
- Leverage NPU/GPU for quantized ops
- CPU fallback for compatibility

Performance Targets:

Time to First Token (TTFT): < 100ms
Tokens per Second: > 10 tokens/sec on 3B, > 5 tokens/sec on 7B
End-to-end latency: < 200ms for simple queries

Validation:

Benchmark latency across model sizes
Measure TTFT and token throughput
Real-world user acceptance testing

Traces To: REQ-AI-002, REQ-PERF-001 (System SRS)

3.2 Context Management

FRD-AI-LLM-010 [P0]
System shall implement intelligent context window management.

Functional Specification:

Context Window Structure:

┌─────────────────────────────────────┐
│  System Prompt (fixed)              │  ~500 tokens
├─────────────────────────────────────┤
│  User Identity & Values (fixed)     │  ~300 tokens
├─────────────────────────────────────┤
│  Recent Conversation (sliding)      │  ~2048 tokens
├─────────────────────────────────────┤
│  RAG Retrieved Context (dynamic)    │  ~1024 tokens
├─────────────────────────────────────┤
│  Current Query                      │  ~256 tokens
├─────────────────────────────────────┤
│  Reserved for Response              │  ~1024 tokens
└─────────────────────────────────────┘
Total: ~5152 tokens (fits in 8K context)

Context Pruning Strategy:

Keep: System prompt, user identity, current query
Prune: Old conversation turns using importance scoring
Compress: Summarize old turns into memory snapshots

Importance Scoring:

def importance_score(turn):
    score = 0
    # Recency (decay over time)
    score += recency_weight * exp(-age_hours / 24)
    
    # Explicit user request ("remember this")
    if turn.marked_important:
        score += 10
    
    # Emotional significance
    score += emotional_intensity(turn) * 2
    
    # Referenced in later turns
    score += reference_count * 0.5
    
    # Contains key entities (people, places, dates)
    score += entity_count * 0.3
    
    return score

Validation:

Long conversation tests (100+ turns)
Verify important context retained
Confirm graceful degradation when context full

Traces To: REQ-AI-001 (System SRS)

3.3 Prompt Engineering

FRD-AI-LLM-020 [P0]
System shall use optimized system prompts for each product variant.

Functional Specification:

Base System Prompt (All Variants):

You are KLYRA, the AI assistant for GROOT FORCE smart glasses. You are:
- Loyal and bound to {USER_NAME} only
- Privacy-first (local processing, no cloud unless explicitly requested)
- Calm, helpful, and non-judgmental
- Direct and concise (avoid long monologues unless asked)
- Honest about uncertainty (never make up facts)

Core capabilities:
- Voice control and commands
- Real-time translation
- Object recognition and OCR
- Walking assistance and navigation
- Memory and knowledge retrieval
- Task planning and execution

Current mode: {CURRENT_MODE}
User state: {EMOTIONAL_STATE}

Respond naturally in a conversational tone. Use the user's preferred language.

Variant-Specific Additions:

GF-CL (Care & Joy - NDIS Support Worker):

Additional context:
- You assist NDIS support workers in their daily tasks
- Always prioritize participant consent and dignity
- Document observations objectively and professionally
- Alert to safety concerns (falls, distress, medical issues)
- Support clear communication with participants and families

When documenting:
- Use person-first language
- Record facts, not judgments
- Include positive observations
- Follow NDIS practice standards

GF-TX (TradeForce - Trades/Industrial):

Additional context:
- You assist tradespeople and industrial workers
- Prioritize workplace safety above all else
- Provide clear, step-by-step instructions
- Support hands-free documentation
- Alert to hazards and OH&S violations

Safety protocol:
- Stop immediately if dangerous situation detected
- Remind about PPE if required
- Document incidents thoroughly
- Support emergency procedures

GF-VI (VisionAssist - Low Vision):

Additional context:
- You assist users with vision impairment
- Describe visual scenes clearly and concisely
- Read text aloud accurately (OCR)
- Provide navigation guidance
- Alert to obstacles and hazards

Descriptive protocol:
- Start with overall scene ("You're in a kitchen")
- Describe relevant objects and their positions
- Use clock face directions ("Door at 2 o'clock")
- Confirm uncertain identifications ("Looks like a chair")

Validation:

Test each variant prompt with appropriate scenarios
Verify variant-specific behavior emerges
User acceptance testing with target personas

Traces To: [Master PRD - Product Variants]

4. RAG (Retrieval-Augmented Generation) System

4.1 Architecture

FRD-AI-RAG-001 [P0]
System shall implement hybrid vector + metadata retrieval system.

Functional Specification:

Storage Backend:

Vector Store: FAISS (Facebook AI Similarity Search)
- IndexFlatL2 for < 10K documents
- IndexIVFFlat for 10K-100K documents
- Quantization for > 100K documents
Metadata Store: SQLite
- Document metadata (source, timestamp, domain, tags)
- User annotations and importance scores
- Access control and privacy flags

Data Model:

-- Documents table
CREATE TABLE documents (
    id INTEGER PRIMARY KEY,
    content TEXT NOT NULL,
    source TEXT NOT NULL,
    domain TEXT NOT NULL,  -- Finance, NDIS, Work, Personal, etc.
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    embedding_version INTEGER NOT NULL,
    privacy_level INTEGER DEFAULT 0,  -- 0=normal, 1=sensitive, 2=highly_sensitive
    user_marked_important BOOLEAN DEFAULT FALSE
);

-- Chunks table (for large documents)
CREATE TABLE chunks (
    id INTEGER PRIMARY KEY,
    document_id INTEGER REFERENCES documents(id),
    chunk_index INTEGER NOT NULL,
    content TEXT NOT NULL,
    embedding_id INTEGER NOT NULL,  -- Index in FAISS
    token_count INTEGER NOT NULL,
    UNIQUE(document_id, chunk_index)
);

-- Tags table
CREATE TABLE tags (
    id INTEGER PRIMARY KEY,
    name TEXT UNIQUE NOT NULL
);

-- Document-Tag junction
CREATE TABLE document_tags (
    document_id INTEGER REFERENCES documents(id),
    tag_id INTEGER REFERENCES tags(id),
    PRIMARY KEY (document_id, tag_id)
);

Embedding Model:

Model: sentence-transformers/all-MiniLM-L6-v2 or equivalent
Dimensions: 384 (compact for mobile)
Speed: > 100 sentences/second
Quality: Good balance of speed and accuracy

Validation:

Load 1,000+ documents
Verify retrieval accuracy with ground truth queries
Benchmark query latency < 100ms

Traces To: REQ-AI-020 (System SRS)

FRD-AI-RAG-002 [P0]
System shall implement domain-based memory isolation.

Functional Specification:

Supported Domains:

Finance - Bank accounts, investments, taxes, budgets
NDIS - Participant information, care plans, incidents
Work - Projects, colleagues, meetings, tasks
Personal - Family, friends, hobbies, diary
Health - Medical records, medications, symptoms
Relationships - People, conversations, emotional context
Engineering - Technical knowledge, code, designs
Custom - User-defined domains

Domain Rules:

Documents belong to ONE primary domain (no cross-contamination)
Queries MUST specify allowed domains explicitly
Cross-domain queries return UNION of results from specified domains
Highly sensitive domains (Health, Finance) require extra confirmation

Query API:

def retrieve_context(
    query: str,
    domains: List[str],
    max_chunks: int = 5,
    relevance_threshold: float = 0.7
) -> List[Chunk]:
    """
    Retrieve relevant context from specified domains.
    
    Args:
        query: User's query
        domains: List of allowed domains (e.g., ["Work", "Personal"])
        max_chunks: Maximum chunks to return
        relevance_threshold: Minimum similarity score
    
    Returns:
        List of relevant chunks with metadata
    """
    # Generate query embedding
    query_embedding = embed(query)
    
    # Search each domain separately
    results = []
    for domain in domains:
        domain_results = faiss_search(
            query_embedding,
            domain=domain,
            top_k=max_chunks
        )
        results.extend(domain_results)
    
    # Filter by relevance threshold
    results = [r for r in results if r.score >= relevance_threshold]
    
    # Sort by relevance and return top N
    results.sort(key=lambda r: r.score, reverse=True)
    return results[:max_chunks]

Example Usage:

# Safe: Query only Personal domain
context = retrieve_context(
    "What did I do last weekend?",
    domains=["Personal"]
)

# Safe: Query work + personal for scheduling
context = retrieve_context(
    "When am I free next week?",
    domains=["Work", "Personal"]
)

# Requires extra confirmation: Sensitive domain
if user_confirms("Access health records?"):
    context = retrieve_context(
        "What medications am I on?",
        domains=["Health"]
    )

Validation:

Cross-domain contamination test (should fail)
Proper domain filtering with test queries
User acceptance testing of privacy controls

Traces To: REQ-AI-021 (System SRS)

FRD-AI-RAG-003 [P0]
System shall implement chunking strategy for large documents.

Functional Specification:

Chunking Parameters:

Chunk Size: 512-768 tokens (target: 640 tokens)
Overlap: 128 tokens between chunks
Rationale: Ensures no context loss at boundaries

Chunking Algorithm:

def chunk_document(text: str, chunk_size: int = 640, overlap: int = 128):
    """
    Split document into overlapping chunks.
    Preserves sentence boundaries where possible.
    """
    # Tokenize
    tokens = tokenize(text)
    
    chunks = []
    start = 0
    
    while start < len(tokens):
        # Define chunk end
        end = min(start + chunk_size, len(tokens))
        
        # Try to end at sentence boundary
        chunk_tokens = tokens[start:end]
        if end < len(tokens):  # Not at document end
            # Look for sentence boundary in last 50 tokens
            for i in range(len(chunk_tokens) - 1, max(0, len(chunk_tokens) - 50), -1):
                if is_sentence_boundary(chunk_tokens[i]):
                    end = start + i + 1
                    break
        
        # Extract chunk
        chunk_text = detokenize(tokens[start:end])
        chunks.append({
            'text': chunk_text,
            'start_token': start,
            'end_token': end,
            'chunk_index': len(chunks)
        })
        
        # Move to next chunk with overlap
        start = end - overlap
    
    return chunks

Chunk Metadata: Each chunk stores:

chunk_index: Position in document (0, 1, 2, ...)
document_id: Parent document
token_count: Number of tokens
start_token, end_token: Token positions in original document
embedding_id: FAISS index

Retrieval Context Assembly: When a chunk is retrieved:

Include the chunk itself
Optionally include adjacent chunks for continuity
Include document metadata (title, source, date)

def assemble_context(retrieved_chunks: List[Chunk]) -> str:
    """
    Assemble context from retrieved chunks.
    Includes source attribution and metadata.
    """
    context_parts = []
    
    for chunk in retrieved_chunks:
        # Load document metadata
        doc = load_document(chunk.document_id)
        
        # Format context with source
        context_part = f"""
        Source: {doc.source} ({doc.domain})
        Date: {doc.created_at}
        ---
        {chunk.text}
        ---
        """
        context_parts.append(context_part)
    
    return "\n\n".join(context_parts)

Validation:

Test chunking on various document types
Verify overlap preserves context
Retrieval quality test with multi-chunk documents

Traces To: REQ-AI-020 (System SRS)

FRD-AI-RAG-004 [P1]
System shall support incremental updates without full reindex.

Functional Specification:

Operations:

Add Document:
- Insert document into SQLite
- Chunk document
- Generate embeddings for each chunk
- Add embeddings to FAISS index
- Time: < 1 second for typical document
Update Document:
- Compute diff with old version
- If minor edit: Update affected chunks only
- If major edit: Re-chunk and re-embed entire document
- Update FAISS index with new embeddings
Delete Document:
- Soft delete: Mark as deleted in SQLite
- Keep in FAISS (for rollback capability)
- Hard delete: Remove from SQLite and FAISS during cleanup
Index Optimization:
- Background task (runs when idle)
- Defragment FAISS index
- Clean up soft-deleted documents
- Rebuild index if fragmentation > 50%

Implementation:

class RAGSystem:
    def add_document(self, content: str, metadata: dict):
        # Insert into database
        doc_id = db.insert_document(content, metadata)
        
        # Chunk
        chunks = chunk_document(content)
        
        # Embed and add to FAISS
        for chunk in chunks:
            embedding = self.embed(chunk['text'])
            embedding_id = self.faiss_index.add(embedding)
            
            db.insert_chunk(
                document_id=doc_id,
                chunk_index=chunk['chunk_index'],
                content=chunk['text'],
                embedding_id=embedding_id
            )
        
        return doc_id
    
    def update_document(self, doc_id: int, new_content: str):
        old_doc = db.get_document(doc_id)
        
        # Check diff size
        similarity = compute_similarity(old_doc.content, new_content)
        
        if similarity > 0.9:  # Minor edit
            # Update only changed chunks
            self._incremental_update(doc_id, new_content)
        else:  # Major edit
            # Full re-index
            self.delete_document(doc_id)
            self.add_document(new_content, old_doc.metadata)
    
    def delete_document(self, doc_id: int, soft=True):
        if soft:
            db.mark_deleted(doc_id)
        else:
            # Remove from FAISS
            chunk_ids = db.get_chunk_embedding_ids(doc_id)
            self.faiss_index.remove_ids(chunk_ids)
            
            # Remove from database
            db.delete_document(doc_id)

Validation:

Add 100 documents rapidly
Update 50 documents
Delete 25 documents
Verify index integrity and retrieval accuracy

Traces To: REQ-AI-022 (System SRS)

4.2 Query Processing

FRD-AI-RAG-010 [P0]
System shall implement query preprocessing and expansion.

Functional Specification:

Query Preprocessing Steps:

Intent Detection:
- Classify query type (factual, procedural, opinion, etc.)
- Identify key entities (people, places, dates)
- Extract temporal constraints ("last week", "yesterday")
Query Expansion:
- Add synonyms for key terms
- Expand abbreviations (e.g., "NDIS" → "National Disability Insurance Scheme")
- Include related concepts (e.g., "car" also search "vehicle", "automobile")
Domain Selection:
- Auto-detect likely domains based on query content
- Ask user to confirm if multiple domains possible
- Default to safe domains (Personal, Work) unless specified
Time Filtering:
- Extract time constraints from query
- Filter documents by creation/update timestamp
- Examples:
  - "last week" → created_at > 7 days ago
  - "in 2024" → created_at BETWEEN '2024-01-01' AND '2024-12-31'
  - "recent" → created_at > 30 days ago

Implementation:

def preprocess_query(query: str, context: ConversationContext) -> ProcessedQuery:
    # Intent detection
    intent = classify_intent(query)
    
    # Entity extraction
    entities = extract_entities(query)
    
    # Time extraction
    time_constraint = extract_time_constraint(query)
    
    # Domain detection
    detected_domains = detect_domains(query, entities)
    
    # Query expansion
    expanded_query = expand_query(query, entities)
    
    return ProcessedQuery(
        original=query,
        expanded=expanded_query,
        intent=intent,
        entities=entities,
        time_constraint=time_constraint,
        suggested_domains=detected_domains
    )

Validation:

Test query preprocessing with diverse inputs
Verify entity extraction accuracy > 90%
Confirm domain detection aligns with user intent

Traces To: REQ-AI-020 (System SRS)

5. Critical Reasoning Kernel (CRK)

5.1 Multi-Scale Reasoning

FRD-AI-CRK-001 [P0]
System shall implement three-scale reasoning: Micro, Meso, Macro.

Functional Specification:

Micro-Reasoning (Per Reply):

Scope: Single response
Purpose: Catch local logic errors and contradictions
Method: Self-critique pass after generating response

def micro_reasoning_check(response: str, context: dict) -> ReasoningResult:
    """
    Check response for local contradictions and errors.
    """
    issues = []
    
    # 1. Internal contradiction check
    contradictions = find_contradictions(response)
    if contradictions:
        issues.append({
            'type': 'contradiction',
            'severity': 'high',
            'details': contradictions
        })
    
    # 2. Math/calculation verification
    if contains_calculations(response):
        calc_errors = verify_calculations(response)
        if calc_errors:
            issues.append({
                'type': 'calculation_error',
                'severity': 'high',
                'details': calc_errors
            })
    
    # 3. Fact grounding check
    claims = extract_claims(response)
    for claim in claims:
        if not is_grounded_in_context(claim, context):
            issues.append({
                'type': 'ungrounded_claim',
                'severity': 'medium',
                'claim': claim
            })
    
    return ReasoningResult(
        passed=len(issues) == 0,
        issues=issues,
        confidence=calculate_confidence(issues)
    )

Meso-Reasoning (Session-Level):

Scope: Current conversation session
Purpose: Maintain consistency across multiple turns
Method: Track session state and check for drift

class SessionReasoningTracker:
    def __init__(self):
        self.session_state = {
            'established_facts': [],  # Facts user has told us
            'decisions_made': [],     # Decisions we've helped with
            'contradictions': []      # Detected inconsistencies
        }
    
    def check_consistency(self, new_response: str) -> bool:
        """
        Check if new response contradicts session history.
        """
        new_facts = extract_facts(new_response)
        
        for new_fact in new_facts:
            for old_fact in self.session_state['established_facts']:
                if contradicts(new_fact, old_fact):
                    self.session_state['contradictions'].append({
                        'new': new_fact,
                        'old': old_fact,
                        'timestamp': now()
                    })
                    return False
        
        # No contradictions found
        self.session_state['established_facts'].extend(new_facts)
        return True

Macro-Reasoning (Life Goals Alignment):

Scope: User's long-term goals and values
Purpose: Ensure advice aligns with user's life direction
Method: Check against stored user values and goals

def macro_reasoning_check(advice: str, user_profile: UserProfile) -> MacroReasoningResult:
    """
    Check if advice aligns with user's life goals and values.
    """
    # Load user's long-term goals and values
    goals = user_profile.long_term_goals
    values = user_profile.core_values
    red_lines = user_profile.red_lines  # Things user will never do
    
    # Analyze advice
    advice_implications = analyze_implications(advice)
    
    conflicts = []
    
    # Check against goals
    for goal in goals:
        if undermines(advice_implications, goal):
            conflicts.append({
                'type': 'goal_conflict',
                'goal': goal,
                'reason': explain_conflict(advice, goal)
            })
    
    # Check against values
    for value in values:
        if violates(advice_implications, value):
            conflicts.append({
                'type': 'value_violation',
                'value': value,
                'reason': explain_violation(advice, value)
            })
    
    # Check against red lines (absolute no-goes)
    for red_line in red_lines:
        if crosses(advice_implications, red_line):
            conflicts.append({
                'type': 'red_line_crossed',
                'red_line': red_line,
                'severity': 'critical'
            })
    
    return MacroReasoningResult(
        aligned=len(conflicts) == 0,
        conflicts=conflicts,
        recommendation='revise' if conflicts else 'proceed'
    )

Validation:

Micro: Test with responses containing planted errors ( > 80% caught)
Meso: Multi-turn conversations with intentional contradictions
Macro: Test against mock user profiles with known goals/values

Traces To: REQ-AI-030 (System SRS)

FRD-AI-CRK-002 [P0]
System shall implement evidence tagging and source attribution.

Functional Specification:

Evidence Types:

User Data - Information provided by user in current or past conversations
RAG Retrieved - Information from user's documents
Model Knowledge - Information from LLM's training data
Web Search - Information from optional web search (if enabled)
Tool Output - Results from tool calls (calculator, weather, etc.)
Inference - Logical conclusion drawn by AI

Tagging Format:

class EvidenceTag:
    claim: str              # The factual claim
    source_type: SourceType  # USER_DATA, RAG, MODEL, WEB, TOOL, INFERENCE
    source_reference: str   # Specific source (doc ID, URL, conversation turn)
    confidence: float       # 0.0-1.0
    verified: bool          # Whether claim has been cross-checked
    
class SourceType(Enum):
    USER_DATA = "user_data"
    RAG = "rag"
    MODEL = "model_knowledge"
    WEB_SEARCH = "web"
    TOOL = "tool"
    INFERENCE = "inference"

Response Annotation:

def generate_response_with_evidence(query: str, context: dict) -> AnnotatedResponse:
    # Generate base response
    response_text = llm.generate(query, context)
    
    # Extract claims
    claims = extract_factual_claims(response_text)
    
    # Tag each claim with evidence
    evidence_tags = []
    for claim in claims:
        tag = determine_evidence_source(claim, context)
        evidence_tags.append(tag)
    
    # Calculate overall confidence
    overall_confidence = min(tag.confidence for tag in evidence_tags)
    
    # Flag low-confidence claims
    low_confidence_claims = [
        tag for tag in evidence_tags 
        if tag.confidence < 0.7
    ]
    
    return AnnotatedResponse(
        text=response_text,
        evidence_tags=evidence_tags,
        overall_confidence=overall_confidence,
        low_confidence_claims=low_confidence_claims
    )

User Display:

High confidence ( > 0.9): No hedging, confident tone
Medium confidence (0.7-0.9): Slight hedging ("It looks like...", "Based on...")
Low confidence ( < 0.7): Explicit uncertainty ("I'm not certain, but...", "I don't have reliable information on this")

Validation:

Test evidence tagging accuracy on labeled dataset
Verify confidence calibration (predicted confidence matches actual accuracy)
User studies: Can users identify certainty levels correctly?

Traces To: REQ-AI-031 (System SRS)

FRD-AI-CRK-003 [P0]
System shall implement PFC (Prefrontal Cortex) Load Estimator.

Functional Specification:

Purpose: Estimate cognitive load of a task to prevent user overwhelm.

Cognitive Load Factors:

Factor	Weight	Description
Step Count	0.3	Number of sequential steps required
Deadline Pressure	0.2	Urgency of task
Emotional Stakes	0.25	How much user cares about outcome
Ambiguity	0.15	Clarity of requirements
Novelty	0.1	Familiarity with task type

Load Calculation:

def estimate_cognitive_load(task: Task, user_state: UserState) -> CognitiveLoadScore:
    # Factor 1: Step count
    steps = decompose_task(task)
    step_load = min(len(steps) / 10.0, 1.0)  # Normalize to 0-1
    
    # Factor 2: Deadline pressure
    if task.deadline:
        hours_until = (task.deadline - now()).total_hours()
        deadline_load = max(0, 1.0 - hours_until / 48.0)  # 48 hours = relaxed
    else:
        deadline_load = 0.0
    
    # Factor 3: Emotional stakes
    emotional_load = user_state.emotional_intensity * task.importance
    
    # Factor 4: Ambiguity
    ambiguity_load = 1.0 - task.clarity_score
    
    # Factor 5: Novelty
    novelty_load = 1.0 - user_state.familiarity_with(task.category)
    
    # Weighted sum
    total_load = (
        step_load * 0.3 +
        deadline_load * 0.2 +
        emotional_load * 0.25 +
        ambiguity_load * 0.15 +
        novelty_load * 0.1
    )
    
    return CognitiveLoadScore(
        total=total_load,
        breakdown={
            'steps': step_load,
            'deadline': deadline_load,
            'emotional': emotional_load,
            'ambiguity': ambiguity_load,
            'novelty': novelty_load
        },
        overload_risk=categorize_risk(total_load)
    )

def categorize_risk(load: float) -> str:
    if load < 0.3:
        return "low"
    elif load < 0.7:
        return "medium"
    else:
        return "high"

Response Adaptation Based on Load:

Low Load ( < 0.3):

Present task as-is
Normal pacing
Standard detail level

Medium Load (0.3-0.7):

Break into phases
Suggest starting with easy wins
Offer encouragement

High Load ( > 0.7):

Mandatory task decomposition
Smallest possible first step ( < 5 minutes)
Emphasis on progress, not perfection
Offer to spread across multiple days

Example:

# User: "I need to do my taxes and I'm freaking out."

load = estimate_cognitive_load(
    Task(
        description="Do taxes",
        steps=["Gather documents", "Enter data", "Review", "Submit"],
        deadline=date.today() + timedelta(days=3),
        importance=0.9,
        clarity=0.4  # User unclear on process
    ),
    user_state=UserState(
        emotional_intensity=0.8,  # "freaking out"
        familiarity_with("taxes")=0.3  # Not familiar
    )
)

# Result: load.total = 0.78 (HIGH)
# System response: Apply high-load adaptations

Validation:

Correlate load scores with user-reported difficulty
Test task decomposition quality
User studies: Does adaptation improve completion rates?

Traces To: REQ-AI-031, REQ-AI-050 (System SRS)

5.2 Self-Critique and Verification

FRD-AI-CRK-010 [P1]
System shall implement self-critique for high-stakes responses.

Functional Specification:

Trigger Conditions:

Financial advice (money, investments)
Health/medical information
Legal advice or documents
Safety-critical decisions
User explicitly requests "double-check this"

Critique Process:

def self_critique(response: str, query: str, context: dict) -> CritiqueResult:
    """
    Generate critique of initial response.
    Returns identified issues and suggestions.
    """
    # Generate critique prompt
    critique_prompt = f"""
    Original query: {query}
    Generated response: {response}
    
    Critically analyze this response. Identify:
    1. Unsupported claims (no evidence)
    2. Logical errors or contradictions
    3. Missing important caveats or warnings
    4. Potential misinterpretations of the query
    5. Alternative perspectives not considered
    
    Be thorough and skeptical.
    """
    
    # Run critique with separate LLM call
    critique_text = llm.generate(critique_prompt)
    
    # Parse critique
    issues = parse_critique_issues(critique_text)
    
    # Severity assessment
    critical_issues = [i for i in issues if i.severity == 'high']
    
    if critical_issues:
        # Response needs revision
        return CritiqueResult(
            passed=False,
            issues=critical_issues,
            recommendation='revise_response'
        )
    else:
        # Response acceptable (maybe minor improvements)
        return CritiqueResult(
            passed=True,
            issues=issues,
            recommendation='proceed_with_caveats'
        )

Revision Process: If critique finds issues, automatically revise:

def revise_response(original_response: str, critique: CritiqueResult) -> str:
    revision_prompt = f"""
    Original response: {original_response}
    
    Issues identified:
    {format_issues(critique.issues)}
    
    Generate a revised response that addresses these issues.
    Add appropriate caveats and warnings.
    """
    
    revised_response = llm.generate(revision_prompt)
    return revised_response

User Display:

If critique passed: Show response normally
If critique found issues: Show revised response with disclaimer
- "I've double-checked this response. Please note: [caveats]"
- "I'm not certain about [specific claim]. You may want to verify this."

Validation:

Plant errors in responses, verify critique catches them > 80%
Measure false positive rate (flagging correct responses)
User studies: Does self-critique improve trust and accuracy?

Traces To: REQ-AI-032 (System SRS)

6. Emotional Engine

6.1 Emotional State Tracking

FRD-AI-EMO-001 [P0]
System shall track user emotional state in 3D affective space.

Functional Specification:

Affective Dimensions:

Valence: Negative (-1.0) to Positive (+1.0)
Arousal: Low (0.0) to High (1.0)
Control: No Control (0.0) to Full Control (1.0)

State Representation:

class EmotionalState:
    valence: float  # -1.0 to +1.0
    arousal: float  # 0.0 to 1.0
    control: float  # 0.0 to 1.0
    discrete_state: str  # Named state for easier reference
    confidence: float  # How certain we are about this state
    last_updated: datetime
    
    def as_discrete_state(self) -> str:
        """
        Map continuous dimensions to discrete emotional states.
        """
        if self.valence > 0.5 and self.arousal < 0.4:
            return "calm_content"
        elif self.valence > 0.5 and self.arousal > 0.6:
            return "excited_energized"
        elif self.valence < -0.3 and self.arousal > 0.6:
            return "anxious_stressed"
        elif self.valence < -0.3 and self.arousal < 0.4:
            return "sad_withdrawn"
        elif self.control < 0.3:
            return "overwhelmed_stuck"
        elif self.arousal < 0.2:
            return "tired_foggy"
        else:
            return "neutral"

State Detection Inputs:

Voice Prosody:
- Pitch variation (high variance → excited/anxious)
- Speech rate (fast → energized/anxious, slow → calm/depressed)
- Volume (loud → aroused, quiet → withdrawn)
- Voice quality (shaky → anxious, flat → depressed)
Word Choice:
- Negative words ("can't", "hate", "terrible") → Low valence
- Uncertainty words ("maybe", "I don't know") → Low control
- Swearing/strong language → High arousal
- Passive language ("it is", "things are") → Low control
Behavior Patterns:
- Task avoidance → Low control, possibly anxious
- Rapid task switching → High arousal, low control
- Long pauses → Low arousal or cognitive load
- Repeated failed attempts → Frustration, low control
Physiological (if sensors available):
- Heart rate (high → aroused)
- HRV (low → stressed)
- Skin temperature (high → aroused/anxious)

State Update Algorithm:

class EmotionalTracker:
    def __init__(self):
        self.state = EmotionalState(
            valence=0.0,
            arousal=0.5,
            control=0.7,
            discrete_state="neutral",
            confidence=0.5
        )
    
    def update_from_voice(self, audio_features: dict):
        # Extract emotional cues from voice
        pitch_var = audio_features['pitch_variance']
        speech_rate = audio_features['speech_rate']
        volume = audio_features['volume']
        
        # Update arousal
        if speech_rate > 150 or pitch_var > 0.3:  # words/min, normalized
            self.state.arousal += 0.1
        elif speech_rate < 100:
            self.state.arousal -= 0.1
        
        # Clamp to valid range
        self.state.arousal = clamp(self.state.arousal, 0.0, 1.0)
        self.state.confidence = 0.6  # Moderate confidence from voice alone
    
    def update_from_text(self, text: str):
        # Sentiment analysis for valence
        sentiment = analyze_sentiment(text)
        self.state.valence = 0.7 * self.state.valence + 0.3 * sentiment
        
        # Detect control/agency language
        control_score = detect_control_language(text)
        self.state.control = 0.7 * self.state.control + 0.3 * control_score
        
        self.state.confidence = 0.7  # Good confidence from language
    
    def update_from_behavior(self, behavior: str):
        # Behavior patterns affect all dimensions
        if behavior == "task_avoidance":
            self.state.control -= 0.2
            self.state.arousal += 0.1
        elif behavior == "rapid_switching":
            self.state.arousal += 0.2
            self.state.control -= 0.15
        
        self.state.confidence = 0.5  # Moderate confidence from behavior
    
    def get_state(self) -> EmotionalState:
        # Update discrete state label
        self.state.discrete_state = self.state.as_discrete_state()
        self.state.last_updated = now()
        return self.state

Validation:

Human raters label emotional states from conversation samples
Compare system predictions to human labels (target: 75% agreement)
Track state changes over time, verify smooth transitions

Traces To: REQ-AI-040 (System SRS)

FRD-AI-EMO-002 [P0]
System shall maintain personalized Trigger Bank for each user.

Functional Specification:

Trigger Types:

Overload Triggers - Situations that overwhelm the user's PFC
Avoidance Triggers - Things user tends to procrastinate on
Fear/Shame Triggers - Topics that cause anxiety or shame
Activation Triggers - Things that energize and motivate
Soothing Triggers - Language/approaches that calm the user

Data Model:

class Trigger:
    id: str
    type: TriggerType  # OVERLOAD, AVOIDANCE, FEAR, ACTIVATION, SOOTHING
    pattern: str       # What to look for (keywords, context)
    strength: float    # 0.0-1.0, how strong this trigger is
    examples: List[str]  # Historical examples
    created_at: datetime
    last_observed: datetime
    observation_count: int
    
class TriggerBank:
    triggers: Dict[str, Trigger]
    
    def add_trigger(self, trigger: Trigger):
        self.triggers[trigger.id] = trigger
    
    def match_triggers(self, text: str, context: dict) -> List[Trigger]:
        matched = []
        for trigger in self.triggers.values():
            if trigger_matches(trigger, text, context):
                matched.append(trigger)
        return matched
    
    def update_trigger_strength(self, trigger_id: str, outcome: str):
        """
        Update trigger strength based on observed outcome.
        
        Args:
            trigger_id: Trigger to update
            outcome: 'success', 'partial', 'failure', 'shutdown'
        """
        trigger = self.triggers[trigger_id]
        
        if outcome == 'shutdown':  # Confirmed shutdown
            trigger.strength = min(1.0, trigger.strength + 0.1)
        elif outcome == 'success':  # User pushed through
            trigger.strength = max(0.0, trigger.strength - 0.05)
        
        trigger.observation_count += 1
        trigger.last_observed = now()

Example Triggers:

# Overload trigger example
trigger_taxes = Trigger(
    id="overload_taxes",
    type=TriggerType.OVERLOAD,
    pattern="tax|taxes|taxation",
    strength=0.8,
    examples=[
        "User said 'I need to do my taxes' and then avoided for 3 days",
        "User showed anxiety when discussing tax deadlines"
    ]
)

# Avoidance trigger example
trigger_phone_calls = Trigger(
    id="avoidance_phone_calls",
    type=TriggerType.AVOIDANCE,
    pattern="call|phone|speak to|contact",
    strength=0.6,
    examples=[
        "User repeatedly postponed calling landlord",
        "User asked to draft email instead of making phone call"
    ]
)

# Activation trigger example
trigger_music_production = Trigger(
    id="activation_music",
    type=TriggerType.ACTIVATION,
    pattern="music|beat|song|produce",
    strength=0.9,
    examples=[
        "User becomes energized when discussing music projects",
        "User stayed focused for 3 hours working on beats"
    ]
)

Trigger Learning:

Initialized with default triggers (common stressors)
Learns from observations over time
Strength adjusted based on user's actual responses
Can be manually edited by user ("This doesn't stress me anymore")

Validation:

Test trigger matching accuracy on example conversations
Verify strength adjustments improve response appropriateness
User satisfaction surveys on AI responsiveness

Traces To: REQ-AI-041 (System SRS)

FRD-AI-EMO-003 [P0]
System shall implement adaptive tone based on emotional state.

Functional Specification:

Tone Dimensions:

Formality: Casual to Formal
Empathy: Matter-of-fact to Highly Empathetic
Directness: Indirect/Soft to Direct/Firm
Verbosity: Terse to Detailed
Encouragement: Neutral to Highly Encouraging

State-to-Tone Mapping:

Emotional State	Formality	Empathy	Directness	Verbosity	Encouragement
Calm/Content	Casual	Moderate	Direct	Medium	Low
Excited/Energized	Casual	Low	Direct	Terse	Moderate
Anxious/Stressed	Casual	High	Gentle	Short	High
Overwhelmed	Casual	Very High	Very Gentle	Minimal	Very High
Sad/Withdrawn	Casual	High	Gentle	Short	High
Focused	Formal	Low	Direct	Terse	None
Tired/Foggy	Casual	Moderate	Gentle	Minimal	Moderate

Implementation:

def adapt_tone(response: str, emotional_state: EmotionalState) -> str:
    """
    Adjust response tone based on user's emotional state.
    """
    # Get tone parameters
    tone_params = map_state_to_tone(emotional_state)
    
    # Apply tone adjustments
    adjusted = response
    
    # Adjust empathy
    if tone_params.empathy > 0.7:
        adjusted = add_empathy_markers(adjusted)
        # "I understand this is difficult..."
        # "It's okay to feel overwhelmed..."
    
    # Adjust directness
    if tone_params.directness < 0.3:  # Very gentle
        adjusted = soften_language(adjusted)
        # "might want to" instead of "should"
        # "could try" instead of "do"
    
    # Adjust verbosity
    if tone_params.verbosity < 0.3:  # Minimal
        adjusted = shorten_response(adjusted, target_sentences=2)
    
    # Adjust encouragement
    if tone_params.encouragement > 0.7:
        adjusted = add_encouragement(adjusted)
        # "You've got this"
        # "Small progress is still progress"
    
    return adjusted

Example Transformations:

Original: "To complete your tax return, you need to gather these documents: W-2, 1099s, deduction receipts, and mortgage interest statements. Then..."

For Anxious User: "I know taxes feel overwhelming. Let's take this one tiny step at a time. First, just find your W-2 - that's it. Nothing else right now. Once you have that, we'll move to the next small step."

For Focused User: "Tax documents needed: W-2, 1099s, deductions, mortgage interest. Gather these first, then we'll proceed with entry."

Validation:

A/B testing: Users rate response appropriateness
Test tone adaptation on labeled emotional states
User satisfaction metrics for emotional intelligence

Traces To: REQ-AI-042 (System SRS)

7. Executive Function Framework (EFF)

7.1 Task Decomposition

FRD-AI-EFF-001 [P0]
System shall decompose complex tasks into micro-steps.

Functional Specification:

Decomposition Criteria:

Each step should take < 5 minutes
Steps should be atomic (no "and" statements)
Steps should be concrete (no vague language)
Steps should be ordered sequentially
Dependencies should be explicit

Decomposition Algorithm:

def decompose_task(task: str, context: dict) -> List[MicroStep]:
    """
    Break complex task into micro-steps.
    Returns ordered list of concrete, time-bounded actions.
    """
    # Generate decomposition prompt
    decomposition_prompt = f"""
    Task: {task}
    Context: {context}
    
    Break this into the smallest possible steps.
    Each step must be:
    - Completable in under 5 minutes
    - One single action (no "and")
    - Concrete and specific
    
    Format each step as:
    1. [Action] (estimated time)
    
    Example:
    Task: "Write project proposal"
    Steps:
    1. Open new document (30 seconds)
    2. Write project title at top (1 min)
    3. Write one-sentence project summary (2 min)
    4. List 3 main goals in bullet points (3 min)
    5. ... (continue)
    
    Now decompose the task above.
    """
    
    # Generate decomposition
    decomposition = llm.generate(decomposition_prompt)
    
    # Parse into structured steps
    steps = parse_steps(decomposition)
    
    # Validate steps
    validated_steps = []
    for step in steps:
        if validate_step(step):
            validated_steps.append(step)
        else:
            # Step too complex, further decompose
            sub_steps = decompose_task(step.description, context)
            validated_steps.extend(sub_steps)
    
    return validated_steps

class MicroStep:
    description: str
    estimated_time_minutes: int
    dependencies: List[int]  # Indices of previous steps required
    optional: bool
    difficulty: float  # 0.0-1.0

Step Presentation:

Show only current step, not entire list (reduces overwhelm)
After completing step, brief celebration + show next step
Track progress visually (e.g., "3/12 steps done")

Example:

Input: "I need to apply for a job"

Output:

Step 1: Open job listing page (30 seconds)
Step 2: Read job description once through (2 minutes)
Step 3: Copy job title and company name to notes (1 minute)
Step 4: List 3 skills from job description (2 minutes)
Step 5: Open resume file (30 seconds)
... (continue)

Validation:

Test decomposition on various complex tasks
User testing: Can naive users follow steps successfully?
Measure task completion rate with vs. without decomposition

Traces To: REQ-AI-050 (System SRS)

FRD-AI-EFF-002 [P0]
System shall implement prioritization engine for task ordering.

Functional Specification:

Prioritization Factors:

Factor	Weight	Description
Deadline Urgency	0.35	How soon is deadline?
Importance	0.25	Impact on goals/life
Cognitive Load	0.20	Mental effort required
Dependencies	0.15	Blocking other tasks?
Emotional State	0.05	User's current state

Priority Calculation:

def calculate_priority(task: Task, user_state: UserState) -> float:
    """
    Calculate task priority (0.0-1.0, higher = more urgent).
    """
    # Factor 1: Deadline urgency
    if task.deadline:
        hours_until = (task.deadline - now()).total_hours()
        if hours_until < 24:
            urgency_score = 1.0
        elif hours_until < 72:
            urgency_score = 0.7
        elif hours_until < 168:  # 1 week
            urgency_score = 0.4
        else:
            urgency_score = 0.1
    else:
        urgency_score = 0.0
    
    # Factor 2: Importance (user-defined or inferred)
    importance_score = task.importance  # 0.0-1.0
    
    # Factor 3: Cognitive load (inverse - easier tasks prioritized when tired)
    load = estimate_cognitive_load(task, user_state)
    if user_state.energy_level < 0.4:  # User tired
        cognitive_score = 1.0 - load.total  # Prioritize easy tasks
    else:  # User energized
        cognitive_score = 0.5  # Neutral factor
    
    # Factor 4: Dependencies (blocks other tasks?)
    dependency_score = 0.0
    if task.blocks_other_tasks:
        dependency_score = 0.8
    
    # Factor 5: Emotional state match
    emotional_score = 0.5
    if user_state.discrete_state == "anxious" and task.requires_calm:
        emotional_score = 0.1  # Deprioritize anxiety-inducing tasks
    elif user_state.discrete_state == "focused" and task.requires_focus:
        emotional_score = 0.9  # Prioritize focus-heavy tasks
    
    # Weighted sum
    priority = (
        urgency_score * 0.35 +
        importance_score * 0.25 +
        cognitive_score * 0.20 +
        dependency_score * 0.15 +
        emotional_score * 0.05
    )
    
    return priority

Task Ordering Strategies:

1. Energy-Based Ordering:

High energy: Tackle hard, important tasks first
Low energy: Start with easy wins to build momentum

2. Time-Based Ordering:

Deadline-driven: Sort by urgency
Time-available: Fit tasks into available time slots

3. Mood-Based Ordering:

Anxious: Start with calming, predictable tasks
Bored: Start with engaging, novel tasks
Focused: Tackle deep work

Recommendation Format:

def recommend_next_task(tasks: List[Task], user_state: UserState) -> TaskRecommendation:
    """
    Recommend best next task for user given current state.
    """
    # Calculate priorities
    task_priorities = [
        (task, calculate_priority(task, user_state))
        for task in tasks
    ]
    
    # Sort by priority
    task_priorities.sort(key=lambda x: x[1], reverse=True)
    
    # Top recommendation
    best_task, best_priority = task_priorities[0]
    
    # Alternative recommendations
    alternatives = task_priorities[1:4]  # Next 3
    
    return TaskRecommendation(
        primary=best_task,
        priority_score=best_priority,
        reasoning=explain_priority(best_task, user_state),
        alternatives=[t for t, _ in alternatives]
    )

Validation:

Test prioritization on diverse task lists
User studies: Does AI prioritization match user preferences?
Task completion rate with AI recommendations

Traces To: REQ-AI-050 (System SRS)

FRD-AI-EFF-003 [P0]
System shall implement gating to prevent harmful impulsive decisions.

Functional Specification:

Gating Triggers:

User Exhausted: Energy level < 0.2
Emotionally Unstable: High arousal + low valence + low control
High-Risk Decision: Financial, health, relationship, legal
Conflicting Goals: Decision contradicts stated goals
Unusual Behavior: Out-of-character request

Gating Actions:

Level 1: Soft Gate (Remind & Pause)

"This seems important. Want to sleep on it?"
"You mentioned wanting to save money - sure about this purchase?"
Continue if user confirms

Level 2: Hard Gate (Require Confirmation)

"This decision could have major consequences. Let's think through it."
Run mini-simulation showing outcomes
Require explicit "yes, proceed" confirmation

Level 3: Block (Refuse Action)

"I can't help with this right now - you're too stressed. Let's revisit when you're calmer."
Offer to set reminder for later
Log attempt for later review

Implementation:

def gate_decision(decision: Decision, user_state: UserState) -> GatingResult:
    """
    Determine if decision should be gated.
    """
    # Check gating triggers
    triggers = []
    
    # Trigger 1: User exhausted
    if user_state.energy_level < 0.2:
        triggers.append(GatingTrigger(
            type="exhaustion",
            severity="medium",
            message="You seem tired - major decisions are best made when rested"
        ))
    
    # Trigger 2: Emotionally unstable
    if (user_state.arousal > 0.7 and 
        user_state.valence < -0.3 and 
        user_state.control < 0.3):
        triggers.append(GatingTrigger(
            type="emotional_instability",
            severity="high",
            message="You seem overwhelmed - let's wait until you feel calmer"
        ))
    
    # Trigger 3: High-risk decision
    if decision.category in ["financial", "health", "relationship", "legal"]:
        risk_level = assess_risk(decision)
        if risk_level > 0.7:
            triggers.append(GatingTrigger(
                type="high_risk",
                severity="high",
                message="This decision has major consequences - let's think it through carefully"
            ))
    
    # Determine gating level
    if not triggers:
        return GatingResult(action="proceed", triggers=[])
    
    max_severity = max(t.severity for t in triggers)
    
    if max_severity == "low":
        return GatingResult(action="soft_gate", triggers=triggers)
    elif max_severity == "medium":
        return GatingResult(action="hard_gate", triggers=triggers)
    else:  # high
        return GatingResult(action="block", triggers=triggers)

User Override:

User can override gates (except critical safety ones)
Overrides are logged
Repeated overrides → adjust user profile (maybe they know what they're doing)

Example Scenarios:

Scenario 1: Late-night online shopping

User (11pm, aroused, tired): "Add this $500 gadget to cart"
System: *Soft Gate* "It's late and you seem tired. This is a significant purchase - want to wait until morning to decide?"
User: "Yeah, good call"
System: "I'll remind you tomorrow to reconsider if you're still interested"

Scenario 2: Angry email

User (angry, aroused): "Send this email to my boss: [angry rant]"
System: *Hard Gate* "I notice you're upset. This email could damage your relationship with your boss. Can we draft it but wait 30 minutes before sending?"
User: "Fine"
System: [Saves draft, sets 30-min timer]

Scenario 3: Critical medical decision

User (anxious, confused): "Should I stop taking my medication?"
System: *Block* "This is a medical decision I can't advise on. Please consult your doctor before making any changes to medication."

Validation:

Simulate test scenarios with gating triggers
Measure false positive rate (unnecessary gates)
User studies: Do users find gates helpful or annoying?

Traces To: REQ-AI-051 (System SRS)

7.2 Habit Formation

FRD-AI-EFF-010 [P1]
System shall support habit formation through micro-routines.

Functional Specification:

Habit Building Strategy:

Phase 1: Ultra-Small Start (Days 1-7)

Goal: 5 minutes daily
Focus: Consistency over performance
Reward: Celebration for showing up

Phase 2: Gradual Increase (Days 8-30)

Goal: Slowly increase difficulty
Add 1-2 minutes every 3 days
Focus: Sustainable growth

Phase 3: Integration (Days 31+)

Goal: Make it automatic
Link to existing routines
Reduce explicit reminders

Micro-Routine Structure:

class MicroRoutine:
    name: str
    description: str
    duration_minutes: int
    frequency: str  # "daily", "3x_week", etc.
    cue: str  # Trigger (time, event, location)
    steps: List[str]
    difficulty: float  # 0.0-1.0, adjusted over time
    streak: int  # Days in a row completed
    total_completions: int
    
class HabitBuilder:
    def create_habit(self, goal: str, user_state: UserState) -> MicroRoutine:
        """
        Create ultra-small starting routine for goal.
        """
        # Start absurdly small
        if "exercise" in goal.lower():
            return MicroRoutine(
                name="Morning Movement",
                description="Just 5 minutes of gentle movement",
                duration_minutes=5,
                frequency="daily",
                cue="right_after_waking_up",
                steps=[
                    "Put on comfortable clothes",
                    "Stand up and stretch arms overhead 3 times",
                    "Walk around room for 2 minutes",
                    "Celebrate with fist pump"
                ],
                difficulty=0.2,
                streak=0
            )
        
        # Similar ultra-small starts for other goals
        ...
    
    def adapt_difficulty(self, routine: MicroRoutine, completion_rate: float):
        """
        Adjust difficulty based on user's adherence.
        """
        if completion_rate > 0.9 and routine.streak > 7:
            # User consistently succeeding - increase challenge
            routine.difficulty += 0.05
            routine.duration_minutes += 1
        elif completion_rate < 0.5:
            # User struggling - make easier
            routine.difficulty = max(0.1, routine.difficulty - 0.1)
            routine.duration_minutes = max(5, routine.duration_minutes - 1)

Hedonic Treadmill Management:

Users adapt to current difficulty
Slowly raise baseline to prevent boredom
Never jump too big (max 10% increase at a time)
Regression is okay - adjust down if life gets busy

Reward System:

def celebrate_completion(routine: MicroRoutine, completion: Completion):
    """
    Provide positive reinforcement.
    """
    messages = []
    
    # Base celebration
    messages.append("Great job! You did it! 🎉")
    
    # Streak milestones
    if routine.streak == 7:
        messages.append("That's a full week! You're building real momentum!")
    elif routine.streak == 30:
        messages.append("30 days! This is becoming a real habit!")
    elif routine.streak == 100:
        messages.append("100 days!! You're unstoppable!")
    
    # Progress recognition
    if routine.total_completions % 10 == 0:
        messages.append(f"That's {routine.total_completions} times you've shown up. Look at that consistency!")
    
    return messages

Validation:

30-day habit formation trials
Measure completion rate over time
Compare to control group (no AI assistance)

Traces To: REQ-AI-052 (System SRS)

8. Tool Calling and Skills System

8.1 Function Calling

FRD-AI-TOOL-001 [P0]
System shall implement function calling for tool execution.

Functional Specification:

Tool Registry:

class Tool:
    name: str
    description: str
    parameters: Dict[str, Parameter]
    returns: str
    safety_level: int  # 0=safe, 1=needs_confirmation, 2=sensitive
    
class Parameter:
    name: str
    type: str  # "string", "number", "boolean", "object", "array"
    description: str
    required: bool
    enum: List[str]  # Optional: allowed values
    
# Example tools
tool_registry = [
    Tool(
        name="get_weather",
        description="Get current weather for a location",
        parameters={
            "location": Parameter(
                name="location",
                type="string",
                description="City name or zip code",
                required=True
            )
        },
        returns="Weather data including temperature, conditions, forecast",
        safety_level=0
    ),
    
    Tool(
        name="send_email",
        description="Send an email to recipient",
        parameters={
            "to": Parameter(name="to", type="string", required=True),
            "subject": Parameter(name="subject", type="string", required=True),
            "body": Parameter(name="body", type="string", required=True)
        },
        returns="Confirmation message",
        safety_level=1  # Requires confirmation
    ),
    
    Tool(
        name="create_calendar_event",
        description="Add event to user's calendar",
        parameters={
            "title": Parameter(name="title", type="string", required=True),
            "start_time": Parameter(name="start_time", type="string", required=True),
            "duration_minutes": Parameter(name="duration_minutes", type="number", required=False)
        },
        returns="Event confirmation",
        safety_level=0
    ),
]

Function Calling Flow:

Intent Detection: User request suggests tool use
Tool Selection: LLM selects appropriate tool from registry
Parameter Extraction: LLM extracts parameters from user input
Validation: Check parameters are valid and complete
Safety Check: If safety_level > 0, ask user confirmation
Execution: Call tool with parameters
Result Processing: Integrate tool output into response

Implementation:

def execute_tool_call(tool_name: str, parameters: dict, user_state: UserState) -> ToolResult:
    """
    Execute tool call with safety checks.
    """
    # Get tool from registry
    tool = tool_registry.get(tool_name)
    if not tool:
        return ToolResult(success=False, error="Tool not found")
    
    # Validate parameters
    validation = validate_parameters(parameters, tool.parameters)
    if not validation.valid:
        return ToolResult(success=False, error=validation.error)
    
    # Safety check
    if tool.safety_level > 0:
        confirmation = get_user_confirmation(tool, parameters)
        if not confirmation.approved:
            return ToolResult(success=False, error="User declined", canceled=True)
    
    # Execute tool
    try:
        result = tool.execute(parameters)
        return ToolResult(success=True, data=result)
    except Exception as e:
        return ToolResult(success=False, error=str(e))

LLM Function Calling Prompt:

Available tools:
{tool_registry_formatted}

User request: "{user_query}"

If this request requires a tool, respond with:
TOOL_CALL: {
  "tool": "tool_name",
  "parameters": {
    "param1": "value1",
    ...
  },
  "reasoning": "why this tool is needed"
}

If no tool is needed, respond normally.

Validation:

Test tool selection accuracy on diverse queries
Verify parameter extraction correctness
Test safety gates for sensitive tools
Measure tool execution success rate

Traces To: [System SRS - Tool Calling], REQ-INT-010

FRD-AI-TOOL-002 [P0]
System shall implement simulate → confirm → execute pattern for critical tools.

Functional Specification:

Purpose: Prevent accidental irreversible actions.

Pattern:

Step 1: Simulate

Show user what WOULD happen
Don't actually execute
Display expected outcome

Step 2: Confirm

Ask explicit "Yes, do this" or "No, cancel"
Show summary again
Timeout after 30 seconds (default to cancel)

Step 3: Execute

Only if user confirms
Execute actual action
Show confirmation message

Example:

User: "Send email to boss saying I quit"

AI:

I can draft that email, but I want to make sure this is what you want.

Simulated email:
To: boss@company.com
Subject: Resignation
Body: I am resigning from my position...

This is a significant decision. Are you sure you want to send this?
- Yes, send it
- No, cancel (or let me revise)

Tools Requiring Confirmation:

send_email
delete_file
send_message
make_phone_call
post_to_social_media
financial_transaction
change_settings (if security-sensitive)

Implementation:

def execute_with_confirmation(tool: Tool, parameters: dict) -> ToolResult:
    """
    Simulate → Confirm → Execute pattern.
    """
    # Step 1: Simulate
    simulation = tool.simulate(parameters)
    
    # Step 2: Show to user and get confirmation
    confirmation_prompt = f"""
    I'm about to {tool.description}.
    
    Simulated result:
    {format_simulation(simulation)}
    
    This action {tool.consequences_description}.
    
    Confirm:
    - Say "yes" or "confirm" to proceed
    - Say "no" or "cancel" to stop
    - (Automatically cancels in 30 seconds if no response)
    """
    
    response = wait_for_user_response(timeout=30)
    
    if response in ["yes", "confirm", "do it", "proceed"]:
        # Step 3: Execute
        result = tool.execute(parameters)
        return ToolResult(success=True, data=result)
    else:
        return ToolResult(success=False, canceled=True, message="Action canceled")

Validation:

Test confirmation flow for all critical tools
Verify timeout behaves correctly
User acceptance: Do confirmations feel appropriate or annoying?

Traces To: [System SRS - Tool Calling]

8.2 Skills System

FRD-AI-SKILL-001 [P0]
System shall support third-party skill installation and management.

Functional Specification:

Skill Structure:

/data/klyra/skills/{skill_id}/
    manifest.json       # Metadata and permissions
    handler.py          # Skill logic (Python)
    ui.hud              # Optional HUD interface definition
    assets/             # Images, icons, data files
    README.md           # Documentation

Manifest Format:

{
  "skill_id": "measure_tool",
  "name": "AR Measurement Tool",
  "version": "1.0.0",
  "author": "ThirdPartyDev",
  "description": "Measure real-world objects using AR and depth sensors",
  "category": "productivity",
  "permissions": [
    "camera",
    "tof_sensor",
    "lidar",
    "hud_display"
  ],
  "entry_point": "handler.py:MeasureSkill",
  "activation": {
    "voice_commands": ["measure", "how long is", "measure distance"],
    "gesture": "two_finger_tap"
  },
  "dependencies": {
    "python": " >= 3.9",
    "libraries": ["numpy", "opencv-python"]
  }
}

Skill Lifecycle:

Installation:
- User browses skill store
- Downloads skill package (.skillpkg file)
- Reviews permissions
- Confirms installation
- Skill extracted to /data/klyra/skills/
Activation:
- User triggers via voice command or gesture
- System launches skill handler
- Skill has access to granted permissions
Execution:
- Skill receives sensor data, context
- Processes data
- Returns results or updates HUD
Deactivation:
- User exits skill
- System cleanup
Update:
- Automatic update checks (daily)
- User approves updates
- Seamless replacement
Uninstallation:
- User removes skill
- All data deleted (except user data if requested to keep)

Skill API:

from klyra_sdk import Skill, HUD, Sensor

class MeasureSkill(Skill):
    def on_start(self):
        """Called when skill is activated."""
        self.hud.display("Point at object to measure")
        self.tof_sensor = Sensor.get("tof")
        self.camera = Sensor.get("camera")
    
    def on_sensor_data(self, sensor_name, data):
        """Called when sensor has new data."""
        if sensor_name == "tof":
            distance = data['distance']
            self.hud.display(f"Distance: {distance:.2f} meters")
    
    def on_voice_command(self, command):
        """Called when user speaks while skill active."""
        if "save" in command.lower():
            self.save_measurement()
    
    def on_stop(self):
        """Called when skill is deactivated."""
        self.cleanup()

Security Sandboxing:

Skills run in isolated process
Limited filesystem access (only skill's own directory + shared user data with permission)
Network access requires explicit permission
Cannot access other skills' data
CPU/memory limits enforced

Validation:

Install and run test skills
Verify permission enforcement
Test skill updates
Security audit of sandboxing

Traces To: REQ-INT-010, [System SRS - Developer SDK]

9. Memory Architecture

9.1 Multi-Tier Memory

FRD-AI-MEM-001 [P0]
System shall implement five-tier memory architecture.

Functional Specification:

Tier 1: Short-Term Memory (Hours)

Storage: In-memory buffer (RAM)
Capacity: Last 50-100 conversation turns
Purpose: Active working memory for current conversation
Retention: Cleared after 24 hours of inactivity

Tier 2: Mid-Term Memory (Weeks)

Storage: SQLite database
Capacity: Recent tasks, reminders, temporary notes
Purpose: Ongoing projects and short-term context
Retention: 30 days, then archived or pruned

Tier 3: Long-Term Memory (Life History)

Storage: Encrypted SQLite + FAISS vector store
Capacity: User's life history, preferences, important moments
Purpose: Deep personal context, relationships, major events
Retention: Indefinite (user can delete)

Tier 4: Domain Stores (Specialized)

Storage: Separate vector stores per domain
Capacity: Domain-specific knowledge and documents
Purpose: Isolated storage for Finance, NDIS, Work, Health, etc.
Retention: Per-domain settings

Tier 5: Immutable Core (Identity)

Storage: Secure encrypted storage
Capacity: User identity, values, moral rules
Purpose: Core identity that never changes without explicit user action
Retention: Permanent (cannot be accidentally deleted)

Memory Promotion:

def promote_memory(item: MemoryItem, from_tier: int, to_tier: int):
    """
    Promote memory item to higher tier based on importance.
    """
    # Calculate importance score
    importance = calculate_importance(item)
    
    # Criteria for promotion
    if from_tier == 1 and to_tier == 2:
        # Short-term to mid-term
        if (importance > 0.6 or 
            item.user_marked_important or 
            item.reference_count > 3):
            move_to_tier2(item)
    
    elif from_tier == 2 and to_tier == 3:
        # Mid-term to long-term
        if (importance > 0.8 or 
            item.age_days > 30 and item.reference_count > 5 or
            item.emotional_significance > 0.7):
            move_to_tier3(item)

Memory Garbage Collection:

Tier 1: Automatic (oldest turns dropped when buffer full)
Tier 2: Weekly cleanup (items > 30 days old archived or deleted)
Tier 3: User-controlled (never auto-delete)

Validation:

Test memory promotion logic
Verify no unintentional data loss
Measure memory usage across tiers

Traces To: [System SRS - Memory Architecture]

FRD-AI-MEM-002 [P0]
System shall implement versioned embeddings for forward compatibility.

Functional Specification:

Problem: Embedding models improve over time. Old embeddings become incompatible.

Solution: Version all embeddings and support migration.

Data Model:

class EmbeddingVersion:
    version_id: int
    model_name: str
    dimensions: int
    created_at: datetime
    
class DocumentEmbedding:
    document_id: int
    embedding_version_id: int
    embedding_data: np.array

Migration Strategy:

Option 1: Lazy Migration

Keep old embeddings until document is accessed
On access, re-embed with new model
Gradual migration over time

Option 2: Background Migration

Schedule background job to re-embed all documents
Progress tracking
User can pause/resume

Implementation:

class EmbeddingManager:
    def __init__(self):
        self.current_version = self.get_current_version()
        self.models = {
            1: load_model("all-MiniLM-L6-v2"),
            2: load_model("all-mpnet-base-v2"),  # Future upgrade
        }
    
    def embed(self, text: str) -> np.array:
        """Embed with current version."""
        model = self.models[self.current_version.version_id]
        return model.encode(text)
    
    def search(self, query: str, domain: str) -> List[Chunk]:
        """Search with automatic version handling."""
        query_embedding = self.embed(query)
        
        # Search only documents with matching embedding version
        results = self.faiss_search(
            query_embedding,
            domain=domain,
            embedding_version=self.current_version.version_id
        )
        
        # If few results and old version exists, include old version results
        if len(results) < 5:
            old_results = self.search_old_version(query, domain)
            results.extend(old_results)
        
        return results
    
    def migrate_embeddings(self, from_version: int, to_version: int):
        """Background migration task."""
        documents = get_documents_with_version(from_version)
        total = len(documents)
        
        for i, doc in enumerate(documents):
            new_embedding = self.embed_with_version(doc.content, to_version)
            update_embedding(doc.id, new_embedding, to_version)
            
            if i % 100 == 0:
                log_progress(i, total)
                yield i / total  # Progress percentage

Validation:

Test migration from v1 to v2 embeddings
Verify search works during migration
Measure migration performance

Traces To: REQ-AI-021 (System SRS)

10. Anticipation Layer (Pre-Thought System)

FRD-AI-ANTICIP-001 [P1]
System shall implement predictive assistance based on context.

Functional Specification:

Anticipation Triggers:

Time-based: User routines at specific times
Location-based: Entering known locations (home, work, gym)
Event-based: Calendar events approaching
Pattern-based: Repeated behaviors (e.g., always asks weather in morning)
Context-based: Sensor data indicating activity (walking, driving)

Anticipation Examples:

Scenario 1: Morning Routine

Trigger: 7:00 AM, user wakes up (detected by movement)
Pre-thought: Load weather, news, calendar for today
Action: "Good morning! It's 72°F and sunny. You have a 10 AM meeting with Sarah."

Scenario 2: Leaving Work

Trigger: 5:30 PM, user leaves office (location detected)
Pre-thought: Load traffic to home, check grocery list
Action: "Traffic home is light, 15 minutes. Want me to remind you to stop for milk?"

Scenario 3: Grocery Store

Trigger: User enters grocery store (location + context)
Pre-thought: Load grocery list, activate Identify & Profit scanner
Action: Display grocery list on HUD, ready to scan items

Implementation:

class AnticipationEngine:
    def __init__(self):
        self.patterns = load_user_patterns()
        self.predictions = []
    
    def observe_context(self, context: Context):
        """
        Observe current context and generate predictions.
        """
        predictions = []
        
        # Time-based predictions
        current_time = now()
        if current_time.hour == 7 and current_time.minute < 15:
            predictions.append(Prediction(
                action="load_morning_brief",
                confidence=0.9,
                data=self.generate_morning_brief()
            ))
        
        # Location-based predictions
        if context.location == "grocery_store":
            predictions.append(Prediction(
                action="show_grocery_list",
                confidence=0.95,
                data=load_grocery_list()
            ))
        
        # Pattern-based predictions
        if self.user_usually_does(action="check_weather", at_time=current_time):
            predictions.append(Prediction(
                action="fetch_weather",
                confidence=0.8,
                data=fetch_weather(context.location)
            ))
        
        # Store predictions for quick access
        self.predictions = predictions
        
        # Pre-load high-confidence predictions
        for pred in predictions:
            if pred.confidence > 0.8:
                pred.preload()
    
    def suggest_actions(self) -> List[Suggestion]:
        """
        Suggest actions user might want to take.
        """
        suggestions = []
        
        for pred in self.predictions:
            if pred.confidence > 0.7:
                suggestions.append(Suggestion(
                    description=pred.description,
                    action=pred.action,
                    confidence=pred.confidence
                ))
        
        return suggestions

User Control:

Anticipation is SUGGESTIVE, not AUTOMATIC (except pre-loading)
User can disable specific anticipations
User can adjust confidence threshold (more or fewer suggestions)
All anticipations logged for transparency

Validation:

Test prediction accuracy on user routines
Measure user acceptance rate of suggestions
Track false positive rate (unhelpful suggestions)

Traces To: [System SRS - Anticipation Layer]

11. Integration and Dependencies

11.1 Hardware Dependencies

FRD-AI-HW-001 [P0]
AI System shall gracefully degrade if hardware sensors unavailable.

Functional Specification:

Sensor Dependencies:

IMU: Required for gesture detection, head tracking
ToF/LiDAR: Required for obstacle detection, walking assistance
Camera: Required for OCR, object detection, photography
Microphone: Required for voice input
Speaker/Bone Conduction: Required for audio output

Degradation Strategy:

def check_hardware_availability() -> HardwareStatus:
    """Check which hardware components are available."""
    status = HardwareStatus()
    
    status.imu = test_sensor('imu')
    status.tof = test_sensor('tof')
    status.lidar = test_sensor('lidar')
    status.camera = test_sensor('camera')
    status.microphone = test_sensor('microphone')
    status.speaker = test_sensor('speaker')
    
    return status

def adapt_ai_capabilities(hw_status: HardwareStatus):
    """Disable features that require missing hardware."""
    
    if not hw_status.microphone:
        disable_feature('voice_input')
        enable_alternative('text_input_via_companion_app')
    
    if not hw_status.camera:
        disable_feature('ocr')
        disable_feature('object_detection')
        disable_feature('photography')
    
    if not hw_status.tof and not hw_status.lidar:
        disable_feature('obstacle_detection')
        disable_feature('walking_assist')
        warn_user("Safety features limited")
    
    if not hw_status.speaker:
        enable_alternative('visual_only_mode')

Validation:

Test with each sensor disabled
Verify appropriate features disabled
Confirm alternative input/output methods work

Traces To: [Hardware Requirements Document]

11.2 Performance Dependencies

FRD-AI-PERF-001 [P0]
AI System shall adapt to available compute resources.

Functional Specification:

Resource Monitoring:

CPU usage
GPU/NPU availability and usage
RAM available
Battery level
Temperature

Adaptation Strategy:

class ResourceAdaptiveAI:
    def select_model_config(self) -> ModelConfig:
        """Select AI model based on available resources."""
        
        cpu_usage = get_cpu_usage()
        ram_available = get_ram_available()
        battery_level = get_battery_level()
        temperature = get_temperature()
        
        # Emergency mode: Critical resources
        if battery_level < 10 or temperature > 48:
            return ModelConfig(
                model_size="1B",
                quantization="4bit",
                max_tokens=256,
                reasoning="emergency_low_power"
            )
        
        # Low power mode: Limited resources
        elif battery_level < 20 or temperature > 45:
            return ModelConfig(
                model_size="3B",
                quantization="4bit",
                max_tokens=512,
                reasoning="power_saving"
            )
        
        # Normal mode: Adequate resources
        elif ram_available > 3000 and battery_level > 40:
            return ModelConfig(
                model_size="7B",
                quantization="4bit",
                max_tokens=2048,
                reasoning="full_capability"
            )
        
        # Default: Balanced mode
        else:
            return ModelConfig(
                model_size="4B",
                quantization="4bit",
                max_tokens=1024,
                reasoning="balanced"
            )

Validation:

Test model selection under various resource conditions
Verify performance remains acceptable in low-power mode
Measure battery life extension from adaptive strategies

Traces To: REQ-PERF-001, REQ-PERF-002 (System SRS)

12. Testing and Validation

12.1 Unit Testing

FRD-AI-TEST-001 [P0]
All AI system modules shall have unit tests with > 80% code coverage.

Test Categories:

Function correctness: Does each function produce expected outputs?
Edge cases: Handles invalid inputs gracefully
Performance: Meets latency and throughput requirements
Resource usage: Doesn't exceed memory or CPU budgets

Example Test:

def test_task_decomposition():
    """Test task decomposition produces valid micro-steps."""
    task = "Write a research paper on AI safety"
    
    steps = decompose_task(task, context={})
    
    # Assert: Multiple steps generated
    assert len(steps) > 5
    
    # Assert: Each step under 5 minutes
    for step in steps:
        assert step.estimated_time_minutes <= 5
    
    # Assert: Steps are ordered
    for i, step in enumerate(steps):
        if step.dependencies:
            for dep in step.dependencies:
                assert dep < i  # Dependency comes before
    
    # Assert: Steps are concrete (no vague language)
    vague_words = ["think about", "consider", "maybe"]
    for step in steps:
        for word in vague_words:
            assert word not in step.description.lower()

12.2 Integration Testing

FRD-AI-TEST-010 [P0]
AI system shall have end-to-end integration tests for critical paths.

Critical Paths:

Voice query → Response (full pipeline)
Document upload → RAG retrieval → Response
Emotional state detection → Tone adaptation
Task decomposition → Prioritization → Execution
Tool calling → Confirmation → Execution
Multi-turn conversation with context management

12.3 Performance Testing

FRD-AI-TEST-020 [P0]
AI system shall meet performance benchmarks under load.

Benchmarks:

Inference latency: P50 < 150ms, P95 < 200ms
RAG retrieval: < 100ms
Token throughput: > 10 tokens/sec (3B model)
Memory usage: < 5 GB for 7-8B models
Battery runtime: 5.5-8 hours under real usage

12.4 User Acceptance Testing

FRD-AI-TEST-030 [P0]
AI system shall be validated with target user personas.

Test Personas:

Sarah (accessibility user)
Marcus (tech enthusiast)
Jordan (enterprise professional)
Alex (active lifestyle)

Test Scenarios:

Daily use cases for each persona
Edge cases (confusion, errors, misunderstandings)
Stress tests (overwhelming tasks, emotional distress)

Success Metrics:

Task completion rate > 80%
User satisfaction > 4.5/5
NPS > 50

13. Appendices

13.1 Traceability Matrix

FRD Requirement	System SRS	Hardware Req	Test Case
FRD-AI-LLM-001	REQ-AI-001	REQ-HW-150	TC-AI-LLM-001
FRD-AI-RAG-001	REQ-AI-020	REQ-HW-150	TC-AI-RAG-001
FRD-AI-CRK-001	REQ-AI-030	-	TC-AI-CRK-001
FRD-AI-EMO-001	REQ-AI-040	REQ-HW-130	TC-AI-EMO-001
FRD-AI-EFF-001	REQ-AI-050	-	TC-AI-EFF-001

(Full matrix maintained separately)

13.2 Glossary

Term	Definition
CRK	Critical Reasoning Kernel - anti-hallucination system
EFF	Executive Function Framework - task management system
PFC	Prefrontal Cortex Load - measure of cognitive difficulty
RAG	Retrieval-Augmented Generation - knowledge retrieval
LLM	Large Language Model
GGUF	GPT-Generated Unified Format for quantized models
Quantization	Reducing model precision (e.g., 4-bit) for efficiency
Micro-step	Smallest possible task unit ( < 5 minutes)
Trigger	Emotional/behavioral pattern that affects user state
Domain	Isolated memory space (Finance, Work, Health, etc.)

13.3 References

Standards:

ISO/IEC 25010: Software quality model

Internal Documents:

Master PRD
System SRS
Hardware Requirements Document
GROOT FORCE Master File Volumes 1-8

External Resources:

Llama.cpp documentation
FAISS documentation
Sentence Transformers documentation

Document Approval

Approved by:

AI/ML Lead: _________________ Date: _______
Software Architect: _________________ Date: _______
Security Lead: _________________ Date: _______

END OF FRD: CORE AI SYSTEM

This FRD defines the detailed functional requirements for the AI brain of GROOT FORCE. Implementation teams use this as the specification for building the intelligence that makes GROOT FORCE unique - a human-bound, emotionally intelligent, privacy-first AI assistant.

Core AI System​

Document Control​

1. Introduction​

1.1 Purpose​

1.2 Scope​

1.3 Related Documents​

2. System Architecture Overview​

2.1 AI System Components​

2.2 Data Flow​

3. Local LLM Engine​

3.1 Model Requirements​

3.2 Context Management​

3.3 Prompt Engineering​

4. RAG (Retrieval-Augmented Generation) System​

4.1 Architecture​

4.2 Query Processing​

5. Critical Reasoning Kernel (CRK)​

5.1 Multi-Scale Reasoning​

5.2 Self-Critique and Verification​

6. Emotional Engine​

6.1 Emotional State Tracking​

7. Executive Function Framework (EFF)​

7.1 Task Decomposition​

7.2 Habit Formation​

8. Tool Calling and Skills System​

8.1 Function Calling​

8.2 Skills System​

9. Memory Architecture​

9.1 Multi-Tier Memory​

10. Anticipation Layer (Pre-Thought System)​

11. Integration and Dependencies​

11.1 Hardware Dependencies​

11.2 Performance Dependencies​

12. Testing and Validation​

12.1 Unit Testing​

12.2 Integration Testing​

12.3 Performance Testing​

12.4 User Acceptance Testing​

13. Appendices​

13.1 Traceability Matrix​

13.2 Glossary​

13.3 References​

Document Approval​

Core AI System

Document Control

1. Introduction

1.1 Purpose

1.2 Scope

1.3 Related Documents

2. System Architecture Overview

2.1 AI System Components

2.2 Data Flow

3. Local LLM Engine

3.1 Model Requirements

3.2 Context Management

3.3 Prompt Engineering

4. RAG (Retrieval-Augmented Generation) System

4.1 Architecture

4.2 Query Processing

5. Critical Reasoning Kernel (CRK)

5.1 Multi-Scale Reasoning

5.2 Self-Critique and Verification

6. Emotional Engine

6.1 Emotional State Tracking

7. Executive Function Framework (EFF)

7.1 Task Decomposition

7.2 Habit Formation

8. Tool Calling and Skills System

8.1 Function Calling

8.2 Skills System

9. Memory Architecture

9.1 Multi-Tier Memory

10. Anticipation Layer (Pre-Thought System)

11. Integration and Dependencies

11.1 Hardware Dependencies

11.2 Performance Dependencies

12. Testing and Validation

12.1 Unit Testing

12.2 Integration Testing

12.3 Performance Testing

12.4 User Acceptance Testing

13. Appendices

13.1 Traceability Matrix

13.2 Glossary

13.3 References

Document Approval