fix: Remove deprecated AI Chat Completions and Models List API implementations

fix: Remove deprecated VMH xAI Chat Completions API implementation
fix: Update ExecModule exec path to use correct binary location
2026-03-19 23:10:00 +00:00 · 2026-03-19 21:42:43 +00:00 · 2026-03-19 21:23:42 +00:00 · 2026-03-19 21:20:31 +00:00 · 2026-03-19 20:56:32 +00:00 · 2026-03-19 20:33:49 +00:00
41 changed files with 452 additions and 1089 deletions
--- a/=0.2.0
+++ b/=0.2.0
--- a/=0.3.0
+++ b/=0.3.0
@@ -1,49 +0,0 @@
 Requirement already satisfied: langchain in ./.venv/lib/python3.13/site-packages (1.2.12)
 Requirement already satisfied: langchain-xai in ./.venv/lib/python3.13/site-packages (1.2.2)
 Requirement already satisfied: langchain-core in ./.venv/lib/python3.13/site-packages (1.2.18)
 Requirement already satisfied: langgraph<1.2.0,>=1.1.1 in ./.venv/lib/python3.13/site-packages (from langchain) (1.1.2)
 Requirement already satisfied: pydantic<3.0.0,>=2.7.4 in ./.venv/lib/python3.13/site-packages (from langchain) (2.12.5)
 Requirement already satisfied: jsonpatch<2.0.0,>=1.33.0 in ./.venv/lib/python3.13/site-packages (from langchain-core) (1.33)
 Requirement already satisfied: langsmith<1.0.0,>=0.3.45 in ./.venv/lib/python3.13/site-packages (from langchain-core) (0.7.17)
 Requirement already satisfied: packaging>=23.2.0 in ./.venv/lib/python3.13/site-packages (from langchain-core) (26.0)
 Requirement already satisfied: pyyaml<7.0.0,>=5.3.0 in ./.venv/lib/python3.13/site-packages (from langchain-core) (6.0.3)
 Requirement already satisfied: tenacity!=8.4.0,<10.0.0,>=8.1.0 in ./.venv/lib/python3.13/site-packages (from langchain-core) (9.1.4)
 Requirement already satisfied: typing-extensions<5.0.0,>=4.7.0 in ./.venv/lib/python3.13/site-packages (from langchain-core) (4.15.0)
 Requirement already satisfied: uuid-utils<1.0,>=0.12.0 in ./.venv/lib/python3.13/site-packages (from langchain-core) (0.14.1)
 Requirement already satisfied: jsonpointer>=1.9 in ./.venv/lib/python3.13/site-packages (from jsonpatch<2.0.0,>=1.33.0->langchain-core) (3.0.0)
 Requirement already satisfied: langgraph-checkpoint<5.0.0,>=2.1.0 in ./.venv/lib/python3.13/site-packages (from langgraph<1.2.0,>=1.1.1->langchain) (4.0.1)
 Requirement already satisfied: langgraph-prebuilt<1.1.0,>=1.0.8 in ./.venv/lib/python3.13/site-packages (from langgraph<1.2.0,>=1.1.1->langchain) (1.0.8)
 Requirement already satisfied: langgraph-sdk<0.4.0,>=0.3.0 in ./.venv/lib/python3.13/site-packages (from langgraph<1.2.0,>=1.1.1->langchain) (0.3.11)
 Requirement already satisfied: xxhash>=3.5.0 in ./.venv/lib/python3.13/site-packages (from langgraph<1.2.0,>=1.1.1->langchain) (3.6.0)
 Requirement already satisfied: ormsgpack>=1.12.0 in ./.venv/lib/python3.13/site-packages (from langgraph-checkpoint<5.0.0,>=2.1.0->langgraph<1.2.0,>=1.1.1->langchain) (1.12.2)
 Requirement already satisfied: httpx>=0.25.2 in ./.venv/lib/python3.13/site-packages (from langgraph-sdk<0.4.0,>=0.3.0->langgraph<1.2.0,>=1.1.1->langchain) (0.28.1)
 Requirement already satisfied: orjson>=3.11.5 in ./.venv/lib/python3.13/site-packages (from langgraph-sdk<0.4.0,>=0.3.0->langgraph<1.2.0,>=1.1.1->langchain) (3.11.7)
 Requirement already satisfied: requests-toolbelt>=1.0.0 in ./.venv/lib/python3.13/site-packages (from langsmith<1.0.0,>=0.3.45->langchain-core) (1.0.0)
 Requirement already satisfied: requests>=2.0.0 in ./.venv/lib/python3.13/site-packages (from langsmith<1.0.0,>=0.3.45->langchain-core) (2.32.5)
 Requirement already satisfied: zstandard>=0.23.0 in ./.venv/lib/python3.13/site-packages (from langsmith<1.0.0,>=0.3.45->langchain-core) (0.25.0)
 Requirement already satisfied: anyio in ./.venv/lib/python3.13/site-packages (from httpx>=0.25.2->langgraph-sdk<0.4.0,>=0.3.0->langgraph<1.2.0,>=1.1.1->langchain) (4.12.1)
 Requirement already satisfied: certifi in ./.venv/lib/python3.13/site-packages (from httpx>=0.25.2->langgraph-sdk<0.4.0,>=0.3.0->langgraph<1.2.0,>=1.1.1->langchain) (2026.2.25)
 Requirement already satisfied: httpcore==1.* in ./.venv/lib/python3.13/site-packages (from httpx>=0.25.2->langgraph-sdk<0.4.0,>=0.3.0->langgraph<1.2.0,>=1.1.1->langchain) (1.0.9)
 Requirement already satisfied: idna in ./.venv/lib/python3.13/site-packages (from httpx>=0.25.2->langgraph-sdk<0.4.0,>=0.3.0->langgraph<1.2.0,>=1.1.1->langchain) (3.11)
 Requirement already satisfied: h11>=0.16 in ./.venv/lib/python3.13/site-packages (from httpcore==1.*->httpx>=0.25.2->langgraph-sdk<0.4.0,>=0.3.0->langgraph<1.2.0,>=1.1.1->langchain) (0.16.0)
 Requirement already satisfied: annotated-types>=0.6.0 in ./.venv/lib/python3.13/site-packages (from pydantic<3.0.0,>=2.7.4->langchain) (0.7.0)
 Requirement already satisfied: pydantic-core==2.41.5 in ./.venv/lib/python3.13/site-packages (from pydantic<3.0.0,>=2.7.4->langchain) (2.41.5)
 Requirement already satisfied: typing-inspection>=0.4.2 in ./.venv/lib/python3.13/site-packages (from pydantic<3.0.0,>=2.7.4->langchain) (0.4.2)
 Requirement already satisfied: aiohttp<4.0.0,>=3.9.1 in ./.venv/lib/python3.13/site-packages (from langchain-xai) (3.13.3)
 Requirement already satisfied: langchain-openai<2.0.0,>=1.1.7 in ./.venv/lib/python3.13/site-packages (from langchain-xai) (1.1.11)
 Requirement already satisfied: aiohappyeyeballs>=2.5.0 in ./.venv/lib/python3.13/site-packages (from aiohttp<4.0.0,>=3.9.1->langchain-xai) (2.6.1)
 Requirement already satisfied: aiosignal>=1.4.0 in ./.venv/lib/python3.13/site-packages (from aiohttp<4.0.0,>=3.9.1->langchain-xai) (1.4.0)
 Requirement already satisfied: attrs>=17.3.0 in ./.venv/lib/python3.13/site-packages (from aiohttp<4.0.0,>=3.9.1->langchain-xai) (25.4.0)
 Requirement already satisfied: frozenlist>=1.1.1 in ./.venv/lib/python3.13/site-packages (from aiohttp<4.0.0,>=3.9.1->langchain-xai) (1.8.0)
 Requirement already satisfied: multidict<7.0,>=4.5 in ./.venv/lib/python3.13/site-packages (from aiohttp<4.0.0,>=3.9.1->langchain-xai) (6.7.1)
 Requirement already satisfied: propcache>=0.2.0 in ./.venv/lib/python3.13/site-packages (from aiohttp<4.0.0,>=3.9.1->langchain-xai) (0.4.1)
 Requirement already satisfied: yarl<2.0,>=1.17.0 in ./.venv/lib/python3.13/site-packages (from aiohttp<4.0.0,>=3.9.1->langchain-xai) (1.22.0)
 Requirement already satisfied: openai<3.0.0,>=2.26.0 in ./.venv/lib/python3.13/site-packages (from langchain-openai<2.0.0,>=1.1.7->langchain-xai) (2.26.0)
 Requirement already satisfied: tiktoken<1.0.0,>=0.7.0 in ./.venv/lib/python3.13/site-packages (from langchain-openai<2.0.0,>=1.1.7->langchain-xai) (0.12.0)
 Requirement already satisfied: distro<2,>=1.7.0 in ./.venv/lib/python3.13/site-packages (from openai<3.0.0,>=2.26.0->langchain-openai<2.0.0,>=1.1.7->langchain-xai) (1.9.0)
 Requirement already satisfied: jiter<1,>=0.10.0 in ./.venv/lib/python3.13/site-packages (from openai<3.0.0,>=2.26.0->langchain-openai<2.0.0,>=1.1.7->langchain-xai) (0.13.0)
 Requirement already satisfied: sniffio in ./.venv/lib/python3.13/site-packages (from openai<3.0.0,>=2.26.0->langchain-openai<2.0.0,>=1.1.7->langchain-xai) (1.3.1)
 Requirement already satisfied: tqdm>4 in ./.venv/lib/python3.13/site-packages (from openai<3.0.0,>=2.26.0->langchain-openai<2.0.0,>=1.1.7->langchain-xai) (4.67.3)
 Requirement already satisfied: charset_normalizer<4,>=2 in ./.venv/lib/python3.13/site-packages (from requests>=2.0.0->langsmith<1.0.0,>=0.3.45->langchain-core) (3.4.4)
 Requirement already satisfied: urllib3<3,>=1.21.1 in ./.venv/lib/python3.13/site-packages (from requests>=2.0.0->langsmith<1.0.0,>=0.3.45->langchain-core) (2.6.3)
 Requirement already satisfied: regex>=2022.1.18 in ./.venv/lib/python3.13/site-packages (from tiktoken<1.0.0,>=0.7.0->langchain-openai<2.0.0,>=1.1.7->langchain-xai) (2026.2.28)
--- a/docs/INDEX.md
+++ b/docs/INDEX.md
@@ -3,6 +3,7 @@
 > **For AI Assistants**: This document contains all critical patterns, conventions, and best practices. Read this first to understand the codebase structure and ensure consistency.
 **Quick Navigation:**
 - [iii Platform & Development Workflow](#iii-platform--development-workflow) - Platform evolution and CLI tools
 - [Core Concepts](#core-concepts) - System architecture and patterns
 - [Design Principles](#design-principles) - Event Storm & Bidirectional References
 - [Step Development](#step-development-best-practices) - How to create new steps
@@ -23,6 +24,244 @@
 ---
 ## iii Platform & Development Workflow
 ### Platform Evolution (v0.8 → v0.9+)
 **Status:** March 2026 - iii v0.9+ production-ready
 iii has evolved from an all-in-one development tool to a **modular, production-grade event engine** with clear separation between development and deployment workflows.
 #### Structural Changes Overview
 | Component | Before (v0.2-v0.7) | Now (v0.9+) | Impact |
 |-----------|-------------------|-------------|--------|
 | **Console/Dashboard** | Integrated in engine process (port 3111) | Separate process (`iii-cli console` or `dev`) | More flexibility, less resource overhead, better scaling |
 | **CLI Tool** | Minimal or non-existent | `iii-cli` is the central dev tool | Terminal-based dev workflow, scriptable, faster iteration |
 | **Project Structure** | Steps anywhere in project | **Recommended:** `src/` + `src/steps/` | Cleaner structure, reliable hot-reload |
 | **Hot-Reload/Watcher** | Integrated in engine | Separate `shell::ExecModule` with `watch` paths | Only Python/TS files watched (configurable) |
 | **Start & Services** | Single `iii` process | Engine (`iii` or `iii-cli start`) + Console separate | Better for production (engine) vs dev (console) |
 | **Config Handling** | YAML + ENV | YAML + ENV + CLI flags prioritized | More control via CLI flags |
 | **Observability** | Basic | Enhanced (OTel, Rollups, Alerts, Traces) | Production-ready telemetry |
 | **Streams & State** | KV-Store (file/memory) | More adapters + file_based default | Better persistence handling |
 **Key Takeaway:** iii is now a **modular, production-ready engine** where development (CLI + separate console) is clearly separated from production deployment.
 ---
 ### Development Workflow with iii-cli
 **`iii-cli` is your primary tool for local development, debugging, and testing.**
 #### Essential Commands
 | Command | Purpose | When to Use | Example |
 |---------|---------|------------|---------|
 | `iii-cli dev` | Start dev server with hot-reload + integrated console | Local development, immediate feedback on code changes | `iii-cli dev` |
 | `iii-cli console` | Start dashboard only (separate port) | When you only need the console (no dev reload) | `iii-cli console --host 0.0.0.0 --port 3113` |
 | `iii-cli start` | Start engine standalone (like `motia.service`) | Testing engine in isolation | `iii-cli start -c iii-config.yaml` |
 | `iii-cli logs` | Live logs of all flows/workers/triggers | Debugging, error investigation | `iii-cli logs --level debug` |
 | `iii-cli trace <id>` | Show detailed trace information (OTel) | Debug specific request/flow | `iii-cli trace abc123` |
 | `iii-cli state ls` | List states (KV storage) | Verify state persistence | `iii-cli state ls` |
 | `iii-cli state get` | Get specific state value | Inspect state content | `iii-cli state get key` |
 | `iii-cli stream ls` | List all streams + groups | Inspect stream/websocket connections | `iii-cli stream ls` |
 | `iii-cli flow list` | Show all registered flows/triggers | Overview of active steps & endpoints | `iii-cli flow list` |
 | `iii-cli worker logs` | Worker logs (Python/TS execution) | Debug issues in step handlers | `iii-cli worker logs` |
 #### Typical Development Workflow
 ```bash
 # 1. Navigate to project
 cd /opt/motia-iii/bitbylaw
 # 2. Start dev mode (hot-reload + console on port 3113)
 iii-cli dev --host 0.0.0.0 --port 3113 --engine-port 3111
 # Alternative: Separate engine + console
 # Terminal 1:
 iii-cli start -c iii-config.yaml
 # Terminal 2:
 iii-cli console --host 0.0.0.0 --port 3113 \
  --engine-host 192.168.1.62 --engine-port 3111
 # 3. Watch logs live (separate terminal)
 iii-cli logs -f
 # 4. Debug specific trace
 iii-cli trace <trace-id-from-logs>
 # 5. Inspect state
 iii-cli state ls
 iii-cli state get document:sync:status
 # 6. Verify flows registered
 iii-cli flow list
 ```
 #### Development vs. Production
 **Development:**
 - Use `iii-cli dev` for hot-reload
 - Console accessible on localhost:3113
 - Logs visible in terminal
 - Immediate feedback on code changes
 **Production:**
 - `systemd` service runs `iii-cli start`
 - Console runs separately (if needed)
 - Logs via `journalctl -u motia.service -f`
 - No hot-reload (restart service for changes)
 **Example Production Service:**
 ```ini
 [Unit]
 Description=Motia III Engine
 After=network.target redis.service
 [Service]
 Type=simple
 User=motia
 WorkingDirectory=/opt/motia-iii/bitbylaw
 ExecStart=/usr/local/bin/iii-cli start -c /opt/motia-iii/bitbylaw/iii-config.yaml
 Restart=always
 RestartSec=10
 Environment="PATH=/usr/local/bin:/usr/bin"
 [Install]
 WantedBy=multi-user.target
 ```
 #### Project Structure Best Practices
 **Recommended Structure (v0.9+):**
 ```
 bitbylaw/
 ├── iii-config.yaml           # Main configuration
 ├── src/                      # Source code root
 │   └── steps/                # All steps here (hot-reload reliable)
 │       ├── __init__.py
 │       ├── vmh/
 │       │   ├── __init__.py
 │       │   ├── document_sync_event_step.py
 │       │   └── webhook/
 │       │       ├── __init__.py
 │       │       └── document_create_api_step.py
 │       └── advoware_proxy/
 │           └── ...
 ├── services/                 # Shared business logic
 │   ├── __init__.py
 │   ├── xai_service.py
 │   ├── espocrm.py
 │   └── ...
 └── tests/                    # Test files
 ```
 **Why `src/steps/` is recommended:**
 - **Hot-reload works reliably** - Watcher detects changes correctly
 - **Cleaner project** - Source code isolated from config/docs
 - **IDE support** - Better navigation and refactoring
 - **Deployment** - Easier to package
 **Note:** Old structure (steps in root) still works, but hot-reload may be less reliable.
 #### Hot-Reload Configuration
 **Hot-reload is configured via `shell::ExecModule` in `iii-config.yaml`:**
 ```yaml
 modules:
  - type: shell::ExecModule
    config:
      watch:
        - "src/**/*.py"      # Watch Python files in src/
        - "services/**/*.py" # Watch service files
        # Add more patterns as needed
      ignore:
        - "**/__pycache__/**"
        - "**/*.pyc"
        - "**/tests/**"
 ```
 **Behavior:**
 - Only files matching `watch` patterns trigger reload
 - Changes in `ignore` patterns are ignored
 - Reload is automatic in `iii-cli dev` mode
 - Production mode (`iii-cli start`) does NOT watch files
 ---
 ### Observability & Debugging
 #### OpenTelemetry Integration
 **iii v0.9+ has built-in OpenTelemetry support:**
 ```python
 # Traces are automatically created for:
 # - HTTP requests
 # - Queue processing
 # - Cron execution
 # - Service calls (if instrumented)
 # Access trace ID in handler:
 async def handler(request: ApiRequest, ctx: FlowContext) -> ApiResponse:
    trace_id = ctx.trace_id  # Use for debugging
    ctx.logger.info(f"Trace ID: {trace_id}")
 ```
 **View traces:**
 ```bash
 # Get trace details
 iii-cli trace <trace-id>
 # Filter logs by trace
 iii-cli logs --trace <trace-id>
 ```
 #### Debugging Workflow
 **1. Live Logs:**
 ```bash
 # All logs
 iii-cli logs -f
 # Specific level
 iii-cli logs --level error
 # With grep
 iii-cli logs -f | grep "document_sync"
 ```
 **2. State Inspection:**
 ```bash
 # List all state keys
 iii-cli state ls
 # Get specific state
 iii-cli state get sync:document:last_run
 ```
 **3. Flow Verification:**
 ```bash
 # List all registered flows
 iii-cli flow list
 # Verify endpoint exists
 iii-cli flow list | grep "/vmh/webhook"
 ```
 **4. Worker Issues:**
 ```bash
 # Worker-specific logs
 iii-cli worker logs
 # Check worker health
 iii-cli worker status
 ```
 ---
 ## Core Concepts
 ### System Overview
@@ -1271,24 +1510,41 @@ sudo systemctl enable motia.service
 sudo systemctl enable iii-console.service
 ```
-**Manual (Development):**
+**Development (iii-cli):**
 ```bash
-# Start iii Engine
+# Option 1: Dev mode with integrated console and hot-reload
 cd /opt/motia-iii/bitbylaw
-/opt/bin/iii -c iii-config.yaml
+iii-cli dev --host 0.0.0.0 --port 3113 --engine-port 3111
-# Start iii Console (Web UI)
+# Option 2: Separate engine and console
-/opt/bin/iii-console --enable-flow --host 0.0.0.0 --port 3113 \
+# Terminal 1: Start engine
-  --engine-host 192.168.67.233 --engine-port 3111 --ws-port 3114
+iii-cli start -c iii-config.yaml
 # Terminal 2: Start console
 iii-cli console --host 0.0.0.0 --port 3113 \
  --engine-host 192.168.1.62 --engine-port 3111
 # Option 3: Manual (legacy)
 /opt/bin/iii -c iii-config.yaml
 ```
 ### Check Registered Steps
 **Using iii-cli (recommended):**
 ```bash
 # List all flows and triggers
 iii-cli flow list
 # Filter for specific step
 iii-cli flow list | grep document_sync
 ```
 **Using curl (legacy):**
 ```bash
 curl http://localhost:3111/_console/functions | python3 -m json.tool
 ```
-### Test HTTP Endpoint
+### Test HTTP Endpoints
 ```bash
 # Test document webhook
@@ -1298,6 +1554,11 @@ curl -X POST "http://localhost:3111/vmh/webhook/document/create" \
 # Test advoware proxy
 curl "http://localhost:3111/advoware/proxy?endpoint=employees"
 # Test beteiligte sync
 curl -X POST "http://localhost:3111/vmh/webhook/beteiligte/create" \
  -H "Content-Type: application/json" \
  -d '{"entity_type": "CBeteiligte", "entity_id": "abc123", "action": "create"}'
 ```
 ### Manually Trigger Cron
@@ -1308,36 +1569,208 @@ curl -X POST "http://localhost:3111/_console/cron/trigger" \
  -d '{"function_id": "steps::VMH Beteiligte Sync Cron::trigger::0"}'
 ```
-### View Logs
+### View and Debug Logs
 **Using iii-cli (recommended):**
 ```bash
-# Live logs via journalctl
+# Live logs (all)
-journalctl -u motia-iii -f
+iii-cli logs -f
 # Live logs with specific level
 iii-cli logs -f --level error
 iii-cli logs -f --level debug
 # Filter by component
 iii-cli logs -f | grep "document_sync"
 # Worker-specific logs
 iii-cli worker logs
 # Get specific trace
 iii-cli trace <trace-id>
 # Filter logs by trace ID
 iii-cli logs --trace <trace-id>
 ```
 **Using journalctl (production):**
 ```bash
 # Live logs
 journalctl -u motia.service -f
 # Search for specific step
-journalctl --since "today" | grep -i "document sync"
+journalctl -u motia.service --since "today" | grep -i "document sync"
 # Show errors only
 journalctl -u motia.service -p err -f
 # Last 100 lines
 journalctl -u motia.service -n 100
 # Specific time range
 journalctl -u motia.service --since "2026-03-19 10:00" --until "2026-03-19 11:00"
 ```
 **Using log files (legacy):**
 ```bash
 # Check for errors
 tail -100 /opt/motia-iii/bitbylaw/iii_new.log | grep -i error
 # Follow log file
 tail -f /opt/motia-iii/bitbylaw/iii_new.log
 ```
 ### Inspect State and Streams
 **State Management:**
 ```bash
 # List all state keys
 iii-cli state ls
 # Get specific state value
 iii-cli state get document:sync:last_run
 # Set state (if needed for testing)
 iii-cli state set test:key "test value"
 # Delete state
 iii-cli state delete test:key
 ```
 **Stream Management:**
 ```bash
 # List all active streams
 iii-cli stream ls
 # Inspect specific stream
 iii-cli stream info <stream-id>
 # List consumer groups
 iii-cli stream groups <stream-name>
 ```
 ### Debugging Workflow
 **1. Identify the Issue:**
 ```bash
 # Check if step is registered
 iii-cli flow list | grep my_step
 # View recent errors
 iii-cli logs --level error -n 50
 # Check service status
 sudo systemctl status motia.service
 ```
 **2. Get Detailed Information:**
 ```bash
 # Live tail logs for specific step
 iii-cli logs -f | grep "document_sync"
 # Check worker processes
 iii-cli worker logs
 # Inspect state
 iii-cli state ls
 ```
 **3. Test Specific Functionality:**
 ```bash
 # Trigger webhook manually
 curl -X POST http://localhost:3111/vmh/webhook/...
 # Check response and logs
 iii-cli logs -f | grep "webhook"
 # Verify state changed
 iii-cli state get entity:sync:status
 ```
 **4. Trace Specific Request:**
 ```bash
 # Make request, note trace ID from logs
 curl -X POST http://localhost:3111/vmh/webhook/document/create ...
 # Get full trace
 iii-cli trace <trace-id>
 # View all logs for this trace
 iii-cli logs --trace <trace-id>
 ```
 ### Performance Monitoring
 **Check System Resources:**
 ```bash
 # CPU and memory usage
 htop
 # Process-specific
 ps aux | grep iii
 # Redis memory
 redis-cli info memory
 # File descriptors
 lsof -p $(pgrep -f "iii-cli start")
 ```
 **Check Processing Metrics:**
 ```bash
 # Queue lengths (if using Redis streams)
 redis-cli XINFO STREAM vmh:document:sync
 # Pending messages
 redis-cli XPENDING vmh:document:sync group1
 # Lock status
 redis-cli KEYS "lock:*"
 ```
 ### Common Issues
 **Step not showing up:**
 1. Check file naming: Must end with `_step.py`
-2. Check for import errors: `grep -i "importerror\|traceback" iii.log`
+2. Check for syntax errors: `iii-cli logs --level error`
-3. Verify `config` dict is present
+3. Check for import errors: `iii-cli logs | grep -i "importerror\|traceback"`
-4. Restart iii engine
+4. Verify `config` dict is present
 5. Restart: `sudo systemctl restart motia.service` or restart `iii-cli dev`
 6. Verify hot-reload working: Check terminal output in `iii-cli dev`
 **Redis connection failed:**
 - Check `REDIS_HOST` and `REDIS_PORT` environment variables
 - Verify Redis is running: `redis-cli ping`
 - Check Redis logs: `journalctl -u redis -f`
 - Service will work without Redis but with warnings
 **Hot-reload not working:**
 - Verify using `iii-cli dev` (not `iii-cli start`)
 - Check `watch` patterns in `iii-config.yaml`
 - Ensure files are in watched directories (`src/**/*.py`)
 - Look for watcher errors: `iii-cli logs | grep -i "watch"`
 **Handler not triggered:**
 - Verify endpoint registered: `iii-cli flow list`
 - Check HTTP method matches (GET, POST, etc.)
 - Test with curl to isolate issue
 - Check trigger configuration in step's `config` dict
 **AttributeError '_log' not found:**
 - Ensure service inherits from `BaseSyncUtils` OR
 - Implement `_log()` method manually
 **Trace not found:**
 - Ensure OpenTelemetry enabled in config
 - Check if trace ID is valid format
 - Use `iii-cli logs` with filters instead
 **Console not accessible:**
 - Check if console service running: `systemctl status iii-console.service`
 - Verify port not blocked by firewall: `sudo ufw status`
 - Check console logs: `journalctl -u iii-console.service -f`
 - Try accessing via `localhost:3113` instead of public IP
 ---
 ## Key Patterns Summary
--- a/iii-config.yaml
+++ b/iii-config.yaml
@@ -78,6 +78,6 @@ modules:
  - class: modules::shell::ExecModule
    config:
      watch:
-        - steps/**/*.py
+        - src/steps/**/*.py
      exec:
-        - /opt/bin/uv run python -m motia.cli run --dir steps
+        - /usr/local/bin/uv run python -m motia.cli run --dir src/steps
--- a/services/langchain_xai_service.py
+++ b/services/langchain_xai_service.py
@@ -17,7 +17,7 @@ class LangChainXAIService:
    Usage:
        service = LangChainXAIService(ctx)
-        model = service.get_chat_model(model="grok-2-latest")
+        model = service.get_chat_model(model="grok-4-1-fast-reasoning")
        model_with_tools = service.bind_file_search(model, collection_id)
        result = await service.invoke_chat(model_with_tools, messages)
    """
@@ -46,7 +46,7 @@ class LangChainXAIService:
    def get_chat_model(
        self,
-        model: str = "grok-2-latest",
+        model: str = "grok-4-1-fast-reasoning",
        temperature: float = 0.7,
        max_tokens: Optional[int] = None
    ):
@@ -54,7 +54,7 @@ class LangChainXAIService:
        Initialisiert ChatXAI Model.
        Args:
-            model: Model name (default: grok-2-latest)
+            model: Model name (default: grok-4-1-fast-reasoning)
            temperature: Sampling temperature 0.0-1.0
            max_tokens: Optional max tokens for response
--- a/src/steps/init.py
+++ b/src/steps/init.py
--- a/src/steps/advoware_cal_sync/README.md
+++ b/src/steps/advoware_cal_sync/README.md
--- a/src/steps/advoware_cal_sync/init.py
+++ b/src/steps/advoware_cal_sync/init.py
--- a/src/steps/advoware_cal_sync/calendar_sync_all_step.py
+++ b/src/steps/advoware_cal_sync/calendar_sync_all_step.py
--- a/src/steps/advoware_cal_sync/calendar_sync_api_step.py
+++ b/src/steps/advoware_cal_sync/calendar_sync_api_step.py
--- a/src/steps/advoware_cal_sync/calendar_sync_cron_step.py
+++ b/src/steps/advoware_cal_sync/calendar_sync_cron_step.py
--- a/src/steps/advoware_cal_sync/calendar_sync_event_step.py
+++ b/src/steps/advoware_cal_sync/calendar_sync_event_step.py
--- a/src/steps/advoware_cal_sync/calendar_sync_utils.py
+++ b/src/steps/advoware_cal_sync/calendar_sync_utils.py
--- a/src/steps/advoware_proxy/README.md
+++ b/src/steps/advoware_proxy/README.md
--- a/src/steps/advoware_proxy/init.py
+++ b/src/steps/advoware_proxy/init.py
--- a/src/steps/advoware_proxy/advoware_api_proxy_delete_step.py
+++ b/src/steps/advoware_proxy/advoware_api_proxy_delete_step.py
--- a/src/steps/advoware_proxy/advoware_api_proxy_get_step.py
+++ b/src/steps/advoware_proxy/advoware_api_proxy_get_step.py
--- a/src/steps/advoware_proxy/advoware_api_proxy_post_step.py
+++ b/src/steps/advoware_proxy/advoware_api_proxy_post_step.py
--- a/src/steps/advoware_proxy/advoware_api_proxy_put_step.py
+++ b/src/steps/advoware_proxy/advoware_api_proxy_put_step.py
--- a/src/steps/vmh/init.py
+++ b/src/steps/vmh/init.py
--- a/src/steps/vmh/aiknowledge_full_sync_cron_step.py
+++ b/src/steps/vmh/aiknowledge_full_sync_cron_step.py
--- a/src/steps/vmh/aiknowledge_sync_event_step.py
+++ b/src/steps/vmh/aiknowledge_sync_event_step.py
--- a/src/steps/vmh/bankverbindungen_sync_event_step.py
+++ b/src/steps/vmh/bankverbindungen_sync_event_step.py
--- a/src/steps/vmh/beteiligte_sync_cron_step.py
+++ b/src/steps/vmh/beteiligte_sync_cron_step.py
--- a/src/steps/vmh/beteiligte_sync_event_step.py
+++ b/src/steps/vmh/beteiligte_sync_event_step.py
--- a/src/steps/vmh/document_sync_event_step.py
+++ b/src/steps/vmh/document_sync_event_step.py
--- a/src/steps/vmh/webhook/init.py
+++ b/src/steps/vmh/webhook/init.py
--- a/src/steps/vmh/webhook/aiknowledge_update_api_step.py
+++ b/src/steps/vmh/webhook/aiknowledge_update_api_step.py
--- a/src/steps/vmh/webhook/bankverbindungen_create_api_step.py
+++ b/src/steps/vmh/webhook/bankverbindungen_create_api_step.py
--- a/src/steps/vmh/webhook/bankverbindungen_delete_api_step.py
+++ b/src/steps/vmh/webhook/bankverbindungen_delete_api_step.py
--- a/src/steps/vmh/webhook/bankverbindungen_update_api_step.py
+++ b/src/steps/vmh/webhook/bankverbindungen_update_api_step.py
--- a/src/steps/vmh/webhook/beteiligte_create_api_step.py
+++ b/src/steps/vmh/webhook/beteiligte_create_api_step.py
--- a/src/steps/vmh/webhook/beteiligte_delete_api_step.py
+++ b/src/steps/vmh/webhook/beteiligte_delete_api_step.py
--- a/src/steps/vmh/webhook/beteiligte_update_api_step.py
+++ b/src/steps/vmh/webhook/beteiligte_update_api_step.py
--- a/src/steps/vmh/webhook/document_create_api_step.py
+++ b/src/steps/vmh/webhook/document_create_api_step.py
--- a/src/steps/vmh/webhook/document_delete_api_step.py
+++ b/src/steps/vmh/webhook/document_delete_api_step.py
--- a/src/steps/vmh/webhook/document_update_api_step.py
+++ b/src/steps/vmh/webhook/document_update_api_step.py
--- a/steps/ai/init.py
+++ b/steps/ai/init.py
--- a/steps/ai/chat_completions_api_step.py
+++ b/steps/ai/chat_completions_api_step.py
@@ -1,374 +0,0 @@
 """AI Chat Completions API
 OpenAI-compatible Chat Completions endpoint with xAI/LangChain backend.
 Features:
 - File Search (RAG) via xAI Collections  
 - Web Search via xAI web_search tool
 - Aktenzeichen-based automatic collection lookup
 - Multiple tools simultaneously
 - Clean, reusable architecture for future LLM endpoints
 Note: Streaming is not supported (Motia limitation - returns clear error).
 Reusability:
 - extract_request_params(): Parse requests for any LLM endpoint
 - resolve_collection_id(): Auto-detect Aktenzeichen, lookup collection
 - initialize_model_with_tools(): Bind tools to any LangChain model
 - invoke_and_format_response(): Standard OpenAI response formatting
 """
 import time
 from typing import Any, Dict, List, Optional
 from motia import FlowContext, http, ApiRequest, ApiResponse
 config = {
    "name": "AI Chat Completions API",
    "description": "OpenAI-compatible Chat Completions API with xAI backend",
    "flows": ["ai-general"],
    "triggers": [
        http("POST", "/ai/v1/chat/completions"),
        http("POST", "/v1/chat/completions")
    ],
 }
 # ============================================================================
 # MAIN HANDLER
 # ============================================================================
 async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
    """
    OpenAI-compatible Chat Completions endpoint.
    Returns:
        ApiResponse with chat completion or error
    """
    ctx.logger.info("=" * 80)
    ctx.logger.info("🤖 AI Chat Completions API")
    ctx.logger.info("=" * 80)
    try:
        # 1. Parse and validate request
        params = extract_request_params(request, ctx)
        # 2. Check streaming (not supported)
        if params['stream']:
            return ApiResponse(
                status=501,
                body={
                    'error': {
                        'message': 'Streaming is not supported. Please set stream=false.',
                        'type': 'not_implemented',
                        'param': 'stream'
                    }
                }
            )
        # 3. Resolve collection (explicit ID or Aktenzeichen lookup)
        collection_id = await resolve_collection_id(
            params['collection_id'],
            params['messages'],
            params['enable_web_search'],
            ctx
        )
        # 4. Validate: collection or web_search required
        if not collection_id and not params['enable_web_search']:
            return ApiResponse(
                status=400,
                body={
                    'error': {
                        'message': 'Either collection_id or enable_web_search must be provided',
                        'type': 'invalid_request_error'
                    }
                }
            )
        # 5. Initialize LLM with tools
        model_with_tools = await initialize_model_with_tools(
            model_name=params['model'],
            temperature=params['temperature'],
            max_tokens=params['max_tokens'],
            collection_id=collection_id,
            enable_web_search=params['enable_web_search'],
            web_search_config=params['web_search_config'],
            ctx=ctx
        )
        # 6. Invoke LLM
        completion_id = f"chatcmpl-{int(time.time())}"
        response = await invoke_and_format_response(
            model=model_with_tools,
            messages=params['messages'],
            completion_id=completion_id,
            model_name=params['model'],
            ctx=ctx
        )
        ctx.logger.info(f"✅ Completion successful – {len(response.body['choices'][0]['message']['content'])} chars")
        return response
    except ValueError as e:
        ctx.logger.error(f"❌ Validation error: {e}")
        return ApiResponse(
            status=400,
            body={'error': {'message': str(e), 'type': 'invalid_request_error'}}
        )
    except Exception as e:
        ctx.logger.error(f"❌ Error: {e}")
        return ApiResponse(
            status=500,
            body={'error': {'message': 'Internal server error', 'type': 'server_error'}}
        )
 # ============================================================================
 # REUSABLE HELPER FUNCTIONS
 # ============================================================================
 def extract_request_params(request: ApiRequest, ctx: FlowContext) -> Dict[str, Any]:
    """
    Extract and validate request parameters.
    Returns:
        Dict with validated parameters
    Raises:
        ValueError: If validation fails
    """
    body = request.body or {}
    if not isinstance(body, dict):
        raise ValueError("Request body must be JSON object")
    messages = body.get('messages', [])
    if not messages or not isinstance(messages, list):
        raise ValueError("messages must be non-empty array")
    # Extract parameters with defaults
    params = {
        'model': body.get('model', 'grok-4-1-fast-reasoning'),
        'messages': messages,
        'temperature': body.get('temperature', 0.7),
        'max_tokens': body.get('max_tokens'),
        'stream': body.get('stream', False),
        'extra_body': body.get('extra_body', {}),
    }
    # Handle enable_web_search (body or extra_body)
    params['enable_web_search'] = body.get(
        'enable_web_search',
        params['extra_body'].get('enable_web_search', False)
    )
    # Handle web_search_config
    params['web_search_config'] = body.get(
        'web_search_config',
        params['extra_body'].get('web_search_config', {})
    )
    # Handle collection_id (multiple sources)
    params['collection_id'] = (
        body.get('collection_id') or
        body.get('custom_collection_id') or
        params['extra_body'].get('collection_id')
    )
    # Log concisely
    ctx.logger.info(f"📋 Model: {params['model']} | Stream: {params['stream']}")
    ctx.logger.info(f"📋 Web Search: {params['enable_web_search']} | Collection: {params['collection_id'] or 'auto'}")
    ctx.logger.info(f"📨 Messages: {len(messages)}")
    return params
 async def resolve_collection_id(
    explicit_collection_id: Optional[str],
    messages: List[Dict[str, Any]],
    enable_web_search: bool,
    ctx: FlowContext
 ) -> Optional[str]:
    """
    Resolve collection ID from explicit ID or Aktenzeichen auto-detection.
    Args:
        explicit_collection_id: Explicitly provided collection ID
        messages: Chat messages (for Aktenzeichen extraction)
        enable_web_search: Whether web search is enabled
        ctx: Motia context
    Returns:
        Collection ID or None
    """
    # Explicit collection ID takes precedence
    if explicit_collection_id:
        ctx.logger.info(f"🔍 Using explicit collection: {explicit_collection_id}")
        return explicit_collection_id
    # Try Aktenzeichen auto-detection from first user message
    from services.aktenzeichen_utils import (
        extract_aktenzeichen,
        normalize_aktenzeichen,
        remove_aktenzeichen
    )
    for msg in messages:
        if msg.get('role') == 'user':
            content = msg.get('content', '')
            aktenzeichen_raw = extract_aktenzeichen(content)
            if aktenzeichen_raw:
                aktenzeichen = normalize_aktenzeichen(aktenzeichen_raw)
                ctx.logger.info(f"🔍 Aktenzeichen detected: {aktenzeichen}")
                collection_id = await lookup_collection_by_aktenzeichen(aktenzeichen, ctx)
                if collection_id:
                    # Clean Aktenzeichen from message
                    msg['content'] = remove_aktenzeichen(content)
                    ctx.logger.info(f"✅ Collection found: {collection_id}")
                    return collection_id
                else:
                    ctx.logger.warning(f"⚠️  No collection for Aktenzeichen: {aktenzeichen}")
            break  # Only check first user message
    return None
 async def initialize_model_with_tools(
    model_name: str,
    temperature: float,
    max_tokens: Optional[int],
    collection_id: Optional[str],
    enable_web_search: bool,
    web_search_config: Dict[str, Any],
    ctx: FlowContext
 ) -> Any:
    """
    Initialize LangChain model with tool bindings (file_search, web_search).
    Returns:
        Model instance with tools bound
    """
    from services.langchain_xai_service import LangChainXAIService
    service = LangChainXAIService(ctx)
    # Create base model
    model = service.get_chat_model(
        model=model_name,
        temperature=temperature,
        max_tokens=max_tokens
    )
    # Bind tools
    model_with_tools = service.bind_tools(
        model=model,
        collection_id=collection_id,
        enable_web_search=enable_web_search,
        web_search_config=web_search_config,
        max_num_results=10
    )
    return model_with_tools
 async def invoke_and_format_response(
    model: Any,
    messages: List[Dict[str, Any]],
    completion_id: str,
    model_name: str,
    ctx: FlowContext
 ) -> ApiResponse:
    """
    Invoke LLM and format response in OpenAI-compatible format.
    Returns:
        ApiResponse with chat completion
    """
    from services.langchain_xai_service import LangChainXAIService
    service = LangChainXAIService(ctx)
    result = await service.invoke_chat(model, messages)
    # Extract content (handle structured responses)
    if hasattr(result, 'content'):
        raw = result.content
        if isinstance(raw, list):
            # Extract text parts from structured response
            text_parts = [
                item.get('text', '')
                for item in raw
                if isinstance(item, dict) and item.get('type') == 'text'
            ]
            content = ''.join(text_parts) or str(raw)
        else:
            content = raw
    else:
        content = str(result)
    # Extract usage metadata (if available)
    usage = {"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0}
    if hasattr(result, 'usage_metadata'):
        u = result.usage_metadata
        usage = {
            "prompt_tokens": getattr(u, 'input_tokens', 0),
            "completion_tokens": getattr(u, 'output_tokens', 0),
            "total_tokens": getattr(u, 'input_tokens', 0) + getattr(u, 'output_tokens', 0)
        }
    # Format OpenAI-compatible response
    response_body = {
        'id': completion_id,
        'object': 'chat.completion',
        'created': int(time.time()),
        'model': model_name,
        'choices': [{
            'index': 0,
            'message': {'role': 'assistant', 'content': content},
            'finish_reason': 'stop'
        }],
        'usage': usage
    }
    return ApiResponse(status=200, body=response_body)
 async def lookup_collection_by_aktenzeichen(
    aktenzeichen: str,
    ctx: FlowContext
 ) -> Optional[str]:
    """
    Lookup xAI Collection ID by Aktenzeichen via EspoCRM.
    Args:
        aktenzeichen: Normalized Aktenzeichen (e.g., "1234/56")
        ctx: Motia context
    Returns:
        Collection ID or None if not found
    """
    try:
        from services.espocrm import EspoCRMAPI
        espocrm = EspoCRMAPI(ctx)
        search_result = await espocrm.search_entities(
            entity_type='Raeumungsklage',
            where=[{
                'type': 'equals',
                'attribute': 'advowareAkteBezeichner',
                'value': aktenzeichen
            }],
            select=['id', 'xaiCollectionId'],
            maxSize=1
        )
        if search_result and len(search_result) > 0:
            return search_result[0].get('xaiCollectionId')
        return None
    except Exception as e:
        ctx.logger.error(f"❌ Collection lookup failed: {e}")
        return None
--- a/steps/ai/models_list_api_step.py
+++ b/steps/ai/models_list_api_step.py
@@ -1,124 +0,0 @@
 """AI Models List API
 OpenAI-compatible models list endpoint for OpenWebUI and other clients.
 Returns all available AI models that can be used with /ai/chat/completions.
 """
 import time
 from typing import Any
 from motia import FlowContext, http, ApiRequest, ApiResponse
 config = {
    "name": "AI Models List API",
    "description": "OpenAI-compatible models endpoint - lists available AI models",
    "flows": ["ai-general"],
    "triggers": [
        http("GET", "/ai/v1/models"),
        http("GET", "/v1/models"),
        http("GET", "/ai/models")
    ],
 }
 async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
    """
    OpenAI-compatible models list endpoint.
    Returns list of available models for OpenWebUI and other clients.
    Response Format (OpenAI compatible):
        {
            "object": "list",
            "data": [
                {
                    "id": "grok-4.20-beta-0309-reasoning",
                    "object": "model",
                    "created": 1735689600,
                    "owned_by": "xai",
                    "permission": [],
                    "root": "grok-4.20-beta-0309-reasoning",
                    "parent": null
                }
            ]
        }
    """
    ctx.logger.info("📋 Models list requested")
    try:
        # Define available models
        # These correspond to models supported by /ai/chat/completions
        current_timestamp = int(time.time())
        models = [
            {
                "id": "grok-4.20-beta-0309-reasoning",
                "object": "model",
                "created": current_timestamp,
                "owned_by": "xai",
                "permission": [],
                "root": "grok-4.20-beta-0309-reasoning",
                "parent": None,
                "capabilities": {
                    "file_search": True,
                    "web_search": True,
                    "streaming": True,
                    "reasoning": True
                }
            },
            {
                "id": "grok-4.20-multi-agent-beta-0309",
                "object": "model",
                "created": current_timestamp,
                "owned_by": "xai",
                "permission": [],
                "root": "grok-4.20-multi-agent-beta-0309",
                "parent": None,
                "capabilities": {
                    "file_search": True,
                    "web_search": True,
                    "streaming": True,
                    "reasoning": True,
                    "multi_agent": True
                }
            },
            {
                "id": "grok-4-1-fast-reasoning",
                "object": "model",
                "created": current_timestamp,
                "owned_by": "xai",
                "permission": [],
                "root": "grok-4-1-fast-reasoning",
                "parent": None,
                "capabilities": {
                    "file_search": True,
                    "web_search": True,
                    "streaming": True,
                    "reasoning": True
                }
            }
        ]
        # Build OpenAI-compatible response
        response_body = {
            "object": "list",
            "data": models
        }
        ctx.logger.info(f"✅ Returned {len(models)} models")
        return ApiResponse(
            status=200,
            body=response_body
        )
    except Exception as e:
        ctx.logger.error(f"❌ Error listing models: {e}")
        return ApiResponse(
            status=500,
            body={
                "error": {
                    "message": str(e),
                    "type": "server_error"
                }
            }
        )
--- a/steps/vmh/xai_chat_completion_api_step.py
+++ b/steps/vmh/xai_chat_completion_api_step.py
@@ -1,523 +0,0 @@
 """VMH xAI Chat Completions API
 OpenAI-kompatible Chat Completions API mit xAI/LangChain Backend.
 Unterstützt file_search über xAI Collections (RAG).
 """
 import json
 import time
 from typing import Any, Dict, List, Optional
 from motia import FlowContext, http, ApiRequest, ApiResponse
 config = {
    "name": "VMH xAI Chat Completions API",
    "description": "OpenAI-compatible Chat Completions API with xAI LangChain backend",
    "flows": ["vmh-chat"],
    "triggers": [
        http("POST", "/vmh/v1/chat/completions")
    ],
 }
 async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
    """
    OpenAI-compatible Chat Completions endpoint.
    Request Body (OpenAI format):
        {
            "model": "grok-2-latest",
            "messages": [
                {"role": "system", "content": "You are helpful"},
                {"role": "user", "content": "1234/56 Was ist der Stand?"}
            ],
            "temperature": 0.7,
            "max_tokens": 2000,
            "stream": false,
            "extra_body": {
                "collection_id": "col_abc123",  // Optional: override auto-detection
                "enable_web_search": true,       // Optional: enable web search (default: false)
                "web_search_config": {           // Optional: web search configuration
                    "allowed_domains": ["example.com"],
                    "excluded_domains": ["spam.com"],
                    "enable_image_understanding": true
                }
            }
        }
    Aktenzeichen-Erkennung (Priority):
        1. extra_body.collection_id (explicit override)
        2. First user message starts with Aktenzeichen (e.g., "1234/56 ...")
        3. Error 400 if no collection_id found (strict mode)
    Response (OpenAI format):
        Non-Streaming:
            {
                "id": "chatcmpl-...",
                "object": "chat.completion",
                "created": 1234567890,
                "model": "grok-2-latest",
                "choices": [{
                    "index": 0,
                    "message": {"role": "assistant", "content": "..."},
                    "finish_reason": "stop"
                }],
                "usage": {"prompt_tokens": X, "completion_tokens": Y, "total_tokens": Z}
            }
        Streaming (SSE):
            data: {"id":"chatcmpl-...","choices":[{"delta":{"content":"Hello"},...}]}
            data: {"id":"chatcmpl-...","choices":[{"delta":{"content":" world"},...}]}
            data: {"choices":[{"delta":{},"finish_reason":"stop"}]}
            data: [DONE]
    """
    from services.langchain_xai_service import LangChainXAIService
    from services.aktenzeichen_utils import extract_aktenzeichen, normalize_aktenzeichen
    from services.espocrm import EspoCRMAPI
    ctx.logger.info("=" * 80)
    ctx.logger.info("💬 VMH CHAT COMPLETIONS API")
    ctx.logger.info("=" * 80)
    try:
        # Parse request body
        body = request.body or {}
        if not isinstance(body, dict):
            ctx.logger.error(f"❌ Invalid request body type: {type(body)}")
            return ApiResponse(
                status=400,
                body={'error': 'Request body must be JSON object'}
            )
        # Extract parameters
        model_name = body.get('model', 'grok-4.20-beta-0309-reasoning')
        messages = body.get('messages', [])
        temperature = body.get('temperature', 0.7)
        max_tokens = body.get('max_tokens')
        stream = body.get('stream', False)
        extra_body = body.get('extra_body', {})
        # Web Search parameters (default: disabled)
        enable_web_search = extra_body.get('enable_web_search', False)
        web_search_config = extra_body.get('web_search_config', {})
        ctx.logger.info(f"📋 Model: {model_name}")
        ctx.logger.info(f"📋 Messages: {len(messages)}")
        ctx.logger.info(f"📋 Stream: {stream}")
        ctx.logger.info(f"📋 Web Search: {'enabled' if enable_web_search else 'disabled'}")
        if enable_web_search and web_search_config:
            ctx.logger.debug(f"Web Search Config: {json.dumps(web_search_config, indent=2)}")
        # Log full conversation messages
        ctx.logger.info("-" * 80)
        ctx.logger.info("📨 REQUEST MESSAGES:")
        for i, msg in enumerate(messages, 1):
            role = msg.get('role', 'unknown')
            content = msg.get('content', '')
            preview = content[:150] + "..." if len(content) > 150 else content
            ctx.logger.info(f"  [{i}] {role}: {preview}")
        ctx.logger.info("-" * 80)
        # Validate messages
        if not messages or not isinstance(messages, list):
            ctx.logger.error("❌ Missing or invalid messages array")
            return ApiResponse(
                status=400,
                body={'error': 'messages must be non-empty array'}
            )
        # Determine collection_id (Priority: extra_body > Aktenzeichen > error)
        collection_id: Optional[str] = None
        aktenzeichen: Optional[str] = None
        # Priority 1: Explicit collection_id in extra_body
        if 'collection_id' in extra_body:
            collection_id = extra_body['collection_id']
            ctx.logger.info(f"🔍 Collection ID from extra_body: {collection_id}")
        # Priority 2: Extract Aktenzeichen from first user message
        else:
            for msg in messages:
                if msg.get('role') == 'user':
                    content = msg.get('content', '')
                    aktenzeichen_raw = extract_aktenzeichen(content)
                    if aktenzeichen_raw:
                        aktenzeichen = normalize_aktenzeichen(aktenzeichen_raw)
                        ctx.logger.info(f"🔍 Aktenzeichen detected: {aktenzeichen}")
                        # Lookup collection_id via EspoCRM
                        collection_id = await lookup_collection_by_aktenzeichen(
                            aktenzeichen, ctx
                        )
                        if collection_id:
                            ctx.logger.info(f"✅ Collection found: {collection_id}")
                            # Remove Aktenzeichen from message (clean prompt)
                            from services.aktenzeichen_utils import remove_aktenzeichen
                            msg['content'] = remove_aktenzeichen(content)
                            ctx.logger.debug(f"Cleaned message: {msg['content']}")
                        else:
                            ctx.logger.warn(f"⚠️  No collection found for {aktenzeichen}")
                    break  # Only check first user message
        # Priority 3: Error if no collection_id AND web_search disabled
        if not collection_id and not enable_web_search:
            ctx.logger.error("❌ No collection_id found and web_search disabled")
            ctx.logger.error("   Provide collection_id, enable web_search, or both")
            return ApiResponse(
                status=400,
                body={
                    'error': 'collection_id or web_search required',
                    'message': 'Provide collection_id in extra_body, enable web_search, or start message with Aktenzeichen (e.g., "1234/56 question")'
                }
            )
        # Initialize LangChain xAI Service
        try:
            langchain_service = LangChainXAIService(ctx)
        except ValueError as e:
            ctx.logger.error(f"❌ Service initialization failed: {e}")
            return ApiResponse(
                status=500,
                body={'error': 'Service configuration error', 'details': str(e)}
            )
        # Create ChatXAI model
        model = langchain_service.get_chat_model(
            model=model_name,
            temperature=temperature,
            max_tokens=max_tokens
        )
        # Bind tools (file_search and/or web_search)
        model_with_tools = langchain_service.bind_tools(
            model=model,
            collection_id=collection_id,
            enable_web_search=enable_web_search,
            web_search_config=web_search_config,
            max_num_results=10
        )
        # Generate completion_id
        completion_id = f"chatcmpl-{ctx.traceId[:12]}" if hasattr(ctx, 'traceId') else f"chatcmpl-{int(time.time())}"
        created_ts = int(time.time())
        # Branch: Streaming vs Non-Streaming
        if stream:
            ctx.logger.info("🌊 Starting streaming response...")
            return await handle_streaming_response(
                model_with_tools=model_with_tools,
                messages=messages,
                completion_id=completion_id,
                created_ts=created_ts,
                model_name=model_name,
                langchain_service=langchain_service,
                ctx=ctx
            )
        else:
            ctx.logger.info("📦 Starting non-streaming response...")
            return await handle_non_streaming_response(
                model_with_tools=model_with_tools,
                messages=messages,
                completion_id=completion_id,
                created_ts=created_ts,
                model_name=model_name,
                langchain_service=langchain_service,
                ctx=ctx
            )
    except Exception as e:
        ctx.logger.error("=" * 80)
        ctx.logger.error("❌ ERROR: CHAT COMPLETIONS API")
        ctx.logger.error("=" * 80)
        ctx.logger.error(f"Error: {e}", exc_info=True)
        ctx.logger.error(f"Request body: {json.dumps(request.body, indent=2, ensure_ascii=False)}")
        ctx.logger.error("=" * 80)
        return ApiResponse(
            status=500,
            body={
                'error': 'Internal server error',
                'message': str(e)
            }
        )
 async def handle_non_streaming_response(
    model_with_tools,
    messages: List[Dict[str, Any]],
    completion_id: str,
    created_ts: int,
    model_name: str,
    langchain_service,
    ctx: FlowContext
 ) -> ApiResponse:
    """
    Handle non-streaming chat completion.
    Returns:
        ApiResponse with OpenAI-format JSON body
    """
    try:
        # Invoke model
        result = await langchain_service.invoke_chat(model_with_tools, messages)
        # Extract content - handle both string and structured responses
        if hasattr(result, 'content'):
            raw_content = result.content
            # If content is a list (tool calls + text message), extract text
            if isinstance(raw_content, list):
                # Find the text message (usually last element with type='text')
                text_messages = [
                    item.get('text', '') 
                    for item in raw_content 
                    if isinstance(item, dict) and item.get('type') == 'text'
                ]
                content = text_messages[0] if text_messages else str(raw_content)
            else:
                content = raw_content
        else:
            content = str(result)
        # Build OpenAI-compatible response
        response_body = {
            'id': completion_id,
            'object': 'chat.completion',
            'created': created_ts,
            'model': model_name,
            'choices': [{
                'index': 0,
                'message': {
                    'role': 'assistant',
                    'content': content
                },
                'finish_reason': 'stop'
            }],
            'usage': {
                'prompt_tokens': 0,  # LangChain doesn't expose token counts easily
                'completion_tokens': 0,
                'total_tokens': 0
            }
        }
        # Log token usage (if available)
        if hasattr(result, 'usage_metadata'):
            usage = result.usage_metadata
            prompt_tokens = getattr(usage, 'input_tokens', 0)
            completion_tokens = getattr(usage, 'output_tokens', 0)
            response_body['usage'] = {
                'prompt_tokens': prompt_tokens,
                'completion_tokens': completion_tokens,
                'total_tokens': prompt_tokens + completion_tokens
            }
            ctx.logger.info(f"📊 Token Usage: prompt={prompt_tokens}, completion={completion_tokens}")
        # Log citations if available (from tool response annotations)
        if hasattr(result, 'content') and isinstance(result.content, list):
            # Extract citations from structured response
            for item in result.content:
                if isinstance(item, dict) and item.get('type') == 'text':
                    annotations = item.get('annotations', [])
                    if annotations:
                        ctx.logger.info(f"🔗 Citations: {len(annotations)}")
                        for i, citation in enumerate(annotations[:10], 1):  # Log first 10
                            url = citation.get('url', 'N/A')
                            title = citation.get('title', '')
                            if url.startswith('collections://'):
                                # Internal collection reference
                                ctx.logger.debug(f"   [{i}] Collection Document: {title}")
                            else:
                                # External URL
                                ctx.logger.debug(f"   [{i}] {url}")
        # Log complete response content
        ctx.logger.info(f"✅ Chat completion: {len(content)} chars")
        ctx.logger.info("=" * 80)
        ctx.logger.info("📝 COMPLETE RESPONSE:")
        ctx.logger.info("-" * 80)
        ctx.logger.info(content)
        ctx.logger.info("-" * 80)
        ctx.logger.info("=" * 80)
        return ApiResponse(
            status=200,
            body=response_body
        )
    except Exception as e:
        ctx.logger.error(f"❌ Non-streaming completion failed: {e}", exc_info=True)
        raise
 async def handle_streaming_response(
    model_with_tools,
    messages: List[Dict[str, Any]],
    completion_id: str,
    created_ts: int,
    model_name: str,
    langchain_service,
    ctx: FlowContext
 ):
    """
    Handle streaming chat completion via SSE.
    Returns:
        Streaming response generator
    """
    async def stream_generator():
        try:
            # Set SSE headers
            await ctx.response.status(200)
            await ctx.response.headers({
                "Content-Type": "text/event-stream",
                "Cache-Control": "no-cache",
                "Connection": "keep-alive"
            })
            ctx.logger.info("🌊 Streaming started")
            # Stream chunks
            chunk_count = 0
            total_content = ""
            async for chunk in langchain_service.astream_chat(model_with_tools, messages):
                # Extract delta content - handle structured chunks
                if hasattr(chunk, "content"):
                    chunk_content = chunk.content
                    # If chunk content is a list (tool calls), extract text parts
                    if isinstance(chunk_content, list):
                        # Accumulate only text deltas
                        text_parts = [
                            item.get('text', '')
                            for item in chunk_content
                            if isinstance(item, dict) and item.get('type') == 'text'
                        ]
                        delta = ''.join(text_parts)
                    else:
                        delta = chunk_content
                else:
                    delta = ""
                if delta:
                    total_content += delta
                    chunk_count += 1
                    # Build SSE data
                    data = {
                        "id": completion_id,
                        "object": "chat.completion.chunk",
                        "created": created_ts,
                        "model": model_name,
                        "choices": [{
                            "index": 0,
                            "delta": {"content": delta},
                            "finish_reason": None
                        }]
                    }
                    # Send SSE event
                    await ctx.response.stream(f"data: {json.dumps(data, ensure_ascii=False)}\n\n")
            # Send finish event
            finish_data = {
                "id": completion_id,
                "object": "chat.completion.chunk",
                "created": created_ts,
                "model": model_name,
                "choices": [{
                    "index": 0,
                    "delta": {},
                    "finish_reason": "stop"
                }]
            }
            await ctx.response.stream(f"data: {json.dumps(finish_data)}\n\n")
            # Send [DONE]
            await ctx.response.stream("data: [DONE]\n\n")
            # Close stream
            await ctx.response.close()
            # Log complete streamed response
            ctx.logger.info(f"✅ Streaming completed: {chunk_count} chunks, {len(total_content)} chars")
            ctx.logger.info("=" * 80)
            ctx.logger.info("📝 COMPLETE STREAMED RESPONSE:")
            ctx.logger.info("-" * 80)
            ctx.logger.info(total_content)
            ctx.logger.info("-" * 80)
            ctx.logger.info("=" * 80)
        except Exception as e:
            ctx.logger.error(f"❌ Streaming failed: {e}", exc_info=True)
            # Send error event
            error_data = {
                "error": {
                    "message": str(e),
                    "type": "server_error"
                }
            }
            await ctx.response.stream(f"data: {json.dumps(error_data)}\n\n")
            await ctx.response.close()
    return stream_generator()
 async def lookup_collection_by_aktenzeichen(
    aktenzeichen: str,
    ctx: FlowContext
 ) -> Optional[str]:
    """
    Lookup xAI Collection ID for Aktenzeichen via EspoCRM.
    Search strategy:
        1. Search for Raeumungsklage with matching advowareAkteBezeichner
        2. Return xaiCollectionId if found
    Args:
        aktenzeichen: Normalized Aktenzeichen (e.g., "1234/56")
        ctx: Motia context
    Returns:
        Collection ID or None if not found
    """
    try:
        # Initialize EspoCRM API
        espocrm = EspoCRMAPI(ctx)
        # Search Räumungsklage by advowareAkteBezeichner
        ctx.logger.info(f"🔍 Searching Räumungsklage for Aktenzeichen: {aktenzeichen}")
        search_result = await espocrm.search_entities(
            entity_type='Raeumungsklage',
            where=[{
                'type': 'equals',
                'attribute': 'advowareAkteBezeichner',
                'value': aktenzeichen
            }],
            select=['id', 'xaiCollectionId', 'advowareAkteBezeichner'],
            maxSize=1
        )
        if search_result and len(search_result) > 0:
            entity = search_result[0]
            collection_id = entity.get('xaiCollectionId')
            if collection_id:
                ctx.logger.info(f"✅ Found Räumungsklage: {entity.get('id')}")
                return collection_id
            else:
                ctx.logger.warn(f"⚠️  Räumungsklage found but no xaiCollectionId: {entity.get('id')}")
        else:
            ctx.logger.warn(f"⚠️  No Räumungsklage found for {aktenzeichen}")
        return None
    except Exception as e:
        ctx.logger.error(f"❌ Collection lookup failed: {e}", exc_info=True)
        return None
Author	SHA1	Message	Date
bsiggel	71f583481a	fix: Remove deprecated AI Chat Completions and Models List API implementations	2026-03-19 23:10:00 +00:00
bsiggel	48d440a860	fix: Remove deprecated VMH xAI Chat Completions API implementation	2026-03-19 21:42:43 +00:00
bsiggel	c02a5d8823	fix: Update ExecModule exec path to use correct binary location	2026-03-19 21:23:42 +00:00
bsiggel	edae5f6081	fix: Update ExecModule configuration to use correct source directory for step scripts	2026-03-19 21:20:31 +00:00
bsiggel	8ce843415e	feat: Enhance developer guide with updated platform evolution and workflow details	2026-03-19 20:56:32 +00:00
bsiggel	46085bd8dd	update to iii 0.90 and change directory structure	2026-03-19 20:33:49 +00:00
bsiggel	2ac83df1e0	fix: Update default chat model to grok-4-1-fast-reasoning and enhance logging for LLM responses	2026-03-19 09:50:31 +00:00