fix: Remove deprecated AI Chat Completions and Models List API implementations

fix: Remove deprecated VMH xAI Chat Completions API implementation
fix: Update ExecModule exec path to use correct binary location
2026-03-19 23:10:00 +00:00 · 2026-03-19 21:42:43 +00:00 · 2026-03-19 21:23:42 +00:00 · 2026-03-19 21:20:31 +00:00 · 2026-03-19 20:56:32 +00:00 · 2026-03-19 20:33:49 +00:00
41 changed files with 452 additions and 1089 deletions
--- a/=0.2.0
+++ b/=0.2.0
--- a/=0.3.0
+++ b/=0.3.0
@@ -1,49 +0,0 @@
-Requirement already satisfied: langchain in ./.venv/lib/python3.13/site-packages (1.2.12)
-Requirement already satisfied: langchain-xai in ./.venv/lib/python3.13/site-packages (1.2.2)
-Requirement already satisfied: langchain-core in ./.venv/lib/python3.13/site-packages (1.2.18)
-Requirement already satisfied: langgraph<1.2.0,>=1.1.1 in ./.venv/lib/python3.13/site-packages (from langchain) (1.1.2)
-Requirement already satisfied: pydantic<3.0.0,>=2.7.4 in ./.venv/lib/python3.13/site-packages (from langchain) (2.12.5)
-Requirement already satisfied: jsonpatch<2.0.0,>=1.33.0 in ./.venv/lib/python3.13/site-packages (from langchain-core) (1.33)
-Requirement already satisfied: langsmith<1.0.0,>=0.3.45 in ./.venv/lib/python3.13/site-packages (from langchain-core) (0.7.17)
-Requirement already satisfied: packaging>=23.2.0 in ./.venv/lib/python3.13/site-packages (from langchain-core) (26.0)
-Requirement already satisfied: pyyaml<7.0.0,>=5.3.0 in ./.venv/lib/python3.13/site-packages (from langchain-core) (6.0.3)
-Requirement already satisfied: tenacity!=8.4.0,<10.0.0,>=8.1.0 in ./.venv/lib/python3.13/site-packages (from langchain-core) (9.1.4)
-Requirement already satisfied: typing-extensions<5.0.0,>=4.7.0 in ./.venv/lib/python3.13/site-packages (from langchain-core) (4.15.0)
-Requirement already satisfied: uuid-utils<1.0,>=0.12.0 in ./.venv/lib/python3.13/site-packages (from langchain-core) (0.14.1)
-Requirement already satisfied: jsonpointer>=1.9 in ./.venv/lib/python3.13/site-packages (from jsonpatch<2.0.0,>=1.33.0->langchain-core) (3.0.0)
-Requirement already satisfied: langgraph-checkpoint<5.0.0,>=2.1.0 in ./.venv/lib/python3.13/site-packages (from langgraph<1.2.0,>=1.1.1->langchain) (4.0.1)
-Requirement already satisfied: langgraph-prebuilt<1.1.0,>=1.0.8 in ./.venv/lib/python3.13/site-packages (from langgraph<1.2.0,>=1.1.1->langchain) (1.0.8)
-Requirement already satisfied: langgraph-sdk<0.4.0,>=0.3.0 in ./.venv/lib/python3.13/site-packages (from langgraph<1.2.0,>=1.1.1->langchain) (0.3.11)
-Requirement already satisfied: xxhash>=3.5.0 in ./.venv/lib/python3.13/site-packages (from langgraph<1.2.0,>=1.1.1->langchain) (3.6.0)
-Requirement already satisfied: ormsgpack>=1.12.0 in ./.venv/lib/python3.13/site-packages (from langgraph-checkpoint<5.0.0,>=2.1.0->langgraph<1.2.0,>=1.1.1->langchain) (1.12.2)
-Requirement already satisfied: httpx>=0.25.2 in ./.venv/lib/python3.13/site-packages (from langgraph-sdk<0.4.0,>=0.3.0->langgraph<1.2.0,>=1.1.1->langchain) (0.28.1)
-Requirement already satisfied: orjson>=3.11.5 in ./.venv/lib/python3.13/site-packages (from langgraph-sdk<0.4.0,>=0.3.0->langgraph<1.2.0,>=1.1.1->langchain) (3.11.7)
-Requirement already satisfied: requests-toolbelt>=1.0.0 in ./.venv/lib/python3.13/site-packages (from langsmith<1.0.0,>=0.3.45->langchain-core) (1.0.0)
-Requirement already satisfied: requests>=2.0.0 in ./.venv/lib/python3.13/site-packages (from langsmith<1.0.0,>=0.3.45->langchain-core) (2.32.5)
-Requirement already satisfied: zstandard>=0.23.0 in ./.venv/lib/python3.13/site-packages (from langsmith<1.0.0,>=0.3.45->langchain-core) (0.25.0)
-Requirement already satisfied: anyio in ./.venv/lib/python3.13/site-packages (from httpx>=0.25.2->langgraph-sdk<0.4.0,>=0.3.0->langgraph<1.2.0,>=1.1.1->langchain) (4.12.1)
-Requirement already satisfied: certifi in ./.venv/lib/python3.13/site-packages (from httpx>=0.25.2->langgraph-sdk<0.4.0,>=0.3.0->langgraph<1.2.0,>=1.1.1->langchain) (2026.2.25)
-Requirement already satisfied: httpcore==1.* in ./.venv/lib/python3.13/site-packages (from httpx>=0.25.2->langgraph-sdk<0.4.0,>=0.3.0->langgraph<1.2.0,>=1.1.1->langchain) (1.0.9)
-Requirement already satisfied: idna in ./.venv/lib/python3.13/site-packages (from httpx>=0.25.2->langgraph-sdk<0.4.0,>=0.3.0->langgraph<1.2.0,>=1.1.1->langchain) (3.11)
-Requirement already satisfied: h11>=0.16 in ./.venv/lib/python3.13/site-packages (from httpcore==1.*->httpx>=0.25.2->langgraph-sdk<0.4.0,>=0.3.0->langgraph<1.2.0,>=1.1.1->langchain) (0.16.0)
-Requirement already satisfied: annotated-types>=0.6.0 in ./.venv/lib/python3.13/site-packages (from pydantic<3.0.0,>=2.7.4->langchain) (0.7.0)
-Requirement already satisfied: pydantic-core==2.41.5 in ./.venv/lib/python3.13/site-packages (from pydantic<3.0.0,>=2.7.4->langchain) (2.41.5)
-Requirement already satisfied: typing-inspection>=0.4.2 in ./.venv/lib/python3.13/site-packages (from pydantic<3.0.0,>=2.7.4->langchain) (0.4.2)
-Requirement already satisfied: aiohttp<4.0.0,>=3.9.1 in ./.venv/lib/python3.13/site-packages (from langchain-xai) (3.13.3)
-Requirement already satisfied: langchain-openai<2.0.0,>=1.1.7 in ./.venv/lib/python3.13/site-packages (from langchain-xai) (1.1.11)
-Requirement already satisfied: aiohappyeyeballs>=2.5.0 in ./.venv/lib/python3.13/site-packages (from aiohttp<4.0.0,>=3.9.1->langchain-xai) (2.6.1)
-Requirement already satisfied: aiosignal>=1.4.0 in ./.venv/lib/python3.13/site-packages (from aiohttp<4.0.0,>=3.9.1->langchain-xai) (1.4.0)
-Requirement already satisfied: attrs>=17.3.0 in ./.venv/lib/python3.13/site-packages (from aiohttp<4.0.0,>=3.9.1->langchain-xai) (25.4.0)
-Requirement already satisfied: frozenlist>=1.1.1 in ./.venv/lib/python3.13/site-packages (from aiohttp<4.0.0,>=3.9.1->langchain-xai) (1.8.0)
-Requirement already satisfied: multidict<7.0,>=4.5 in ./.venv/lib/python3.13/site-packages (from aiohttp<4.0.0,>=3.9.1->langchain-xai) (6.7.1)
-Requirement already satisfied: propcache>=0.2.0 in ./.venv/lib/python3.13/site-packages (from aiohttp<4.0.0,>=3.9.1->langchain-xai) (0.4.1)
-Requirement already satisfied: yarl<2.0,>=1.17.0 in ./.venv/lib/python3.13/site-packages (from aiohttp<4.0.0,>=3.9.1->langchain-xai) (1.22.0)
-Requirement already satisfied: openai<3.0.0,>=2.26.0 in ./.venv/lib/python3.13/site-packages (from langchain-openai<2.0.0,>=1.1.7->langchain-xai) (2.26.0)
-Requirement already satisfied: tiktoken<1.0.0,>=0.7.0 in ./.venv/lib/python3.13/site-packages (from langchain-openai<2.0.0,>=1.1.7->langchain-xai) (0.12.0)
-Requirement already satisfied: distro<2,>=1.7.0 in ./.venv/lib/python3.13/site-packages (from openai<3.0.0,>=2.26.0->langchain-openai<2.0.0,>=1.1.7->langchain-xai) (1.9.0)
-Requirement already satisfied: jiter<1,>=0.10.0 in ./.venv/lib/python3.13/site-packages (from openai<3.0.0,>=2.26.0->langchain-openai<2.0.0,>=1.1.7->langchain-xai) (0.13.0)
-Requirement already satisfied: sniffio in ./.venv/lib/python3.13/site-packages (from openai<3.0.0,>=2.26.0->langchain-openai<2.0.0,>=1.1.7->langchain-xai) (1.3.1)
-Requirement already satisfied: tqdm>4 in ./.venv/lib/python3.13/site-packages (from openai<3.0.0,>=2.26.0->langchain-openai<2.0.0,>=1.1.7->langchain-xai) (4.67.3)
-Requirement already satisfied: charset_normalizer<4,>=2 in ./.venv/lib/python3.13/site-packages (from requests>=2.0.0->langsmith<1.0.0,>=0.3.45->langchain-core) (3.4.4)
-Requirement already satisfied: urllib3<3,>=1.21.1 in ./.venv/lib/python3.13/site-packages (from requests>=2.0.0->langsmith<1.0.0,>=0.3.45->langchain-core) (2.6.3)
-Requirement already satisfied: regex>=2022.1.18 in ./.venv/lib/python3.13/site-packages (from tiktoken<1.0.0,>=0.7.0->langchain-openai<2.0.0,>=1.1.7->langchain-xai) (2026.2.28)
--- a/docs/INDEX.md
+++ b/docs/INDEX.md
@@ -3,6 +3,7 @@
 > **For AI Assistants**: This document contains all critical patterns, conventions, and best practices. Read this first to understand the codebase structure and ensure consistency.

 **Quick Navigation:**
+- [iii Platform & Development Workflow](#iii-platform--development-workflow) - Platform evolution and CLI tools
 - [Core Concepts](#core-concepts) - System architecture and patterns
 - [Design Principles](#design-principles) - Event Storm & Bidirectional References
 - [Step Development](#step-development-best-practices) - How to create new steps
@@ -23,6 +24,244 @@

 ---

+## iii Platform & Development Workflow
+
+### Platform Evolution (v0.8 → v0.9+)
+
+**Status:** March 2026 - iii v0.9+ production-ready
+
+iii has evolved from an all-in-one development tool to a **modular, production-grade event engine** with clear separation between development and deployment workflows.
+
+#### Structural Changes Overview
+
+| Component | Before (v0.2-v0.7) | Now (v0.9+) | Impact |
+|-----------|-------------------|-------------|--------|
+| **Console/Dashboard** | Integrated in engine process (port 3111) | Separate process (`iii-cli console` or `dev`) | More flexibility, less resource overhead, better scaling |
+| **CLI Tool** | Minimal or non-existent | `iii-cli` is the central dev tool | Terminal-based dev workflow, scriptable, faster iteration |
+| **Project Structure** | Steps anywhere in project | **Recommended:** `src/` + `src/steps/` | Cleaner structure, reliable hot-reload |
+| **Hot-Reload/Watcher** | Integrated in engine | Separate `shell::ExecModule` with `watch` paths | Only Python/TS files watched (configurable) |
+| **Start & Services** | Single `iii` process | Engine (`iii` or `iii-cli start`) + Console separate | Better for production (engine) vs dev (console) |
+| **Config Handling** | YAML + ENV | YAML + ENV + CLI flags prioritized | More control via CLI flags |
+| **Observability** | Basic | Enhanced (OTel, Rollups, Alerts, Traces) | Production-ready telemetry |
+| **Streams & State** | KV-Store (file/memory) | More adapters + file_based default | Better persistence handling |
+
+**Key Takeaway:** iii is now a **modular, production-ready engine** where development (CLI + separate console) is clearly separated from production deployment.
+
+---
+
+### Development Workflow with iii-cli
+
+**`iii-cli` is your primary tool for local development, debugging, and testing.**
+
+#### Essential Commands
+
+| Command | Purpose | When to Use | Example |
+|---------|---------|------------|---------|
+| `iii-cli dev` | Start dev server with hot-reload + integrated console | Local development, immediate feedback on code changes | `iii-cli dev` |
+| `iii-cli console` | Start dashboard only (separate port) | When you only need the console (no dev reload) | `iii-cli console --host 0.0.0.0 --port 3113` |
+| `iii-cli start` | Start engine standalone (like `motia.service`) | Testing engine in isolation | `iii-cli start -c iii-config.yaml` |
+| `iii-cli logs` | Live logs of all flows/workers/triggers | Debugging, error investigation | `iii-cli logs --level debug` |
+| `iii-cli trace <id>` | Show detailed trace information (OTel) | Debug specific request/flow | `iii-cli trace abc123` |
+| `iii-cli state ls` | List states (KV storage) | Verify state persistence | `iii-cli state ls` |
+| `iii-cli state get` | Get specific state value | Inspect state content | `iii-cli state get key` |
+| `iii-cli stream ls` | List all streams + groups | Inspect stream/websocket connections | `iii-cli stream ls` |
+| `iii-cli flow list` | Show all registered flows/triggers | Overview of active steps & endpoints | `iii-cli flow list` |
+| `iii-cli worker logs` | Worker logs (Python/TS execution) | Debug issues in step handlers | `iii-cli worker logs` |
+
+#### Typical Development Workflow
+
+```bash
+# 1. Navigate to project
+cd /opt/motia-iii/bitbylaw
+
+# 2. Start dev mode (hot-reload + console on port 3113)
+iii-cli dev --host 0.0.0.0 --port 3113 --engine-port 3111
+
+# Alternative: Separate engine + console
+# Terminal 1:
+iii-cli start -c iii-config.yaml
+
+# Terminal 2:
+iii-cli console --host 0.0.0.0 --port 3113 \
+  --engine-host 192.168.1.62 --engine-port 3111
+
+# 3. Watch logs live (separate terminal)
+iii-cli logs -f
+
+# 4. Debug specific trace
+iii-cli trace <trace-id-from-logs>
+
+# 5. Inspect state
+iii-cli state ls
+iii-cli state get document:sync:status
+
+# 6. Verify flows registered
+iii-cli flow list
+```
+
+#### Development vs. Production
+
+**Development:**
+- Use `iii-cli dev` for hot-reload
+- Console accessible on localhost:3113
+- Logs visible in terminal
+- Immediate feedback on code changes
+
+**Production:**
+- `systemd` service runs `iii-cli start`
+- Console runs separately (if needed)
+- Logs via `journalctl -u motia.service -f`
+- No hot-reload (restart service for changes)
+
+**Example Production Service:**
+```ini
+[Unit]
+Description=Motia III Engine
+After=network.target redis.service
+
+[Service]
+Type=simple
+User=motia
+WorkingDirectory=/opt/motia-iii/bitbylaw
+ExecStart=/usr/local/bin/iii-cli start -c /opt/motia-iii/bitbylaw/iii-config.yaml
+Restart=always
+RestartSec=10
+Environment="PATH=/usr/local/bin:/usr/bin"
+
+[Install]
+WantedBy=multi-user.target
+```
+
+#### Project Structure Best Practices
+
+**Recommended Structure (v0.9+):**
+```
+bitbylaw/
+├── iii-config.yaml           # Main configuration
+├── src/                      # Source code root
+│   └── steps/                # All steps here (hot-reload reliable)
+│       ├── __init__.py
+│       ├── vmh/
+│       │   ├── __init__.py
+│       │   ├── document_sync_event_step.py
+│       │   └── webhook/
+│       │       ├── __init__.py
+│       │       └── document_create_api_step.py
+│       └── advoware_proxy/
+│           └── ...
+├── services/                 # Shared business logic
+│   ├── __init__.py
+│   ├── xai_service.py
+│   ├── espocrm.py
+│   └── ...
+└── tests/                    # Test files
+```
+
+**Why `src/steps/` is recommended:**
+- **Hot-reload works reliably** - Watcher detects changes correctly
+- **Cleaner project** - Source code isolated from config/docs
+- **IDE support** - Better navigation and refactoring
+- **Deployment** - Easier to package
+
+**Note:** Old structure (steps in root) still works, but hot-reload may be less reliable.
+
+#### Hot-Reload Configuration
+
+**Hot-reload is configured via `shell::ExecModule` in `iii-config.yaml`:**
+
+```yaml
+modules:
+  - type: shell::ExecModule
+    config:
+      watch:
+        - "src/**/*.py"      # Watch Python files in src/
+        - "services/**/*.py" # Watch service files
+        # Add more patterns as needed
+      ignore:
+        - "**/__pycache__/**"
+        - "**/*.pyc"
+        - "**/tests/**"
+```
+
+**Behavior:**
+- Only files matching `watch` patterns trigger reload
+- Changes in `ignore` patterns are ignored
+- Reload is automatic in `iii-cli dev` mode
+- Production mode (`iii-cli start`) does NOT watch files
+
+---
+
+### Observability & Debugging
+
+#### OpenTelemetry Integration
+
+**iii v0.9+ has built-in OpenTelemetry support:**
+
+```python
+# Traces are automatically created for:
+# - HTTP requests
+# - Queue processing
+# - Cron execution
+# - Service calls (if instrumented)
+
+# Access trace ID in handler:
+async def handler(request: ApiRequest, ctx: FlowContext) -> ApiResponse:
+    trace_id = ctx.trace_id  # Use for debugging
+    ctx.logger.info(f"Trace ID: {trace_id}")
+```
+
+**View traces:**
+```bash
+# Get trace details
+iii-cli trace <trace-id>
+
+# Filter logs by trace
+iii-cli logs --trace <trace-id>
+```
+
+#### Debugging Workflow
+
+**1. Live Logs:**
+```bash
+# All logs
+iii-cli logs -f
+
+# Specific level
+iii-cli logs --level error
+
+# With grep
+iii-cli logs -f | grep "document_sync"
+```
+
+**2. State Inspection:**
+```bash
+# List all state keys
+iii-cli state ls
+
+# Get specific state
+iii-cli state get sync:document:last_run
+```
+
+**3. Flow Verification:**
+```bash
+# List all registered flows
+iii-cli flow list
+
+# Verify endpoint exists
+iii-cli flow list | grep "/vmh/webhook"
+```
+
+**4. Worker Issues:**
+```bash
+# Worker-specific logs
+iii-cli worker logs
+
+# Check worker health
+iii-cli worker status
+```
+
+---
+
 ## Core Concepts

 ### System Overview
@@ -1271,24 +1510,41 @@ sudo systemctl enable motia.service
 sudo systemctl enable iii-console.service
 ```

-**Manual (Development):**
+**Development (iii-cli):**
 ```bash
-# Start iii Engine
+# Option 1: Dev mode with integrated console and hot-reload
 cd /opt/motia-iii/bitbylaw
-/opt/bin/iii -c iii-config.yaml
+iii-cli dev --host 0.0.0.0 --port 3113 --engine-port 3111

-# Start iii Console (Web UI)
-/opt/bin/iii-console --enable-flow --host 0.0.0.0 --port 3113 \
-  --engine-host 192.168.67.233 --engine-port 3111 --ws-port 3114
+# Option 2: Separate engine and console
+# Terminal 1: Start engine
+iii-cli start -c iii-config.yaml
+
+# Terminal 2: Start console
+iii-cli console --host 0.0.0.0 --port 3113 \
+  --engine-host 192.168.1.62 --engine-port 3111
+
+# Option 3: Manual (legacy)
+/opt/bin/iii -c iii-config.yaml
 ```

 ### Check Registered Steps

+**Using iii-cli (recommended):**
+```bash
+# List all flows and triggers
+iii-cli flow list
+
+# Filter for specific step
+iii-cli flow list | grep document_sync
+```
+
+**Using curl (legacy):**
 ```bash
 curl http://localhost:3111/_console/functions | python3 -m json.tool
 ```

-### Test HTTP Endpoint
+### Test HTTP Endpoints

 ```bash
 # Test document webhook
@@ -1298,6 +1554,11 @@ curl -X POST "http://localhost:3111/vmh/webhook/document/create" \

 # Test advoware proxy
 curl "http://localhost:3111/advoware/proxy?endpoint=employees"
+
+# Test beteiligte sync
+curl -X POST "http://localhost:3111/vmh/webhook/beteiligte/create" \
+  -H "Content-Type: application/json" \
+  -d '{"entity_type": "CBeteiligte", "entity_id": "abc123", "action": "create"}'
 ```

 ### Manually Trigger Cron
@@ -1308,36 +1569,208 @@ curl -X POST "http://localhost:3111/_console/cron/trigger" \
  -d '{"function_id": "steps::VMH Beteiligte Sync Cron::trigger::0"}'
 ```

-### View Logs
+### View and Debug Logs

+**Using iii-cli (recommended):**
 ```bash
-# Live logs via journalctl
-journalctl -u motia-iii -f
+# Live logs (all)
+iii-cli logs -f
+
+# Live logs with specific level
+iii-cli logs -f --level error
+iii-cli logs -f --level debug
+
+# Filter by component
+iii-cli logs -f | grep "document_sync"
+
+# Worker-specific logs
+iii-cli worker logs
+
+# Get specific trace
+iii-cli trace <trace-id>
+
+# Filter logs by trace ID
+iii-cli logs --trace <trace-id>
+```
+
+**Using journalctl (production):**
+```bash
+# Live logs
+journalctl -u motia.service -f

 # Search for specific step
-journalctl --since "today" | grep -i "document sync"
+journalctl -u motia.service --since "today" | grep -i "document sync"

+# Show errors only
+journalctl -u motia.service -p err -f
+
+# Last 100 lines
+journalctl -u motia.service -n 100
+
+# Specific time range
+journalctl -u motia.service --since "2026-03-19 10:00" --until "2026-03-19 11:00"
+```
+
+**Using log files (legacy):**
+```bash
 # Check for errors
 tail -100 /opt/motia-iii/bitbylaw/iii_new.log | grep -i error
+
+# Follow log file
+tail -f /opt/motia-iii/bitbylaw/iii_new.log
+```
+
+### Inspect State and Streams
+
+**State Management:**
+```bash
+# List all state keys
+iii-cli state ls
+
+# Get specific state value
+iii-cli state get document:sync:last_run
+
+# Set state (if needed for testing)
+iii-cli state set test:key "test value"
+
+# Delete state
+iii-cli state delete test:key
+```
+
+**Stream Management:**
+```bash
+# List all active streams
+iii-cli stream ls
+
+# Inspect specific stream
+iii-cli stream info <stream-id>
+
+# List consumer groups
+iii-cli stream groups <stream-name>
+```
+
+### Debugging Workflow
+
+**1. Identify the Issue:**
+```bash
+# Check if step is registered
+iii-cli flow list | grep my_step
+
+# View recent errors
+iii-cli logs --level error -n 50
+
+# Check service status
+sudo systemctl status motia.service
+```
+
+**2. Get Detailed Information:**
+```bash
+# Live tail logs for specific step
+iii-cli logs -f | grep "document_sync"
+
+# Check worker processes
+iii-cli worker logs
+
+# Inspect state
+iii-cli state ls
+```
+
+**3. Test Specific Functionality:**
+```bash
+# Trigger webhook manually
+curl -X POST http://localhost:3111/vmh/webhook/...
+
+# Check response and logs
+iii-cli logs -f | grep "webhook"
+
+# Verify state changed
+iii-cli state get entity:sync:status
+```
+
+**4. Trace Specific Request:**
+```bash
+# Make request, note trace ID from logs
+curl -X POST http://localhost:3111/vmh/webhook/document/create ...
+
+# Get full trace
+iii-cli trace <trace-id>
+
+# View all logs for this trace
+iii-cli logs --trace <trace-id>
+```
+
+### Performance Monitoring
+
+**Check System Resources:**
+```bash
+# CPU and memory usage
+htop
+
+# Process-specific
+ps aux | grep iii
+
+# Redis memory
+redis-cli info memory
+
+# File descriptors
+lsof -p $(pgrep -f "iii-cli start")
+```
+
+**Check Processing Metrics:**
+```bash
+# Queue lengths (if using Redis streams)
+redis-cli XINFO STREAM vmh:document:sync
+
+# Pending messages
+redis-cli XPENDING vmh:document:sync group1
+
+# Lock status
+redis-cli KEYS "lock:*"
 ```

 ### Common Issues

 **Step not showing up:**
 1. Check file naming: Must end with `_step.py`
-2. Check for import errors: `grep -i "importerror\|traceback" iii.log`
-3. Verify `config` dict is present
-4. Restart iii engine
+2. Check for syntax errors: `iii-cli logs --level error`
+3. Check for import errors: `iii-cli logs | grep -i "importerror\|traceback"`
+4. Verify `config` dict is present
+5. Restart: `sudo systemctl restart motia.service` or restart `iii-cli dev`
+6. Verify hot-reload working: Check terminal output in `iii-cli dev`

 **Redis connection failed:**
 - Check `REDIS_HOST` and `REDIS_PORT` environment variables
 - Verify Redis is running: `redis-cli ping`
+- Check Redis logs: `journalctl -u redis -f`
 - Service will work without Redis but with warnings

+**Hot-reload not working:**
+- Verify using `iii-cli dev` (not `iii-cli start`)
+- Check `watch` patterns in `iii-config.yaml`
+- Ensure files are in watched directories (`src/**/*.py`)
+- Look for watcher errors: `iii-cli logs | grep -i "watch"`
+
+**Handler not triggered:**
+- Verify endpoint registered: `iii-cli flow list`
+- Check HTTP method matches (GET, POST, etc.)
+- Test with curl to isolate issue
+- Check trigger configuration in step's `config` dict
+
 **AttributeError '_log' not found:**
 - Ensure service inherits from `BaseSyncUtils` OR
 - Implement `_log()` method manually

+**Trace not found:**
+- Ensure OpenTelemetry enabled in config
+- Check if trace ID is valid format
+- Use `iii-cli logs` with filters instead
+
+**Console not accessible:**
+- Check if console service running: `systemctl status iii-console.service`
+- Verify port not blocked by firewall: `sudo ufw status`
+- Check console logs: `journalctl -u iii-console.service -f`
+- Try accessing via `localhost:3113` instead of public IP
+
 ---

 ## Key Patterns Summary
--- a/iii-config.yaml
+++ b/iii-config.yaml
@@ -78,6 +78,6 @@ modules:
  - class: modules::shell::ExecModule
    config:
      watch:
-        - steps/**/*.py
+        - src/steps/**/*.py
      exec:
-        - /opt/bin/uv run python -m motia.cli run --dir steps
+        - /usr/local/bin/uv run python -m motia.cli run --dir src/steps
--- a/services/langchain_xai_service.py
+++ b/services/langchain_xai_service.py
@@ -17,7 +17,7 @@ class LangChainXAIService:
    
    Usage:
        service = LangChainXAIService(ctx)
-        model = service.get_chat_model(model="grok-2-latest")
+        model = service.get_chat_model(model="grok-4-1-fast-reasoning")
        model_with_tools = service.bind_file_search(model, collection_id)
        result = await service.invoke_chat(model_with_tools, messages)
    """
@@ -46,7 +46,7 @@ class LangChainXAIService:
    
    def get_chat_model(
        self,
-        model: str = "grok-2-latest",
+        model: str = "grok-4-1-fast-reasoning",
        temperature: float = 0.7,
        max_tokens: Optional[int] = None
    ):
@@ -54,7 +54,7 @@ class LangChainXAIService:
        Initialisiert ChatXAI Model.
        
        Args:
-            model: Model name (default: grok-2-latest)
+            model: Model name (default: grok-4-1-fast-reasoning)
            temperature: Sampling temperature 0.0-1.0
            max_tokens: Optional max tokens for response
            
--- a/src/steps/init.py
+++ b/src/steps/init.py
--- a/src/steps/advoware_cal_sync/README.md
+++ b/src/steps/advoware_cal_sync/README.md
--- a/src/steps/advoware_cal_sync/init.py
+++ b/src/steps/advoware_cal_sync/init.py
--- a/src/steps/advoware_cal_sync/calendar_sync_all_step.py
+++ b/src/steps/advoware_cal_sync/calendar_sync_all_step.py
--- a/src/steps/advoware_cal_sync/calendar_sync_api_step.py
+++ b/src/steps/advoware_cal_sync/calendar_sync_api_step.py
--- a/src/steps/advoware_cal_sync/calendar_sync_cron_step.py
+++ b/src/steps/advoware_cal_sync/calendar_sync_cron_step.py
--- a/src/steps/advoware_cal_sync/calendar_sync_event_step.py
+++ b/src/steps/advoware_cal_sync/calendar_sync_event_step.py
--- a/src/steps/advoware_cal_sync/calendar_sync_utils.py
+++ b/src/steps/advoware_cal_sync/calendar_sync_utils.py
--- a/src/steps/advoware_proxy/README.md
+++ b/src/steps/advoware_proxy/README.md
--- a/src/steps/advoware_proxy/init.py
+++ b/src/steps/advoware_proxy/init.py
--- a/src/steps/advoware_proxy/advoware_api_proxy_delete_step.py
+++ b/src/steps/advoware_proxy/advoware_api_proxy_delete_step.py
--- a/src/steps/advoware_proxy/advoware_api_proxy_get_step.py
+++ b/src/steps/advoware_proxy/advoware_api_proxy_get_step.py
--- a/src/steps/advoware_proxy/advoware_api_proxy_post_step.py
+++ b/src/steps/advoware_proxy/advoware_api_proxy_post_step.py
--- a/src/steps/advoware_proxy/advoware_api_proxy_put_step.py
+++ b/src/steps/advoware_proxy/advoware_api_proxy_put_step.py
--- a/src/steps/vmh/init.py
+++ b/src/steps/vmh/init.py
--- a/src/steps/vmh/aiknowledge_full_sync_cron_step.py
+++ b/src/steps/vmh/aiknowledge_full_sync_cron_step.py
--- a/src/steps/vmh/aiknowledge_sync_event_step.py
+++ b/src/steps/vmh/aiknowledge_sync_event_step.py
--- a/src/steps/vmh/bankverbindungen_sync_event_step.py
+++ b/src/steps/vmh/bankverbindungen_sync_event_step.py
--- a/src/steps/vmh/beteiligte_sync_cron_step.py
+++ b/src/steps/vmh/beteiligte_sync_cron_step.py
--- a/src/steps/vmh/beteiligte_sync_event_step.py
+++ b/src/steps/vmh/beteiligte_sync_event_step.py
--- a/src/steps/vmh/document_sync_event_step.py
+++ b/src/steps/vmh/document_sync_event_step.py
--- a/src/steps/vmh/webhook/init.py
+++ b/src/steps/vmh/webhook/init.py
--- a/src/steps/vmh/webhook/aiknowledge_update_api_step.py
+++ b/src/steps/vmh/webhook/aiknowledge_update_api_step.py
--- a/src/steps/vmh/webhook/bankverbindungen_create_api_step.py
+++ b/src/steps/vmh/webhook/bankverbindungen_create_api_step.py
--- a/src/steps/vmh/webhook/bankverbindungen_delete_api_step.py
+++ b/src/steps/vmh/webhook/bankverbindungen_delete_api_step.py
--- a/src/steps/vmh/webhook/bankverbindungen_update_api_step.py
+++ b/src/steps/vmh/webhook/bankverbindungen_update_api_step.py
--- a/src/steps/vmh/webhook/beteiligte_create_api_step.py
+++ b/src/steps/vmh/webhook/beteiligte_create_api_step.py
--- a/src/steps/vmh/webhook/beteiligte_delete_api_step.py
+++ b/src/steps/vmh/webhook/beteiligte_delete_api_step.py
--- a/src/steps/vmh/webhook/beteiligte_update_api_step.py
+++ b/src/steps/vmh/webhook/beteiligte_update_api_step.py
--- a/src/steps/vmh/webhook/document_create_api_step.py
+++ b/src/steps/vmh/webhook/document_create_api_step.py
--- a/src/steps/vmh/webhook/document_delete_api_step.py
+++ b/src/steps/vmh/webhook/document_delete_api_step.py
--- a/src/steps/vmh/webhook/document_update_api_step.py
+++ b/src/steps/vmh/webhook/document_update_api_step.py
--- a/steps/ai/init.py
+++ b/steps/ai/init.py
--- a/steps/ai/chat_completions_api_step.py
+++ b/steps/ai/chat_completions_api_step.py
@@ -1,374 +0,0 @@
-"""AI Chat Completions API
-
-OpenAI-compatible Chat Completions endpoint with xAI/LangChain backend.
-
-Features:
- File Search (RAG) via xAI Collections  
- Web Search via xAI web_search tool
- Aktenzeichen-based automatic collection lookup
- Multiple tools simultaneously
- Clean, reusable architecture for future LLM endpoints
-
-Note: Streaming is not supported (Motia limitation - returns clear error).
-
-Reusability:
- extract_request_params(): Parse requests for any LLM endpoint
- resolve_collection_id(): Auto-detect Aktenzeichen, lookup collection
- initialize_model_with_tools(): Bind tools to any LangChain model
- invoke_and_format_response(): Standard OpenAI response formatting
-"""
-import time
-from typing import Any, Dict, List, Optional
-from motia import FlowContext, http, ApiRequest, ApiResponse
-
-config = {
-    "name": "AI Chat Completions API",
-    "description": "OpenAI-compatible Chat Completions API with xAI backend",
-    "flows": ["ai-general"],
-    "triggers": [
-        http("POST", "/ai/v1/chat/completions"),
-        http("POST", "/v1/chat/completions")
-    ],
-}
-
-
-# ============================================================================
-# MAIN HANDLER
-# ============================================================================
-
-async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
-    """
-    OpenAI-compatible Chat Completions endpoint.
-    
-    Returns:
-        ApiResponse with chat completion or error
-    """
-    ctx.logger.info("=" * 80)
-    ctx.logger.info("🤖 AI Chat Completions API")
-    ctx.logger.info("=" * 80)
-    
-    try:
-        # 1. Parse and validate request
-        params = extract_request_params(request, ctx)
-        
-        # 2. Check streaming (not supported)
-        if params['stream']:
-            return ApiResponse(
-                status=501,
-                body={
-                    'error': {
-                        'message': 'Streaming is not supported. Please set stream=false.',
-                        'type': 'not_implemented',
-                        'param': 'stream'
-                    }
-                }
-            )
-        
-        # 3. Resolve collection (explicit ID or Aktenzeichen lookup)
-        collection_id = await resolve_collection_id(
-            params['collection_id'],
-            params['messages'],
-            params['enable_web_search'],
-            ctx
-        )
-        
-        # 4. Validate: collection or web_search required
-        if not collection_id and not params['enable_web_search']:
-            return ApiResponse(
-                status=400,
-                body={
-                    'error': {
-                        'message': 'Either collection_id or enable_web_search must be provided',
-                        'type': 'invalid_request_error'
-                    }
-                }
-            )
-        
-        # 5. Initialize LLM with tools
-        model_with_tools = await initialize_model_with_tools(
-            model_name=params['model'],
-            temperature=params['temperature'],
-            max_tokens=params['max_tokens'],
-            collection_id=collection_id,
-            enable_web_search=params['enable_web_search'],
-            web_search_config=params['web_search_config'],
-            ctx=ctx
-        )
-        
-        # 6. Invoke LLM
-        completion_id = f"chatcmpl-{int(time.time())}"
-        response = await invoke_and_format_response(
-            model=model_with_tools,
-            messages=params['messages'],
-            completion_id=completion_id,
-            model_name=params['model'],
-            ctx=ctx
-        )
-        
-        ctx.logger.info(f"✅ Completion successful – {len(response.body['choices'][0]['message']['content'])} chars")
-        return response
-        
-    except ValueError as e:
-        ctx.logger.error(f"❌ Validation error: {e}")
-        return ApiResponse(
-            status=400,
-            body={'error': {'message': str(e), 'type': 'invalid_request_error'}}
-        )
-    except Exception as e:
-        ctx.logger.error(f"❌ Error: {e}")
-        return ApiResponse(
-            status=500,
-            body={'error': {'message': 'Internal server error', 'type': 'server_error'}}
-        )
-
-
-# ============================================================================
-# REUSABLE HELPER FUNCTIONS
-# ============================================================================
-
-def extract_request_params(request: ApiRequest, ctx: FlowContext) -> Dict[str, Any]:
-    """
-    Extract and validate request parameters.
-    
-    Returns:
-        Dict with validated parameters
-        
-    Raises:
-        ValueError: If validation fails
-    """
-    body = request.body or {}
-    
-    if not isinstance(body, dict):
-        raise ValueError("Request body must be JSON object")
-    
-    messages = body.get('messages', [])
-    if not messages or not isinstance(messages, list):
-        raise ValueError("messages must be non-empty array")
-    
-    # Extract parameters with defaults
-    params = {
-        'model': body.get('model', 'grok-4-1-fast-reasoning'),
-        'messages': messages,
-        'temperature': body.get('temperature', 0.7),
-        'max_tokens': body.get('max_tokens'),
-        'stream': body.get('stream', False),
-        'extra_body': body.get('extra_body', {}),
-    }
-    
-    # Handle enable_web_search (body or extra_body)
-    params['enable_web_search'] = body.get(
-        'enable_web_search',
-        params['extra_body'].get('enable_web_search', False)
-    )
-    
-    # Handle web_search_config
-    params['web_search_config'] = body.get(
-        'web_search_config',
-        params['extra_body'].get('web_search_config', {})
-    )
-    
-    # Handle collection_id (multiple sources)
-    params['collection_id'] = (
-        body.get('collection_id') or
-        body.get('custom_collection_id') or
-        params['extra_body'].get('collection_id')
-    )
-    
-    # Log concisely
-    ctx.logger.info(f"📋 Model: {params['model']} | Stream: {params['stream']}")
-    ctx.logger.info(f"📋 Web Search: {params['enable_web_search']} | Collection: {params['collection_id'] or 'auto'}")
-    ctx.logger.info(f"📨 Messages: {len(messages)}")
-    
-    return params
-
-
-async def resolve_collection_id(
-    explicit_collection_id: Optional[str],
-    messages: List[Dict[str, Any]],
-    enable_web_search: bool,
-    ctx: FlowContext
-) -> Optional[str]:
-    """
-    Resolve collection ID from explicit ID or Aktenzeichen auto-detection.
-    
-    Args:
-        explicit_collection_id: Explicitly provided collection ID
-        messages: Chat messages (for Aktenzeichen extraction)
-        enable_web_search: Whether web search is enabled
-        ctx: Motia context
-        
-    Returns:
-        Collection ID or None
-    """
-    # Explicit collection ID takes precedence
-    if explicit_collection_id:
-        ctx.logger.info(f"🔍 Using explicit collection: {explicit_collection_id}")
-        return explicit_collection_id
-    
-    # Try Aktenzeichen auto-detection from first user message
-    from services.aktenzeichen_utils import (
-        extract_aktenzeichen,
-        normalize_aktenzeichen,
-        remove_aktenzeichen
-    )
-    
-    for msg in messages:
-        if msg.get('role') == 'user':
-            content = msg.get('content', '')
-            aktenzeichen_raw = extract_aktenzeichen(content)
-            
-            if aktenzeichen_raw:
-                aktenzeichen = normalize_aktenzeichen(aktenzeichen_raw)
-                ctx.logger.info(f"🔍 Aktenzeichen detected: {aktenzeichen}")
-                
-                collection_id = await lookup_collection_by_aktenzeichen(aktenzeichen, ctx)
-                
-                if collection_id:
-                    # Clean Aktenzeichen from message
-                    msg['content'] = remove_aktenzeichen(content)
-                    ctx.logger.info(f"✅ Collection found: {collection_id}")
-                    return collection_id
-                else:
-                    ctx.logger.warning(f"⚠️  No collection for Aktenzeichen: {aktenzeichen}")
-            break  # Only check first user message
-    
-    return None
-
-
-async def initialize_model_with_tools(
-    model_name: str,
-    temperature: float,
-    max_tokens: Optional[int],
-    collection_id: Optional[str],
-    enable_web_search: bool,
-    web_search_config: Dict[str, Any],
-    ctx: FlowContext
-) -> Any:
-    """
-    Initialize LangChain model with tool bindings (file_search, web_search).
-    
-    Returns:
-        Model instance with tools bound
-    """
-    from services.langchain_xai_service import LangChainXAIService
-    
-    service = LangChainXAIService(ctx)
-    
-    # Create base model
-    model = service.get_chat_model(
-        model=model_name,
-        temperature=temperature,
-        max_tokens=max_tokens
-    )
-    
-    # Bind tools
-    model_with_tools = service.bind_tools(
-        model=model,
-        collection_id=collection_id,
-        enable_web_search=enable_web_search,
-        web_search_config=web_search_config,
-        max_num_results=10
-    )
-    
-    return model_with_tools
-
-
-async def invoke_and_format_response(
-    model: Any,
-    messages: List[Dict[str, Any]],
-    completion_id: str,
-    model_name: str,
-    ctx: FlowContext
-) -> ApiResponse:
-    """
-    Invoke LLM and format response in OpenAI-compatible format.
-    
-    Returns:
-        ApiResponse with chat completion
-    """
-    from services.langchain_xai_service import LangChainXAIService
-    
-    service = LangChainXAIService(ctx)
-    result = await service.invoke_chat(model, messages)
-    
-    # Extract content (handle structured responses)
-    if hasattr(result, 'content'):
-        raw = result.content
-        if isinstance(raw, list):
-            # Extract text parts from structured response
-            text_parts = [
-                item.get('text', '')
-                for item in raw
-                if isinstance(item, dict) and item.get('type') == 'text'
-            ]
-            content = ''.join(text_parts) or str(raw)
-        else:
-            content = raw
-    else:
-        content = str(result)
-    
-    # Extract usage metadata (if available)
-    usage = {"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0}
-    if hasattr(result, 'usage_metadata'):
-        u = result.usage_metadata
-        usage = {
-            "prompt_tokens": getattr(u, 'input_tokens', 0),
-            "completion_tokens": getattr(u, 'output_tokens', 0),
-            "total_tokens": getattr(u, 'input_tokens', 0) + getattr(u, 'output_tokens', 0)
-        }
-    
-    # Format OpenAI-compatible response
-    response_body = {
-        'id': completion_id,
-        'object': 'chat.completion',
-        'created': int(time.time()),
-        'model': model_name,
-        'choices': [{
-            'index': 0,
-            'message': {'role': 'assistant', 'content': content},
-            'finish_reason': 'stop'
-        }],
-        'usage': usage
-    }
-    
-    return ApiResponse(status=200, body=response_body)
-
-
-async def lookup_collection_by_aktenzeichen(
-    aktenzeichen: str,
-    ctx: FlowContext
-) -> Optional[str]:
-    """
-    Lookup xAI Collection ID by Aktenzeichen via EspoCRM.
-    
-    Args:
-        aktenzeichen: Normalized Aktenzeichen (e.g., "1234/56")
-        ctx: Motia context
-        
-    Returns:
-        Collection ID or None if not found
-    """
-    try:
-        from services.espocrm import EspoCRMAPI
-        
-        espocrm = EspoCRMAPI(ctx)
-        
-        search_result = await espocrm.search_entities(
-            entity_type='Raeumungsklage',
-            where=[{
-                'type': 'equals',
-                'attribute': 'advowareAkteBezeichner',
-                'value': aktenzeichen
-            }],
-            select=['id', 'xaiCollectionId'],
-            maxSize=1
-        )
-        
-        if search_result and len(search_result) > 0:
-            return search_result[0].get('xaiCollectionId')
-        
-        return None
-        
-    except Exception as e:
-        ctx.logger.error(f"❌ Collection lookup failed: {e}")
-        return None
--- a/steps/ai/models_list_api_step.py
+++ b/steps/ai/models_list_api_step.py
@@ -1,124 +0,0 @@
-"""AI Models List API
-
-OpenAI-compatible models list endpoint for OpenWebUI and other clients.
-Returns all available AI models that can be used with /ai/chat/completions.
-"""
-import time
-from typing import Any
-from motia import FlowContext, http, ApiRequest, ApiResponse
-
-
-config = {
-    "name": "AI Models List API",
-    "description": "OpenAI-compatible models endpoint - lists available AI models",
-    "flows": ["ai-general"],
-    "triggers": [
-        http("GET", "/ai/v1/models"),
-        http("GET", "/v1/models"),
-        http("GET", "/ai/models")
-    ],
-}
-
-
-async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
-    """
-    OpenAI-compatible models list endpoint.
-    
-    Returns list of available models for OpenWebUI and other clients.
-    
-    Response Format (OpenAI compatible):
-        {
-            "object": "list",
-            "data": [
-                {
-                    "id": "grok-4.20-beta-0309-reasoning",
-                    "object": "model",
-                    "created": 1735689600,
-                    "owned_by": "xai",
-                    "permission": [],
-                    "root": "grok-4.20-beta-0309-reasoning",
-                    "parent": null
-                }
-            ]
-        }
-    """
-    ctx.logger.info("📋 Models list requested")
-    
-    try:
-        # Define available models
-        # These correspond to models supported by /ai/chat/completions
-        current_timestamp = int(time.time())
-        
-        models = [
-            {
-                "id": "grok-4.20-beta-0309-reasoning",
-                "object": "model",
-                "created": current_timestamp,
-                "owned_by": "xai",
-                "permission": [],
-                "root": "grok-4.20-beta-0309-reasoning",
-                "parent": None,
-                "capabilities": {
-                    "file_search": True,
-                    "web_search": True,
-                    "streaming": True,
-                    "reasoning": True
-                }
-            },
-            {
-                "id": "grok-4.20-multi-agent-beta-0309",
-                "object": "model",
-                "created": current_timestamp,
-                "owned_by": "xai",
-                "permission": [],
-                "root": "grok-4.20-multi-agent-beta-0309",
-                "parent": None,
-                "capabilities": {
-                    "file_search": True,
-                    "web_search": True,
-                    "streaming": True,
-                    "reasoning": True,
-                    "multi_agent": True
-                }
-            },
-            {
-                "id": "grok-4-1-fast-reasoning",
-                "object": "model",
-                "created": current_timestamp,
-                "owned_by": "xai",
-                "permission": [],
-                "root": "grok-4-1-fast-reasoning",
-                "parent": None,
-                "capabilities": {
-                    "file_search": True,
-                    "web_search": True,
-                    "streaming": True,
-                    "reasoning": True
-                }
-            }
-        ]
-        
-        # Build OpenAI-compatible response
-        response_body = {
-            "object": "list",
-            "data": models
-        }
-        
-        ctx.logger.info(f"✅ Returned {len(models)} models")
-        
-        return ApiResponse(
-            status=200,
-            body=response_body
-        )
-    
-    except Exception as e:
-        ctx.logger.error(f"❌ Error listing models: {e}")
-        return ApiResponse(
-            status=500,
-            body={
-                "error": {
-                    "message": str(e),
-                    "type": "server_error"
-                }
-            }
-        )
--- a/steps/vmh/xai_chat_completion_api_step.py
+++ b/steps/vmh/xai_chat_completion_api_step.py
@@ -1,523 +0,0 @@
-"""VMH xAI Chat Completions API
-
-OpenAI-kompatible Chat Completions API mit xAI/LangChain Backend.
-Unterstützt file_search über xAI Collections (RAG).
-"""
-import json
-import time
-from typing import Any, Dict, List, Optional
-from motia import FlowContext, http, ApiRequest, ApiResponse
-
-
-config = {
-    "name": "VMH xAI Chat Completions API",
-    "description": "OpenAI-compatible Chat Completions API with xAI LangChain backend",
-    "flows": ["vmh-chat"],
-    "triggers": [
-        http("POST", "/vmh/v1/chat/completions")
-    ],
-}
-
-
-async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
-    """
-    OpenAI-compatible Chat Completions endpoint.
-    
-    Request Body (OpenAI format):
-        {
-            "model": "grok-2-latest",
-            "messages": [
-                {"role": "system", "content": "You are helpful"},
-                {"role": "user", "content": "1234/56 Was ist der Stand?"}
-            ],
-            "temperature": 0.7,
-            "max_tokens": 2000,
-            "stream": false,
-            "extra_body": {
-                "collection_id": "col_abc123",  // Optional: override auto-detection
-                "enable_web_search": true,       // Optional: enable web search (default: false)
-                "web_search_config": {           // Optional: web search configuration
-                    "allowed_domains": ["example.com"],
-                    "excluded_domains": ["spam.com"],
-                    "enable_image_understanding": true
-                }
-            }
-        }
-    
-    Aktenzeichen-Erkennung (Priority):
-        1. extra_body.collection_id (explicit override)
-        2. First user message starts with Aktenzeichen (e.g., "1234/56 ...")
-        3. Error 400 if no collection_id found (strict mode)
-    
-    Response (OpenAI format):
-        Non-Streaming:
-            {
-                "id": "chatcmpl-...",
-                "object": "chat.completion",
-                "created": 1234567890,
-                "model": "grok-2-latest",
-                "choices": [{
-                    "index": 0,
-                    "message": {"role": "assistant", "content": "..."},
-                    "finish_reason": "stop"
-                }],
-                "usage": {"prompt_tokens": X, "completion_tokens": Y, "total_tokens": Z}
-            }
-        
-        Streaming (SSE):
-            data: {"id":"chatcmpl-...","choices":[{"delta":{"content":"Hello"},...}]}
-            data: {"id":"chatcmpl-...","choices":[{"delta":{"content":" world"},...}]}
-            data: {"choices":[{"delta":{},"finish_reason":"stop"}]}
-            data: [DONE]
-    """
-    from services.langchain_xai_service import LangChainXAIService
-    from services.aktenzeichen_utils import extract_aktenzeichen, normalize_aktenzeichen
-    from services.espocrm import EspoCRMAPI
-    
-    ctx.logger.info("=" * 80)
-    ctx.logger.info("💬 VMH CHAT COMPLETIONS API")
-    ctx.logger.info("=" * 80)
-    
-    try:
-        # Parse request body
-        body = request.body or {}
-        
-        if not isinstance(body, dict):
-            ctx.logger.error(f"❌ Invalid request body type: {type(body)}")
-            return ApiResponse(
-                status=400,
-                body={'error': 'Request body must be JSON object'}
-            )
-        
-        # Extract parameters
-        model_name = body.get('model', 'grok-4.20-beta-0309-reasoning')
-        messages = body.get('messages', [])
-        temperature = body.get('temperature', 0.7)
-        max_tokens = body.get('max_tokens')
-        stream = body.get('stream', False)
-        extra_body = body.get('extra_body', {})
-        
-        # Web Search parameters (default: disabled)
-        enable_web_search = extra_body.get('enable_web_search', False)
-        web_search_config = extra_body.get('web_search_config', {})
-        
-        ctx.logger.info(f"📋 Model: {model_name}")
-        ctx.logger.info(f"📋 Messages: {len(messages)}")
-        ctx.logger.info(f"📋 Stream: {stream}")
-        ctx.logger.info(f"📋 Web Search: {'enabled' if enable_web_search else 'disabled'}")
-        if enable_web_search and web_search_config:
-            ctx.logger.debug(f"Web Search Config: {json.dumps(web_search_config, indent=2)}")
-        
-        # Log full conversation messages
-        ctx.logger.info("-" * 80)
-        ctx.logger.info("📨 REQUEST MESSAGES:")
-        for i, msg in enumerate(messages, 1):
-            role = msg.get('role', 'unknown')
-            content = msg.get('content', '')
-            preview = content[:150] + "..." if len(content) > 150 else content
-            ctx.logger.info(f"  [{i}] {role}: {preview}")
-        ctx.logger.info("-" * 80)
-        
-        # Validate messages
-        if not messages or not isinstance(messages, list):
-            ctx.logger.error("❌ Missing or invalid messages array")
-            return ApiResponse(
-                status=400,
-                body={'error': 'messages must be non-empty array'}
-            )
-        
-        # Determine collection_id (Priority: extra_body > Aktenzeichen > error)
-        collection_id: Optional[str] = None
-        aktenzeichen: Optional[str] = None
-        
-        # Priority 1: Explicit collection_id in extra_body
-        if 'collection_id' in extra_body:
-            collection_id = extra_body['collection_id']
-            ctx.logger.info(f"🔍 Collection ID from extra_body: {collection_id}")
-        
-        # Priority 2: Extract Aktenzeichen from first user message
-        else:
-            for msg in messages:
-                if msg.get('role') == 'user':
-                    content = msg.get('content', '')
-                    aktenzeichen_raw = extract_aktenzeichen(content)
-                    
-                    if aktenzeichen_raw:
-                        aktenzeichen = normalize_aktenzeichen(aktenzeichen_raw)
-                        ctx.logger.info(f"🔍 Aktenzeichen detected: {aktenzeichen}")
-                        
-                        # Lookup collection_id via EspoCRM
-                        collection_id = await lookup_collection_by_aktenzeichen(
-                            aktenzeichen, ctx
-                        )
-                        
-                        if collection_id:
-                            ctx.logger.info(f"✅ Collection found: {collection_id}")
-                            
-                            # Remove Aktenzeichen from message (clean prompt)
-                            from services.aktenzeichen_utils import remove_aktenzeichen
-                            msg['content'] = remove_aktenzeichen(content)
-                            ctx.logger.debug(f"Cleaned message: {msg['content']}")
-                        else:
-                            ctx.logger.warn(f"⚠️  No collection found for {aktenzeichen}")
-                    
-                    break  # Only check first user message
-        
-        # Priority 3: Error if no collection_id AND web_search disabled
-        if not collection_id and not enable_web_search:
-            ctx.logger.error("❌ No collection_id found and web_search disabled")
-            ctx.logger.error("   Provide collection_id, enable web_search, or both")
-            return ApiResponse(
-                status=400,
-                body={
-                    'error': 'collection_id or web_search required',
-                    'message': 'Provide collection_id in extra_body, enable web_search, or start message with Aktenzeichen (e.g., "1234/56 question")'
-                }
-            )
-        
-        # Initialize LangChain xAI Service
-        try:
-            langchain_service = LangChainXAIService(ctx)
-        except ValueError as e:
-            ctx.logger.error(f"❌ Service initialization failed: {e}")
-            return ApiResponse(
-                status=500,
-                body={'error': 'Service configuration error', 'details': str(e)}
-            )
-        
-        # Create ChatXAI model
-        model = langchain_service.get_chat_model(
-            model=model_name,
-            temperature=temperature,
-            max_tokens=max_tokens
-        )
-        
-        # Bind tools (file_search and/or web_search)
-        model_with_tools = langchain_service.bind_tools(
-            model=model,
-            collection_id=collection_id,
-            enable_web_search=enable_web_search,
-            web_search_config=web_search_config,
-            max_num_results=10
-        )
-        
-        # Generate completion_id
-        completion_id = f"chatcmpl-{ctx.traceId[:12]}" if hasattr(ctx, 'traceId') else f"chatcmpl-{int(time.time())}"
-        created_ts = int(time.time())
-        
-        # Branch: Streaming vs Non-Streaming
-        if stream:
-            ctx.logger.info("🌊 Starting streaming response...")
-            return await handle_streaming_response(
-                model_with_tools=model_with_tools,
-                messages=messages,
-                completion_id=completion_id,
-                created_ts=created_ts,
-                model_name=model_name,
-                langchain_service=langchain_service,
-                ctx=ctx
-            )
-        else:
-            ctx.logger.info("📦 Starting non-streaming response...")
-            return await handle_non_streaming_response(
-                model_with_tools=model_with_tools,
-                messages=messages,
-                completion_id=completion_id,
-                created_ts=created_ts,
-                model_name=model_name,
-                langchain_service=langchain_service,
-                ctx=ctx
-            )
-    
-    except Exception as e:
-        ctx.logger.error("=" * 80)
-        ctx.logger.error("❌ ERROR: CHAT COMPLETIONS API")
-        ctx.logger.error("=" * 80)
-        ctx.logger.error(f"Error: {e}", exc_info=True)
-        ctx.logger.error(f"Request body: {json.dumps(request.body, indent=2, ensure_ascii=False)}")
-        ctx.logger.error("=" * 80)
-        
-        return ApiResponse(
-            status=500,
-            body={
-                'error': 'Internal server error',
-                'message': str(e)
-            }
-        )
-
-
-async def handle_non_streaming_response(
-    model_with_tools,
-    messages: List[Dict[str, Any]],
-    completion_id: str,
-    created_ts: int,
-    model_name: str,
-    langchain_service,
-    ctx: FlowContext
-) -> ApiResponse:
-    """
-    Handle non-streaming chat completion.
-    
-    Returns:
-        ApiResponse with OpenAI-format JSON body
-    """
-    try:
-        # Invoke model
-        result = await langchain_service.invoke_chat(model_with_tools, messages)
-        
-        # Extract content - handle both string and structured responses
-        if hasattr(result, 'content'):
-            raw_content = result.content
-            
-            # If content is a list (tool calls + text message), extract text
-            if isinstance(raw_content, list):
-                # Find the text message (usually last element with type='text')
-                text_messages = [
-                    item.get('text', '') 
-                    for item in raw_content 
-                    if isinstance(item, dict) and item.get('type') == 'text'
-                ]
-                content = text_messages[0] if text_messages else str(raw_content)
-            else:
-                content = raw_content
-        else:
-            content = str(result)
-        
-        # Build OpenAI-compatible response
-        response_body = {
-            'id': completion_id,
-            'object': 'chat.completion',
-            'created': created_ts,
-            'model': model_name,
-            'choices': [{
-                'index': 0,
-                'message': {
-                    'role': 'assistant',
-                    'content': content
-                },
-                'finish_reason': 'stop'
-            }],
-            'usage': {
-                'prompt_tokens': 0,  # LangChain doesn't expose token counts easily
-                'completion_tokens': 0,
-                'total_tokens': 0
-            }
-        }
-        
-        # Log token usage (if available)
-        if hasattr(result, 'usage_metadata'):
-            usage = result.usage_metadata
-            prompt_tokens = getattr(usage, 'input_tokens', 0)
-            completion_tokens = getattr(usage, 'output_tokens', 0)
-            response_body['usage'] = {
-                'prompt_tokens': prompt_tokens,
-                'completion_tokens': completion_tokens,
-                'total_tokens': prompt_tokens + completion_tokens
-            }
-            ctx.logger.info(f"📊 Token Usage: prompt={prompt_tokens}, completion={completion_tokens}")
-        
-        # Log citations if available (from tool response annotations)
-        if hasattr(result, 'content') and isinstance(result.content, list):
-            # Extract citations from structured response
-            for item in result.content:
-                if isinstance(item, dict) and item.get('type') == 'text':
-                    annotations = item.get('annotations', [])
-                    if annotations:
-                        ctx.logger.info(f"🔗 Citations: {len(annotations)}")
-                        for i, citation in enumerate(annotations[:10], 1):  # Log first 10
-                            url = citation.get('url', 'N/A')
-                            title = citation.get('title', '')
-                            if url.startswith('collections://'):
-                                # Internal collection reference
-                                ctx.logger.debug(f"   [{i}] Collection Document: {title}")
-                            else:
-                                # External URL
-                                ctx.logger.debug(f"   [{i}] {url}")
-        
-        # Log complete response content
-        ctx.logger.info(f"✅ Chat completion: {len(content)} chars")
-        ctx.logger.info("=" * 80)
-        ctx.logger.info("📝 COMPLETE RESPONSE:")
-        ctx.logger.info("-" * 80)
-        ctx.logger.info(content)
-        ctx.logger.info("-" * 80)
-        ctx.logger.info("=" * 80)
-        
-        return ApiResponse(
-            status=200,
-            body=response_body
-        )
-    
-    except Exception as e:
-        ctx.logger.error(f"❌ Non-streaming completion failed: {e}", exc_info=True)
-        raise
-
-
-async def handle_streaming_response(
-    model_with_tools,
-    messages: List[Dict[str, Any]],
-    completion_id: str,
-    created_ts: int,
-    model_name: str,
-    langchain_service,
-    ctx: FlowContext
-):
-    """
-    Handle streaming chat completion via SSE.
-    
-    Returns:
-        Streaming response generator
-    """
-    async def stream_generator():
-        try:
-            # Set SSE headers
-            await ctx.response.status(200)
-            await ctx.response.headers({
-                "Content-Type": "text/event-stream",
-                "Cache-Control": "no-cache",
-                "Connection": "keep-alive"
-            })
-            
-            ctx.logger.info("🌊 Streaming started")
-            
-            # Stream chunks
-            chunk_count = 0
-            total_content = ""
-            
-            async for chunk in langchain_service.astream_chat(model_with_tools, messages):
-                # Extract delta content - handle structured chunks
-                if hasattr(chunk, "content"):
-                    chunk_content = chunk.content
-                    
-                    # If chunk content is a list (tool calls), extract text parts
-                    if isinstance(chunk_content, list):
-                        # Accumulate only text deltas
-                        text_parts = [
-                            item.get('text', '')
-                            for item in chunk_content
-                            if isinstance(item, dict) and item.get('type') == 'text'
-                        ]
-                        delta = ''.join(text_parts)
-                    else:
-                        delta = chunk_content
-                else:
-                    delta = ""
-                
-                if delta:
-                    total_content += delta
-                    chunk_count += 1
-                    
-                    # Build SSE data
-                    data = {
-                        "id": completion_id,
-                        "object": "chat.completion.chunk",
-                        "created": created_ts,
-                        "model": model_name,
-                        "choices": [{
-                            "index": 0,
-                            "delta": {"content": delta},
-                            "finish_reason": None
-                        }]
-                    }
-                    
-                    # Send SSE event
-                    await ctx.response.stream(f"data: {json.dumps(data, ensure_ascii=False)}\n\n")
-            
-            # Send finish event
-            finish_data = {
-                "id": completion_id,
-                "object": "chat.completion.chunk",
-                "created": created_ts,
-                "model": model_name,
-                "choices": [{
-                    "index": 0,
-                    "delta": {},
-                    "finish_reason": "stop"
-                }]
-            }
-            await ctx.response.stream(f"data: {json.dumps(finish_data)}\n\n")
-            
-            # Send [DONE]
-            await ctx.response.stream("data: [DONE]\n\n")
-            
-            # Close stream
-            await ctx.response.close()
-            
-            # Log complete streamed response
-            ctx.logger.info(f"✅ Streaming completed: {chunk_count} chunks, {len(total_content)} chars")
-            ctx.logger.info("=" * 80)
-            ctx.logger.info("📝 COMPLETE STREAMED RESPONSE:")
-            ctx.logger.info("-" * 80)
-            ctx.logger.info(total_content)
-            ctx.logger.info("-" * 80)
-            ctx.logger.info("=" * 80)
-        
-        except Exception as e:
-            ctx.logger.error(f"❌ Streaming failed: {e}", exc_info=True)
-            
-            # Send error event
-            error_data = {
-                "error": {
-                    "message": str(e),
-                    "type": "server_error"
-                }
-            }
-            await ctx.response.stream(f"data: {json.dumps(error_data)}\n\n")
-            await ctx.response.close()
-    
-    return stream_generator()
-
-
-async def lookup_collection_by_aktenzeichen(
-    aktenzeichen: str,
-    ctx: FlowContext
-) -> Optional[str]:
-    """
-    Lookup xAI Collection ID for Aktenzeichen via EspoCRM.
-    
-    Search strategy:
-        1. Search for Raeumungsklage with matching advowareAkteBezeichner
-        2. Return xaiCollectionId if found
-    
-    Args:
-        aktenzeichen: Normalized Aktenzeichen (e.g., "1234/56")
-        ctx: Motia context
-        
-    Returns:
-        Collection ID or None if not found
-    """
-    try:
-        # Initialize EspoCRM API
-        espocrm = EspoCRMAPI(ctx)
-        
-        # Search Räumungsklage by advowareAkteBezeichner
-        ctx.logger.info(f"🔍 Searching Räumungsklage for Aktenzeichen: {aktenzeichen}")
-        
-        search_result = await espocrm.search_entities(
-            entity_type='Raeumungsklage',
-            where=[{
-                'type': 'equals',
-                'attribute': 'advowareAkteBezeichner',
-                'value': aktenzeichen
-            }],
-            select=['id', 'xaiCollectionId', 'advowareAkteBezeichner'],
-            maxSize=1
-        )
-        
-        if search_result and len(search_result) > 0:
-            entity = search_result[0]
-            collection_id = entity.get('xaiCollectionId')
-            
-            if collection_id:
-                ctx.logger.info(f"✅ Found Räumungsklage: {entity.get('id')}")
-                return collection_id
-            else:
-                ctx.logger.warn(f"⚠️  Räumungsklage found but no xaiCollectionId: {entity.get('id')}")
-        else:
-            ctx.logger.warn(f"⚠️  No Räumungsklage found for {aktenzeichen}")
-        
-        return None
-    
-    except Exception as e:
-        ctx.logger.error(f"❌ Collection lookup failed: {e}", exc_info=True)
-        return None
Author	SHA1	Message	Date
bsiggel	71f583481a	fix: Remove deprecated AI Chat Completions and Models List API implementations	2026-03-19 23:10:00 +00:00
bsiggel	48d440a860	fix: Remove deprecated VMH xAI Chat Completions API implementation	2026-03-19 21:42:43 +00:00
bsiggel	c02a5d8823	fix: Update ExecModule exec path to use correct binary location	2026-03-19 21:23:42 +00:00
bsiggel	edae5f6081	fix: Update ExecModule configuration to use correct source directory for step scripts	2026-03-19 21:20:31 +00:00
bsiggel	8ce843415e	feat: Enhance developer guide with updated platform evolution and workflow details	2026-03-19 20:56:32 +00:00
bsiggel	46085bd8dd	update to iii 0.90 and change directory structure	2026-03-19 20:33:49 +00:00
bsiggel	2ac83df1e0	fix: Update default chat model to grok-4-1-fast-reasoning and enhance logging for LLM responses	2026-03-19 09:50:31 +00:00