Compare commits

...

41 Commits

Author SHA1 Message Date
bsiggel
71f583481a fix: Remove deprecated AI Chat Completions and Models List API implementations 2026-03-19 23:10:00 +00:00
bsiggel
48d440a860 fix: Remove deprecated VMH xAI Chat Completions API implementation 2026-03-19 21:42:43 +00:00
bsiggel
c02a5d8823 fix: Update ExecModule exec path to use correct binary location 2026-03-19 21:23:42 +00:00
bsiggel
edae5f6081 fix: Update ExecModule configuration to use correct source directory for step scripts 2026-03-19 21:20:31 +00:00
bsiggel
8ce843415e feat: Enhance developer guide with updated platform evolution and workflow details 2026-03-19 20:56:32 +00:00
bsiggel
46085bd8dd update to iii 0.90 and change directory structure 2026-03-19 20:33:49 +00:00
bsiggel
2ac83df1e0 fix: Update default chat model to grok-4-1-fast-reasoning and enhance logging for LLM responses 2026-03-19 09:50:31 +00:00
bsiggel
7fffdb2660 fix: Simplify error logging in models list API handler 2026-03-19 09:48:57 +00:00
bsiggel
69f0c6a44d feat: Implement AI Chat Completions API with streaming support and models list endpoint
- Enhanced the AI Chat Completions API to support true streaming using async generators and proper SSE headers.
- Updated endpoint paths to align with OpenAI's API versioning.
- Improved logging for request details and error handling.
- Added a new AI Models List API to return available models compatible with chat completions.
- Refactored code for better readability and maintainability, including the extraction of common functionalities.
- Introduced a VMH-specific Chat Completions API with similar features and structure.
2026-03-18 21:30:59 +00:00
bsiggel
949a5fd69c feat: Implement AI Chat Completions API with support for file search, web search, and Aktenzeichen-based collection lookup 2026-03-18 18:22:04 +00:00
bsiggel
8e53fd6345 fix: Enhance tool binding in LangChainXAIService to support web search and update API handler for new parameters 2026-03-15 16:37:57 +00:00
bsiggel
59fdd7d9ec fix: Normalize MIME type for PDF uploads and update collection management endpoint to use vector store API 2026-03-15 16:34:13 +00:00
bsiggel
eaab14ae57 fix: Adjust multipart form to use raw UTF-8 encoding for filenames in file uploads 2026-03-14 23:00:49 +00:00
bsiggel
331d43390a fix: Import unquote for URL decoding in AI Knowledge synchronization utilities 2026-03-14 22:50:59 +00:00
bsiggel
18f2ff775e fix: URL-decode filenames in document synchronization to handle special characters 2026-03-14 22:49:07 +00:00
bsiggel
c032e24d7a fix: Update default model name to 'grok-4-1-fast-reasoning' in xAI Chat Completions API 2026-03-14 08:39:50 +00:00
bsiggel
4a5065aea4 feat: Add Aktenzeichen utility functions and LangChain xAI service integration
- Implemented utility functions for extracting, validating, and normalizing Aktenzeichen in 'aktenzeichen_utils.py'.
- Created LangChainXAIService for integrating LangChain ChatXAI with file search capabilities in 'langchain_xai_service.py'.
- Developed VMH xAI Chat Completions API to handle OpenAI-compatible requests with support for Aktenzeichen detection and file search in 'xai_chat_completion_api_step.py'.
2026-03-13 10:10:33 +00:00
bsiggel
bb13d59ddb fix: Improve orphan detection and Blake3 hash verification in document synchronization 2026-03-13 08:40:20 +00:00
bsiggel
b0fceef4e2 fix: Update sync mode logging to clarify Blake3 hash verification status 2026-03-12 23:09:21 +00:00
bsiggel
e727582584 fix: Update JunctionData URL construction to use API Gateway instead of direct EspoCRM endpoint 2026-03-12 23:07:33 +00:00
bsiggel
2292fd4762 feat: Enhance document synchronization logic to continue syncing after collection activation 2026-03-12 23:06:40 +00:00
bsiggel
9ada48d8c8 fix: Update collection ID retrieval logic and simplify error logging in AI Knowledge sync event handler 2026-03-12 23:04:01 +00:00
bsiggel
9a3e01d447 fix: Correct logging method from warning to warn for lock acquisition in AI Knowledge sync handler 2026-03-12 23:00:08 +00:00
bsiggel
e945333c1a feat: Update activation status references to 'aktivierungsstatus' for consistency across AI Knowledge sync utilities 2026-03-12 22:53:47 +00:00
bsiggel
6f7f847939 feat: Enhance AI Knowledge Update webhook handler to validate payload structure and handle empty lists 2026-03-12 22:51:44 +00:00
bsiggel
46c0bbf381 feat: Refactor AI Knowledge sync processes to remove full sync parameter and ensure Blake3 verification is always performed 2026-03-12 22:41:19 +00:00
bsiggel
8f1533337c feat: Enhance AI Knowledge sync process with full sync mode and attachment handling 2026-03-12 22:35:48 +00:00
bsiggel
6bf2343a12 feat: Enhance document synchronization by integrating CAIKnowledge handling and improving error logging 2026-03-12 22:30:11 +00:00
bsiggel
8ed7cca432 feat: Add logging utility for calendar sync operations and enhance error handling 2026-03-12 19:26:04 +00:00
bsiggel
9bbfa61b3b feat: Implement AI Knowledge Sync Utilities and Event Handlers
- Added AIKnowledgeActivationStatus and AIKnowledgeSyncStatus enums to models.py for managing activation and sync states.
- Introduced AIKnowledgeSync class in aiknowledge_sync_utils.py for synchronizing CAIKnowledge entities with XAI Collections, including collection lifecycle management, document synchronization, and metadata updates.
- Created a daily cron job (aiknowledge_full_sync_cron_step.py) to perform a full sync of CAIKnowledge entities.
- Developed an event handler (aiknowledge_sync_event_step.py) to synchronize CAIKnowledge entities with XAI Collections triggered by webhooks and cron jobs.
- Implemented a webhook handler (aiknowledge_update_api_step.py) to receive updates from EspoCRM for CAIKnowledge entities and enqueue sync events.
- Enhanced xai_service.py with methods for collection management, document listing, and metadata updates.
2026-03-11 21:14:52 +00:00
bsiggel
a5a122b688 refactor(logging): enhance error handling and resource management in rate limiting and sync operations 2026-03-08 22:47:05 +00:00
bsiggel
6c3cf3ca91 refactor(logging): remove unused logger instances and enhance error logging in webhook steps 2026-03-08 22:21:08 +00:00
bsiggel
1c765d1eec refactor(logging): standardize status code handling and enhance logging in webhook and cron handlers 2026-03-08 22:09:22 +00:00
bsiggel
a0cf845877 Refactor and enhance logging in webhook handlers and Redis client
- Translated comments and docstrings from German to English for better clarity.
- Improved logging consistency across various webhook handlers for create, delete, and update operations.
- Centralized logging functionality by utilizing a dedicated logger utility.
- Added new enums for file and XAI sync statuses in models.
- Updated Redis client factory to use a centralized logger and improved error handling.
- Enhanced API responses to include more descriptive messages and status codes.
2026-03-08 21:50:34 +00:00
bsiggel
f392ec0f06 refactor(typing): update handler signatures to use Dict and Any for improved type hinting 2026-03-08 21:24:12 +00:00
bsiggel
2532bd89ee refactor(logging): standardize logging approach across services and steps 2026-03-08 21:20:49 +00:00
bsiggel
2e449d2928 docs: enhance error handling and locking strategies in document synchronization 2026-03-08 20:58:58 +00:00
bsiggel
fd0196ec31 docs: add guidelines for Steps vs. Utils architecture and decision matrix 2026-03-08 20:30:33 +00:00
bsiggel
d71b5665b6 docs: update Design Principles section with enhanced lock strategy and event handling guidelines 2026-03-08 20:28:57 +00:00
bsiggel
d69801ed97 docs: expand Design Principles section with detailed guidelines and examples for event handling and locking strategies 2026-03-08 20:18:18 +00:00
bsiggel
6e2303c5eb feat(docs): add Design Principles section with Event Storm and Bidirectional Reference patterns 2026-03-08 20:09:32 +00:00
55 changed files with 5399 additions and 502 deletions

599
docs/AI_KNOWLEDGE_SYNC.md Normal file
View File

@@ -0,0 +1,599 @@
# AI Knowledge Collection Sync - Dokumentation
**Version**: 1.0
**Datum**: 11. März 2026
**Status**: ✅ Implementiert
---
## Überblick
Synchronisiert EspoCRM `CAIKnowledge` Entities mit XAI Collections für semantische Dokumentensuche. Unterstützt vollständigen Collection-Lifecycle, BLAKE3-basierte Integritätsprüfung und robustes Hash-basiertes Change Detection.
## Features
**Collection Lifecycle Management**
- NEW → Collection erstellen in XAI
- ACTIVE → Automatischer Sync der Dokumente
- PAUSED → Sync pausiert, Collection bleibt
- DEACTIVATED → Collection aus XAI löschen
**Dual-Hash Change Detection**
- EspoCRM Hash (MD5/SHA256) für lokale Änderungserkennung
- XAI BLAKE3 Hash für Remote-Integritätsverifikation
- Metadata-Hash für Beschreibungs-Änderungen
**Robustheit**
- BLAKE3 Verification nach jedem Upload
- Metadata-Only Updates via PATCH
- Orphan Detection & Cleanup
- Distributed Locking (Redis)
- Daily Full Sync (02:00 Uhr nachts)
**Fehlerbehandlung**
- Unsupported MIME Types → Status "unsupported"
- Transient Errors → Retry mit Exponential Backoff
- Partial Failures toleriert
---
## Architektur
```
┌─────────────────────────────────────────────────────────────────┐
│ EspoCRM CAIKnowledge │
│ ├─ activationStatus: new/active/paused/deactivated │
│ ├─ syncStatus: unclean/pending_sync/synced/failed │
│ └─ datenbankId: XAI Collection ID │
└─────────────────────────────────────────────────────────────────┘
↓ Webhook
┌─────────────────────────────────────────────────────────────────┐
│ Motia Webhook Handler │
│ → POST /vmh/webhook/aiknowledge/update │
└─────────────────────────────────────────────────────────────────┘
↓ Emit Event
┌─────────────────────────────────────────────────────────────────┐
│ Queue: aiknowledge.sync │
└─────────────────────────────────────────────────────────────────┘
↓ Lock: aiknowledge:{id}
┌─────────────────────────────────────────────────────────────────┐
│ Sync Handler │
│ ├─ Check activationStatus │
│ ├─ Manage Collection Lifecycle │
│ ├─ Sync Documents (with BLAKE3 verification) │
│ └─ Update Statuses │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ XAI Collections API │
│ └─ Collections with embedded documents │
└─────────────────────────────────────────────────────────────────┘
```
---
## EspoCRM Konfiguration
### 1. Entity: CAIKnowledge
**Felder:**
| Feld | Typ | Beschreibung | Werte |
|------|-----|--------------|-------|
| `name` | varchar(255) | Name der Knowledge Base | - |
| `datenbankId` | varchar(255) | XAI Collection ID | Automatisch gefüllt |
| `activationStatus` | enum | Lifecycle-Status | new, active, paused, deactivated |
| `syncStatus` | enum | Sync-Status | unclean, pending_sync, synced, failed |
| `lastSync` | datetime | Letzter erfolgreicher Sync | ISO 8601 |
| `syncError` | text | Fehlermeldung bei Failure | Max 2000 Zeichen |
**Enum-Definitionen:**
```json
{
"activationStatus": {
"type": "enum",
"options": ["new", "active", "paused", "deactivated"],
"default": "new"
},
"syncStatus": {
"type": "enum",
"options": ["unclean", "pending_sync", "synced", "failed"],
"default": "unclean"
}
}
```
### 2. Junction: CAIKnowledgeCDokumente
**additionalColumns:**
| Feld | Typ | Beschreibung |
|------|-----|--------------|
| `aiDocumentId` | varchar(255) | XAI file_id |
| `syncstatus` | enum | Per-Document Sync-Status |
| `syncedHash` | varchar(64) | MD5/SHA256 von EspoCRM |
| `xaiBlake3Hash` | varchar(128) | BLAKE3 Hash von XAI |
| `syncedMetadataHash` | varchar(64) | Hash der Metadaten |
| `lastSync` | datetime | Letzter Sync dieses Dokuments |
**Enum-Definition:**
```json
{
"syncstatus": {
"type": "enum",
"options": ["new", "unclean", "synced", "failed", "unsupported"]
}
}
```
### 3. Webhooks
**Webhook 1: CREATE**
```json
{
"event": "CAIKnowledge.afterSave",
"url": "https://your-motia-domain.com/vmh/webhook/aiknowledge/update",
"method": "POST",
"payload": "{\"entity_id\": \"{$id}\", \"entity_type\": \"CAIKnowledge\", \"action\": \"create\"}",
"condition": "entity.isNew()"
}
```
**Webhook 2: UPDATE**
```json
{
"event": "CAIKnowledge.afterSave",
"url": "https://your-motia-domain.com/vmh/webhook/aiknowledge/update",
"method": "POST",
"payload": "{\"entity_id\": \"{$id}\", \"entity_type\": \"CAIKnowledge\", \"action\": \"update\"}",
"condition": "!entity.isNew()"
}
```
**Webhook 3: DELETE (Optional)**
```json
{
"event": "CAIKnowledge.afterRemove",
"url": "https://your-motia-domain.com/vmh/webhook/aiknowledge/delete",
"method": "POST",
"payload": "{\"entity_id\": \"{$id}\", \"entity_type\": \"CAIKnowledge\", \"action\": \"delete\"}"
}
```
**Empfehlung**: Nur CREATE + UPDATE verwenden. DELETE über `activationStatus="deactivated"` steuern.
### 4. Hooks (EspoCRM Backend)
**Hook 1: Document Link → syncStatus auf "unclean"**
```php
// Hooks/Custom/CAIKnowledge/AfterRelateLinkMultiple.php
namespace Espo\Custom\Hooks\CAIKnowledge;
class AfterRelateLinkMultiple extends \Espo\Core\Hooks\Base
{
public function afterRelateLinkMultiple($entity, $options, $data)
{
if ($data['link'] === 'dokumentes') {
// Mark as unclean when documents linked
$entity->set('syncStatus', 'unclean');
$this->getEntityManager()->saveEntity($entity);
}
}
}
```
**Hook 2: Document Change → Junction auf "unclean"**
```php
// Hooks/Custom/CDokumente/AfterSave.php
namespace Espo\Custom\Hooks\CDokumente;
class AfterSave extends \Espo\Core\Hooks\Base
{
public function afterSave($entity, $options)
{
if ($entity->isAttributeChanged('description') ||
$entity->isAttributeChanged('md5') ||
$entity->isAttributeChanged('sha256')) {
// Mark all junction entries as unclean
$this->updateJunctionStatuses($entity->id, 'unclean');
// Mark all related CAIKnowledge as unclean
$this->markRelatedKnowledgeUnclean($entity->id);
}
}
}
```
---
## Environment Variables
```bash
# XAI API Keys (erforderlich)
XAI_API_KEY=your_xai_api_key_here
XAI_MANAGEMENT_KEY=your_xai_management_key_here
# Redis (für Locking)
REDIS_HOST=localhost
REDIS_PORT=6379
# EspoCRM
ESPOCRM_API_BASE_URL=https://crm.bitbylaw.com/api/v1
ESPOCRM_API_KEY=your_espocrm_api_key
```
---
## Workflows
### Workflow 1: Neue Knowledge Base erstellen
```
1. User erstellt CAIKnowledge in EspoCRM
└─ activationStatus: "new" (default)
2. Webhook CREATE gefeuert
└─ Event: aiknowledge.sync
3. Sync Handler:
└─ activationStatus="new" → Collection erstellen in XAI
└─ Update EspoCRM:
├─ datenbankId = collection_id
├─ activationStatus = "active"
└─ syncStatus = "unclean"
4. Nächster Webhook (UPDATE):
└─ activationStatus="active" → Dokumente syncen
```
### Workflow 2: Dokumente hinzufügen
```
1. User verknüpft Dokumente mit CAIKnowledge
└─ EspoCRM Hook setzt syncStatus = "unclean"
2. Webhook UPDATE gefeuert
└─ Event: aiknowledge.sync
3. Sync Handler:
└─ Für jedes Junction-Entry:
├─ Check: MIME Type supported?
├─ Check: Hash changed?
├─ Download von EspoCRM
├─ Upload zu XAI mit Metadata
├─ Verify Upload (BLAKE3)
└─ Update Junction: syncstatus="synced"
4. Update CAIKnowledge:
└─ syncStatus = "synced"
└─ lastSync = now()
```
### Workflow 3: Metadata-Änderung
```
1. User ändert Document.description in EspoCRM
└─ EspoCRM Hook setzt Junction syncstatus = "unclean"
└─ EspoCRM Hook setzt CAIKnowledge syncStatus = "unclean"
2. Webhook UPDATE gefeuert
3. Sync Handler:
└─ Berechne Metadata-Hash
└─ Hash unterschiedlich? → PATCH zu XAI
└─ Falls PATCH fehlschlägt → Fallback: Re-upload
└─ Update Junction: syncedMetadataHash
```
### Workflow 4: Knowledge Base deaktivieren
```
1. User setzt activationStatus = "deactivated"
2. Webhook UPDATE gefeuert
3. Sync Handler:
└─ Collection aus XAI löschen
└─ Alle Junction Entries zurücksetzen:
├─ syncstatus = "new"
└─ aiDocumentId = NULL
└─ CAIKnowledge bleibt in EspoCRM (mit datenbankId)
```
### Workflow 5: Daily Full Sync
```
Cron: Täglich um 02:00 Uhr
1. Lade alle CAIKnowledge mit:
└─ activationStatus = "active"
└─ syncStatus IN ("unclean", "failed")
2. Für jedes:
└─ Emit: aiknowledge.sync Event
3. Queue verarbeitet alle sequenziell
└─ Fängt verpasste Webhooks ab
```
---
## Monitoring & Troubleshooting
### Logs prüfen
```bash
# Motia Service Logs
sudo journalctl -u motia-iii -f | grep -i "ai knowledge"
# Letzte 100 Sync-Events
sudo journalctl -u motia-iii -n 100 | grep "AI KNOWLEDGE SYNC"
# Fehler der letzten 24 Stunden
sudo journalctl -u motia-iii --since "24 hours ago" | grep "❌"
```
### EspoCRM Status prüfen
```sql
-- Alle Knowledge Bases mit Status
SELECT
id,
name,
activation_status,
sync_status,
last_sync,
sync_error
FROM c_ai_knowledge
WHERE activation_status = 'active';
-- Junction Entries mit Sync-Problemen
SELECT
j.id,
k.name AS knowledge_name,
d.name AS document_name,
j.syncstatus,
j.last_sync
FROM c_ai_knowledge_c_dokumente j
JOIN c_ai_knowledge k ON j.c_ai_knowledge_id = k.id
JOIN c_dokumente d ON j.c_dokumente_id = d.id
WHERE j.syncstatus IN ('failed', 'unsupported');
```
### Häufige Probleme
#### Problem: "Lock busy for aiknowledge:xyz"
**Ursache**: Vorheriger Sync noch aktiv oder abgestürzt
**Lösung**:
```bash
# Redis lock manuell freigeben
redis-cli
> DEL sync_lock:aiknowledge:xyz
```
#### Problem: "Unsupported MIME type"
**Ursache**: Document hat MIME Type, den XAI nicht unterstützt
**Lösung**:
- Dokument konvertieren (z.B. RTF → PDF)
- Oder: Akzeptieren (bleibt mit Status "unsupported")
#### Problem: "Upload verification failed"
**Ursache**: XAI liefert kein BLAKE3 Hash oder Hash-Mismatch
**Lösung**:
1. Prüfe XAI API Dokumentation (Hash-Format geändert?)
2. Falls temporär: Retry läuft automatisch
3. Falls persistent: XAI Support kontaktieren
#### Problem: "Collection not found"
**Ursache**: Collection wurde manuell in XAI gelöscht
**Lösung**: Automatisch gelöst - Sync erstellt neue Collection
---
## API Endpoints
### Webhook Endpoint
```http
POST /vmh/webhook/aiknowledge/update
Content-Type: application/json
{
"entity_id": "kb-123",
"entity_type": "CAIKnowledge",
"action": "update"
}
```
**Response:**
```json
{
"success": true,
"knowledge_id": "kb-123"
}
```
---
## Performance
### Typische Sync-Zeiten
| Szenario | Zeit | Notizen |
|----------|------|---------|
| Collection erstellen | < 1s | Nur API Call |
| 1 Dokument (1 MB) | 2-4s | Upload + Verify |
| 10 Dokumente (10 MB) | 20-40s | Sequenziell |
| 100 Dokumente (100 MB) | 3-6 min | Lock TTL: 30 min |
| Metadata-only Update | < 1s | Nur PATCH |
| Orphan Cleanup | 1-3s | Pro 10 Dokumente |
### Lock TTLs
- **AIKnowledge Sync**: 30 Minuten (1800 Sekunden)
- **Redis Lock**: Same as above
- **Auto-Release**: Bei Timeout (TTL expired)
### Rate Limits
**XAI API:**
- Files Upload: ~100 requests/minute
- Management API: ~1000 requests/minute
**Strategie bei Rate Limit (429)**:
- Exponential Backoff: 2s, 4s, 8s, 16s, 32s
- Respect `Retry-After` Header
- Max 5 Retries
---
## XAI Collections Metadata
### Document Metadata Fields
Werden für jedes Dokument in XAI gespeichert:
```json
{
"fields": {
"document_name": "Vertrag.pdf",
"description": "Mietvertrag Mustermann",
"created_at": "2024-01-01T00:00:00Z",
"modified_at": "2026-03-10T15:30:00Z",
"espocrm_id": "dok-123"
}
}
```
**inject_into_chunk**: `true` für `document_name` und `description`
→ Verbessert semantische Suche
### Collection Metadata
```json
{
"metadata": {
"espocrm_entity_type": "CAIKnowledge",
"espocrm_entity_id": "kb-123",
"created_at": "2026-03-11T10:00:00Z"
}
}
```
---
## Testing
### Manueller Test
```bash
# 1. Erstelle CAIKnowledge in EspoCRM
# 2. Prüfe Logs
sudo journalctl -u motia-iii -f
# 3. Prüfe Redis Lock
redis-cli
> KEYS sync_lock:aiknowledge:*
# 4. Prüfe XAI Collection
curl -H "Authorization: Bearer $XAI_MANAGEMENT_KEY" \
https://management-api.x.ai/v1/collections
```
### Integration Test
```python
# tests/test_aiknowledge_sync.py
async def test_full_sync_workflow():
"""Test complete sync workflow"""
# 1. Create CAIKnowledge with status "new"
knowledge = await espocrm.create_entity('CAIKnowledge', {
'name': 'Test KB',
'activationStatus': 'new'
})
# 2. Trigger webhook
await trigger_webhook(knowledge['id'])
# 3. Wait for sync
await asyncio.sleep(5)
# 4. Check collection created
knowledge = await espocrm.get_entity('CAIKnowledge', knowledge['id'])
assert knowledge['datenbankId'] is not None
assert knowledge['activationStatus'] == 'active'
# 5. Link document
await espocrm.link_entities('CAIKnowledge', knowledge['id'], 'CDokumente', doc_id)
# 6. Trigger webhook again
await trigger_webhook(knowledge['id'])
await asyncio.sleep(10)
# 7. Check junction synced
junction = await espocrm.get_junction_entries(
'CAIKnowledgeCDokumente',
'cAIKnowledgeId',
knowledge['id']
)
assert junction[0]['syncstatus'] == 'synced'
assert junction[0]['xaiBlake3Hash'] is not None
```
---
## Maintenance
### Wöchentliche Checks
- [ ] Prüfe failed Syncs in EspoCRM
- [ ] Prüfe Redis Memory Usage
- [ ] Prüfe XAI Storage Usage
- [ ] Review Logs für Patterns
### Monatliche Tasks
- [ ] Cleanup alte syncError Messages
- [ ] Verify XAI Collection Integrity
- [ ] Review Performance Metrics
- [ ] Update MIME Type Support List
---
## Support
**Bei Problemen:**
1. **Logs prüfen**: `journalctl -u motia-iii -f`
2. **EspoCRM Status prüfen**: SQL Queries (siehe oben)
3. **Redis Locks prüfen**: `redis-cli KEYS sync_lock:*`
4. **XAI API Status**: https://status.x.ai
**Kontakt:**
- Team: BitByLaw Development
- Motia Docs: `/opt/motia-iii/bitbylaw/docs/INDEX.md`
---
**Version History:**
- **1.0** (11.03.2026) - Initial Release
- Collection Lifecycle Management
- BLAKE3 Hash Verification
- Daily Full Sync
- Metadata Change Detection

File diff suppressed because it is too large Load Diff

View File

@@ -78,6 +78,6 @@ modules:
- class: modules::shell::ExecModule
config:
watch:
- steps/**/*.py
- src/steps/**/*.py
exec:
- /opt/bin/uv run python -m motia.cli run --dir steps
- /usr/local/bin/uv run python -m motia.cli run --dir src/steps

View File

@@ -18,5 +18,8 @@ dependencies = [
"google-api-python-client>=2.100.0", # Google Calendar API
"google-auth>=2.23.0", # Google OAuth2
"backoff>=2.2.1", # Retry/backoff decorator
"langchain>=0.3.0", # LangChain framework
"langchain-xai>=0.2.0", # xAI integration for LangChain
"langchain-core>=0.3.0", # LangChain core
]

View File

@@ -7,9 +7,6 @@ Basierend auf ADRESSEN_SYNC_ANALYSE.md Abschnitt 12.
from typing import Dict, Any, Optional
from datetime import datetime
import logging
logger = logging.getLogger(__name__)
class AdressenMapper:

View File

@@ -26,8 +26,6 @@ from services.espocrm import EspoCRMAPI
from services.adressen_mapper import AdressenMapper
from services.notification_utils import NotificationManager
logger = logging.getLogger(__name__)
class AdressenSync:
"""Sync-Klasse für Adressen zwischen EspoCRM und Advoware"""

View File

@@ -8,7 +8,6 @@ import hashlib
import base64
import os
import datetime
import logging
from typing import Optional, Dict, Any
from services.exceptions import (
@@ -21,8 +20,6 @@ from services.redis_client import get_redis_client
from services.config import ADVOWARE_CONFIG, API_CONFIG
from services.logging_utils import get_service_logger
logger = logging.getLogger(__name__)
class AdvowareAPI:
"""
@@ -75,6 +72,11 @@ class AdvowareAPI:
self._session: Optional[aiohttp.ClientSession] = None
def _log(self, message: str, level: str = 'info') -> None:
"""Internal logging helper"""
log_func = getattr(self.logger, level, self.logger.info)
log_func(message)
async def _get_session(self) -> aiohttp.ClientSession:
if self._session is None or self._session.closed:
self._session = aiohttp.ClientSession()
@@ -93,7 +95,7 @@ class AdvowareAPI:
try:
api_key_bytes = base64.b64decode(self.api_key)
logger.debug("API Key decoded from base64")
self.logger.debug("API Key decoded from base64")
except Exception as e:
self._log(f"API Key not base64-encoded, using as-is: {e}", level='debug')
api_key_bytes = self.api_key.encode('utf-8') if isinstance(self.api_key, str) else self.api_key
@@ -101,8 +103,8 @@ class AdvowareAPI:
signature = hmac.new(api_key_bytes, message, hashlib.sha512)
return base64.b64encode(signature.digest()).decode('utf-8')
def _fetch_new_access_token(self) -> str:
"""Fetch new access token from Advoware Auth API"""
async def _fetch_new_access_token(self) -> str:
"""Fetch new access token from Advoware Auth API (async)"""
self.logger.info("Fetching new access token from Advoware")
nonce = str(uuid.uuid4())
@@ -125,40 +127,41 @@ class AdvowareAPI:
self.logger.debug(f"Token request: AppID={self.app_id}, User={self.user}")
# Using synchronous requests for token fetch (called from sync context)
# TODO: Convert to async in future version
import requests
# Async token fetch using aiohttp
session = await self._get_session()
try:
response = requests.post(
async with session.post(
ADVOWARE_CONFIG.auth_url,
json=data,
headers=headers,
timeout=self.api_timeout_seconds
)
self.logger.debug(f"Token response status: {response.status_code}")
if response.status_code == 401:
raise AdvowareAuthError(
"Authentication failed - check credentials",
status_code=401
)
response.raise_for_status()
except requests.Timeout:
timeout=aiohttp.ClientTimeout(total=self.api_timeout_seconds)
) as response:
self.logger.debug(f"Token response status: {response.status}")
if response.status == 401:
raise AdvowareAuthError(
"Authentication failed - check credentials",
status_code=401
)
if response.status >= 400:
error_text = await response.text()
raise AdvowareAPIError(
f"Token request failed ({response.status}): {error_text}",
status_code=response.status
)
result = await response.json()
except asyncio.TimeoutError:
raise AdvowareTimeoutError(
"Token request timed out",
status_code=408
)
except requests.RequestException as e:
raise AdvowareAPIError(
f"Token request failed: {str(e)}",
status_code=getattr(e.response, 'status_code', None) if hasattr(e, 'response') else None
)
except aiohttp.ClientError as e:
raise AdvowareAPIError(f"Token request failed: {str(e)}")
result = response.json()
access_token = result.get("access_token")
if not access_token:
@@ -176,7 +179,7 @@ class AdvowareAPI:
return access_token
def get_access_token(self, force_refresh: bool = False) -> str:
async def get_access_token(self, force_refresh: bool = False) -> str:
"""
Get valid access token (from cache or fetch new).
@@ -190,11 +193,11 @@ class AdvowareAPI:
if not self.redis_client:
self.logger.info("No Redis available, fetching new token")
return self._fetch_new_access_token()
return await self._fetch_new_access_token()
if force_refresh:
self.logger.info("Force refresh requested, fetching new token")
return self._fetch_new_access_token()
return await self._fetch_new_access_token()
# Check cache
cached_token = self.redis_client.get(ADVOWARE_CONFIG.token_cache_key)
@@ -213,7 +216,7 @@ class AdvowareAPI:
self.logger.debug(f"Error reading cached token: {e}")
self.logger.info("Cached token expired or invalid, fetching new")
return self._fetch_new_access_token()
return await self._fetch_new_access_token()
async def api_call(
self,
@@ -257,7 +260,7 @@ class AdvowareAPI:
# Get auth token
try:
token = self.get_access_token()
token = await self.get_access_token()
except AdvowareAuthError:
raise
except Exception as e:
@@ -285,7 +288,7 @@ class AdvowareAPI:
# Handle 401 - retry with fresh token
if response.status == 401:
self.logger.warning("401 Unauthorized, refreshing token")
token = self.get_access_token(force_refresh=True)
token = await self.get_access_token(force_refresh=True)
effective_headers['Authorization'] = f'Bearer {token}'
async with session.request(

View File

@@ -1,24 +1,29 @@
"""
Advoware Service Wrapper
Erweitert AdvowareAPI mit höheren Operations
Extends AdvowareAPI with higher-level operations for business logic.
"""
import logging
from typing import Dict, Any, Optional
from services.advoware import AdvowareAPI
logger = logging.getLogger(__name__)
from services.logging_utils import get_service_logger
class AdvowareService:
"""
Service-Layer für Advoware Operations
Verwendet AdvowareAPI für API-Calls
Service layer for Advoware operations.
Uses AdvowareAPI for API calls.
"""
def __init__(self, context=None):
self.api = AdvowareAPI(context)
self.context = context
self.logger = get_service_logger('advoware_service', context)
def _log(self, message: str, level: str = 'info') -> None:
"""Internal logging helper"""
log_func = getattr(self.logger, level, self.logger.info)
log_func(message)
async def api_call(self, *args, **kwargs):
"""Delegate api_call to underlying AdvowareAPI"""
@@ -26,29 +31,29 @@ class AdvowareService:
# ========== BETEILIGTE ==========
async def get_beteiligter(self, betnr: int) -> Optional[Dict]:
async def get_beteiligter(self, betnr: int) -> Optional[Dict[str, Any]]:
"""
Lädt Beteiligten mit allen Daten
Load Beteiligte with all data.
Returns:
Beteiligte-Objekt
Beteiligte object or None
"""
try:
endpoint = f"api/v1/advonet/Beteiligte/{betnr}"
result = await self.api.api_call(endpoint, method='GET')
return result
except Exception as e:
logger.error(f"[ADVO] Fehler beim Laden von Beteiligte {betnr}: {e}", exc_info=True)
self._log(f"[ADVO] Error loading Beteiligte {betnr}: {e}", level='error')
return None
# ========== KOMMUNIKATION ==========
async def create_kommunikation(self, betnr: int, data: Dict[str, Any]) -> Optional[Dict]:
async def create_kommunikation(self, betnr: int, data: Dict[str, Any]) -> Optional[Dict[str, Any]]:
"""
Erstellt neue Kommunikation
Create new Kommunikation.
Args:
betnr: Beteiligten-Nummer
betnr: Beteiligte number
data: {
'tlf': str, # Required
'bemerkung': str, # Optional
@@ -57,68 +62,68 @@ class AdvowareService:
}
Returns:
Neue Kommunikation mit 'id'
New Kommunikation with 'id' or None
"""
try:
endpoint = f"api/v1/advonet/Beteiligte/{betnr}/Kommunikationen"
result = await self.api.api_call(endpoint, method='POST', json_data=data)
if result:
logger.info(f"[ADVO] ✅ Created Kommunikation: betnr={betnr}, kommKz={data.get('kommKz')}")
self._log(f"[ADVO] ✅ Created Kommunikation: betnr={betnr}, kommKz={data.get('kommKz')}")
return result
except Exception as e:
logger.error(f"[ADVO] Fehler beim Erstellen von Kommunikation: {e}", exc_info=True)
self._log(f"[ADVO] Error creating Kommunikation: {e}", level='error')
return None
async def update_kommunikation(self, betnr: int, komm_id: int, data: Dict[str, Any]) -> bool:
"""
Aktualisiert bestehende Kommunikation
Update existing Kommunikation.
Args:
betnr: Beteiligten-Nummer
komm_id: Kommunikation-ID
betnr: Beteiligte number
komm_id: Kommunikation ID
data: {
'tlf': str, # Optional
'bemerkung': str, # Optional
'online': bool # Optional
}
NOTE: kommKz ist READ-ONLY und kann nicht geändert werden
NOTE: kommKz is READ-ONLY and cannot be changed
Returns:
True wenn erfolgreich
True if successful
"""
try:
endpoint = f"api/v1/advonet/Beteiligte/{betnr}/Kommunikationen/{komm_id}"
await self.api.api_call(endpoint, method='PUT', json_data=data)
logger.info(f"[ADVO] ✅ Updated Kommunikation: betnr={betnr}, komm_id={komm_id}")
self._log(f"[ADVO] ✅ Updated Kommunikation: betnr={betnr}, komm_id={komm_id}")
return True
except Exception as e:
logger.error(f"[ADVO] Fehler beim Update von Kommunikation: {e}", exc_info=True)
self._log(f"[ADVO] Error updating Kommunikation: {e}", level='error')
return False
async def delete_kommunikation(self, betnr: int, komm_id: int) -> bool:
"""
Löscht Kommunikation (aktuell 403 Forbidden)
Delete Kommunikation (currently returns 403 Forbidden).
NOTE: DELETE ist in Advoware API deaktiviert
Verwende stattdessen: Leere Slots mit empty_slot_marker
NOTE: DELETE is disabled in Advoware API.
Use empty slots with empty_slot_marker instead.
Returns:
True wenn erfolgreich
True if successful
"""
try:
endpoint = f"api/v1/advonet/Beteiligte/{betnr}/Kommunikationen/{komm_id}"
await self.api.api_call(endpoint, method='DELETE')
logger.info(f"[ADVO] ✅ Deleted Kommunikation: betnr={betnr}, komm_id={komm_id}")
self._log(f"[ADVO] ✅ Deleted Kommunikation: betnr={betnr}, komm_id={komm_id}")
return True
except Exception as e:
# Expected: 403 Forbidden
logger.warning(f"[ADVO] DELETE not allowed (expected): {e}")
self._log(f"[ADVO] DELETE not allowed (expected): {e}", level='warning')
return False

View File

@@ -0,0 +1,545 @@
"""
AI Knowledge Sync Utilities
Utility functions for synchronizing CAIKnowledge entities with XAI Collections:
- Collection lifecycle management (create, delete)
- Document synchronization with BLAKE3 hash verification
- Metadata-only updates via PATCH
- Orphan detection and cleanup
"""
import hashlib
import json
from typing import Dict, Any, Optional, List, Tuple
from datetime import datetime
from urllib.parse import unquote
from services.sync_utils_base import BaseSyncUtils
from services.models import (
AIKnowledgeActivationStatus,
AIKnowledgeSyncStatus,
JunctionSyncStatus
)
class AIKnowledgeSync(BaseSyncUtils):
"""Utility class for AI Knowledge ↔ XAI Collections synchronization"""
def _get_lock_key(self, entity_id: str) -> str:
"""Redis lock key for AI Knowledge entities"""
return f"sync_lock:aiknowledge:{entity_id}"
async def acquire_sync_lock(self, knowledge_id: str) -> bool:
"""
Acquire distributed lock via Redis + update EspoCRM syncStatus.
Args:
knowledge_id: CAIKnowledge entity ID
Returns:
True if lock acquired, False if already locked
"""
try:
# STEP 1: Atomic Redis lock
lock_key = self._get_lock_key(knowledge_id)
if not self._acquire_redis_lock(lock_key):
self._log(f"Redis lock already active for {knowledge_id}", level='warn')
return False
# STEP 2: Update syncStatus to pending_sync
try:
await self.espocrm.update_entity('CAIKnowledge', knowledge_id, {
'syncStatus': AIKnowledgeSyncStatus.PENDING_SYNC.value
})
except Exception as e:
self._log(f"Could not set syncStatus: {e}", level='debug')
self._log(f"Sync lock acquired for {knowledge_id}")
return True
except Exception as e:
self._log(f"Error acquiring lock: {e}", level='error')
# Clean up Redis lock on error
lock_key = self._get_lock_key(knowledge_id)
self._release_redis_lock(lock_key)
return False
async def release_sync_lock(
self,
knowledge_id: str,
success: bool = True,
error_message: Optional[str] = None
) -> None:
"""
Release sync lock and set final status.
Args:
knowledge_id: CAIKnowledge entity ID
success: Whether sync succeeded
error_message: Optional error message
"""
try:
update_data = {
'syncStatus': AIKnowledgeSyncStatus.SYNCED.value if success else AIKnowledgeSyncStatus.FAILED.value
}
if success:
update_data['lastSync'] = datetime.now().isoformat()
update_data['syncError'] = None
elif error_message:
update_data['syncError'] = error_message[:2000]
await self.espocrm.update_entity('CAIKnowledge', knowledge_id, update_data)
self._log(f"Sync lock released: {knowledge_id}{'success' if success else 'failed'}")
# Release Redis lock
lock_key = self._get_lock_key(knowledge_id)
self._release_redis_lock(lock_key)
except Exception as e:
self._log(f"Error releasing lock: {e}", level='error')
# Ensure Redis lock is released
lock_key = self._get_lock_key(knowledge_id)
self._release_redis_lock(lock_key)
async def sync_knowledge_to_xai(self, knowledge_id: str, ctx) -> None:
"""
Main sync orchestrator with activation status handling.
Args:
knowledge_id: CAIKnowledge entity ID
ctx: Motia context for logging
"""
from services.espocrm import EspoCRMAPI
from services.xai_service import XAIService
espocrm = EspoCRMAPI(ctx)
xai = XAIService(ctx)
try:
# 1. Load knowledge entity
knowledge = await espocrm.get_entity('CAIKnowledge', knowledge_id)
activation_status = knowledge.get('aktivierungsstatus')
collection_id = knowledge.get('datenbankId')
ctx.logger.info("=" * 80)
ctx.logger.info(f"📋 Processing: {knowledge['name']}")
ctx.logger.info(f" aktivierungsstatus: {activation_status}")
ctx.logger.info(f" datenbankId: {collection_id or 'NONE'}")
ctx.logger.info("=" * 80)
# ═══════════════════════════════════════════════════════════
# CASE 1: NEW → Create Collection
# ═══════════════════════════════════════════════════════════
if activation_status == AIKnowledgeActivationStatus.NEW.value:
ctx.logger.info("🆕 Status 'new' → Creating XAI Collection")
collection = await xai.create_collection(
name=knowledge['name'],
metadata={
'espocrm_entity_type': 'CAIKnowledge',
'espocrm_entity_id': knowledge_id,
'created_at': datetime.now().isoformat()
}
)
# XAI API returns 'collection_id' not 'id'
collection_id = collection.get('collection_id') or collection.get('id')
# Update EspoCRM: Set datenbankId + change status to 'active'
await espocrm.update_entity('CAIKnowledge', knowledge_id, {
'datenbankId': collection_id,
'aktivierungsstatus': AIKnowledgeActivationStatus.ACTIVE.value,
'syncStatus': AIKnowledgeSyncStatus.UNCLEAN.value
})
ctx.logger.info(f"✅ Collection created: {collection_id}")
ctx.logger.info(" Status changed to 'active', now syncing documents...")
# Continue to document sync immediately (don't return)
# Fall through to sync logic below
# ═══════════════════════════════════════════════════════════
# CASE 2: DEACTIVATED → Delete Collection from XAI
# ═══════════════════════════════════════════════════════════
elif activation_status == AIKnowledgeActivationStatus.DEACTIVATED.value:
ctx.logger.info("🗑️ Status 'deactivated' → Deleting XAI Collection")
if collection_id:
try:
await xai.delete_collection(collection_id)
ctx.logger.info(f"✅ Collection deleted from XAI: {collection_id}")
except Exception as e:
ctx.logger.error(f"❌ Failed to delete collection: {e}")
else:
ctx.logger.info("⏭️ No collection ID, nothing to delete")
# Reset junction entries
documents = await espocrm.get_knowledge_documents_with_junction(knowledge_id)
for doc in documents:
doc_id = doc['documentId']
try:
await espocrm.update_knowledge_document_junction(
knowledge_id,
doc_id,
{
'syncstatus': 'new',
'aiDocumentId': None
},
update_last_sync=False
)
except Exception as e:
ctx.logger.warn(f"⚠️ Failed to reset junction for {doc_id}: {e}")
ctx.logger.info(f"✅ Deactivation complete, {len(documents)} junction entries reset")
return
# ═══════════════════════════════════════════════════════════
# CASE 3: PAUSED → Skip Sync
# ═══════════════════════════════════════════════════════════
elif activation_status == AIKnowledgeActivationStatus.PAUSED.value:
ctx.logger.info("⏸️ Status 'paused' → No sync performed")
return
# ═══════════════════════════════════════════════════════════
# CASE 4: ACTIVE → Normal Sync (or just created from NEW)
# ═══════════════════════════════════════════════════════════
if activation_status in (AIKnowledgeActivationStatus.ACTIVE.value, AIKnowledgeActivationStatus.NEW.value):
if not collection_id:
ctx.logger.error("❌ Status 'active' but no datenbankId!")
raise RuntimeError("Active knowledge without collection ID")
if activation_status == AIKnowledgeActivationStatus.ACTIVE.value:
ctx.logger.info(f"🔄 Status 'active' → Syncing documents to {collection_id}")
# Verify collection exists
collection = await xai.get_collection(collection_id)
if not collection:
ctx.logger.warn(f"⚠️ Collection {collection_id} not found, recreating")
collection = await xai.create_collection(
name=knowledge['name'],
metadata={
'espocrm_entity_type': 'CAIKnowledge',
'espocrm_entity_id': knowledge_id
}
)
collection_id = collection['id']
await espocrm.update_entity('CAIKnowledge', knowledge_id, {
'datenbankId': collection_id
})
# Sync documents (both for ACTIVE status and after NEW → ACTIVE transition)
await self._sync_knowledge_documents(knowledge_id, collection_id, ctx)
elif activation_status not in (AIKnowledgeActivationStatus.DEACTIVATED.value, AIKnowledgeActivationStatus.PAUSED.value):
ctx.logger.error(f"❌ Unknown aktivierungsstatus: {activation_status}")
raise ValueError(f"Invalid aktivierungsstatus: {activation_status}")
finally:
await xai.close()
async def _sync_knowledge_documents(
self,
knowledge_id: str,
collection_id: str,
ctx
) -> None:
"""
Sync all documents of a knowledge base to XAI collection.
Uses efficient JunctionData endpoint to get all documents with junction data
and blake3 hashes in a single API call. Hash comparison is always performed.
Args:
knowledge_id: CAIKnowledge entity ID
collection_id: XAI Collection ID
ctx: Motia context
"""
from services.espocrm import EspoCRMAPI
from services.xai_service import XAIService
espocrm = EspoCRMAPI(ctx)
xai = XAIService(ctx)
# ═══════════════════════════════════════════════════════════════
# STEP 1: Load all documents with junction data (single API call)
# ═══════════════════════════════════════════════════════════════
ctx.logger.info(f"📥 Loading documents with junction data for knowledge {knowledge_id}")
documents = await espocrm.get_knowledge_documents_with_junction(knowledge_id)
ctx.logger.info(f"📊 Found {len(documents)} document(s)")
if not documents:
ctx.logger.info("✅ No documents to sync")
return
# ═══════════════════════════════════════════════════════════════
# STEP 2: Sync each document based on status/hash
# ═══════════════════════════════════════════════════════════════
successful = 0
failed = 0
skipped = 0
# Track aiDocumentIds for orphan detection (collected during sync)
synced_file_ids: set = set()
for doc in documents:
doc_id = doc['documentId']
doc_name = doc.get('documentName', 'Unknown')
junction_status = doc.get('syncstatus', 'new')
ai_document_id = doc.get('aiDocumentId')
blake3_hash = doc.get('blake3hash')
ctx.logger.info(f"\n📄 {doc_name} (ID: {doc_id})")
ctx.logger.info(f" Status: {junction_status}")
ctx.logger.info(f" aiDocumentId: {ai_document_id or 'N/A'}")
ctx.logger.info(f" blake3hash: {blake3_hash[:16] if blake3_hash else 'N/A'}...")
try:
# Decide if sync needed
needs_sync = False
reason = ""
if junction_status in ['new', 'unclean', 'failed']:
needs_sync = True
reason = f"status={junction_status}"
elif junction_status == 'synced':
# Synced status should have both blake3_hash and ai_document_id
if not blake3_hash:
needs_sync = True
reason = "inconsistency: synced but no blake3 hash"
ctx.logger.warn(f" ⚠️ Synced document missing blake3 hash!")
elif not ai_document_id:
needs_sync = True
reason = "inconsistency: synced but no aiDocumentId"
ctx.logger.warn(f" ⚠️ Synced document missing aiDocumentId!")
else:
# Verify Blake3 hash with XAI (always, since hash from JunctionData API is free)
try:
xai_doc_info = await xai.get_collection_document(collection_id, ai_document_id)
if xai_doc_info:
xai_blake3 = xai_doc_info.get('blake3_hash')
if xai_blake3 != blake3_hash:
needs_sync = True
reason = f"blake3 mismatch (XAI: {xai_blake3[:16] if xai_blake3 else 'N/A'}... vs EspoCRM: {blake3_hash[:16]}...)"
ctx.logger.info(f" 🔄 Blake3 mismatch detected!")
else:
ctx.logger.info(f" ✅ Blake3 hash matches")
else:
needs_sync = True
reason = "file not found in XAI collection"
ctx.logger.warn(f" ⚠️ Document marked synced but not in XAI!")
except Exception as e:
needs_sync = True
reason = f"verification failed: {e}"
ctx.logger.warn(f" ⚠️ Failed to verify Blake3, will re-sync: {e}")
if not needs_sync:
ctx.logger.info(f" ⏭️ Skipped (no sync needed)")
# Document is already synced, track its aiDocumentId
if ai_document_id:
synced_file_ids.add(ai_document_id)
skipped += 1
continue
ctx.logger.info(f" 🔄 Syncing: {reason}")
# Get complete document entity with attachment info
doc_entity = await espocrm.get_entity('CDokumente', doc_id)
attachment_id = doc_entity.get('dokumentId')
if not attachment_id:
ctx.logger.error(f" ❌ No attachment ID found for document {doc_id}")
failed += 1
continue
# Get attachment details for MIME type and original filename
try:
attachment = await espocrm.get_entity('Attachment', attachment_id)
mime_type = attachment.get('type', 'application/octet-stream')
file_size = attachment.get('size', 0)
original_filename = attachment.get('name', doc_name) # Original filename with extension
# URL-decode filename (fixes special chars like §, ä, ö, ü, etc.)
original_filename = unquote(original_filename)
except Exception as e:
ctx.logger.warn(f" ⚠️ Failed to get attachment details: {e}, using defaults")
mime_type = 'application/octet-stream'
file_size = 0
original_filename = unquote(doc_name) # Also decode fallback name
ctx.logger.info(f" 📎 Attachment: {attachment_id} ({mime_type}, {file_size} bytes)")
ctx.logger.info(f" 📄 Original filename: {original_filename}")
# Download document
file_content = await espocrm.download_attachment(attachment_id)
ctx.logger.info(f" 📥 Downloaded {len(file_content)} bytes")
# Upload to XAI with original filename (includes extension)
filename = original_filename
xai_file_id = await xai.upload_file(file_content, filename, mime_type)
ctx.logger.info(f" 📤 Uploaded to XAI: {xai_file_id}")
# Add to collection
await xai.add_to_collection(collection_id, xai_file_id)
ctx.logger.info(f" ✅ Added to collection {collection_id}")
# Update junction
await espocrm.update_knowledge_document_junction(
knowledge_id,
doc_id,
{
'aiDocumentId': xai_file_id,
'syncstatus': 'synced'
},
update_last_sync=True
)
ctx.logger.info(f" ✅ Junction updated")
# Track the new aiDocumentId for orphan detection
synced_file_ids.add(xai_file_id)
successful += 1
except Exception as e:
failed += 1
ctx.logger.error(f" ❌ Sync failed: {e}")
# Mark as failed in junction
try:
await espocrm.update_knowledge_document_junction(
knowledge_id,
doc_id,
{'syncstatus': 'failed'},
update_last_sync=False
)
except Exception as update_err:
ctx.logger.error(f" ❌ Failed to update junction status: {update_err}")
# ═══════════════════════════════════════════════════════════════
# STEP 3: Remove orphaned documents from XAI collection
# ═══════════════════════════════════════════════════════════════
try:
ctx.logger.info(f"\n🧹 Checking for orphaned documents in XAI collection...")
# Get all files in XAI collection (normalized structure)
xai_documents = await xai.list_collection_documents(collection_id)
xai_file_ids = {doc.get('file_id') for doc in xai_documents if doc.get('file_id')}
# Use synced_file_ids (collected during this sync) for orphan detection
# This includes both pre-existing synced docs and newly uploaded ones
ctx.logger.info(f" XAI has {len(xai_file_ids)} files, we have {len(synced_file_ids)} synced")
# Find orphans (in XAI but not in our current sync)
orphans = xai_file_ids - synced_file_ids
if orphans:
ctx.logger.info(f" Found {len(orphans)} orphaned file(s)")
for orphan_id in orphans:
try:
await xai.remove_from_collection(collection_id, orphan_id)
ctx.logger.info(f" 🗑️ Removed {orphan_id}")
except Exception as e:
ctx.logger.warn(f" ⚠️ Failed to remove {orphan_id}: {e}")
else:
ctx.logger.info(f" ✅ No orphans found")
except Exception as e:
ctx.logger.warn(f"⚠️ Failed to clean up orphans: {e}")
# ═══════════════════════════════════════════════════════════════
# STEP 4: Summary
# ═══════════════════════════════════════════════════════════════
ctx.logger.info("")
ctx.logger.info("=" * 80)
ctx.logger.info(f"📊 Sync Statistics:")
ctx.logger.info(f" ✅ Synced: {successful}")
ctx.logger.info(f" ⏭️ Skipped: {skipped}")
ctx.logger.info(f" ❌ Failed: {failed}")
ctx.logger.info(f" Mode: Blake3 hash verification enabled")
ctx.logger.info("=" * 80)
def _calculate_metadata_hash(self, document: Dict) -> str:
"""
Calculate hash of sync-relevant metadata.
Args:
document: CDokumente entity
Returns:
MD5 hash (32 chars)
"""
metadata = {
'name': document.get('name', ''),
'description': document.get('description', ''),
}
metadata_str = json.dumps(metadata, sort_keys=True)
return hashlib.md5(metadata_str.encode()).hexdigest()
def _build_xai_metadata(self, document: Dict) -> Dict[str, str]:
"""
Build XAI metadata from CDokumente entity.
Args:
document: CDokumente entity
Returns:
Metadata dict for XAI
"""
return {
'document_name': document.get('name', ''),
'description': document.get('description', ''),
'created_at': document.get('createdAt', ''),
'modified_at': document.get('modifiedAt', ''),
'espocrm_id': document.get('id', '')
}
async def _get_document_download_info(
self,
document: Dict,
ctx
) -> Optional[Dict[str, Any]]:
"""
Get download info for CDokumente entity.
Args:
document: CDokumente entity
ctx: Motia context
Returns:
Dict with attachment_id, filename, mime_type
"""
from services.espocrm import EspoCRMAPI
espocrm = EspoCRMAPI(ctx)
# Check for dokumentId (CDokumente custom field)
attachment_id = None
filename = None
if document.get('dokumentId'):
attachment_id = document.get('dokumentId')
filename = document.get('dokumentName')
elif document.get('fileId'):
attachment_id = document.get('fileId')
filename = document.get('fileName')
if not attachment_id:
ctx.logger.error(f"❌ No attachment ID for document {document['id']}")
return None
# Get attachment details
try:
attachment = await espocrm.get_entity('Attachment', attachment_id)
return {
'attachment_id': attachment_id,
'filename': filename or attachment.get('name', 'unknown'),
'mime_type': attachment.get('type', 'application/octet-stream')
}
except Exception as e:
ctx.logger.error(f"❌ Failed to get attachment {attachment_id}: {e}")
return None

View File

@@ -0,0 +1,110 @@
"""Aktenzeichen-Erkennung und Validation
Utility functions für das Erkennen, Validieren und Normalisieren von
Aktenzeichen im Format '1234/56' oder 'ABC/23'.
"""
import re
from typing import Optional
# Regex für Aktenzeichen: 1-4 Zeichen (alphanumerisch) + "/" + 2 Ziffern
AKTENZEICHEN_REGEX = re.compile(r'^([A-Za-z0-9]{1,4}/\d{2})\s*', re.IGNORECASE)
def extract_aktenzeichen(text: str) -> Optional[str]:
"""
Extrahiert Aktenzeichen vom Anfang des Textes.
Pattern: ^[A-Za-z0-9]{1,4}/\d{2}
Examples:
>>> extract_aktenzeichen("1234/56 Was ist der Stand?")
"1234/56"
>>> extract_aktenzeichen("ABC/23 Frage zum Vertrag")
"ABC/23"
>>> extract_aktenzeichen("Kein Aktenzeichen hier")
None
Args:
text: Eingabetext (z.B. erste Message)
Returns:
Aktenzeichen als String, oder None wenn nicht gefunden
"""
if not text or not isinstance(text, str):
return None
match = AKTENZEICHEN_REGEX.match(text.strip())
return match.group(1) if match else None
def remove_aktenzeichen(text: str) -> str:
"""
Entfernt Aktenzeichen vom Anfang des Textes.
Examples:
>>> remove_aktenzeichen("1234/56 Was ist der Stand?")
"Was ist der Stand?"
>>> remove_aktenzeichen("Kein Aktenzeichen")
"Kein Aktenzeichen"
Args:
text: Eingabetext mit Aktenzeichen
Returns:
Text ohne Aktenzeichen (whitespace getrimmt)
"""
if not text or not isinstance(text, str):
return text
return AKTENZEICHEN_REGEX.sub('', text, count=1).strip()
def validate_aktenzeichen(az: str) -> bool:
"""
Validiert Aktenzeichen-Format.
Pattern: ^[A-Za-z0-9]{1,4}/\d{2}$
Examples:
>>> validate_aktenzeichen("1234/56")
True
>>> validate_aktenzeichen("ABC/23")
True
>>> validate_aktenzeichen("12345/567") # Zu lang
False
>>> validate_aktenzeichen("1234-56") # Falsches Trennzeichen
False
Args:
az: Aktenzeichen zum Validieren
Returns:
True wenn valide, False sonst
"""
if not az or not isinstance(az, str):
return False
return bool(re.match(r'^[A-Za-z0-9]{1,4}/\d{2}$', az, re.IGNORECASE))
def normalize_aktenzeichen(az: str) -> str:
"""
Normalisiert Aktenzeichen (uppercase, trim whitespace).
Examples:
>>> normalize_aktenzeichen("abc/23")
"ABC/23"
>>> normalize_aktenzeichen(" 1234/56 ")
"1234/56"
Args:
az: Aktenzeichen zum Normalisieren
Returns:
Normalisiertes Aktenzeichen (uppercase, getrimmt)
"""
if not az or not isinstance(az, str):
return az
return az.strip().upper()

View File

@@ -6,9 +6,6 @@ Transformiert Bankverbindungen zwischen den beiden Systemen
from typing import Dict, Any, Optional, List
from datetime import datetime
import logging
logger = logging.getLogger(__name__)
class BankverbindungenMapper:

View File

@@ -17,7 +17,7 @@ import pytz
from services.exceptions import LockAcquisitionError, SyncError, ValidationError
from services.redis_client import get_redis_client
from services.config import SYNC_CONFIG, get_lock_key, get_retry_delay_seconds
from services.logging_utils import get_logger
from services.logging_utils import get_service_logger
import redis
@@ -31,7 +31,7 @@ class BeteiligteSync:
def __init__(self, espocrm_api, redis_client: Optional[redis.Redis] = None, context=None):
self.espocrm = espocrm_api
self.context = context
self.logger = get_logger('beteiligte_sync', context)
self.logger = get_service_logger('beteiligte_sync', context)
# Use provided Redis client or get from factory
self.redis = redis_client or get_redis_client(strict=False)
@@ -46,6 +46,11 @@ class BeteiligteSync:
from services.notification_utils import NotificationManager
self.notification_manager = NotificationManager(espocrm_api=self.espocrm, context=context)
def _log(self, message: str, level: str = 'info') -> None:
"""Delegate logging to the logger with optional level"""
log_func = getattr(self.logger, level, self.logger.info)
log_func(message)
async def acquire_sync_lock(self, entity_id: str) -> bool:
"""
Atomic distributed lock via Redis + syncStatus update
@@ -87,7 +92,7 @@ class BeteiligteSync:
return True
except Exception as e:
self._log(f"Fehler beim Acquire Lock: {e}", level='error')
self.logger.error(f"Fehler beim Acquire Lock: {e}")
# Clean up Redis lock on error
if self.redis:
try:
@@ -202,16 +207,15 @@ class BeteiligteSync:
except:
pass
@staticmethod
def parse_timestamp(ts: Any) -> Optional[datetime]:
def parse_timestamp(self, ts: Any) -> Optional[datetime]:
"""
Parse verschiedene Timestamp-Formate zu datetime
Parse various timestamp formats to datetime.
Args:
ts: String, datetime oder None
ts: String, datetime or None
Returns:
datetime-Objekt oder None
datetime object or None
"""
if not ts:
return None
@@ -220,13 +224,13 @@ class BeteiligteSync:
return ts
if isinstance(ts, str):
# EspoCRM Format: "2026-02-07 14:30:00"
# Advoware Format: "2026-02-07T14:30:00" oder "2026-02-07T14:30:00Z"
# EspoCRM format: "2026-02-07 14:30:00"
# Advoware format: "2026-02-07T14:30:00" or "2026-02-07T14:30:00Z"
try:
# Entferne trailing Z falls vorhanden
# Remove trailing Z if present
ts = ts.rstrip('Z')
# Versuche verschiedene Formate
# Try various formats
for fmt in [
'%Y-%m-%d %H:%M:%S',
'%Y-%m-%dT%H:%M:%S',
@@ -237,11 +241,11 @@ class BeteiligteSync:
except ValueError:
continue
# Fallback: ISO-Format
# Fallback: ISO format
return datetime.fromisoformat(ts)
except Exception as e:
logger.warning(f"Konnte Timestamp nicht parsen: {ts} - {e}")
self._log(f"Could not parse timestamp: {ts} - {e}", level='warning')
return None
return None

View File

@@ -1,20 +1,19 @@
"""
Document Sync Utilities
Hilfsfunktionen für Document-Synchronisation mit xAI:
Utility functions for document synchronization with xAI:
- Distributed locking via Redis + syncStatus
- Entscheidungslogik: Wann muss ein Document zu xAI?
- Related Entities ermitteln (Many-to-Many Attachments)
- xAI Collection Management
- Decision logic: When does a document need xAI sync?
- Related entities determination (Many-to-Many attachments)
- xAI Collection management
"""
from typing import Dict, Any, Optional, List, Tuple
from datetime import datetime, timedelta
import logging
from urllib.parse import unquote
from services.sync_utils_base import BaseSyncUtils
logger = logging.getLogger(__name__)
from services.models import FileStatus, XAISyncStatus
# Max retry before permanent failure
MAX_SYNC_RETRIES = 5
@@ -22,12 +21,18 @@ MAX_SYNC_RETRIES = 5
# Retry backoff: Wartezeit zwischen Retries (in Minuten)
RETRY_BACKOFF_MINUTES = [1, 5, 15, 60, 240] # 1min, 5min, 15min, 1h, 4h
# Legacy file status values (for backward compatibility)
# These are old German and English status values that may still exist in the database
LEGACY_NEW_STATUS_VALUES = {'neu', 'Neu', 'New'}
LEGACY_CHANGED_STATUS_VALUES = {'geändert', 'Geändert', 'Changed'}
LEGACY_SYNCED_STATUS_VALUES = {'synced', 'Synced', 'synchronized', 'Synchronized'}
class DocumentSync(BaseSyncUtils):
"""Utility-Klasse für Document-Synchronisation mit xAI"""
"""Utility class for document synchronization with xAI"""
def _get_lock_key(self, entity_id: str) -> str:
"""Redis Lock-Key für Documents"""
"""Redis lock key for documents"""
return f"sync_lock:document:{entity_id}"
async def acquire_sync_lock(self, entity_id: str, entity_type: str = 'CDokumente') -> bool:
@@ -48,13 +53,13 @@ class DocumentSync(BaseSyncUtils):
self._log(f"Redis lock bereits aktiv für {entity_type} {entity_id}", level='warn')
return False
# STEP 2: Update xaiSyncStatus auf pending_sync
# STEP 2: Update xaiSyncStatus to pending_sync
try:
await self.espocrm.update_entity(entity_type, entity_id, {
'xaiSyncStatus': 'pending_sync'
'xaiSyncStatus': XAISyncStatus.PENDING_SYNC.value
})
except Exception as e:
self._log(f"Konnte xaiSyncStatus nicht setzen: {e}", level='debug')
self._log(f"Could not set xaiSyncStatus: {e}", level='debug')
self._log(f"Sync-Lock für {entity_type} {entity_id} erworben")
return True
@@ -87,16 +92,16 @@ class DocumentSync(BaseSyncUtils):
try:
update_data = {}
# xaiSyncStatus setzen: clean bei Erfolg, failed bei Fehler
# Set xaiSyncStatus: clean on success, failed on error
try:
update_data['xaiSyncStatus'] = 'clean' if success else 'failed'
update_data['xaiSyncStatus'] = XAISyncStatus.CLEAN.value if success else XAISyncStatus.FAILED.value
if error_message:
update_data['xaiSyncError'] = error_message[:2000]
else:
update_data['xaiSyncError'] = None
except:
pass # Felder existieren evtl. nicht
pass # Fields may not exist
# Merge extra fields (z.B. xaiFileId, xaiCollections)
if extra_fields:
@@ -123,37 +128,37 @@ class DocumentSync(BaseSyncUtils):
entity_type: str = 'CDokumente'
) -> Tuple[bool, List[str], str]:
"""
Entscheidet ob ein Document zu xAI synchronisiert werden muss
Decide if a document needs to be synchronized to xAI.
Prüft:
1. Datei-Status Feld ("Neu", "Geändert")
2. Hash-Werte für Change Detection
3. Related Entities mit xAI Collections
Checks:
1. File status field ("new", "changed")
2. Hash values for change detection
3. Related entities with xAI collections
Args:
document: Vollständiges Document Entity von EspoCRM
document: Complete document entity from EspoCRM
Returns:
Tuple[bool, List[str], str]:
- bool: Ob Sync nötig ist
- List[str]: Liste der Collection-IDs in die das Document soll
- str: Grund/Beschreibung der Entscheidung
- bool: Whether sync is needed
- List[str]: List of collection IDs where the document should go
- str: Reason/description of the decision
"""
doc_id = document.get('id')
doc_name = document.get('name', 'Unbenannt')
# xAI-relevante Felder
# xAI-relevant fields
xai_file_id = document.get('xaiFileId')
xai_collections = document.get('xaiCollections') or []
xai_sync_status = document.get('xaiSyncStatus')
# Datei-Status und Hash-Felder
# File status and hash fields
datei_status = document.get('dateiStatus') or document.get('fileStatus')
file_md5 = document.get('md5') or document.get('fileMd5')
file_sha = document.get('sha') or document.get('fileSha')
xai_synced_hash = document.get('xaiSyncedHash') # Hash beim letzten xAI-Sync
xai_synced_hash = document.get('xaiSyncedHash') # Hash at last xAI sync
self._log(f"📋 Document Analysis: {doc_name} (ID: {doc_id})")
self._log(f"📋 Document analysis: {doc_name} (ID: {doc_id})")
self._log(f" xaiFileId: {xai_file_id or 'N/A'}")
self._log(f" xaiCollections: {xai_collections}")
self._log(f" xaiSyncStatus: {xai_sync_status or 'N/A'}")
@@ -168,65 +173,69 @@ class DocumentSync(BaseSyncUtils):
entity_type=entity_type
)
# Prüfe xaiSyncStatus="no_sync" → kein Sync für dieses Dokument
if xai_sync_status == 'no_sync':
self._log("⏭️ Kein xAI-Sync nötig: xaiSyncStatus='no_sync'")
return (False, [], "xaiSyncStatus ist 'no_sync'")
# Check xaiSyncStatus="no_sync" -> no sync for this document
if xai_sync_status == XAISyncStatus.NO_SYNC.value:
self._log("⏭️ No xAI sync needed: xaiSyncStatus='no_sync'")
return (False, [], "xaiSyncStatus is 'no_sync'")
if not target_collections:
self._log("⏭️ Kein xAI-Sync nötig: Keine Related Entities mit xAI Collections")
return (False, [], "Keine verknüpften Entities mit xAI Collections")
self._log("⏭️ No xAI sync needed: No related entities with xAI collections")
return (False, [], "No linked entities with xAI collections")
# ═══════════════════════════════════════════════════════════════
# PRIORITY CHECK 1: xaiSyncStatus="unclean" → Dokument wurde geändert
# PRIORITY CHECK 1: xaiSyncStatus="unclean" -> document was changed
# ═══════════════════════════════════════════════════════════════
if xai_sync_status == 'unclean':
self._log(f"🆕 xaiSyncStatus='unclean' → xAI-Sync ERFORDERLICH")
if xai_sync_status == XAISyncStatus.UNCLEAN.value:
self._log(f"🆕 xaiSyncStatus='unclean' → xAI sync REQUIRED")
return (True, target_collections, "xaiSyncStatus='unclean'")
# ═══════════════════════════════════════════════════════════════
# PRIORITY CHECK 2: fileStatus "new" oder "changed"
# PRIORITY CHECK 2: fileStatus "new" or "changed"
# ═══════════════════════════════════════════════════════════════
if datei_status in ['new', 'changed', 'neu', 'geändert', 'New', 'Changed', 'Neu', 'Geändert']:
self._log(f"🆕 fileStatus: '{datei_status}' → xAI-Sync ERFORDERLICH")
# Check for standard enum values and legacy values
is_new = (datei_status == FileStatus.NEW.value or datei_status in LEGACY_NEW_STATUS_VALUES)
is_changed = (datei_status == FileStatus.CHANGED.value or datei_status in LEGACY_CHANGED_STATUS_VALUES)
if is_new or is_changed:
self._log(f"🆕 fileStatus: '{datei_status}' → xAI sync REQUIRED")
if target_collections:
return (True, target_collections, f"fileStatus: {datei_status}")
else:
# Datei ist neu/geändert aber keine Collections gefunden
self._log(f"⚠️ fileStatus '{datei_status}' aber keine Collections gefunden - überspringe Sync")
return (False, [], f"fileStatus: {datei_status}, aber keine Collections")
# File is new/changed but no collections found
self._log(f"⚠️ fileStatus '{datei_status}' but no collections found - skipping sync")
return (False, [], f"fileStatus: {datei_status}, but no collections")
# ═══════════════════════════════════════════════════════════════
# FALL 1: Document ist bereits in xAI UND Collections sind gesetzt
# CASE 1: Document is already in xAI AND collections are set
# ═══════════════════════════════════════════════════════════════
if xai_file_id:
self._log(f"✅ Document bereits in xAI gesynct mit {len(target_collections)} Collection(s)")
self._log(f"✅ Document already synced to xAI with {len(target_collections)} collection(s)")
# Prüfe ob File-Inhalt geändert wurde (Hash-Vergleich)
# Check if file content was changed (hash comparison)
current_hash = file_md5 or file_sha
if current_hash and xai_synced_hash:
if current_hash != xai_synced_hash:
self._log(f"🔄 Hash-Änderung erkannt! RESYNC erforderlich")
self._log(f" Alt: {xai_synced_hash[:16]}...")
self._log(f" Neu: {current_hash[:16]}...")
return (True, target_collections, "File-Inhalt geändert (Hash-Mismatch)")
self._log(f"🔄 Hash change detected! RESYNC required")
self._log(f" Old: {xai_synced_hash[:16]}...")
self._log(f" New: {current_hash[:16]}...")
return (True, target_collections, "File content changed (hash mismatch)")
else:
self._log(f"✅ Hash identisch - keine Änderung")
self._log(f"✅ Hash identical - no change")
else:
self._log(f"⚠️ Keine Hash-Werte verfügbar für Vergleich")
self._log(f"⚠️ No hash values available for comparison")
return (False, target_collections, "Bereits gesynct, keine Änderung erkannt")
return (False, target_collections, "Already synced, no change detected")
# ═══════════════════════════════════════════════════════════════
# FALL 2: Document hat xaiFileId aber Collections ist leer/None
# CASE 2: Document has xaiFileId but collections is empty/None
# ═══════════════════════════════════════════════════════════════
# ═══════════════════════════════════════════════════════════════
# FALL 3: Collections vorhanden aber kein Status/Hash-Trigger
# CASE 3: Collections present but no status/hash trigger
# ═══════════════════════════════════════════════════════════════
self._log(f"✅ Document ist mit {len(target_collections)} Entity/ies verknüpft die Collections haben")
return (True, target_collections, "Verknüpft mit Entities die Collections benötigen")
self._log(f"✅ Document is linked to {len(target_collections)} entity/ies with collections")
return (True, target_collections, "Linked to entities that require collections")
async def _get_required_collections_from_relations(
self,
@@ -234,78 +243,67 @@ class DocumentSync(BaseSyncUtils):
entity_type: str = 'Document'
) -> List[str]:
"""
Ermittelt alle xAI Collection-IDs von Entities die mit diesem Document verknüpft sind
Determine all xAI collection IDs of CAIKnowledge entities linked to this document.
EspoCRM Many-to-Many: Document kann mit beliebigen Entities verknüpft sein
(CBeteiligte, Account, CVmhErstgespraech, etc.)
Checks CAIKnowledgeCDokumente junction table:
- Status 'active' + datenbankId: Returns collection ID
- Status 'new': Returns "NEW:{knowledge_id}" marker (collection must be created first)
- Other statuses (paused, deactivated): Skips
Args:
document_id: Document ID
entity_type: Entity type (e.g., 'CDokumente')
Returns:
Liste von xAI Collection-IDs (dedupliziert)
List of collection IDs or markers:
- Normal IDs: "abc123..." (existing collections)
- New markers: "NEW:kb-id..." (collection needs to be created via knowledge sync)
"""
collections = set()
self._log(f"🔍 Prüfe Relations von {entity_type} {document_id}...")
self._log(f"🔍 Checking relations of {entity_type} {document_id}...")
# ═══════════════════════════════════════════════════════════════
# SPECIAL HANDLING: CAIKnowledge via Junction Table
# ═══════════════════════════════════════════════════════════════
try:
entity_def = await self.espocrm.get_entity_def(entity_type)
links = entity_def.get('links', {}) if isinstance(entity_def, dict) else {}
except Exception as e:
self._log(f"⚠️ Konnte Metadata fuer {entity_type} nicht laden: {e}", level='warn')
links = {}
link_types = {'hasMany', 'hasChildren', 'manyMany', 'hasManyThrough'}
for link_name, link_def in links.items():
try:
if not isinstance(link_def, dict):
continue
if link_def.get('type') not in link_types:
continue
related_entity = link_def.get('entity')
if not related_entity:
continue
related_def = await self.espocrm.get_entity_def(related_entity)
related_fields = related_def.get('fields', {}) if isinstance(related_def, dict) else {}
select_fields = ['id']
if 'xaiCollectionId' in related_fields:
select_fields.append('xaiCollectionId')
offset = 0
page_size = 100
while True:
result = await self.espocrm.list_related(
entity_type,
document_id,
link_name,
select=','.join(select_fields),
offset=offset,
max_size=page_size
)
entities = result.get('list', [])
if not entities:
break
for entity in entities:
collection_id = entity.get('xaiCollectionId')
if collection_id:
junction_entries = await self.espocrm.get_junction_entries(
'CAIKnowledgeCDokumente',
'cDokumenteId',
document_id
)
if junction_entries:
self._log(f" 📋 Found {len(junction_entries)} CAIKnowledge link(s)")
for junction in junction_entries:
knowledge_id = junction.get('cAIKnowledgeId')
if not knowledge_id:
continue
try:
knowledge = await self.espocrm.get_entity('CAIKnowledge', knowledge_id)
activation_status = knowledge.get('aktivierungsstatus')
collection_id = knowledge.get('datenbankId')
if activation_status == 'active' and collection_id:
# Existing collection - use it
collections.add(collection_id)
if len(entities) < page_size:
break
offset += page_size
except Exception as e:
self._log(f" ⚠️ Fehler beim Prüfen von Link {link_name}: {e}", level='warn')
continue
self._log(f" ✅ CAIKnowledge {knowledge_id}: {collection_id} (active)")
elif activation_status == 'new':
# Collection doesn't exist yet - return special marker
# Format: "NEW:{knowledge_id}" signals to caller: trigger knowledge sync first
collections.add(f"NEW:{knowledge_id}")
self._log(f" 🆕 CAIKnowledge {knowledge_id}: status='new' → collection must be created first")
else:
self._log(f" ⏭️ CAIKnowledge {knowledge_id}: status={activation_status}, datenbankId={collection_id or 'N/A'}")
except Exception as e:
self._log(f" ⚠️ Failed to load CAIKnowledge {knowledge_id}: {e}", level='warn')
except Exception as e:
self._log(f" ⚠️ Failed to check CAIKnowledge junction: {e}", level='warn')
result = list(collections)
self._log(f"📊 Gesamt: {len(result)} eindeutige Collection(s) gefunden")
@@ -368,6 +366,10 @@ class DocumentSync(BaseSyncUtils):
# Filename: Nutze dokumentName/fileName falls vorhanden, sonst aus Attachment
final_filename = filename or attachment.get('name', 'unknown')
# URL-decode filename (fixes special chars like §, ä, ö, ü, etc.)
# EspoCRM stores filenames URL-encoded: %C2%A7 → §
final_filename = unquote(final_filename)
return {
'attachment_id': attachment_id,
'download_url': f"/api/v1/Attachment/file/{attachment_id}",

View File

@@ -17,8 +17,6 @@ from services.redis_client import get_redis_client
from services.config import ESPOCRM_CONFIG, API_CONFIG
from services.logging_utils import get_service_logger
logger = logging.getLogger(__name__)
class EspoCRMAPI:
"""
@@ -60,6 +58,10 @@ class EspoCRMAPI:
self._entity_defs_cache: Dict[str, Dict[str, Any]] = {}
self._entity_defs_cache_ttl_seconds = int(os.getenv('ESPOCRM_METADATA_TTL_SECONDS', '300'))
# Metadata cache (complete metadata loaded once)
self._metadata_cache: Optional[Dict[str, Any]] = None
self._metadata_cache_ts: float = 0
# Optional Redis for caching/rate limiting (centralized)
self.redis_client = get_redis_client(strict=False)
if self.redis_client:
@@ -89,20 +91,76 @@ class EspoCRMAPI:
if self._session and not self._session.closed:
await self._session.close()
async def get_entity_def(self, entity_type: str) -> Dict[str, Any]:
async def get_metadata(self) -> Dict[str, Any]:
"""
Get complete EspoCRM metadata (cached).
Loads once and caches for TTL duration.
Much faster than individual entity def calls.
Returns:
Complete metadata dict with entityDefs, clientDefs, etc.
"""
now = time.monotonic()
cached = self._entity_defs_cache.get(entity_type)
if cached and (now - cached['ts']) < self._entity_defs_cache_ttl_seconds:
return cached['data']
# Return cached if still valid
if (self._metadata_cache is not None and
(now - self._metadata_cache_ts) < self._entity_defs_cache_ttl_seconds):
return self._metadata_cache
# Load fresh metadata
try:
data = await self.api_call(f"/Metadata/EntityDefs/{entity_type}", method='GET')
except EspoCRMAPIError:
all_defs = await self.api_call("/Metadata/EntityDefs", method='GET')
data = all_defs.get(entity_type, {}) if isinstance(all_defs, dict) else {}
self._log("📥 Loading complete EspoCRM metadata...", level='debug')
metadata = await self.api_call("/Metadata", method='GET')
if not isinstance(metadata, dict):
self._log("⚠️ Metadata response is not a dict, using empty", level='warn')
metadata = {}
# Cache it
self._metadata_cache = metadata
self._metadata_cache_ts = now
entity_count = len(metadata.get('entityDefs', {}))
self._log(f"✅ Metadata cached: {entity_count} entity definitions", level='debug')
return metadata
except Exception as e:
self._log(f"❌ Failed to load metadata: {e}", level='error')
# Return empty dict as fallback
return {}
self._entity_defs_cache[entity_type] = {'ts': now, 'data': data}
return data
async def get_entity_def(self, entity_type: str) -> Dict[str, Any]:
"""
Get entity definition for a specific entity type (cached via metadata).
Uses complete metadata cache - much faster and correct API usage.
Args:
entity_type: Entity type (e.g., 'Document', 'CDokumente', 'Account')
Returns:
Entity definition dict with fields, links, etc.
"""
try:
metadata = await self.get_metadata()
entity_defs = metadata.get('entityDefs', {})
if not isinstance(entity_defs, dict):
self._log(f"⚠️ entityDefs is not a dict for {entity_type}", level='warn')
return {}
entity_def = entity_defs.get(entity_type, {})
if not entity_def:
self._log(f"⚠️ No entity definition found for '{entity_type}'", level='debug')
return entity_def
except Exception as e:
self._log(f"⚠️ Could not load entity def for {entity_type}: {e}", level='warn')
return {}
async def api_call(
self,
@@ -475,3 +533,199 @@ class EspoCRMAPI:
except aiohttp.ClientError as e:
self._log(f"Download failed: {e}", level='error')
raise EspoCRMError(f"Download request failed: {e}") from e
# ========== Junction Table Operations ==========
async def get_junction_entries(
self,
junction_entity: str,
filter_field: str,
filter_value: str,
max_size: int = 1000
) -> List[Dict[str, Any]]:
"""
Load junction table entries with filtering.
Args:
junction_entity: Junction entity name (e.g., 'CAIKnowledgeCDokumente')
filter_field: Field to filter on (e.g., 'cAIKnowledgeId')
filter_value: Value to match
max_size: Maximum entries to return
Returns:
List of junction records with ALL additionalColumns
Example:
entries = await espocrm.get_junction_entries(
'CAIKnowledgeCDokumente',
'cAIKnowledgeId',
'kb-123'
)
"""
self._log(f"Loading junction entries: {junction_entity} where {filter_field}={filter_value}")
result = await self.list_entities(
junction_entity,
where=[{
'type': 'equals',
'attribute': filter_field,
'value': filter_value
}],
max_size=max_size
)
entries = result.get('list', [])
self._log(f"✅ Loaded {len(entries)} junction entries")
return entries
async def update_junction_entry(
self,
junction_entity: str,
junction_id: str,
fields: Dict[str, Any]
) -> None:
"""
Update junction table entry.
Args:
junction_entity: Junction entity name
junction_id: Junction entry ID
fields: Fields to update
Example:
await espocrm.update_junction_entry(
'CAIKnowledgeCDokumente',
'jct-123',
{'syncstatus': 'synced', 'lastSync': '2026-03-11T20:00:00Z'}
)
"""
await self.update_entity(junction_entity, junction_id, fields)
async def get_knowledge_documents_with_junction(
self,
knowledge_id: str
) -> List[Dict[str, Any]]:
"""
Get all documents linked to a CAIKnowledge entry with junction data.
Uses custom EspoCRM endpoint: GET /JunctionData/CAIKnowledge/{knowledge_id}/dokumentes
Returns enriched list with:
- junctionId: Junction table ID
- cAIKnowledgeId, cDokumenteId: Junction keys
- aiDocumentId: XAI document ID from junction
- syncstatus: Sync status from junction (new, synced, failed, unclean)
- lastSync: Last sync timestamp from junction
- documentId, documentName: Document info
- blake3hash: Blake3 hash from document entity
- documentCreatedAt, documentModifiedAt: Document timestamps
This consolidates multiple API calls into one efficient query.
Args:
knowledge_id: CAIKnowledge entity ID
Returns:
List of document dicts with junction data
Example:
docs = await espocrm.get_knowledge_documents_with_junction('69b1b03582bb6e2da')
for doc in docs:
print(f"{doc['documentName']}: {doc['syncstatus']}")
"""
# JunctionData uses API Gateway URL, not direct EspoCRM
# Use gateway URL from env or construct from ESPOCRM_API_BASE_URL
gateway_url = os.getenv('ESPOCRM_GATEWAY_URL', 'https://api.bitbylaw.com/vmh/crm')
url = f"{gateway_url}/JunctionData/CAIKnowledge/{knowledge_id}/dokumentes"
self._log(f"GET {url}")
try:
session = await self._get_session()
timeout = aiohttp.ClientTimeout(total=self.api_timeout_seconds)
async with session.get(url, headers=self._get_headers(), timeout=timeout) as response:
self._log(f"Response status: {response.status}")
if response.status == 404:
# Knowledge base not found or no documents linked
return []
if response.status >= 400:
error_text = await response.text()
raise EspoCRMAPIError(f"JunctionData GET failed: {response.status} - {error_text}")
result = await response.json()
documents = result.get('list', [])
self._log(f"✅ Loaded {len(documents)} document(s) with junction data")
return documents
except asyncio.TimeoutError:
raise EspoCRMTimeoutError(f"Timeout getting junction data for knowledge {knowledge_id}")
except aiohttp.ClientError as e:
raise EspoCRMAPIError(f"Network error getting junction data: {e}")
async def update_knowledge_document_junction(
self,
knowledge_id: str,
document_id: str,
fields: Dict[str, Any],
update_last_sync: bool = True
) -> Dict[str, Any]:
"""
Update junction columns for a specific document link.
Uses custom EspoCRM endpoint:
PUT /JunctionData/CAIKnowledge/{knowledge_id}/dokumentes/{document_id}
Args:
knowledge_id: CAIKnowledge entity ID
document_id: CDokumente entity ID
fields: Junction fields to update (aiDocumentId, syncstatus, etc.)
update_last_sync: Whether to update lastSync timestamp (default: True)
Returns:
Updated junction data
Example:
await espocrm.update_knowledge_document_junction(
'69b1b03582bb6e2da',
'69a68b556a39771bf',
{
'aiDocumentId': 'xai-file-abc123',
'syncstatus': 'synced'
},
update_last_sync=True
)
"""
# JunctionData uses API Gateway URL, not direct EspoCRM
gateway_url = os.getenv('ESPOCRM_GATEWAY_URL', 'https://api.bitbylaw.com/vmh/crm')
url = f"{gateway_url}/JunctionData/CAIKnowledge/{knowledge_id}/dokumentes/{document_id}"
payload = {**fields}
if update_last_sync:
payload['updateLastSync'] = True
self._log(f"PUT {url}")
self._log(f" Payload: {payload}")
try:
session = await self._get_session()
timeout = aiohttp.ClientTimeout(total=self.api_timeout_seconds)
async with session.put(url, headers=self._get_headers(), json=payload, timeout=timeout) as response:
self._log(f"Response status: {response.status}")
if response.status >= 400:
error_text = await response.text()
raise EspoCRMAPIError(f"JunctionData PUT failed: {response.status} - {error_text}")
result = await response.json()
self._log(f"✅ Junction updated: junctionId={result.get('junctionId')}")
return result
except asyncio.TimeoutError:
raise EspoCRMTimeoutError(f"Timeout updating junction data")
except aiohttp.ClientError as e:
raise EspoCRMAPIError(f"Network error updating junction data: {e}")

View File

@@ -18,8 +18,6 @@ from services.models import (
from services.exceptions import ValidationError
from services.config import FEATURE_FLAGS
logger = logging.getLogger(__name__)
class BeteiligteMapper:
"""Mapper für CBeteiligte (EspoCRM) ↔ Beteiligte (Advoware)"""

View File

@@ -24,8 +24,6 @@ from services.kommunikation_mapper import (
from services.advoware_service import AdvowareService
from services.espocrm import EspoCRMAPI
logger = logging.getLogger(__name__)
class KommunikationSyncManager:
"""Manager für Kommunikation-Synchronisation"""

View File

@@ -0,0 +1,218 @@
"""LangChain xAI Integration Service
Service für LangChain ChatXAI Integration mit File Search Binding.
Analog zu xai_service.py für xAI Files API.
"""
import os
from typing import Dict, List, Any, Optional, AsyncIterator
from services.logging_utils import get_service_logger
class LangChainXAIService:
"""
Wrapper für LangChain ChatXAI mit Motia-Integration.
Benötigte Umgebungsvariablen:
- XAI_API_KEY: API Key für xAI (für ChatXAI model)
Usage:
service = LangChainXAIService(ctx)
model = service.get_chat_model(model="grok-4-1-fast-reasoning")
model_with_tools = service.bind_file_search(model, collection_id)
result = await service.invoke_chat(model_with_tools, messages)
"""
def __init__(self, ctx=None):
"""
Initialize LangChain xAI Service.
Args:
ctx: Optional Motia context for logging
Raises:
ValueError: If XAI_API_KEY not configured
"""
self.api_key = os.getenv('XAI_API_KEY', '')
self.ctx = ctx
self.logger = get_service_logger('langchain_xai', ctx)
if not self.api_key:
raise ValueError("XAI_API_KEY not configured in environment")
def _log(self, msg: str, level: str = 'info') -> None:
"""Delegate logging to service logger"""
log_func = getattr(self.logger, level, self.logger.info)
log_func(msg)
def get_chat_model(
self,
model: str = "grok-4-1-fast-reasoning",
temperature: float = 0.7,
max_tokens: Optional[int] = None
):
"""
Initialisiert ChatXAI Model.
Args:
model: Model name (default: grok-4-1-fast-reasoning)
temperature: Sampling temperature 0.0-1.0
max_tokens: Optional max tokens for response
Returns:
ChatXAI model instance
Raises:
ImportError: If langchain_xai not installed
"""
try:
from langchain_xai import ChatXAI
except ImportError:
raise ImportError(
"langchain_xai not installed. "
"Run: pip install langchain-xai>=0.2.0"
)
self._log(f"🤖 Initializing ChatXAI: model={model}, temp={temperature}")
kwargs = {
"model": model,
"api_key": self.api_key,
"temperature": temperature
}
if max_tokens:
kwargs["max_tokens"] = max_tokens
return ChatXAI(**kwargs)
def bind_tools(
self,
model,
collection_id: Optional[str] = None,
enable_web_search: bool = False,
web_search_config: Optional[Dict[str, Any]] = None,
max_num_results: int = 10
):
"""
Bindet xAI Tools (file_search und/oder web_search) an Model.
Args:
model: ChatXAI model instance
collection_id: Optional xAI Collection ID für file_search
enable_web_search: Enable web search tool (default: False)
web_search_config: Optional web search configuration:
{
'allowed_domains': ['example.com'], # Max 5 domains
'excluded_domains': ['spam.com'], # Max 5 domains
'enable_image_understanding': True
}
max_num_results: Max results from file search (default: 10)
Returns:
Model with requested tools bound (file_search and/or web_search)
"""
tools = []
# Add file_search tool if collection_id provided
if collection_id:
self._log(f"🔍 Binding file_search: collection={collection_id}")
tools.append({
"type": "file_search",
"vector_store_ids": [collection_id],
"max_num_results": max_num_results
})
# Add web_search tool if enabled
if enable_web_search:
self._log("🌐 Binding web_search")
web_search_tool = {"type": "web_search"}
# Add optional web search filters
if web_search_config:
if 'allowed_domains' in web_search_config:
domains = web_search_config['allowed_domains'][:5] # Max 5
web_search_tool['filters'] = {'allowed_domains': domains}
self._log(f" Allowed domains: {domains}")
elif 'excluded_domains' in web_search_config:
domains = web_search_config['excluded_domains'][:5] # Max 5
web_search_tool['filters'] = {'excluded_domains': domains}
self._log(f" Excluded domains: {domains}")
if web_search_config.get('enable_image_understanding'):
web_search_tool['enable_image_understanding'] = True
self._log(" Image understanding: enabled")
tools.append(web_search_tool)
if not tools:
self._log("⚠️ No tools to bind (no collection_id and web_search disabled)", level='warn')
return model
self._log(f"🔧 Binding {len(tools)} tool(s) to model")
return model.bind_tools(tools)
def bind_file_search(
self,
model,
collection_id: str,
max_num_results: int = 10
):
"""
Legacy method: Bindet nur file_search Tool an Model.
Use bind_tools() for more flexibility.
"""
return self.bind_tools(
model=model,
collection_id=collection_id,
max_num_results=max_num_results
)
async def invoke_chat(
self,
model,
messages: List[Dict[str, Any]]
) -> Any:
"""
Non-streaming Chat Completion.
Args:
model: ChatXAI model (with or without tools)
messages: List of message dicts [{"role": "user", "content": "..."}]
Returns:
LangChain AIMessage with response
Raises:
Exception: If API call fails
"""
self._log(f"💬 Invoking chat: {len(messages)} messages", level='debug')
result = await model.ainvoke(messages)
self._log(f"✅ Response received: {len(result.content)} chars", level='debug')
return result
async def astream_chat(
self,
model,
messages: List[Dict[str, Any]]
) -> AsyncIterator:
"""
Streaming Chat Completion.
Args:
model: ChatXAI model (with or without tools)
messages: List of message dicts
Yields:
Chunks from streaming response
Example:
async for chunk in service.astream_chat(model, messages):
delta = chunk.content if hasattr(chunk, "content") else ""
# Process delta...
"""
self._log(f"💬 Streaming chat: {len(messages)} messages", level='debug')
async for chunk in model.astream(messages):
yield chunk

View File

@@ -5,6 +5,59 @@ Vereinheitlicht Logging über:
- Standard Python Logger
- Motia FlowContext Logger
- Structured Logging
Usage Guidelines:
=================
FOR SERVICES: Use get_service_logger('service_name', context)
-----------------------------------------------------------------
Example:
from services.logging_utils import get_service_logger
class XAIService:
def __init__(self, ctx=None):
self.logger = get_service_logger('xai', ctx)
def upload(self):
self.logger.info("Uploading file...")
FOR STEPS: Use ctx.logger directly (preferred)
-----------------------------------------------------------------
Steps already have ctx.logger available - use it directly:
async def handler(event_data, ctx: FlowContext):
ctx.logger.info("Processing event")
Alternative: Use get_step_logger() for additional loggers:
step_logger = get_step_logger('beteiligte_sync', ctx)
FOR SYNC UTILS: Inherit from BaseSyncUtils (provides self.logger)
-----------------------------------------------------------------
from services.sync_utils_base import BaseSyncUtils
class MySync(BaseSyncUtils):
def __init__(self, espocrm, redis, context):
super().__init__(espocrm, redis, context)
# self.logger is now available
def sync(self):
self._log("Syncing...", level='info')
FOR STANDALONE UTILITIES: Use get_logger()
-----------------------------------------------------------------
from services.logging_utils import get_logger
logger = get_logger('my_module', context)
logger.info("Processing...")
CONSISTENCY RULES:
==================
✅ Services: get_service_logger('service_name', ctx)
✅ Steps: ctx.logger (direct) or get_step_logger('step_name', ctx)
✅ Sync Utils: Inherit from BaseSyncUtils → use self._log() or self.logger
✅ Standalone: get_logger('module_name', ctx)
❌ DO NOT: Use module-level logging.getLogger(__name__)
❌ DO NOT: Mix get_logger() and get_service_logger() in same module
"""
import logging

View File

@@ -16,7 +16,7 @@ from enum import Enum
# ========== Enums ==========
class Rechtsform(str, Enum):
"""Rechtsformen für Beteiligte"""
"""Legal forms for Beteiligte"""
NATUERLICHE_PERSON = ""
GMBH = "GmbH"
AG = "AG"
@@ -29,7 +29,7 @@ class Rechtsform(str, Enum):
class SyncStatus(str, Enum):
"""Sync Status für EspoCRM Entities"""
"""Sync status for EspoCRM entities (Beteiligte)"""
PENDING_SYNC = "pending_sync"
SYNCING = "syncing"
CLEAN = "clean"
@@ -38,14 +38,70 @@ class SyncStatus(str, Enum):
PERMANENTLY_FAILED = "permanently_failed"
class FileStatus(str, Enum):
"""Valid values for CDokumente.fileStatus field"""
NEW = "new"
CHANGED = "changed"
SYNCED = "synced"
def __str__(self) -> str:
return self.value
class XAISyncStatus(str, Enum):
"""Valid values for CDokumente.xaiSyncStatus field"""
NO_SYNC = "no_sync" # Entity has no xAI collections
PENDING_SYNC = "pending_sync" # Sync in progress (locked)
CLEAN = "clean" # Synced successfully
UNCLEAN = "unclean" # Needs re-sync (file changed)
FAILED = "failed" # Sync failed (see xaiSyncError)
def __str__(self) -> str:
return self.value
class SalutationType(str, Enum):
"""Anredetypen"""
"""Salutation types"""
HERR = "Herr"
FRAU = "Frau"
DIVERS = "Divers"
FIRMA = ""
class AIKnowledgeActivationStatus(str, Enum):
"""Activation status for CAIKnowledge collections"""
NEW = "new" # Collection noch nicht in XAI erstellt
ACTIVE = "active" # Collection aktiv, Sync läuft
PAUSED = "paused" # Collection existiert, aber kein Sync
DEACTIVATED = "deactivated" # Collection aus XAI gelöscht
def __str__(self) -> str:
return self.value
class AIKnowledgeSyncStatus(str, Enum):
"""Sync status for CAIKnowledge"""
UNCLEAN = "unclean" # Änderungen pending
PENDING_SYNC = "pending_sync" # Sync läuft (locked)
SYNCED = "synced" # Alles synced
FAILED = "failed" # Sync fehlgeschlagen
def __str__(self) -> str:
return self.value
class JunctionSyncStatus(str, Enum):
"""Sync status for junction tables (CAIKnowledgeCDokumente)"""
NEW = "new"
UNCLEAN = "unclean"
SYNCED = "synced"
FAILED = "failed"
UNSUPPORTED = "unsupported"
def __str__(self) -> str:
return self.value
# ========== Advoware Models ==========
class AdvowareBeteiligteBase(BaseModel):

View File

@@ -1,51 +1,58 @@
"""
Redis Client Factory
Zentralisierte Redis-Client-Verwaltung mit:
- Singleton Pattern
- Connection Pooling
- Automatic Reconnection
- Health Checks
Centralized Redis client management with:
- Singleton pattern
- Connection pooling
- Automatic reconnection
- Health checks
"""
import redis
import os
import logging
from typing import Optional
from services.exceptions import RedisConnectionError
logger = logging.getLogger(__name__)
from services.logging_utils import get_service_logger
class RedisClientFactory:
"""
Singleton Factory für Redis Clients.
Singleton factory for Redis clients.
Vorteile:
- Eine zentrale Konfiguration
- Connection Pooling
- Lazy Initialization
- Besseres Error Handling
Benefits:
- Centralized configuration
- Connection pooling
- Lazy initialization
- Better error handling
"""
_instance: Optional[redis.Redis] = None
_connection_pool: Optional[redis.ConnectionPool] = None
_logger = None
@classmethod
def _get_logger(cls):
"""Get logger instance (lazy initialization)"""
if cls._logger is None:
cls._logger = get_service_logger('redis_factory', None)
return cls._logger
@classmethod
def get_client(cls, strict: bool = False) -> Optional[redis.Redis]:
"""
Gibt Redis Client zurück (erstellt wenn nötig).
Return Redis client (creates if needed).
Args:
strict: Wenn True, wirft Exception bei Verbindungsfehlern.
Wenn False, gibt None zurück (für optionale Redis-Nutzung).
strict: If True, raises exception on connection failures.
If False, returns None (for optional Redis usage).
Returns:
Redis client oder None (wenn strict=False und Verbindung fehlschlägt)
Redis client or None (if strict=False and connection fails)
Raises:
RedisConnectionError: Wenn strict=True und Verbindung fehlschlägt
RedisConnectionError: If strict=True and connection fails
"""
logger = cls._get_logger()
if cls._instance is None:
try:
cls._instance = cls._create_client()
@@ -65,14 +72,15 @@ class RedisClientFactory:
@classmethod
def _create_client(cls) -> redis.Redis:
"""
Erstellt neuen Redis Client mit Connection Pool.
Create new Redis client with connection pool.
Returns:
Configured Redis client
Raises:
redis.ConnectionError: Bei Verbindungsproblemen
redis.ConnectionError: On connection problems
"""
logger = cls._get_logger()
# Load configuration from environment
redis_host = os.getenv('REDIS_HOST', 'localhost')
redis_port = int(os.getenv('REDIS_PORT', '6379'))
@@ -94,7 +102,7 @@ class RedisClientFactory:
socket_timeout=redis_timeout,
socket_connect_timeout=redis_timeout,
max_connections=redis_max_connections,
decode_responses=True # Auto-decode bytes zu strings
decode_responses=True # Auto-decode bytes to strings
)
# Create client from pool
@@ -108,10 +116,11 @@ class RedisClientFactory:
@classmethod
def reset(cls) -> None:
"""
Reset factory state (hauptsächlich für Tests).
Reset factory state (mainly for tests).
Schließt bestehende Verbindungen und setzt Singleton zurück.
Closes existing connections and resets singleton.
"""
logger = cls._get_logger()
if cls._instance:
try:
cls._instance.close()
@@ -131,11 +140,12 @@ class RedisClientFactory:
@classmethod
def health_check(cls) -> bool:
"""
Prüft Redis-Verbindung.
Check Redis connection.
Returns:
True wenn Redis erreichbar, False sonst
True if Redis is reachable, False otherwise
"""
logger = cls._get_logger()
try:
client = cls.get_client(strict=False)
if client is None:
@@ -150,11 +160,12 @@ class RedisClientFactory:
@classmethod
def get_info(cls) -> Optional[dict]:
"""
Gibt Redis Server Info zurück (für Monitoring).
Return Redis server info (for monitoring).
Returns:
Redis info dict oder None bei Fehler
Redis info dict or None on error
"""
logger = cls._get_logger()
try:
client = cls.get_client(strict=False)
if client is None:
@@ -170,22 +181,22 @@ class RedisClientFactory:
def get_redis_client(strict: bool = False) -> Optional[redis.Redis]:
"""
Convenience function für Redis Client.
Convenience function for Redis client.
Args:
strict: Wenn True, wirft Exception bei Fehler
strict: If True, raises exception on error
Returns:
Redis client oder None
Redis client or None
"""
return RedisClientFactory.get_client(strict=strict)
def is_redis_available() -> bool:
"""
Prüft ob Redis verfügbar ist.
Check if Redis is available.
Returns:
True wenn Redis erreichbar
True if Redis is reachable
"""
return RedisClientFactory.health_check()

View File

@@ -14,7 +14,7 @@ import pytz
from services.exceptions import RedisConnectionError, LockAcquisitionError
from services.redis_client import get_redis_client
from services.config import SYNC_CONFIG, get_lock_key
from services.logging_utils import get_logger
from services.logging_utils import get_service_logger
import redis
@@ -31,7 +31,7 @@ class BaseSyncUtils:
"""
self.espocrm = espocrm_api
self.context = context
self.logger = get_logger('sync_utils', context)
self.logger = get_service_logger('sync_utils', context)
# Use provided Redis client or get from factory
self.redis = redis_client or get_redis_client(strict=False)

View File

@@ -1,10 +1,9 @@
"""xAI Files & Collections Service"""
import os
import asyncio
import aiohttp
import logging
from typing import Optional, List
logger = logging.getLogger(__name__)
from typing import Optional, List, Dict, Tuple
from services.logging_utils import get_service_logger
XAI_FILES_URL = "https://api.x.ai"
XAI_MANAGEMENT_URL = "https://management-api.x.ai"
@@ -23,6 +22,7 @@ class XAIService:
self.api_key = os.getenv('XAI_API_KEY', '')
self.management_key = os.getenv('XAI_MANAGEMENT_KEY', '')
self.ctx = ctx
self.logger = get_service_logger('xai', ctx)
self._session: Optional[aiohttp.ClientSession] = None
if not self.api_key:
@@ -31,10 +31,9 @@ class XAIService:
raise ValueError("XAI_MANAGEMENT_KEY not configured in environment")
def _log(self, msg: str, level: str = 'info') -> None:
if self.ctx:
getattr(self.ctx.logger, level, self.ctx.logger.info)(msg)
else:
getattr(logger, level, logger.info)(msg)
"""Delegate logging to service logger"""
log_func = getattr(self.logger, level, self.logger.info)
log_func(msg)
async def _get_session(self) -> aiohttp.ClientSession:
if self._session is None or self._session.closed:
@@ -64,14 +63,31 @@ class XAIService:
Raises:
RuntimeError: bei HTTP-Fehler oder fehlendem file_id in der Antwort
"""
self._log(f"📤 Uploading {len(file_content)} bytes to xAI: {filename}")
# Normalize MIME type: xAI needs correct Content-Type for proper processing
# If generic octet-stream but file is clearly a PDF, fix it
if mime_type == 'application/octet-stream' and filename.lower().endswith('.pdf'):
mime_type = 'application/pdf'
self._log(f"⚠️ Corrected MIME type to application/pdf for {filename}")
self._log(f"📤 Uploading {len(file_content)} bytes to xAI: {filename} ({mime_type})")
session = await self._get_session()
url = f"{XAI_FILES_URL}/v1/files"
headers = {"Authorization": f"Bearer {self.api_key}"}
form = aiohttp.FormData()
form.add_field('file', file_content, filename=filename, content_type=mime_type)
# Create multipart form with explicit UTF-8 filename encoding
# aiohttp automatically URL-encodes filenames with special chars,
# but xAI expects raw UTF-8 in the filename parameter
form = aiohttp.FormData(quote_fields=False)
form.add_field(
'file',
file_content,
filename=filename,
content_type=mime_type
)
# CRITICAL: purpose="file_search" enables proper PDF processing
# Without this, xAI throws "internal error" on complex PDFs
form.add_field('purpose', 'file_search')
async with session.post(url, data=form, headers=headers) as response:
try:
@@ -96,9 +112,12 @@ class XAIService:
async def add_to_collection(self, collection_id: str, file_id: str) -> None:
"""
Fügt eine Datei einer xAI-Collection hinzu.
Fügt eine Datei einer xAI-Collection (Vector Store) hinzu.
POST https://management-api.x.ai/v1/collections/{collection_id}/documents/{file_id}
POST https://api.x.ai/v1/vector_stores/{vector_store_id}/files
Uses the OpenAI-compatible API pattern for adding files to vector stores.
This triggers proper indexing and processing.
Raises:
RuntimeError: bei HTTP-Fehler
@@ -106,13 +125,16 @@ class XAIService:
self._log(f"📚 Adding file {file_id} to collection {collection_id}")
session = await self._get_session()
url = f"{XAI_MANAGEMENT_URL}/v1/collections/{collection_id}/documents/{file_id}"
# Use the OpenAI-compatible endpoint (not management API)
url = f"{XAI_FILES_URL}/v1/vector_stores/{collection_id}/files"
headers = {
"Authorization": f"Bearer {self.management_key}",
"Authorization": f"Bearer {self.api_key}",
"Content-Type": "application/json",
}
async with session.post(url, headers=headers) as response:
payload = {"file_id": file_id}
async with session.post(url, json=payload, headers=headers) as response:
if response.status not in (200, 201):
raw = await response.text()
raise RuntimeError(
@@ -175,3 +197,333 @@ class XAIService:
f"⚠️ Fehler beim Entfernen aus Collection {collection_id}: {e}",
level='warn'
)
# ========== Collection Management ==========
async def create_collection(
self,
name: str,
metadata: Optional[Dict[str, str]] = None,
field_definitions: Optional[List[Dict]] = None
) -> Dict:
"""
Erstellt eine neue xAI Collection.
POST https://management-api.x.ai/v1/collections
Args:
name: Collection name
metadata: Optional metadata dict
field_definitions: Optional field definitions for metadata fields
Returns:
Collection object mit 'id' field
Raises:
RuntimeError: bei HTTP-Fehler
"""
self._log(f"📚 Creating collection: {name}")
# Standard field definitions für document metadata
if field_definitions is None:
field_definitions = [
{"key": "document_name", "inject_into_chunk": True},
{"key": "description", "inject_into_chunk": True},
{"key": "created_at", "inject_into_chunk": False},
{"key": "modified_at", "inject_into_chunk": False},
{"key": "espocrm_id", "inject_into_chunk": False}
]
session = await self._get_session()
url = f"{XAI_MANAGEMENT_URL}/v1/collections"
headers = {
"Authorization": f"Bearer {self.management_key}",
"Content-Type": "application/json"
}
body = {
"collection_name": name,
"field_definitions": field_definitions
}
# Add metadata if provided
if metadata:
body["metadata"] = metadata
async with session.post(url, json=body, headers=headers) as response:
if response.status not in (200, 201):
raw = await response.text()
raise RuntimeError(
f"Failed to create collection ({response.status}): {raw}"
)
data = await response.json()
# API returns 'collection_id' not 'id'
collection_id = data.get('collection_id') or data.get('id')
self._log(f"✅ Collection created: {collection_id}")
return data
async def get_collection(self, collection_id: str) -> Optional[Dict]:
"""
Holt Collection-Details.
GET https://management-api.x.ai/v1/collections/{collection_id}
Returns:
Collection object or None if not found
Raises:
RuntimeError: bei HTTP-Fehler (außer 404)
"""
self._log(f"📄 Getting collection: {collection_id}")
session = await self._get_session()
url = f"{XAI_MANAGEMENT_URL}/v1/collections/{collection_id}"
headers = {"Authorization": f"Bearer {self.management_key}"}
async with session.get(url, headers=headers) as response:
if response.status == 404:
self._log(f"⚠️ Collection not found: {collection_id}", level='warn')
return None
if response.status not in (200,):
raw = await response.text()
raise RuntimeError(
f"Failed to get collection ({response.status}): {raw}"
)
data = await response.json()
self._log(f"✅ Collection retrieved: {data.get('collection_name', 'N/A')}")
return data
async def delete_collection(self, collection_id: str) -> None:
"""
Löscht eine XAI Collection.
DELETE https://management-api.x.ai/v1/collections/{collection_id}
NOTE: Documents in der Collection werden NICHT gelöscht!
Sie können noch in anderen Collections sein.
Raises:
RuntimeError: bei HTTP-Fehler
"""
self._log(f"🗑️ Deleting collection {collection_id}")
session = await self._get_session()
url = f"{XAI_MANAGEMENT_URL}/v1/collections/{collection_id}"
headers = {"Authorization": f"Bearer {self.management_key}"}
async with session.delete(url, headers=headers) as response:
if response.status not in (200, 204):
raw = await response.text()
raise RuntimeError(
f"Failed to delete collection {collection_id} ({response.status}): {raw}"
)
self._log(f"✅ Collection deleted: {collection_id}")
async def list_collection_documents(self, collection_id: str) -> List[Dict]:
"""
Listet alle Dokumente in einer Collection.
GET https://management-api.x.ai/v1/collections/{collection_id}/documents
Returns:
List von normalized document objects:
[
{
'file_id': 'file_...',
'filename': 'doc.pdf',
'blake3_hash': 'hex_string', # Plain hex, kein prefix
'size_bytes': 12345,
'content_type': 'application/pdf',
'fields': {}, # Custom metadata
'status': 'DOCUMENT_STATUS_...'
}
]
Raises:
RuntimeError: bei HTTP-Fehler
"""
self._log(f"📋 Listing documents in collection {collection_id}")
session = await self._get_session()
url = f"{XAI_MANAGEMENT_URL}/v1/collections/{collection_id}/documents"
headers = {"Authorization": f"Bearer {self.management_key}"}
async with session.get(url, headers=headers) as response:
if response.status not in (200,):
raw = await response.text()
raise RuntimeError(
f"Failed to list documents ({response.status}): {raw}"
)
data = await response.json()
# API gibt Liste zurück oder dict mit 'documents' key
if isinstance(data, list):
raw_documents = data
elif isinstance(data, dict) and 'documents' in data:
raw_documents = data['documents']
else:
raw_documents = []
# Normalize nested structure: file_metadata -> top-level
normalized = []
for doc in raw_documents:
file_meta = doc.get('file_metadata', {})
normalized.append({
'file_id': file_meta.get('file_id'),
'filename': file_meta.get('name'),
'blake3_hash': file_meta.get('hash'), # Plain hex string
'size_bytes': int(file_meta.get('size_bytes', 0)) if file_meta.get('size_bytes') else 0,
'content_type': file_meta.get('content_type'),
'created_at': file_meta.get('created_at'),
'fields': doc.get('fields', {}),
'status': doc.get('status')
})
self._log(f"✅ Listed {len(normalized)} documents")
return normalized
async def get_collection_document(self, collection_id: str, file_id: str) -> Optional[Dict]:
"""
Holt Dokument-Details aus einer XAI Collection.
GET https://management-api.x.ai/v1/collections/{collection_id}/documents/{file_id}
Returns:
Normalized dict mit document info:
{
'file_id': 'file_xyz',
'filename': 'document.pdf',
'blake3_hash': 'hex_string', # Plain hex, kein prefix
'size_bytes': 12345,
'content_type': 'application/pdf',
'fields': {...} # Custom metadata
}
Returns None if not found.
"""
self._log(f"📄 Getting document {file_id} from collection {collection_id}")
session = await self._get_session()
url = f"{XAI_MANAGEMENT_URL}/v1/collections/{collection_id}/documents/{file_id}"
headers = {"Authorization": f"Bearer {self.management_key}"}
async with session.get(url, headers=headers) as response:
if response.status == 404:
return None
if response.status not in (200,):
raw = await response.text()
raise RuntimeError(
f"Failed to get document from collection ({response.status}): {raw}"
)
data = await response.json()
# Normalize nested structure
file_meta = data.get('file_metadata', {})
normalized = {
'file_id': file_meta.get('file_id'),
'filename': file_meta.get('name'),
'blake3_hash': file_meta.get('hash'), # Plain hex
'size_bytes': int(file_meta.get('size_bytes', 0)) if file_meta.get('size_bytes') else 0,
'content_type': file_meta.get('content_type'),
'created_at': file_meta.get('created_at'),
'fields': data.get('fields', {}),
'status': data.get('status')
}
self._log(f"✅ Document info retrieved: {normalized.get('filename', 'N/A')}")
return normalized
async def update_document_metadata(
self,
collection_id: str,
file_id: str,
metadata: Dict[str, str]
) -> None:
"""
Aktualisiert nur Metadaten eines Documents (kein File-Upload).
PATCH https://management-api.x.ai/v1/collections/{collection_id}/documents/{file_id}
Args:
collection_id: XAI Collection ID
file_id: XAI file_id
metadata: Updated metadata fields
Raises:
RuntimeError: bei HTTP-Fehler
"""
self._log(f"📝 Updating metadata for document {file_id}")
session = await self._get_session()
url = f"{XAI_MANAGEMENT_URL}/v1/collections/{collection_id}/documents/{file_id}"
headers = {
"Authorization": f"Bearer {self.management_key}",
"Content-Type": "application/json"
}
body = {"fields": metadata}
async with session.patch(url, json=body, headers=headers) as response:
if response.status not in (200, 204):
raw = await response.text()
raise RuntimeError(
f"Failed to update document metadata ({response.status}): {raw}"
)
self._log(f"✅ Metadata updated for {file_id}")
def is_mime_type_supported(self, mime_type: str) -> bool:
"""
Prüft, ob XAI diesen MIME-Type unterstützt.
Args:
mime_type: MIME type string
Returns:
True wenn unterstützt, False sonst
"""
# Liste der unterstützten MIME-Types basierend auf XAI Dokumentation
supported_types = {
# Documents
'application/pdf',
'application/msword',
'application/vnd.openxmlformats-officedocument.wordprocessingml.document',
'application/vnd.ms-excel',
'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet',
'application/vnd.oasis.opendocument.text',
'application/epub+zip',
'application/vnd.openxmlformats-officedocument.presentationml.presentation',
# Text
'text/plain',
'text/html',
'text/markdown',
'text/csv',
'text/xml',
# Code
'text/javascript',
'application/json',
'application/xml',
'text/x-python',
'text/x-java-source',
'text/x-c',
'text/x-c++src',
# Other
'application/zip',
}
# Normalisiere MIME-Type (lowercase, strip whitespace)
normalized = mime_type.lower().strip()
return normalized in supported_types

View File

@@ -17,7 +17,7 @@ from calendar_sync_utils import (
import math
import time
from datetime import datetime
from typing import Any
from typing import Any, Dict
from motia import queue, FlowContext
from pydantic import BaseModel, Field
from services.advoware_service import AdvowareService
@@ -33,7 +33,7 @@ config = {
}
async def handler(input_data: dict, ctx: FlowContext):
async def handler(input_data: Dict[str, Any], ctx: FlowContext) -> None:
"""
Handler that fetches all employees, sorts by last sync time,
and emits calendar_sync_employee events for the oldest ones.

View File

@@ -7,7 +7,7 @@ Supports syncing a single employee or all employees.
import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent))
from calendar_sync_utils import get_redis_client, set_employee_lock, log_operation
from calendar_sync_utils import get_redis_client, set_employee_lock, get_logger
from motia import http, ApiRequest, ApiResponse, FlowContext
@@ -41,7 +41,7 @@ async def handler(request: ApiRequest, ctx: FlowContext) -> ApiResponse:
status=400,
body={
'error': 'kuerzel required',
'message': 'Bitte kuerzel im Body angeben'
'message': 'Please provide kuerzel in body'
}
)
@@ -49,7 +49,7 @@ async def handler(request: ApiRequest, ctx: FlowContext) -> ApiResponse:
if kuerzel_upper == 'ALL':
# Emit sync-all event
log_operation('info', "Calendar Sync API: Emitting sync-all event", context=ctx)
ctx.logger.info("Calendar Sync API: Emitting sync-all event")
await ctx.enqueue({
"topic": "calendar_sync_all",
"data": {
@@ -60,7 +60,7 @@ async def handler(request: ApiRequest, ctx: FlowContext) -> ApiResponse:
status=200,
body={
'status': 'triggered',
'message': 'Calendar sync wurde für alle Mitarbeiter ausgelöst',
'message': 'Calendar sync triggered for all employees',
'triggered_by': 'api'
}
)
@@ -69,7 +69,7 @@ async def handler(request: ApiRequest, ctx: FlowContext) -> ApiResponse:
redis_client = get_redis_client(ctx)
if not set_employee_lock(redis_client, kuerzel_upper, 'api', ctx):
log_operation('info', f"Calendar Sync API: Sync already active for {kuerzel_upper}, skipping", context=ctx)
ctx.logger.info(f"Calendar Sync API: Sync already active for {kuerzel_upper}, skipping")
return ApiResponse(
status=409,
body={
@@ -80,7 +80,7 @@ async def handler(request: ApiRequest, ctx: FlowContext) -> ApiResponse:
}
)
log_operation('info', f"Calendar Sync API called for {kuerzel_upper}", context=ctx)
ctx.logger.info(f"Calendar Sync API called for {kuerzel_upper}")
# Lock successfully set, now emit event
await ctx.enqueue({
@@ -95,14 +95,14 @@ async def handler(request: ApiRequest, ctx: FlowContext) -> ApiResponse:
status=200,
body={
'status': 'triggered',
'message': f'Calendar sync was triggered for {kuerzel_upper}',
'message': f'Calendar sync triggered for {kuerzel_upper}',
'kuerzel': kuerzel_upper,
'triggered_by': 'api'
}
)
except Exception as e:
log_operation('error', f"Error in API trigger: {e}", context=ctx)
ctx.logger.error(f"Error in API trigger: {e}")
return ApiResponse(
status=500,
body={

View File

@@ -9,6 +9,7 @@ from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent))
from calendar_sync_utils import log_operation
from typing import Dict, Any
from motia import cron, FlowContext
@@ -17,16 +18,19 @@ config = {
'description': 'Runs calendar sync automatically every 15 minutes',
'flows': ['advoware-calendar-sync'],
'triggers': [
cron("0 */15 * * * *") # Every 15 minutes at second 0 (6-field: sec min hour day month weekday)
cron("0 15 1 * * *") # Every 15 minutes at second 0 (6-field: sec min hour day month weekday)
],
'enqueues': ['calendar_sync_all']
}
async def handler(input_data: dict, ctx: FlowContext):
async def handler(input_data: None, ctx: FlowContext) -> None:
"""Cron handler that triggers the calendar sync cascade."""
try:
log_operation('info', "Calendar Sync Cron: Starting to emit sync-all event", context=ctx)
ctx.logger.info("=" * 80)
ctx.logger.info("🕐 CALENDAR SYNC CRON: STARTING")
ctx.logger.info("=" * 80)
ctx.logger.info("Emitting sync-all event")
# Enqueue sync-all event
await ctx.enqueue({
@@ -36,15 +40,11 @@ async def handler(input_data: dict, ctx: FlowContext):
}
})
log_operation('info', "Calendar Sync Cron: Emitted sync-all event", context=ctx)
return {
'status': 'completed',
'triggered_by': 'cron'
}
ctx.logger.info("Calendar sync-all event emitted successfully")
ctx.logger.info("=" * 80)
except Exception as e:
log_operation('error', f"Fehler beim Cron-Job: {e}", context=ctx)
return {
'status': 'error',
'error': str(e)
}
ctx.logger.error("=" * 80)
ctx.logger.error("❌ ERROR: CALENDAR SYNC CRON")
ctx.logger.error(f"Error: {e}")
ctx.logger.error("=" * 80)

View File

@@ -14,6 +14,7 @@ import asyncio
import os
import datetime
from datetime import timedelta
from typing import Dict, Any
import pytz
import backoff
import time
@@ -64,7 +65,8 @@ async def enforce_global_rate_limit(context=None):
socket_timeout=int(os.getenv('REDIS_TIMEOUT_SECONDS', '5'))
)
lua_script = """
try:
lua_script = """
local key = KEYS[1]
local current_time_ms = tonumber(ARGV[1])
local max_tokens = tonumber(ARGV[2])
@@ -96,7 +98,6 @@ async def enforce_global_rate_limit(context=None):
end
"""
try:
script = redis_client.register_script(lua_script)
while True:
@@ -120,6 +121,12 @@ async def enforce_global_rate_limit(context=None):
except Exception as e:
log_operation('error', f"Rate limiting failed: {e}. Proceeding without limit.", context=context)
finally:
# Always close Redis connection to prevent resource leaks
try:
redis_client.close()
except Exception:
pass
@backoff.on_exception(backoff.expo, HttpError, max_tries=4, base=3,
@@ -945,18 +952,19 @@ config = {
}
async def handler(input_data: dict, ctx: FlowContext):
async def handler(input_data: Dict[str, Any], ctx: FlowContext) -> None:
"""Main event handler for calendar sync."""
start_time = time.time()
kuerzel = input_data.get('kuerzel')
if not kuerzel:
log_operation('error', "No kuerzel provided in event", context=ctx)
return {'status': 400, 'body': {'error': 'No kuerzel provided'}}
return
log_operation('info', f"Starting calendar sync for employee {kuerzel}", context=ctx)
redis_client = get_redis_client(ctx)
service = None
try:
log_operation('debug', "Initializing Advoware service", context=ctx)
@@ -1047,11 +1055,24 @@ async def handler(input_data: dict, ctx: FlowContext):
log_operation('info', f"Handler duration: {time.time() - start_time}", context=ctx)
return {'status': 200, 'body': {'status': 'completed', 'kuerzel': kuerzel}}
except Exception as e:
log_operation('error', f"Sync failed for {kuerzel}: {e}", context=ctx)
log_operation('info', f"Handler duration (failed): {time.time() - start_time}", context=ctx)
return {'status': 500, 'body': {'error': str(e)}}
finally:
# Always close resources to prevent memory leaks
if service is not None:
try:
service.close()
except Exception as e:
log_operation('debug', f"Error closing Google service: {e}", context=ctx)
try:
redis_client.close()
except Exception as e:
log_operation('debug', f"Error closing Redis client: {e}", context=ctx)
# Ensure lock is always released
clear_employee_lock(redis_client, kuerzel, ctx)

View File

@@ -3,50 +3,44 @@ Calendar Sync Utilities
Shared utility functions for calendar synchronization between Google Calendar and Advoware.
"""
import logging
import asyncpg
import os
import redis
import time
from typing import Optional, Any, List
from googleapiclient.discovery import build
from google.oauth2 import service_account
# Configure logging
logger = logging.getLogger(__name__)
from services.logging_utils import get_service_logger
def log_operation(level: str, message: str, context=None, **context_vars):
"""Centralized logging with context, supporting file and console logging."""
context_str = ' '.join(f"{k}={v}" for k, v in context_vars.items() if v is not None)
full_message = f"{message} {context_str}".strip()
def get_logger(context=None):
"""Get logger for calendar sync operations"""
return get_service_logger('calendar_sync', context)
def log_operation(level: str, message: str, context=None, **extra):
"""
Log calendar sync operations with structured context.
# Use ctx.logger if context is available (Motia III FlowContext)
if context and hasattr(context, 'logger'):
if level == 'info':
context.logger.info(full_message)
elif level == 'warning':
context.logger.warning(full_message)
elif level == 'error':
context.logger.error(full_message)
elif level == 'debug':
context.logger.debug(full_message)
Args:
level: Log level ('debug', 'info', 'warning', 'error')
message: Log message
context: FlowContext if available
**extra: Additional key-value pairs to log
"""
logger = get_logger(context)
log_func = getattr(logger, level.lower(), logger.info)
if extra:
extra_str = " | " + " | ".join(f"{k}={v}" for k, v in extra.items())
log_func(message + extra_str)
else:
# Fallback to standard logger
if level == 'info':
logger.info(full_message)
elif level == 'warning':
logger.warning(full_message)
elif level == 'error':
logger.error(full_message)
elif level == 'debug':
logger.debug(full_message)
# Also log to console for journalctl visibility
print(f"[{level.upper()}] {full_message}")
log_func(message)
async def connect_db(context=None):
"""Connect to Postgres DB from environment variables."""
logger = get_logger(context)
try:
conn = await asyncpg.connect(
host=os.getenv('POSTGRES_HOST', 'localhost'),
@@ -57,12 +51,13 @@ async def connect_db(context=None):
)
return conn
except Exception as e:
log_operation('error', f"Failed to connect to DB: {e}", context=context)
logger.error(f"Failed to connect to DB: {e}")
raise
async def get_google_service(context=None):
"""Initialize Google Calendar service."""
logger = get_logger(context)
try:
service_account_path = os.getenv('GOOGLE_CALENDAR_SERVICE_ACCOUNT_PATH', 'service-account.json')
if not os.path.exists(service_account_path):
@@ -75,48 +70,53 @@ async def get_google_service(context=None):
service = build('calendar', 'v3', credentials=creds)
return service
except Exception as e:
log_operation('error', f"Failed to initialize Google service: {e}", context=context)
logger.error(f"Failed to initialize Google service: {e}")
raise
def get_redis_client(context=None):
def get_redis_client(context=None) -> redis.Redis:
"""Initialize Redis client for calendar sync operations."""
logger = get_logger(context)
try:
redis_client = redis.Redis(
host=os.getenv('REDIS_HOST', 'localhost'),
port=int(os.getenv('REDIS_PORT', '6379')),
db=int(os.getenv('REDIS_DB_CALENDAR_SYNC', '2')),
socket_timeout=int(os.getenv('REDIS_TIMEOUT_SECONDS', '5'))
socket_timeout=int(os.getenv('REDIS_TIMEOUT_SECONDS', '5')),
decode_responses=True
)
return redis_client
except Exception as e:
log_operation('error', f"Failed to initialize Redis client: {e}", context=context)
logger.error(f"Failed to initialize Redis client: {e}")
raise
async def get_advoware_employees(advoware, context=None):
async def get_advoware_employees(advoware, context=None) -> List[Any]:
"""Fetch list of employees from Advoware."""
logger = get_logger(context)
try:
result = await advoware.api_call('api/v1/advonet/Mitarbeiter', method='GET', params={'aktiv': 'true'})
employees = result if isinstance(result, list) else []
log_operation('info', f"Fetched {len(employees)} Advoware employees", context=context)
logger.info(f"Fetched {len(employees)} Advoware employees")
return employees
except Exception as e:
log_operation('error', f"Failed to fetch Advoware employees: {e}", context=context)
logger.error(f"Failed to fetch Advoware employees: {e}")
raise
def set_employee_lock(redis_client, kuerzel: str, triggered_by: str, context=None) -> bool:
def set_employee_lock(redis_client: redis.Redis, kuerzel: str, triggered_by: str, context=None) -> bool:
"""Set lock for employee sync operation."""
logger = get_logger(context)
employee_lock_key = f'calendar_sync_lock_{kuerzel}'
if redis_client.set(employee_lock_key, triggered_by, ex=1800, nx=True) is None:
log_operation('info', f"Sync already active for {kuerzel}, skipping", context=context)
logger.info(f"Sync already active for {kuerzel}, skipping")
return False
return True
def clear_employee_lock(redis_client, kuerzel: str, context=None):
def clear_employee_lock(redis_client: redis.Redis, kuerzel: str, context=None) -> None:
"""Clear lock for employee sync operation and update last-synced timestamp."""
logger = get_logger(context)
try:
employee_lock_key = f'calendar_sync_lock_{kuerzel}'
employee_last_synced_key = f'calendar_sync_last_synced_{kuerzel}'
@@ -128,6 +128,6 @@ def clear_employee_lock(redis_client, kuerzel: str, context=None):
# Delete the lock
redis_client.delete(employee_lock_key)
log_operation('debug', f"Cleared lock and updated last-synced for {kuerzel} to {current_time}", context=context)
logger.debug(f"Cleared lock and updated last-synced for {kuerzel} to {current_time}")
except Exception as e:
log_operation('warning', f"Failed to clear lock and update last-synced for {kuerzel}: {e}", context=context)
logger.warning(f"Failed to clear lock and update last-synced for {kuerzel}: {e}")

View File

@@ -32,23 +32,33 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
body={'error': 'Endpoint required as query parameter'}
)
ctx.logger.info("=" * 80)
ctx.logger.info("🔄 ADVOWARE PROXY: DELETE REQUEST")
ctx.logger.info("=" * 80)
ctx.logger.info(f"Endpoint: {endpoint}")
ctx.logger.info("=" * 80)
# Initialize Advoware client
advoware = AdvowareAPI(ctx)
# Forward all query params except 'endpoint'
params = {k: v for k, v in request.query_params.items() if k != 'endpoint'}
ctx.logger.info(f"Proxying DELETE request to Advoware: {endpoint}")
result = await advoware.api_call(
endpoint,
method='DELETE',
params=params
)
ctx.logger.info("✅ Proxy DELETE erfolgreich")
return ApiResponse(status=200, body={'result': result})
except Exception as e:
ctx.logger.error(f"Proxy error: {e}")
ctx.logger.error("=" * 80)
ctx.logger.error("❌ ADVOWARE PROXY DELETE FEHLER")
ctx.logger.error(f"Endpoint: {request.query_params.get('endpoint', 'N/A')}")
ctx.logger.error(f"Error: {e}")
ctx.logger.error("=" * 80)
return ApiResponse(
status=500,
body={'error': 'Internal server error', 'details': str(e)}

View File

@@ -32,23 +32,33 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
body={'error': 'Endpoint required as query parameter'}
)
ctx.logger.info("=" * 80)
ctx.logger.info("🔄 ADVOWARE PROXY: GET REQUEST")
ctx.logger.info("=" * 80)
ctx.logger.info(f"Endpoint: {endpoint}")
ctx.logger.info("=" * 80)
# Initialize Advoware client
advoware = AdvowareAPI(ctx)
# Forward all query params except 'endpoint'
params = {k: v for k, v in request.query_params.items() if k != 'endpoint'}
ctx.logger.info(f"Proxying GET request to Advoware: {endpoint}")
result = await advoware.api_call(
endpoint,
method='GET',
params=params
)
ctx.logger.info("✅ Proxy GET erfolgreich")
return ApiResponse(status=200, body={'result': result})
except Exception as e:
ctx.logger.error(f"Proxy error: {e}")
ctx.logger.error("=" * 80)
ctx.logger.error("❌ ADVOWARE PROXY GET FEHLER")
ctx.logger.error(f"Endpoint: {request.query_params.get('endpoint', 'N/A')}")
ctx.logger.error(f"Error: {e}")
ctx.logger.error("=" * 80)
return ApiResponse(
status=500,
body={'error': 'Internal server error', 'details': str(e)}

View File

@@ -34,6 +34,12 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
body={'error': 'Endpoint required as query parameter'}
)
ctx.logger.info("=" * 80)
ctx.logger.info("🔄 ADVOWARE PROXY: POST REQUEST")
ctx.logger.info("=" * 80)
ctx.logger.info(f"Endpoint: {endpoint}")
ctx.logger.info("=" * 80)
# Initialize Advoware client
advoware = AdvowareAPI(ctx)
@@ -43,7 +49,6 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
# Get request body
json_data = request.body
ctx.logger.info(f"Proxying POST request to Advoware: {endpoint}")
result = await advoware.api_call(
endpoint,
method='POST',
@@ -51,10 +56,15 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
json_data=json_data
)
ctx.logger.info("✅ Proxy POST erfolgreich")
return ApiResponse(status=200, body={'result': result})
except Exception as e:
ctx.logger.error(f"Proxy error: {e}")
ctx.logger.error("=" * 80)
ctx.logger.error("❌ ADVOWARE PROXY POST FEHLER")
ctx.logger.error(f"Endpoint: {request.query_params.get('endpoint', 'N/A')}")
ctx.logger.error(f"Error: {e}")
ctx.logger.error("=" * 80)
return ApiResponse(
status=500,
body={'error': 'Internal server error', 'details': str(e)}

View File

@@ -34,6 +34,12 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
body={'error': 'Endpoint required as query parameter'}
)
ctx.logger.info("=" * 80)
ctx.logger.info("🔄 ADVOWARE PROXY: PUT REQUEST")
ctx.logger.info("=" * 80)
ctx.logger.info(f"Endpoint: {endpoint}")
ctx.logger.info("=" * 80)
# Initialize Advoware client
advoware = AdvowareAPI(ctx)
@@ -43,7 +49,6 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
# Get request body
json_data = request.body
ctx.logger.info(f"Proxying PUT request to Advoware: {endpoint}")
result = await advoware.api_call(
endpoint,
method='PUT',
@@ -51,10 +56,15 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
json_data=json_data
)
ctx.logger.info("✅ Proxy PUT erfolgreich")
return ApiResponse(status=200, body={'result': result})
except Exception as e:
ctx.logger.error(f"Proxy error: {e}")
ctx.logger.error("=" * 80)
ctx.logger.error("❌ ADVOWARE PROXY PUT FEHLER")
ctx.logger.error(f"Endpoint: {request.query_params.get('endpoint', 'N/A')}")
ctx.logger.error(f"Error: {e}")
ctx.logger.error("=" * 80)
return ApiResponse(
status=500,
body={'error': 'Internal server error', 'details': str(e)}

View File

@@ -0,0 +1,90 @@
"""AI Knowledge Daily Sync - Cron Job"""
from typing import Any
from motia import FlowContext, cron
config = {
"name": "AI Knowledge Daily Sync",
"description": "Daily sync of all CAIKnowledge entities (catches missed webhooks, Blake3 verification included)",
"flows": ["aiknowledge-full-sync"],
"triggers": [
cron("0 0 2 * * *"), # Daily at 2:00 AM
],
"enqueues": ["aiknowledge.sync"],
}
async def handler(input_data: None, ctx: FlowContext[Any]) -> None:
"""
Daily sync handler - ensures all active knowledge bases are synchronized.
Loads all CAIKnowledge entities that need sync and emits events.
Blake3 hash verification is always performed (hash available from JunctionData API).
Runs every day at 02:00:00.
"""
from services.espocrm import EspoCRMAPI
from services.models import AIKnowledgeActivationStatus, AIKnowledgeSyncStatus
ctx.logger.info("=" * 80)
ctx.logger.info("🌙 DAILY AI KNOWLEDGE SYNC STARTED")
ctx.logger.info("=" * 80)
espocrm = EspoCRMAPI(ctx)
try:
# Load all CAIKnowledge entities with status 'active' that need sync
result = await espocrm.list_entities(
'CAIKnowledge',
where=[
{
'type': 'equals',
'attribute': 'aktivierungsstatus',
'value': AIKnowledgeActivationStatus.ACTIVE.value
},
{
'type': 'in',
'attribute': 'syncStatus',
'value': [
AIKnowledgeSyncStatus.UNCLEAN.value,
AIKnowledgeSyncStatus.FAILED.value
]
}
],
select='id,name,syncStatus',
max_size=1000 # Adjust if you have more
)
entities = result.get('list', [])
total = len(entities)
ctx.logger.info(f"📊 Found {total} knowledge bases needing sync")
if total == 0:
ctx.logger.info("✅ All knowledge bases are synced")
ctx.logger.info("=" * 80)
return
# Enqueue sync events for all (Blake3 verification always enabled)
for i, entity in enumerate(entities, 1):
await ctx.enqueue({
'topic': 'aiknowledge.sync',
'data': {
'knowledge_id': entity['id'],
'source': 'daily_cron'
}
})
ctx.logger.info(
f"📤 [{i}/{total}] Enqueued: {entity['name']} "
f"(syncStatus={entity.get('syncStatus')})"
)
ctx.logger.info("=" * 80)
ctx.logger.info(f"✅ Daily sync complete: {total} events enqueued")
ctx.logger.info("=" * 80)
except Exception as e:
ctx.logger.error("=" * 80)
ctx.logger.error("❌ FULL SYNC FAILED")
ctx.logger.error("=" * 80)
ctx.logger.error(f"Error: {e}", exc_info=True)
raise

View File

@@ -0,0 +1,89 @@
"""AI Knowledge Sync Event Handler"""
from typing import Dict, Any
from redis import Redis
from motia import FlowContext, queue
config = {
"name": "AI Knowledge Sync",
"description": "Synchronizes CAIKnowledge entities with XAI Collections",
"flows": ["vmh-aiknowledge"],
"triggers": [
queue("aiknowledge.sync")
],
}
async def handler(event_data: Dict[str, Any], ctx: FlowContext[Any]) -> None:
"""
Event handler for AI Knowledge synchronization.
Emitted by:
- Webhook on CAIKnowledge update
- Daily full sync cron job
Args:
event_data: Event payload with knowledge_id
ctx: Motia context
"""
from services.redis_client import RedisClientFactory
from services.aiknowledge_sync_utils import AIKnowledgeSync
ctx.logger.info("=" * 80)
ctx.logger.info("🔄 AI KNOWLEDGE SYNC STARTED")
ctx.logger.info("=" * 80)
# Extract data
knowledge_id = event_data.get('knowledge_id')
source = event_data.get('source', 'unknown')
if not knowledge_id:
ctx.logger.error("❌ Missing knowledge_id in event data")
return
ctx.logger.info(f"📋 Knowledge ID: {knowledge_id}")
ctx.logger.info(f"📋 Source: {source}")
ctx.logger.info("=" * 80)
# Get Redis for locking
redis_client = RedisClientFactory.get_client(strict=False)
# Initialize sync utils
sync_utils = AIKnowledgeSync(ctx, redis_client)
# Acquire lock
lock_acquired = await sync_utils.acquire_sync_lock(knowledge_id)
if not lock_acquired:
ctx.logger.warn(f"⏸️ Lock already held for {knowledge_id}, skipping")
ctx.logger.info(" (Will be retried by Motia queue)")
raise RuntimeError(f"Lock busy for {knowledge_id}") # Motia will retry
try:
# Perform sync (Blake3 hash verification always enabled)
await sync_utils.sync_knowledge_to_xai(knowledge_id, ctx)
ctx.logger.info("=" * 80)
ctx.logger.info("✅ AI KNOWLEDGE SYNC COMPLETED")
ctx.logger.info("=" * 80)
# Release lock with success=True
await sync_utils.release_sync_lock(knowledge_id, success=True)
except Exception as e:
ctx.logger.error("=" * 80)
ctx.logger.error("❌ AI KNOWLEDGE SYNC FAILED")
ctx.logger.error("=" * 80)
ctx.logger.error(f"Error: {e}")
ctx.logger.error(f"Knowledge ID: {knowledge_id}")
ctx.logger.error("=" * 80)
# Release lock with failure
await sync_utils.release_sync_lock(
knowledge_id,
success=False,
error_message=str(e)
)
# Re-raise to let Motia retry
raise

View File

@@ -11,30 +11,29 @@ Verarbeitet:
"""
from typing import Dict, Any, Optional
from motia import FlowContext
from motia import FlowContext, queue
from services.advoware import AdvowareAPI
from services.espocrm import EspoCRMAPI
from services.bankverbindungen_mapper import BankverbindungenMapper
from services.notification_utils import NotificationManager
from services.redis_client import get_redis_client
import json
import redis
import os
config = {
"name": "VMH Bankverbindungen Sync Handler",
"description": "Zentraler Sync-Handler für Bankverbindungen (Webhooks + Cron Events)",
"flows": ["vmh-bankverbindungen"],
"triggers": [
{"type": "queue", "topic": "vmh.bankverbindungen.create"},
{"type": "queue", "topic": "vmh.bankverbindungen.update"},
{"type": "queue", "topic": "vmh.bankverbindungen.delete"},
{"type": "queue", "topic": "vmh.bankverbindungen.sync_check"}
queue("vmh.bankverbindungen.create"),
queue("vmh.bankverbindungen.update"),
queue("vmh.bankverbindungen.delete"),
queue("vmh.bankverbindungen.sync_check")
],
"enqueues": []
}
async def handler(event_data: Dict[str, Any], ctx: FlowContext[Any]):
async def handler(event_data: Dict[str, Any], ctx: FlowContext[Any]) -> None:
"""Zentraler Sync-Handler für Bankverbindungen"""
entity_id = event_data.get('entity_id')
@@ -47,20 +46,11 @@ async def handler(event_data: Dict[str, Any], ctx: FlowContext[Any]):
ctx.logger.info(f"🔄 Bankverbindungen Sync gestartet: {action.upper()} | Entity: {entity_id} | Source: {source}")
# Shared Redis client
redis_host = os.getenv('REDIS_HOST', 'localhost')
redis_port = int(os.getenv('REDIS_PORT', '6379'))
redis_db = int(os.getenv('REDIS_DB_ADVOWARE_CACHE', '1'))
# Shared Redis client (centralized factory)
redis_client = get_redis_client(strict=False)
redis_client = redis.Redis(
host=redis_host,
port=redis_port,
db=redis_db,
decode_responses=True
)
# APIs initialisieren
espocrm = EspoCRMAPI()
# APIs initialisieren (mit Context für besseres Logging)
espocrm = EspoCRMAPI(ctx)
advoware = AdvowareAPI(ctx)
mapper = BankverbindungenMapper()
notification_mgr = NotificationManager(espocrm_api=espocrm, context=ctx)
@@ -130,7 +120,7 @@ async def handler(event_data: Dict[str, Any], ctx: FlowContext[Any]):
pass
async def handle_create(entity_id, betnr, espo_entity, espocrm, advoware, mapper, ctx, redis_client, lock_key):
async def handle_create(entity_id, betnr, espo_entity, espocrm, advoware, mapper, ctx, redis_client, lock_key) -> None:
"""Erstellt neue Bankverbindung in Advoware"""
try:
ctx.logger.info(f"🔨 CREATE Bankverbindung in Advoware für Beteiligter {betnr}...")
@@ -176,7 +166,7 @@ async def handle_create(entity_id, betnr, espo_entity, espocrm, advoware, mapper
redis_client.delete(lock_key)
async def handle_update(entity_id, betnr, advoware_id, espo_entity, espocrm, notification_mgr, ctx, redis_client, lock_key):
async def handle_update(entity_id, betnr, advoware_id, espo_entity, espocrm, notification_mgr, ctx, redis_client, lock_key) -> None:
"""Update nicht möglich - Sendet Notification an User"""
try:
ctx.logger.warn(f"⚠️ UPDATE: Advoware API unterstützt kein PUT für Bankverbindungen")
@@ -219,7 +209,7 @@ async def handle_update(entity_id, betnr, advoware_id, espo_entity, espocrm, not
redis_client.delete(lock_key)
async def handle_delete(entity_id, betnr, advoware_id, espo_entity, espocrm, notification_mgr, ctx, redis_client, lock_key):
async def handle_delete(entity_id, betnr, advoware_id, espo_entity, espocrm, notification_mgr, ctx, redis_client, lock_key) -> None:
"""Delete nicht möglich - Sendet Notification an User"""
try:
ctx.logger.warn(f"⚠️ DELETE: Advoware API unterstützt kein DELETE für Bankverbindungen")

View File

@@ -25,14 +25,14 @@ config = {
}
async def handler(input_data: Dict[str, Any], ctx: FlowContext):
async def handler(input_data: Dict[str, Any], ctx: FlowContext) -> None:
"""
Cron-Handler: Findet alle Beteiligte die Sync benötigen und emittiert Events
"""
ctx.logger.info("🕐 Beteiligte Sync Cron gestartet")
try:
espocrm = EspoCRMAPI()
espocrm = EspoCRMAPI(ctx)
# Berechne Threshold für "veraltete" Syncs (24 Stunden)
threshold = datetime.datetime.now() - datetime.timedelta(hours=24)

View File

@@ -11,7 +11,7 @@ Verarbeitet:
"""
from typing import Dict, Any, Optional
from motia import FlowContext
from motia import FlowContext, queue
from services.advoware import AdvowareAPI
from services.advoware_service import AdvowareService
from services.espocrm import EspoCRMAPI
@@ -33,25 +33,22 @@ config = {
"description": "Zentraler Sync-Handler für Beteiligte (Webhooks + Cron Events)",
"flows": ["vmh-beteiligte"],
"triggers": [
{"type": "queue", "topic": "vmh.beteiligte.create"},
{"type": "queue", "topic": "vmh.beteiligte.update"},
{"type": "queue", "topic": "vmh.beteiligte.delete"},
{"type": "queue", "topic": "vmh.beteiligte.sync_check"}
queue("vmh.beteiligte.create"),
queue("vmh.beteiligte.update"),
queue("vmh.beteiligte.delete"),
queue("vmh.beteiligte.sync_check")
],
"enqueues": []
}
async def handler(event_data: Dict[str, Any], ctx: FlowContext[Any]) -> Optional[Dict[str, Any]]:
async def handler(event_data: Dict[str, Any], ctx: FlowContext[Any]) -> None:
"""
Zentraler Sync-Handler für Beteiligte
Args:
event_data: Event data mit entity_id, action, source
ctx: Motia FlowContext
Returns:
Optional result dict
"""
entity_id = event_data.get('entity_id')
action = event_data.get('action')
@@ -61,11 +58,13 @@ async def handler(event_data: Dict[str, Any], ctx: FlowContext[Any]) -> Optional
if not entity_id:
step_logger.error("Keine entity_id im Event gefunden")
return None
return
step_logger.info(
f"🔄 Sync-Handler gestartet: {action.upper()} | Entity: {entity_id} | Source: {source}"
)
step_logger.info("=" * 80)
step_logger.info(f"🔄 BETEILIGTE SYNC HANDLER: {action.upper()}")
step_logger.info("=" * 80)
step_logger.info(f"Entity: {entity_id} | Source: {source}")
step_logger.info("=" * 80)
# Get shared Redis client (centralized)
redis_client = get_redis_client(strict=False)
@@ -175,7 +174,7 @@ async def handler(event_data: Dict[str, Any], ctx: FlowContext[Any]) -> Optional
ctx.logger.error(traceback.format_exc())
async def handle_create(entity_id, espo_entity, espocrm, advoware, sync_utils, mapper, ctx):
async def handle_create(entity_id, espo_entity, espocrm, advoware, sync_utils, mapper, ctx) -> None:
"""Erstellt neuen Beteiligten in Advoware"""
try:
ctx.logger.info(f"🔨 CREATE in Advoware...")
@@ -234,7 +233,7 @@ async def handle_create(entity_id, espo_entity, espocrm, advoware, sync_utils, m
await sync_utils.release_sync_lock(entity_id, 'failed', str(e), increment_retry=True)
async def handle_update(entity_id, betnr, espo_entity, espocrm, advoware, sync_utils, mapper, ctx):
async def handle_update(entity_id, betnr, espo_entity, espocrm, advoware, sync_utils, mapper, ctx) -> None:
"""Synchronisiert existierenden Beteiligten"""
try:
ctx.logger.info(f"🔍 Fetch von Advoware betNr={betnr}...")

View File

@@ -10,29 +10,28 @@ Verarbeitet:
"""
from typing import Dict, Any
from motia import FlowContext
from motia import FlowContext, queue
from services.espocrm import EspoCRMAPI
from services.document_sync_utils import DocumentSync
from services.xai_service import XAIService
from services.redis_client import get_redis_client
import hashlib
import json
import redis
import os
config = {
"name": "VMH Document Sync Handler",
"description": "Zentraler Sync-Handler für Documents mit xAI Collections",
"flows": ["vmh-documents"],
"triggers": [
{"type": "queue", "topic": "vmh.document.create"},
{"type": "queue", "topic": "vmh.document.update"},
{"type": "queue", "topic": "vmh.document.delete"}
queue("vmh.document.create"),
queue("vmh.document.update"),
queue("vmh.document.delete")
],
"enqueues": []
}
async def handler(event_data: Dict[str, Any], ctx: FlowContext[Any]):
async def handler(event_data: Dict[str, Any], ctx: FlowContext[Any]) -> None:
"""Zentraler Sync-Handler für Documents"""
entity_id = event_data.get('entity_id')
entity_type = event_data.get('entity_type', 'CDokumente') # Default: CDokumente
@@ -52,20 +51,11 @@ async def handler(event_data: Dict[str, Any], ctx: FlowContext[Any]):
ctx.logger.info(f"Source: {source}")
ctx.logger.info("=" * 80)
# Shared Redis client for distributed locking
redis_host = os.getenv('REDIS_HOST', 'localhost')
redis_port = int(os.getenv('REDIS_PORT', '6379'))
redis_db = int(os.getenv('REDIS_DB_ADVOWARE_CACHE', '1'))
# Shared Redis client for distributed locking (centralized factory)
redis_client = get_redis_client(strict=False)
redis_client = redis.Redis(
host=redis_host,
port=redis_port,
db=redis_db,
decode_responses=True
)
# APIs initialisieren
espocrm = EspoCRMAPI()
# APIs initialisieren (mit Context für besseres Logging)
espocrm = EspoCRMAPI(ctx)
sync_utils = DocumentSync(espocrm, redis_client, ctx)
xai_service = XAIService(ctx)
@@ -137,7 +127,7 @@ async def handler(event_data: Dict[str, Any], ctx: FlowContext[Any]):
ctx.logger.error(traceback.format_exc())
async def handle_create_or_update(entity_id: str, document: Dict[str, Any], sync_utils: DocumentSync, xai_service: XAIService, ctx: FlowContext[Any], entity_type: str = 'CDokumente'):
async def handle_create_or_update(entity_id: str, document: Dict[str, Any], sync_utils: DocumentSync, xai_service: XAIService, ctx: FlowContext[Any], entity_type: str = 'CDokumente') -> None:
"""
Behandelt Create/Update von Documents
@@ -162,6 +152,42 @@ async def handle_create_or_update(entity_id: str, document: Dict[str, Any], sync
if collection_ids:
ctx.logger.info(f" Collections: {collection_ids}")
# ═══════════════════════════════════════════════════════════════
# CHECK: Knowledge Bases mit Status "new" (noch keine Collection)
# ═══════════════════════════════════════════════════════════════
new_knowledge_bases = [cid for cid in collection_ids if cid.startswith('NEW:')]
if new_knowledge_bases:
ctx.logger.info("")
ctx.logger.info("=" * 80)
ctx.logger.info("🆕 DOKUMENT IST MIT KNOWLEDGE BASE(S) VERKNÜPFT (Status: new)")
ctx.logger.info("=" * 80)
for new_kb in new_knowledge_bases:
kb_id = new_kb[4:] # Remove "NEW:" prefix
ctx.logger.info(f"📋 CAIKnowledge {kb_id}")
ctx.logger.info(f" Status: new → Collection muss zuerst erstellt werden")
# Trigger Knowledge Sync
ctx.logger.info(f"📤 Triggering aiknowledge.sync event...")
await ctx.emit('aiknowledge.sync', {
'entity_id': kb_id,
'entity_type': 'CAIKnowledge',
'triggered_by': 'document_sync',
'document_id': entity_id
})
ctx.logger.info(f"✅ Event emitted for {kb_id}")
# Release lock and skip document sync - knowledge sync will handle documents
ctx.logger.info("")
ctx.logger.info("=" * 80)
ctx.logger.info("✅ KNOWLEDGE SYNC GETRIGGERT")
ctx.logger.info(" Document Sync wird übersprungen")
ctx.logger.info(" (Knowledge Sync erstellt Collection und synchronisiert dann Dokumente)")
ctx.logger.info("=" * 80)
await sync_utils.release_sync_lock(entity_id, success=True, entity_type=entity_type)
return
# ═══════════════════════════════════════════════════════════════
# PREVIEW-GENERIERUNG bei neuen/geänderten Dateien
# ═══════════════════════════════════════════════════════════════
@@ -326,7 +352,7 @@ async def handle_create_or_update(entity_id: str, document: Dict[str, Any], sync
await sync_utils.release_sync_lock(entity_id, success=False, error_message=str(e))
async def handle_delete(entity_id: str, document: Dict[str, Any], sync_utils: DocumentSync, xai_service: XAIService, ctx: FlowContext[Any], entity_type: str = 'CDokumente'):
async def handle_delete(entity_id: str, document: Dict[str, Any], sync_utils: DocumentSync, xai_service: XAIService, ctx: FlowContext[Any], entity_type: str = 'CDokumente') -> None:
"""
Behandelt Delete von Documents

View File

@@ -0,0 +1,91 @@
"""VMH Webhook - AI Knowledge Update"""
from typing import Any
from motia import FlowContext, http, ApiRequest, ApiResponse
config = {
"name": "VMH Webhook AI Knowledge Update",
"description": "Receives update webhooks from EspoCRM for CAIKnowledge entities",
"flows": ["vmh-aiknowledge"],
"triggers": [
http("POST", "/vmh/webhook/aiknowledge/update")
],
"enqueues": ["aiknowledge.sync"],
}
async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
"""
Webhook handler for CAIKnowledge updates in EspoCRM.
Triggered when:
- activationStatus changes
- syncStatus changes (e.g., set to 'unclean')
- Documents linked/unlinked
"""
try:
ctx.logger.info("=" * 80)
ctx.logger.info("🔔 AI Knowledge Update Webhook")
ctx.logger.info("=" * 80)
# Extract payload
payload = request.body
# Handle case where payload is a list (e.g., from array-based webhook)
if isinstance(payload, list):
if not payload:
ctx.logger.error("❌ Empty payload list")
return ApiResponse(
status=400,
body={'success': False, 'error': 'Empty payload'}
)
payload = payload[0] # Take first item
# Ensure payload is a dict
if not isinstance(payload, dict):
ctx.logger.error(f"❌ Invalid payload type: {type(payload)}")
return ApiResponse(
status=400,
body={'success': False, 'error': f'Invalid payload type: {type(payload).__name__}'}
)
# Validate required fields
knowledge_id = payload.get('entity_id') or payload.get('id')
entity_type = payload.get('entity_type', 'CAIKnowledge')
action = payload.get('action', 'update')
if not knowledge_id:
ctx.logger.error("❌ Missing entity_id in payload")
return ApiResponse(
status=400,
body={'success': False, 'error': 'Missing entity_id'}
)
ctx.logger.info(f"📋 Entity Type: {entity_type}")
ctx.logger.info(f"📋 Entity ID: {knowledge_id}")
ctx.logger.info(f"📋 Action: {action}")
# Enqueue sync event
await ctx.enqueue({
'topic': 'aiknowledge.sync',
'data': {
'knowledge_id': knowledge_id,
'source': 'webhook',
'action': action
}
})
ctx.logger.info(f"✅ Sync event enqueued for {knowledge_id}")
ctx.logger.info("=" * 80)
return ApiResponse(
status=200,
body={'success': True, 'knowledge_id': knowledge_id}
)
except Exception as e:
ctx.logger.error(f"❌ Webhook error: {e}")
return ApiResponse(
status=500,
body={'success': False, 'error': str(e)}
)

View File

@@ -7,7 +7,7 @@ from motia import FlowContext, http, ApiRequest, ApiResponse
config = {
"name": "VMH Webhook Bankverbindungen Create",
"description": "Empfängt Create-Webhooks von EspoCRM für Bankverbindungen",
"description": "Receives create webhooks from EspoCRM for Bankverbindungen",
"flows": ["vmh-bankverbindungen"],
"triggers": [
http("POST", "/vmh/webhook/bankverbindungen/create")
@@ -23,10 +23,13 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
try:
payload = request.body or []
ctx.logger.info("VMH Webhook Bankverbindungen Create empfangen")
ctx.logger.info("=" * 80)
ctx.logger.info("📥 VMH WEBHOOK: BANKVERBINDUNGEN CREATE")
ctx.logger.info("=" * 80)
ctx.logger.info(f"Payload: {json.dumps(payload, indent=2, ensure_ascii=False)}")
ctx.logger.info("=" * 80)
# Sammle alle IDs aus dem Batch
# Collect all IDs from batch
entity_ids = set()
if isinstance(payload, list):
@@ -36,7 +39,7 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
elif isinstance(payload, dict) and 'id' in payload:
entity_ids.add(payload['id'])
ctx.logger.info(f"{len(entity_ids)} IDs zum Create-Sync gefunden")
ctx.logger.info(f"{len(entity_ids)} IDs found for create sync")
# Emit events
for entity_id in entity_ids:
@@ -50,7 +53,8 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
}
})
ctx.logger.info(f"VMH Create Webhook verarbeitet: {len(entity_ids)} Events emittiert")
ctx.logger.info("VMH Create Webhook processed: "
f"{len(entity_ids)} events emitted")
return ApiResponse(
status=200,
@@ -62,7 +66,10 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
)
except Exception as e:
ctx.logger.error(f"Fehler beim Verarbeiten des VMH Create Webhooks: {e}")
ctx.logger.error("=" * 80)
ctx.logger.error("❌ ERROR: BANKVERBINDUNGEN CREATE WEBHOOK")
ctx.logger.error(f"Error: {e}")
ctx.logger.error("=" * 80)
return ApiResponse(
status=500,
body={'error': 'Internal server error', 'details': str(e)}

View File

@@ -7,7 +7,7 @@ from motia import FlowContext, http, ApiRequest, ApiResponse
config = {
"name": "VMH Webhook Bankverbindungen Delete",
"description": "Empfängt Delete-Webhooks von EspoCRM für Bankverbindungen",
"description": "Receives delete webhooks from EspoCRM for Bankverbindungen",
"flows": ["vmh-bankverbindungen"],
"triggers": [
http("POST", "/vmh/webhook/bankverbindungen/delete")
@@ -23,10 +23,13 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
try:
payload = request.body or []
ctx.logger.info("VMH Webhook Bankverbindungen Delete empfangen")
ctx.logger.info("=" * 80)
ctx.logger.info("📥 VMH WEBHOOK: BANKVERBINDUNGEN DELETE")
ctx.logger.info("=" * 80)
ctx.logger.info(f"Payload: {json.dumps(payload, indent=2, ensure_ascii=False)}")
ctx.logger.info("=" * 80)
# Sammle alle IDs
# Collect all IDs
entity_ids = set()
if isinstance(payload, list):
@@ -36,7 +39,7 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
elif isinstance(payload, dict) and 'id' in payload:
entity_ids.add(payload['id'])
ctx.logger.info(f"{len(entity_ids)} IDs zum Delete-Sync gefunden")
ctx.logger.info(f"{len(entity_ids)} IDs found for delete sync")
# Emit events
for entity_id in entity_ids:
@@ -50,7 +53,8 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
}
})
ctx.logger.info(f"VMH Delete Webhook verarbeitet: {len(entity_ids)} Events emittiert")
ctx.logger.info("VMH Delete Webhook processed: "
f"{len(entity_ids)} events emitted")
return ApiResponse(
status=200,
@@ -62,7 +66,10 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
)
except Exception as e:
ctx.logger.error(f"Fehler beim Verarbeiten des VMH Delete Webhooks: {e}")
ctx.logger.error("=" * 80)
ctx.logger.error("❌ ERROR: BANKVERBINDUNGEN DELETE WEBHOOK")
ctx.logger.error(f"Error: {e}")
ctx.logger.error("=" * 80)
return ApiResponse(
status=500,
body={'error': 'Internal server error', 'details': str(e)}

View File

@@ -7,7 +7,7 @@ from motia import FlowContext, http, ApiRequest, ApiResponse
config = {
"name": "VMH Webhook Bankverbindungen Update",
"description": "Empfängt Update-Webhooks von EspoCRM für Bankverbindungen",
"description": "Receives update webhooks from EspoCRM for Bankverbindungen",
"flows": ["vmh-bankverbindungen"],
"triggers": [
http("POST", "/vmh/webhook/bankverbindungen/update")
@@ -23,10 +23,13 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
try:
payload = request.body or []
ctx.logger.info("VMH Webhook Bankverbindungen Update empfangen")
ctx.logger.info("=" * 80)
ctx.logger.info("📥 VMH WEBHOOK: BANKVERBINDUNGEN UPDATE")
ctx.logger.info("=" * 80)
ctx.logger.info(f"Payload: {json.dumps(payload, indent=2, ensure_ascii=False)}")
ctx.logger.info("=" * 80)
# Sammle alle IDs
# Collect all IDs
entity_ids = set()
if isinstance(payload, list):
@@ -36,7 +39,7 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
elif isinstance(payload, dict) and 'id' in payload:
entity_ids.add(payload['id'])
ctx.logger.info(f"{len(entity_ids)} IDs zum Update-Sync gefunden")
ctx.logger.info(f"{len(entity_ids)} IDs found for update sync")
# Emit events
for entity_id in entity_ids:
@@ -50,7 +53,8 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
}
})
ctx.logger.info(f"VMH Update Webhook verarbeitet: {len(entity_ids)} Events emittiert")
ctx.logger.info("VMH Update Webhook processed: "
f"{len(entity_ids)} events emitted")
return ApiResponse(
status=200,
@@ -62,7 +66,10 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
)
except Exception as e:
ctx.logger.error(f"Fehler beim Verarbeiten des VMH Update Webhooks: {e}")
ctx.logger.error("=" * 80)
ctx.logger.error("❌ ERROR: BANKVERBINDUNGEN UPDATE WEBHOOK")
ctx.logger.error(f"Error: {e}")
ctx.logger.error("=" * 80)
return ApiResponse(
status=500,
body={'error': 'Internal server error', 'details': str(e)}

View File

@@ -7,7 +7,7 @@ from motia import FlowContext, http, ApiRequest, ApiResponse
config = {
"name": "VMH Webhook Beteiligte Create",
"description": "Empfängt Create-Webhooks von EspoCRM für Beteiligte",
"description": "Receives create webhooks from EspoCRM for Beteiligte",
"flows": ["vmh-beteiligte"],
"triggers": [
http("POST", "/vmh/webhook/beteiligte/create")
@@ -26,10 +26,13 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
try:
payload = request.body or []
ctx.logger.info("VMH Webhook Beteiligte Create empfangen")
ctx.logger.info("=" * 80)
ctx.logger.info("📥 VMH WEBHOOK: BETEILIGTE CREATE")
ctx.logger.info("=" * 80)
ctx.logger.info(f"Payload: {json.dumps(payload, indent=2, ensure_ascii=False)}")
ctx.logger.info("=" * 80)
# Sammle alle IDs aus dem Batch
# Collect all IDs from batch
entity_ids = set()
if isinstance(payload, list):
@@ -39,9 +42,9 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
elif isinstance(payload, dict) and 'id' in payload:
entity_ids.add(payload['id'])
ctx.logger.info(f"{len(entity_ids)} IDs zum Create-Sync gefunden")
ctx.logger.info(f"{len(entity_ids)} IDs found for create sync")
# Emit events für Queue-Processing (Deduplizierung erfolgt im Event-Handler via Lock)
# Emit events for queue processing (deduplication via lock in event handler)
for entity_id in entity_ids:
await ctx.enqueue({
'topic': 'vmh.beteiligte.create',
@@ -53,7 +56,8 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
}
})
ctx.logger.info(f"VMH Create Webhook verarbeitet: {len(entity_ids)} Events emittiert")
ctx.logger.info("VMH Create Webhook processed: "
f"{len(entity_ids)} events emitted")
return ApiResponse(
status=200,
@@ -65,7 +69,14 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
)
except Exception as e:
ctx.logger.error(f"Fehler beim Verarbeiten des VMH Create Webhooks: {e}")
ctx.logger.error("=" * 80)
ctx.logger.error("❌ ERROR: VMH CREATE WEBHOOK")
ctx.logger.error("=" * 80)
ctx.logger.error(f"Error: {e}")
ctx.logger.error(f"Entity IDs attempted: {list(entity_ids) if 'entity_ids' in locals() else 'N/A'}")
ctx.logger.error(f"Full Payload: {json.dumps(request.body, indent=2, ensure_ascii=False)}")
ctx.logger.error(f"Timestamp: {datetime.datetime.now().isoformat()}")
ctx.logger.error("=" * 80)
return ApiResponse(
status=500,
body={

View File

@@ -7,7 +7,7 @@ from motia import FlowContext, http, ApiRequest, ApiResponse
config = {
"name": "VMH Webhook Beteiligte Delete",
"description": "Empfängt Delete-Webhooks von EspoCRM für Beteiligte",
"description": "Receives delete webhooks from EspoCRM for Beteiligte",
"flows": ["vmh-beteiligte"],
"triggers": [
http("POST", "/vmh/webhook/beteiligte/delete")
@@ -23,10 +23,13 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
try:
payload = request.body or []
ctx.logger.info("VMH Webhook Beteiligte Delete empfangen")
ctx.logger.info("=" * 80)
ctx.logger.info("📥 VMH WEBHOOK: BETEILIGTE DELETE")
ctx.logger.info("=" * 80)
ctx.logger.info(f"Payload: {json.dumps(payload, indent=2, ensure_ascii=False)}")
ctx.logger.info("=" * 80)
# Sammle alle IDs aus dem Batch
# Collect all IDs from batch
entity_ids = set()
if isinstance(payload, list):
@@ -36,9 +39,9 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
elif isinstance(payload, dict) and 'id' in payload:
entity_ids.add(payload['id'])
ctx.logger.info(f"{len(entity_ids)} IDs zum Delete-Sync gefunden")
ctx.logger.info(f"{len(entity_ids)} IDs found for delete sync")
# Emit events für Queue-Processing
# Emit events for queue processing
for entity_id in entity_ids:
await ctx.enqueue({
'topic': 'vmh.beteiligte.delete',
@@ -50,7 +53,8 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
}
})
ctx.logger.info(f"VMH Delete Webhook verarbeitet: {len(entity_ids)} Events emittiert")
ctx.logger.info("VMH Delete Webhook processed: "
f"{len(entity_ids)} events emitted")
return ApiResponse(
status=200,
@@ -62,7 +66,10 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
)
except Exception as e:
ctx.logger.error(f"Fehler beim Delete-Webhook: {e}")
ctx.logger.error("=" * 80)
ctx.logger.error("❌ ERROR: BETEILIGTE DELETE WEBHOOK")
ctx.logger.error(f"Error: {e}")
ctx.logger.error("=" * 80)
return ApiResponse(
status=500,
body={'error': 'Internal server error', 'details': str(e)}

View File

@@ -7,7 +7,7 @@ from motia import FlowContext, http, ApiRequest, ApiResponse
config = {
"name": "VMH Webhook Beteiligte Update",
"description": "Empfängt Update-Webhooks von EspoCRM für Beteiligte",
"description": "Receives update webhooks from EspoCRM for Beteiligte",
"flows": ["vmh-beteiligte"],
"triggers": [
http("POST", "/vmh/webhook/beteiligte/update")
@@ -20,16 +20,19 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
"""
Webhook handler for Beteiligte updates in EspoCRM.
Note: Loop-Prevention ist auf EspoCRM-Seite implementiert.
rowId-Updates triggern keine Webhooks mehr, daher keine Filterung nötig.
Note: Loop prevention is implemented on EspoCRM side.
rowId updates no longer trigger webhooks, so no filtering needed.
"""
try:
payload = request.body or []
ctx.logger.info("VMH Webhook Beteiligte Update empfangen")
ctx.logger.info("=" * 80)
ctx.logger.info("📥 VMH WEBHOOK: BETEILIGTE UPDATE")
ctx.logger.info("=" * 80)
ctx.logger.info(f"Payload: {json.dumps(payload, indent=2, ensure_ascii=False)}")
ctx.logger.info("=" * 80)
# Sammle alle IDs aus dem Batch
# Collect all IDs from batch
entity_ids = set()
if isinstance(payload, list):
@@ -39,9 +42,9 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
elif isinstance(payload, dict) and 'id' in payload:
entity_ids.add(payload['id'])
ctx.logger.info(f"{len(entity_ids)} IDs zum Update-Sync gefunden")
ctx.logger.info(f"{len(entity_ids)} IDs found for update sync")
# Emit events für Queue-Processing
# Emit events for queue processing
for entity_id in entity_ids:
await ctx.enqueue({
'topic': 'vmh.beteiligte.update',
@@ -53,7 +56,8 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
}
})
ctx.logger.info(f"VMH Update Webhook verarbeitet: {len(entity_ids)} Events emittiert")
ctx.logger.info("VMH Update Webhook processed: "
f"{len(entity_ids)} events emitted")
return ApiResponse(
status=200,
@@ -65,7 +69,14 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
)
except Exception as e:
ctx.logger.error(f"Fehler beim Verarbeiten des VMH Update Webhooks: {e}")
ctx.logger.error("=" * 80)
ctx.logger.error("❌ ERROR: VMH UPDATE WEBHOOK")
ctx.logger.error("=" * 80)
ctx.logger.error(f"Error: {e}")
ctx.logger.error(f"Entity IDs attempted: {list(entity_ids) if 'entity_ids' in locals() else 'N/A'}")
ctx.logger.error(f"Full Payload: {json.dumps(request.body, indent=2, ensure_ascii=False)}")
ctx.logger.error(f"Timestamp: {datetime.datetime.now().isoformat()}")
ctx.logger.error("=" * 80)
return ApiResponse(
status=500,
body={

View File

@@ -1,5 +1,6 @@
"""VMH Webhook - Document Create"""
import json
import datetime
from typing import Any
from motia import FlowContext, http, ApiRequest, ApiResponse
@@ -25,48 +26,61 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
try:
payload = request.body or []
ctx.logger.info("VMH Webhook Document Create empfangen")
ctx.logger.info("=" * 80)
ctx.logger.info("📥 VMH WEBHOOK: DOCUMENT CREATE")
ctx.logger.info("=" * 80)
ctx.logger.debug(f"Payload: {json.dumps(payload, indent=2, ensure_ascii=False)}")
# Sammle alle IDs aus dem Batch
# Collect all IDs from batch
entity_ids = set()
entity_type = 'CDokumente' # Default
if isinstance(payload, list):
for entity in payload:
if isinstance(entity, dict) and 'id' in entity:
entity_ids.add(entity['id'])
# Extrahiere entityType falls vorhanden
entity_type = entity.get('entityType', 'CDokumente')
# Take entityType from first entity if present
if entity_type == 'CDokumente':
entity_type = entity.get('entityType', 'CDokumente')
elif isinstance(payload, dict) and 'id' in payload:
entity_ids.add(payload['id'])
entity_type = payload.get('entityType', 'CDokumente')
ctx.logger.info(f"{len(entity_ids)} Document IDs zum Create-Sync gefunden")
ctx.logger.info(f"{len(entity_ids)} document IDs found for create sync")
# Emit events für Queue-Processing (Deduplizierung erfolgt im Event-Handler via Lock)
# Emit events for queue processing (deduplication via lock in event handler)
for entity_id in entity_ids:
await ctx.enqueue({
'topic': 'vmh.document.create',
'data': {
'entity_id': entity_id,
'entity_type': entity_type if 'entity_type' in locals() else 'CDokumente',
'entity_type': entity_type,
'action': 'create',
'timestamp': payload[0].get('modifiedAt') if isinstance(payload, list) and payload else None
}
})
ctx.logger.info("✅ Document Create Webhook processed: "
f"{len(entity_ids)} events emitted")
return ApiResponse(
status=200,
body={
'success': True,
'message': f'{len(entity_ids)} Document(s) zum Sync enqueued',
'message': f'{len(entity_ids)} document(s) enqueued for sync',
'entity_ids': list(entity_ids)
}
)
except Exception as e:
ctx.logger.error(f"Fehler im Document Create Webhook: {e}")
ctx.logger.error(f"Payload: {request.body}")
ctx.logger.error("=" * 80)
ctx.logger.error("❌ ERROR: DOCUMENT CREATE WEBHOOK")
ctx.logger.error("=" * 80)
ctx.logger.error(f"Error: {e}")
ctx.logger.error(f"Entity IDs attempted: {list(entity_ids) if 'entity_ids' in locals() else 'N/A'}")
ctx.logger.error(f"Full Payload: {json.dumps(request.body, indent=2, ensure_ascii=False)}")
ctx.logger.error(f"Timestamp: {datetime.datetime.now().isoformat()}")
ctx.logger.error("=" * 80)
return ApiResponse(
status=500,

View File

@@ -1,5 +1,6 @@
"""VMH Webhook - Document Delete"""
import json
import datetime
from typing import Any
from motia import FlowContext, http, ApiRequest, ApiResponse
@@ -25,47 +26,61 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
try:
payload = request.body or []
ctx.logger.info("VMH Webhook Document Delete empfangen")
ctx.logger.info("=" * 80)
ctx.logger.info("📥 VMH WEBHOOK: DOCUMENT DELETE")
ctx.logger.info("=" * 80)
ctx.logger.debug(f"Payload: {json.dumps(payload, indent=2, ensure_ascii=False)}")
# Sammle alle IDs aus dem Batch
# Collect all IDs from batch
entity_ids = set()
entity_type = 'CDokumente' # Default
if isinstance(payload, list):
for entity in payload:
if isinstance(entity, dict) and 'id' in entity:
entity_ids.add(entity['id'])
entity_type = entity.get('entityType', 'CDokumente')
# Take entityType from first entity if present
if entity_type == 'CDokumente':
entity_type = entity.get('entityType', 'CDokumente')
elif isinstance(payload, dict) and 'id' in payload:
entity_ids.add(payload['id'])
entity_type = payload.get('entityType', 'CDokumente')
ctx.logger.info(f"{len(entity_ids)} Document IDs zum Delete-Sync gefunden")
ctx.logger.info(f"{len(entity_ids)} document IDs found for delete sync")
# Emit events für Queue-Processing
# Emit events for queue processing
for entity_id in entity_ids:
await ctx.enqueue({
'topic': 'vmh.document.delete',
'data': {
'entity_id': entity_id,
'entity_type': entity_type if 'entity_type' in locals() else 'CDokumente',
'entity_type': entity_type,
'action': 'delete',
'timestamp': payload[0].get('deletedAt') if isinstance(payload, list) and payload else None
}
})
ctx.logger.info("✅ Document Delete Webhook processed: "
f"{len(entity_ids)} events emitted")
return ApiResponse(
status=200,
body={
'success': True,
'message': f'{len(entity_ids)} Document(s) zum Delete enqueued',
'message': f'{len(entity_ids)} document(s) enqueued for deletion',
'entity_ids': list(entity_ids)
}
)
except Exception as e:
ctx.logger.error(f"Fehler im Document Delete Webhook: {e}")
ctx.logger.error(f"Payload: {request.body}")
ctx.logger.error("=" * 80)
ctx.logger.error("❌ ERROR: DOCUMENT DELETE WEBHOOK")
ctx.logger.error("=" * 80)
ctx.logger.error(f"Error: {e}")
ctx.logger.error(f"Entity IDs attempted: {list(entity_ids) if 'entity_ids' in locals() else 'N/A'}")
ctx.logger.error(f"Full Payload: {json.dumps(request.body, indent=2, ensure_ascii=False)}")
ctx.logger.error(f"Timestamp: {datetime.datetime.now().isoformat()}")
ctx.logger.error("=" * 80)
return ApiResponse(
status=500,

View File

@@ -1,5 +1,6 @@
"""VMH Webhook - Document Update"""
import json
import datetime
from typing import Any
from motia import FlowContext, http, ApiRequest, ApiResponse
@@ -25,47 +26,61 @@ async def handler(request: ApiRequest, ctx: FlowContext[Any]) -> ApiResponse:
try:
payload = request.body or []
ctx.logger.info("VMH Webhook Document Update empfangen")
ctx.logger.info("=" * 80)
ctx.logger.info("📥 VMH WEBHOOK: DOCUMENT UPDATE")
ctx.logger.info("=" * 80)
ctx.logger.debug(f"Payload: {json.dumps(payload, indent=2, ensure_ascii=False)}")
# Sammle alle IDs aus dem Batch
# Collect all IDs from batch
entity_ids = set()
entity_type = 'CDokumente' # Default
if isinstance(payload, list):
for entity in payload:
if isinstance(entity, dict) and 'id' in entity:
entity_ids.add(entity['id'])
entity_type = entity.get('entityType', 'CDokumente')
# Take entityType from first entity if present
if entity_type == 'CDokumente':
entity_type = entity.get('entityType', 'CDokumente')
elif isinstance(payload, dict) and 'id' in payload:
entity_ids.add(payload['id'])
entity_type = payload.get('entityType', 'CDokumente')
ctx.logger.info(f"{len(entity_ids)} Document IDs zum Update-Sync gefunden")
ctx.logger.info(f"{len(entity_ids)} document IDs found for update sync")
# Emit events für Queue-Processing
# Emit events for queue processing
for entity_id in entity_ids:
await ctx.enqueue({
'topic': 'vmh.document.update',
'data': {
'entity_id': entity_id,
'entity_type': entity_type if 'entity_type' in locals() else 'CDokumente',
'entity_type': entity_type,
'action': 'update',
'timestamp': payload[0].get('modifiedAt') if isinstance(payload, list) and payload else None
}
})
ctx.logger.info("✅ Document Update Webhook processed: "
f"{len(entity_ids)} events emitted")
return ApiResponse(
status=200,
body={
'success': True,
'message': f'{len(entity_ids)} Document(s) zum Sync enqueued',
'message': f'{len(entity_ids)} document(s) enqueued for sync',
'entity_ids': list(entity_ids)
}
)
except Exception as e:
ctx.logger.error(f"Fehler im Document Update Webhook: {e}")
ctx.logger.error(f"Payload: {request.body}")
ctx.logger.error("=" * 80)
ctx.logger.error("❌ ERROR: DOCUMENT UPDATE WEBHOOK")
ctx.logger.error("=" * 80)
ctx.logger.error(f"Error: {e}")
ctx.logger.error(f"Entity IDs attempted: {list(entity_ids) if 'entity_ids' in locals() else 'N/A'}")
ctx.logger.error(f"Full Payload: {json.dumps(request.body, indent=2, ensure_ascii=False)}")
ctx.logger.error(f"Timestamp: {datetime.datetime.now().isoformat()}")
ctx.logger.error("=" * 80)
return ApiResponse(
status=500,

1033
uv.lock generated

File diff suppressed because it is too large Load Diff