docs: add guidelines for Steps vs. Utils architecture and decision matrix

This commit is contained in:
bsiggel
2026-03-08 20:30:33 +00:00
parent d71b5665b6
commit fd0196ec31

View File

@@ -151,6 +151,93 @@ Handlers (parallel, different entities):
└─ Run 3: Already synced → Early return (idempotent!)
```
---
### Steps vs. Utils: Parallel vs. Sequential
**Separation Principle: "Events for Parallelism, Functions for Composition"**
```
╔═══════════════════════════════════════════════════════╗
║ Choose Architecture by Execution Model: ║
║ ║
║ ✅ Separate Steps → When parallel possible ║
║ → Communicate via Events ║
║ → Own lock scope, independent retry ║
║ ║
║ ✅ Shared Utils → When sequential required ║
║ → Call as function (need return values) ║
║ → Reusable across multiple steps ║
╚═══════════════════════════════════════════════════════╝
```
**Decision Matrix:**
| Requirement | Architecture | Communication | Example |
|------------|--------------|---------------|---------|
| **Can run in parallel** | Separate Steps | Events | Document sync + Collection create |
| **Needs return value** | Utils function | Function call | `get_or_create_collection()` |
| **Reusable logic** | Utils function | Import + call | `should_sync_to_xai()` |
| **One-time handler** | Inline in Step | N/A | Event-specific parsing |
**Examples:**
```python
# ✅ GOOD: Parallel → Separate Steps + Events
# steps/collection_manager_step.py
async def handle_entity_update(event_data, ctx):
"""Creates xAI Collection for entity"""
collection_id = await create_collection(entity_type, entity_id)
# Fire events for all linked documents (parallel processing!)
for doc in linked_docs:
await ctx.emit("cdokumente.update", {"entity_id": doc.id})
# steps/document_sync_step.py
async def handle_document_update(event_data, ctx):
"""Syncs document to xAI"""
# Runs in parallel with collection_manager_step!
await sync_document_to_xai(entity_id)
# ✅ GOOD: Sequential + Reusable → Utils
# services/document_sync_utils.py
async def get_required_collections(entity_id):
"""Reusable function, needs return value"""
return await _scan_entity_relations(entity_id)
# ❌ BAD: Sequential logic in Step (not reusable)
async def handle_document_update(event_data, ctx):
# Inline function call - hard to test and reuse
collection_id = await create_collection_inline(...)
await upload_file(collection_id, ...) # Needs collection_id from above!
```
**Guidelines:**
1. **Default to Steps + Events** - Maximize parallelism
2. **Use Utils when:**
- Need immediate return value
- Logic reused across 2+ steps
- Complex computation (not I/O bound)
3. **Keep Steps thin** - Mostly orchestration + event emission
4. **Keep Utils testable** - Pure functions, no event emission
**Code Organization:**
```
steps/
├─ document_sync_event_step.py # Event handler (thin)
├─ collection_manager_step.py # Event handler (thin)
└─ vmh/
└─ entity_link_webhook_step.py # Webhook handler (thin)
services/
├─ document_sync_utils.py # Reusable functions
├─ xai_service.py # API client
└─ espocrm.py # CRM client
```
---
### Bidirectional Reference Pattern
**Principle: Every sync maintains references on both sides**