diff --git a/docs/INDEX.md b/docs/INDEX.md index e8a2f99..bd90d97 100644 --- a/docs/INDEX.md +++ b/docs/INDEX.md @@ -64,13 +64,53 @@ Webhook (HTTP) → Queue Event → Event Handler → External APIs ╚═══════════════════════════════════════════════════════╝ ``` -**Guideline:** -- Fire events liberally. 10 redundant events are cheaper than 1 complex lock coordination. -- Make every handler idempotent with early returns. -- Accept that events may trigger multiple times - handlers must be robust. -- Use locks only for expensive operations (file uploads, rate-limited APIs). +**Guidelines:** -**Example: Entity Link Event** +1. **Fire events liberally** - 10 redundant events are cheaper than complex coordination +2. **Make handlers idempotent** - Early returns when nothing to do +3. **Sequential per entity, parallel across entities** - Lock prevents collisions, not updates +4. **Accept event storms** - Handlers queue up, process one by one + +**Lock Strategy: Sequential Processing per Entity** + +``` +Event Storm für Document A: +├─ Event 1: update → Handler starts → Lock acquired +├─ Event 2: update → Queued (waits for lock) +├─ Event 3: update → Queued (after Event 2) +└─ Event 4: update → Queued (after Event 3) + +Document B (parallel): +└─ Event 1: update → Own lock, processes in parallel! + +Result: +- Same entity: Sequential (prevents file upload collisions) +- Different entities: Parallel (independent locks) +- Lost events: Zero (all queued and processed) +- Duplicate work: Prevented by idempotency checks +``` + +**Example Flow:** +``` +t=0: User ändert Document A (fileStatus → "changed") +t=1: Event 1 fired → Lock acquired → Sync starts +t=2: User ändert Document A again (fileStatus → "changed") +t=3: Event 2 fired → Queued (lock busy) +t=4: Event 1 completes → fileStatus="unchanged", xaiSyncStatus="clean" +t=5: Event 2 starts → Lock acquired + Check: fileStatus="unchanged", xaiSyncStatus="clean" + → Early return (nothing to do) ✅ + +Result: Second event processed but no duplicate work! +``` + +**Why This Works:** +- **Lock prevents chaos**: No parallel file uploads for same entity +- **Queue enables updates**: New changes processed sequentially +- **Idempotency prevents waste**: Redundant events → cheap early returns +- **Parallel scaling**: Different entities process simultaneously + +**Practical Example: Entity Link Event** ``` User links Document ↔ Räumungsklage @@ -78,19 +118,19 @@ Webhooks fire: ├─ POST /vmh/webhook/entity/link └─ Emits: raeumungsklage.update, cdokumente.update -Handlers (parallel): +Handlers (parallel, different entities): ├─ Räumungsklage Handler +│ ├─ Lock: raeumungsklage:abc123 │ ├─ Creates xAI Collection (if missing) -│ └─ Fires: cdokumente.update (for all linked docs) +│ └─ Fires: cdokumente.update (for all linked docs) ← Event Storm! │ -└─ Document Handler (may run 2-3x) - ├─ Check: Already synced? → Return early (cheap!) - ├─ Check: Collection ready? → No? Return, retry later - └─ Sync: Upload to xAI + add to collections +└─ Document Handler (may run 2-3x on same doc) + ├─ Lock: document:doc456 (sequential processing) + ├─ Run 1: Collections not ready → Skip (cheap return) + ├─ Run 2: Collection ready → Upload & sync + └─ Run 3: Already synced → Early return (idempotent!) ``` -**Result:** Overhead through multiple checks << Complexity of coordination. - ### Bidirectional Reference Pattern **Principle: Every sync maintains references on both sides**