bsiggel/motia

Fork 0

Files

bitbylaw 8de2654d74 feat(sync): Fix Var6 revert logic for direction='to_advoware' and enhance conflict handling

2026-02-08 22:07:55 +00:00

18 KiB

Raw Blame History

Code-Analyse: Kommunikation Sync (Komplett-Review)

Datum: 8. Februar 2026

Executive Summary

Gesamtbewertung: ⚠️ GUT mit 4 KRITISCHEN BUGS

Der Code ist grundsätzlich gut strukturiert, verwendet intelligente Change Detection mit Hashes und implementiert alle 6 Varianten korrekt. ABER es gibt 4 kritische Bugs die dazu führen können, dass Systeme nicht synchron bleiben.

1. Architektur-Bewertung

✅ Stärken

3-Way Diffing mit Hash-basierter Konflikt-Erkennung
- Verwendet Kommunikations-Hash (MD5 der rowIds) statt Beteiligte-rowId
- Erkennt wer geändert hat (Advoware vs EspoCRM vs beide)
- Konfliktauflösung: EspoCRM wins
Marker-Strategie ist elegant
- [ESPOCRM:base64_value:kommKz] ermöglicht Matching auch bei Value-Änderungen
- [ESPOCRM-SLOT:kommKz] für gelöschte Einträge (Advoware DELETE gibt 403)
- User-Bemerkungen werden preserviert
Alle 6 Varianten implementiert
- Var1: Neu in EspoCRM → CREATE/REUSE in Advoware ✅
- Var2: Gelöscht in EspoCRM → Empty Slot in Advoware ✅
- Var3: Gelöscht in Advoware → DELETE in EspoCRM ✅
- Var4: Neu in Advoware → CREATE in EspoCRM ✅
- Var5: Geändert in EspoCRM → UPDATE in Advoware ✅
- Var6: Geändert in Advoware → UPDATE in EspoCRM ✅
Retry-Mechanismus mit Exponential Backoff
- [1, 5, 15, 60, 240] Minuten
- Auto-Reset nach 24h für permanently_failed
- Round-trip Validation nach jedem Sync
Distributed Locking
- Redis + EspoCRM syncStatus
- TTL: 15 Minuten (verhindert Deadlocks)

⚠️ Schwächen

Komplexität: Hash-basierte Logik ist schwer zu debuggen
Performance: Lädt ALLE Kommunikationen bei jedem Sync
Error Handling: Einige Fehler werden nur geloggt, nicht propagiert
Testing: Keine Unit-Tests für edge cases

2. Szenario-Matrix: ALLE möglichen Konstellationen

Legende

✅ = Korrekt implementiert
❌ = BUG gefunden
⚠️ = Potentielles Problem

2.1 Single-Side Changes (keine Konflikte)

#	Szenario	EspoCRM	Advoware	Erwartung	Status	Bug-ID
1	Neu in EspoCRM	+email	-	Var1: CREATE in Advoware	✅	-
2	Neu in Advoware	-	+phone	Var4: CREATE in EspoCRM	✅	-
3	Gelöscht in EspoCRM	-email	(marker)	Var2: Empty Slot	✅	-
4	Gelöscht in Advoware	(entry)	-phone	Var3: DELETE in EspoCRM	✅	-
5	Geändert in EspoCRM	email↑	(synced)	Var5: UPDATE in Advoware	✅	-
6	Geändert in Advoware	(synced)	phone↑	Var6: UPDATE in EspoCRM	✅	-

2.2 Conflict Scenarios (beide Seiten ändern)

#	Szenario	EspoCRM	Advoware	Erwartung	Status	Bug-ID
7	Beide neu angelegt	+emailA	+phoneB	EspoCRM wins: nur emailA	❌	BUG-1
8	Beide geändert (same)	emailA→B	emailA→B	Beide Änderungen	✅	-
9	Beide geändert (diff)	emailA→B	emailA→C	EspoCRM wins: A→B	⚠️	BUG-2
10	Einer neu, einer gelöscht	+emailA	-phoneX	EspoCRM wins	✅	-
11	Primary flag conflict	primary↑	primary↑	EspoCRM wins	✅	-

BUG-1: ❌ Bei Konflikt wird Var4 (neu in Advoware) zu Empty Slot, ABER wenn espo_wins=False, wird Var4 trotzdem zu EspoCRM übertragen → Daten-Inkonsistenz möglich

BUG-2: ⚠️ Bei gleichzeitiger Änderung wird Advoware-Änderung revertiert, ABER wenn User sofort wieder ändert, entsteht Ping-Pong

2.3 Initial Sync (kein kommunikationHash)

#	Szenario	EspoCRM	Advoware	Erwartung	Status	Bug-ID
12	Initial: Beide leer	-	-	Kein Sync	✅	-
13	Initial: Nur EspoCRM	+emails	-	CREATE in Advoware	✅	-
14	Initial: Nur Advoware	-	+phones	CREATE in EspoCRM	✅	-
15	Initial: Beide haben Daten	+emailA	+phoneB	Merge beide	✅	-
16	Initial: Gleiche Email	+emailA	+emailA	Nur 1x	❌	BUG-3

BUG-3: ❌ Bei Initial Sync werden identische Werte doppelt angelegt (einmal mit Marker, einmal ohne)

2.4 Edge Cases mit Empty Slots

#	Szenario	EspoCRM	Advoware	Erwartung	Status	Bug-ID
17	User füllt Slot mit Daten	-	Slot→+phone	Var4: Sync zu EspoCRM	✅	FIXED
18	EspoCRM füllt gleichen Slot	+email	Slot	Var1: Slot reuse	✅	-
19	Mehrere Slots gleicher kommKz	-	Slot1, Slot2	First reused	✅	-
20	Slot mit User-Bemerkung	-	Slot+Text	Text bleibt	✅	-

2.5 Direction Parameter

#	Direction	EspoCRM Change	Advoware Change	Erwartung	Status	Bug-ID
21	'both'	+emailA	+phoneB	Beide synced	✅	-
22	'to_espocrm'	+emailA	+phoneB	Nur phoneB→EspoCRM	❌	BUG-4
23	'to_advoware'	+emailA	+phoneB	Nur emailA→Advoware, phoneB DELETED	✅	FIXED
24	'to_advoware'	-	phoneA→B	phoneB→A revert	✅	FIXED

BUG-4: ❌ Bei direction='to_espocrm' werden EspoCRM-Änderungen ignoriert, ABER Hash wird trotzdem updated → Verlust von EspoCRM-Änderungen beim nächsten Sync

2.6 Hash Calculation

#	Szenario	Kommunikationen	Hash Basis	Status	Bug-ID
25	Nur sync-relevante	4 synced, 5 andere	4 rowIds	✅	FIXED
26	Mit Empty Slots	3 synced, 2 slots	3 rowIds	✅	FIXED
27	Alle leer	0 synced	"" → empty hash	✅	-

2.7 kommKz Detection

#	Szenario	Input	Erwartet	Status	Notiz
28	Email ohne Marker	test@mail.com	kommKz=4 (MailGesch)	✅	Via pattern
29	Phone ohne Marker	+4930123	kommKz=1 (TelGesch)	✅	Via pattern
30	Mit EspoCRM type 'Mobile'	phone	kommKz=3	✅	Via type map
31	Mit EspoCRM type 'Fax'	phone	kommKz=2	✅	Via type map
32	Mit Marker	any	kommKz aus Marker	✅	Priority 1

3. Gefundene BUGS

🔴 BUG-1: Konflikt-Handling inkonsistent (CRITICAL)

Problem: Bei espo_wins=False (kein Konflikt) wird Var4 (neu in Advoware) zu EspoCRM übertragen. ABER bei espo_wins=True wird Var4 zu Empty Slot. Das ist korrekt. ABER: Wenn nach Konflikt wieder espo_wins=False, wird der Slot NICHT automatisch wiederhergestellt.

Location: sync_bidirectional() Lines 119-136

if direction in ['both', 'to_espocrm'] and not espo_wins:
    # Var4 wird zu EspoCRM übertragen ✅
    espo_result = await self._apply_advoware_to_espocrm(...)
elif direction in ['both', 'to_espocrm'] and espo_wins:
    # Var4 wird zu Empty Slot ✅
    for komm in diff['advo_new']:
        await self._create_empty_slot(betnr, komm)

Problem: User legt in Advoware phone an → Konflikt → Slot → Konflikt gelöst → Slot bleibt leer, wird NICHT zu EspoCRM übertragen!

Fix: Nach Konflikt-Auflösung sollte ein "recovery" Sync laufen, der Slots mit User-Daten wieder aktiviert.

Severity: 🔴 CRITICAL - Datenverlust möglich

🔴 BUG-3: Initial Sync mit identischen Werten (CRITICAL)

Problem: Bei Initial Sync (kein Hash) werden identische Einträge doppelt angelegt:

Advoware hat test@mail.com (ohne Marker)
EspoCRM hat test@mail.com
→ Ergebnis: 2x test@mail.com in beiden Systemen!

Location: _compute_diff() Lines 365-385

# Var4: Neu in Advoware
for komm in advo_without_marker:
    diff['advo_new'].append(komm)  # Wird zu EspoCRM übertragen

# Var1: Neu in EspoCRM
for value, espo_item in espo_values.items():
    if value not in advo_with_marker:
        diff['espo_new'].append((value, espo_item))  # Wird zu Advoware übertragen

Root Cause: Bei Initial Sync sollte Value-Matching statt nur Marker-Matching verwendet werden!

Fix: In _compute_diff() bei Initial Sync auch value in [komm['tlf'] for komm in advo_without_marker] prüfen.

Severity: 🔴 CRITICAL - Daten-Duplikate

🟡 BUG-4: direction='to_espocrm' verliert EspoCRM-Änderungen (MEDIUM)

Problem: Bei direction='to_espocrm' werden nur Advoware→EspoCRM Änderungen übertragen, ABER der Hash wird trotzdem updated. Beim nächsten Sync (direction='both') gehen EspoCRM-Änderungen verloren.

Beispiel:

User ändert in EspoCRM: emailA→B
Sync mit direction='to_espocrm' → emailA→B wird NICHT zu Advoware übertragen
Hash wird updated (basiert auf Advoware rowIds)
Nächster Sync: Diff erkennt KEINE Änderung (Hash gleich) → emailA→B geht verloren!

Location: sync_bidirectional() Lines 167-174

# Hash wird IMMER updated, auch bei direction='to_espocrm'
if total_changes > 0 or is_initial_sync:
    # ... Hash berechnen ...
    await self.espocrm.update_entity('CBeteiligte', beteiligte_id, {
        'kommunikationHash': new_komm_hash
    })

Fix: Hash nur updaten wenn:

direction='both' ODER
direction='to_advoware' UND Advoware wurde geändert ODER
direction='to_espocrm' UND EspoCRM wurde geändert UND zu Advoware übertragen

Severity: 🟡 MEDIUM - Datenverlust bei falscher direction-Verwendung

🟢 BUG-2: Ping-Pong bei gleichzeitigen Änderungen (LOW)

Problem: Bei gleichzeitiger Änderung wird Advoware revertiert. Wenn User sofort wieder ändert, entsteht Ping-Pong.

Beispiel:

User ändert in Advoware: phoneA→B
Admin ändert in EspoCRM: phoneA→C
Sync: EspoCRM wins → phoneA→C in beiden
User ändert wieder: phoneC→B
Sync: phoneC→B in beiden (kein Konflikt mehr)
Admin bemerkt Änderung, ändert zurück: phoneB→C
→ Endlos-Schleife

Fix: Conflict-Notification mit "Lock" bis Admin bestätigt.

Severity: 🟢 LOW - UX Problem, keine Daten-Inkonsistenz

4. Code-Qualität Bewertung

Eleganz: ⭐⭐⭐⭐☆ (4/5)

Gut:

Klare Trennung: Diff Computation vs. Apply Changes
Marker-Strategie ist clever und robust
Hash-basierte Konflikt-Erkennung ist innovativ

Schlecht:

Zu viele verschachtelte if-else (Lines 119-163)
_compute_diff() ist 300+ Zeilen lang → schwer zu testen
Duplikation von kommKz-Detection Logic

Verbesserung:

# Aktuell:
if direction in ['both', 'to_espocrm'] and not espo_wins:
    ...
elif direction in ['both', 'to_espocrm'] and espo_wins:
    ...
else:
    ...

# Besser:
should_apply_advo = direction in ['both', 'to_espocrm']
should_revert_advo = should_apply_advo and espo_wins

if should_apply_advo and not espo_wins:
    ...
elif should_revert_advo:
    ...

Effizienz: ⭐⭐⭐☆☆ (3/5)

Performance-Probleme:

Doppelte API-Calls:

# Line 71-75: Lädt Advoware
advo_result = await self.advoware.get_beteiligter(betnr)

# Line 167-174: Lädt NOCHMAL
advo_result_final = await self.advoware.get_beteiligter(betnr)

Fix: Cache das Ergebnis, nur neu laden wenn Änderungen gemacht wurden.

Hash-Berechnung bei jedem Sync:
- Sortiert ALLE rowIds, auch wenn keine Änderungen
- Fix: Lazy evaluation - nur berechnen wenn total_changes > 0

N+1 Problem bei Updates:

# _apply_advoware_to_espocrm(): Update einzeln
for komm, old_value, new_value in diff['advo_changed']:
    await self.advoware.update_kommunikation(...)  # N API-Calls

Fix: Batch-Update API (wenn Advoware unterstützt)

Keine Parallelisierung:
- Var1-6 werden sequenziell verarbeitet
- Fix: asyncio.gather() für unabhängige Operations

Optimierte Version:

async def sync_bidirectional(self, ...):
    # 1. Load data (parallel)
    advo_task = self.advoware.get_beteiligter(betnr)
    espo_task = self.espocrm.get_entity('CBeteiligte', beteiligte_id)
    advo_bet, espo_bet = await asyncio.gather(advo_task, espo_task)
    
    # 2. Compute diff (sync, fast)
    diff = self._compute_diff(...)
    
    # 3. Apply changes (parallel where possible)
    tasks = []
    if direction in ['both', 'to_espocrm'] and not espo_wins:
        tasks.append(self._apply_advoware_to_espocrm(...))
    if direction in ['both', 'to_advoware']:
        tasks.append(self._apply_espocrm_to_advoware(...))
    
    results = await asyncio.gather(*tasks, return_exceptions=True)

Robustheit: ⭐⭐⭐⭐☆ (4/5)

Gut:

Distributed Locking verhindert Race Conditions
Retry mit Exponential Backoff
Auto-Reset nach 24h
Round-trip Validation

Probleme:

Error Propagation unvollständig:

# _apply_advoware_to_espocrm()
except Exception as e:
    result['errors'].append(str(e))  # Nur geloggt, nicht propagiert!
    return result  # Caller weiß nicht ob partial failure

Fix: Bei kritischen Fehlern Exception werfen, nicht nur loggen.

Partial Updates nicht atomic:
- Var1-6 werden sequenziell verarbeitet
- Bei Fehler in Var3 sind Var1-2 schon geschrieben
- Kein Rollback möglich (Advoware DELETE gibt 403!)
Fix:
- Phase 1: Collect all changes (dry-run)
- Phase 2: Apply all or nothing (mit compensation)
- Phase 3: Update hash only if Phase 2 successful

Hash-Mismatch nach partial failure:

# Hash wird updated auch bei Fehlern
if total_changes > 0:  # total_changes kann > 0 sein auch wenn errors!
    await self.espocrm.update_entity(..., {'kommunikationHash': new_hash})

Fix: if total_changes > 0 AND not result['errors']:

Lock-Release bei Exception:

async def sync_bidirectional(self, ...):
    try:
        ...
    except Exception:
        # Lock wird NICHT released!
        pass

Fix: try/finally mit Lock-Release im finally-Block.

5. Korrektheit-Matrix

Alle 6 Varianten: Verarbeitung korrekt?

Variante	Single-Side	Konflikt	Initial Sync	Direction	Gesamt
Var1: Neu EspoCRM	✅	✅	✅	⚠️ BUG-4	🟡
Var2: Del EspoCRM	✅	✅	N/A	✅	✅
Var3: Del Advoware	✅	✅	N/A	✅	✅
Var4: Neu Advoware	✅	⚠️ BUG-1	❌ BUG-3	✅	🔴
Var5: Chg EspoCRM	✅	✅	✅	⚠️ BUG-4	🟡
Var6: Chg Advoware	✅	✅	✅	✅	✅

Legende:

✅ = Korrekt
⚠️ = Mit Einschränkungen
❌ = Fehlerhaft
🔴 = Critical Bug
🟡 = Medium Bug

6. Recommendations

6.1 CRITICAL FIXES (sofort)

BUG-3: Initial Sync Duplikate

def _compute_diff(self, ...):
    # Bei Initial Sync: Value-Matching statt nur Marker-Matching
    if is_initial_sync:
        advo_values = {k['tlf']: k for k in advo_without_marker if k.get('tlf')}

        for value, espo_item in espo_values.items():
            if value in advo_values:
                # Match gefunden - setze Marker, KEIN Var1/Var4
                komm = advo_values[value]
                # Setze Marker in Advoware
                ...
            elif value not in advo_with_marker:
                diff['espo_new'].append((value, espo_item))

BUG-1: Konflikt-Recovery

# Nach Konflikt-Auflösung: Recovery-Check für Slots mit User-Daten
if last_status == 'conflict' and new_status == 'clean':
    # Check für Slots die User-Daten haben
    recovery_komms = [
        k for k in advo_kommunikationen 
        if parse_marker(k.get('bemerkung', '')).get('is_slot') 
        and k.get('tlf')  # Slot hat Daten!
    ]
    if recovery_komms:
        # Trigger Var4-Sync
        for komm in recovery_komms:
            diff['advo_new'].append(komm)

6.2 MEDIUM FIXES (nächster Sprint)

BUG-4: Hash-Update Logik

should_update_hash = (
    (direction in ['both', 'to_advoware'] and result['espocrm_to_advoware']['created'] + result['espocrm_to_advoware']['updated'] + result['espocrm_to_advoware']['deleted'] > 0) or
    (direction in ['both', 'to_espocrm'] and result['advoware_to_espocrm']['emails_synced'] + result['advoware_to_espocrm']['phones_synced'] > 0)
)

if should_update_hash and not result['errors']:
    # Update hash
    ...

Error Propagation

async def _apply_advoware_to_espocrm(self, ...):
    critical_errors = []

    try:
        ...
    except CriticalException as e:
        critical_errors.append(e)
        raise  # Propagate up!
    except Exception as e:
        result['errors'].append(str(e))

    return result

6.3 OPTIMIZATIONS (Backlog)

Performance: Batch-Updates, Parallelisierung
Code-Qualität: Extract _compute_diff() in kleinere Methoden
Testing: Unit-Tests für alle 32 Szenarien
Monitoring: Metrics für Sync-Dauer, Fehler-Rate, Konflikt-Rate

7. Fazit

Ist der Code gut, elegant, effizient und robust?

Gut: ⭐⭐⭐⭐☆ (4/5) - Ja, grundsätzlich gut
Elegant: ⭐⭐⭐⭐☆ (4/5) - Marker-Strategie clever, aber zu verschachtelt
Effizient: ⭐⭐⭐☆☆ (3/5) - N+1 Problem, keine Parallelisierung
Robust: ⭐⭐⭐⭐☆ (4/5) - Mit Einschränkungen (partial failures)

Werden alle Varianten korrekt verarbeitet?

JA, mit 3 CRITICAL EXCEPTIONS:

❌ BUG-1: Konflikt-Recovery fehlt
❌ BUG-3: Initial Sync mit Duplikaten
⚠️ BUG-4: direction='to_espocrm' verliert EspoCRM-Änderungen

Sind alle Konstellationen abgedeckt?

Größtenteils JA: 28 von 32 Szenarien korrekt (87.5%)

Missing:

Initial Sync mit identischen Werten (BUG-3)
Konflikt-Recovery nach Auflösung (BUG-1)
Partial failure handling
Concurrent modifications während Sync

8. Next Steps

✅ Sofort: Fixe BUG-3 (Initial Sync Duplikate)
✅ Heute: Fixe BUG-1 (Konflikt-Recovery)
⏳ Diese Woche: Fixe BUG-4 (Hash-Update Logik)
⏳ Nächste Woche: Performance-Optimierungen
⏳ Backlog: Unit-Tests für alle Szenarien

Review erstellt von: GitHub Copilot
Review-Datum: 8. Februar 2026
Code-Version: Latest (nach allen bisherigen Fixes)

18 KiB Raw Blame History