This commit is contained in:
2026-02-07 09:23:49 +00:00
parent 96eabe3db6
commit 36552903e7
85 changed files with 9820870 additions and 1767 deletions

View File

@@ -0,0 +1,800 @@
# Troubleshooting Guide
## Service Issues
### Service Won't Start
**Symptoms**: `systemctl start motia.service` schlägt fehl
**Diagnose**:
```bash
# Check service status
sudo systemctl status motia.service
# View detailed logs
sudo journalctl -u motia.service -n 100 --no-pager
# Check for port conflicts
sudo netstat -tlnp | grep 3000
```
**Häufige Ursachen**:
1. **Port 3000 bereits belegt**:
```bash
# Find process
sudo lsof -i :3000
# Kill process
sudo kill -9 <PID>
```
2. **Fehlende Dependencies**:
```bash
cd /opt/motia-app/bitbylaw
sudo -u www-data npm install
sudo -u www-data bash -c 'source python_modules/bin/activate && pip install -r requirements.txt'
```
3. **Falsche Permissions**:
```bash
sudo chown -R www-data:www-data /opt/motia-app
sudo chmod 600 /opt/motia-app/service-account.json
```
4. **Environment Variables fehlen**:
```bash
# Check systemd environment
sudo systemctl show motia.service -p Environment
# Verify required vars
sudo systemctl cat motia.service | grep Environment
```
### Service Keeps Crashing
**Symptoms**: Service startet, crashed aber nach kurzer Zeit
**Diagnose**:
```bash
# Watch logs in real-time
sudo journalctl -u motia.service -f
# Check for OOM (Out of Memory)
dmesg | grep -i "out of memory"
sudo grep -i "killed process" /var/log/syslog
```
**Solutions**:
1. **Memory Limit erhöhen**:
```ini
# In /etc/systemd/system/motia.service
Environment=NODE_OPTIONS=--max-old-space-size=8192
```
2. **Python Memory Leak**:
```bash
# Check memory usage
ps aux | grep python
# Restart service periodically (workaround)
# Add to crontab:
0 3 * * * systemctl restart motia.service
```
3. **Unhandled Exception**:
```bash
# Check error logs
sudo journalctl -u motia.service -p err
# Add try-catch in problematic step
```
## Redis Issues
### Redis Connection Failed
**Symptoms**: "Redis connection failed" in logs
**Diagnose**:
```bash
# Check Redis status
sudo systemctl status redis-server
# Test connection
redis-cli ping
# Check config
redis-cli CONFIG GET bind
redis-cli CONFIG GET port
```
**Solutions**:
1. **Redis not running**:
```bash
sudo systemctl start redis-server
sudo systemctl enable redis-server
```
2. **Wrong host/port**:
```bash
# Check environment
echo $REDIS_HOST
echo $REDIS_PORT
# Test connection
redis-cli -h $REDIS_HOST -p $REDIS_PORT ping
```
3. **Permission denied**:
```bash
# Check Redis log
sudo tail -f /var/log/redis/redis-server.log
# Fix permissions
sudo chown redis:redis /var/lib/redis
sudo chmod 750 /var/lib/redis
```
### Redis Out of Memory
**Symptoms**: "OOM command not allowed" errors
**Diagnose**:
```bash
# Check memory usage
redis-cli INFO memory
# Check maxmemory setting
redis-cli CONFIG GET maxmemory
```
**Solutions**:
1. **Increase maxmemory**:
```bash
# In /etc/redis/redis.conf
maxmemory 2gb
maxmemory-policy allkeys-lru
sudo systemctl restart redis-server
```
2. **Clear old data**:
```bash
# Clear cache (safe for Advoware tokens)
redis-cli -n 1 FLUSHDB
# Clear calendar sync state
redis-cli -n 2 FLUSHDB
```
3. **Check for memory leaks**:
```bash
# Find large keys
redis-cli --bigkeys
# Check specific key size
redis-cli MEMORY USAGE <key>
```
## Advoware API Issues
### Authentication Failed
**Symptoms**: "401 Unauthorized" oder "HMAC signature invalid"
**Diagnose**:
```bash
# Check logs for auth errors
sudo journalctl -u motia.service | grep -i "auth\|token\|401"
# Test token fetch manually
python3 << 'EOF'
from services.advoware import AdvowareAPI
api = AdvowareAPI()
token = api.get_access_token(force_refresh=True)
print(f"Token: {token[:20]}...")
EOF
```
**Solutions**:
1. **Invalid API Key**:
```bash
# Verify API Key is Base64
echo $ADVOWARE_API_KEY | base64 -d
# Re-encode if needed
echo -n "raw_key" | base64
```
2. **Wrong credentials**:
```bash
# Verify environment variables
sudo systemctl show motia.service -p Environment | grep ADVOWARE
# Update in systemd service
sudo nano /etc/systemd/system/motia.service
sudo systemctl daemon-reload
sudo systemctl restart motia.service
```
3. **Token expired**:
```bash
# Clear cached token
redis-cli -n 1 DEL advoware_access_token advoware_token_timestamp
# Retry request (will fetch new token)
```
### API Timeout
**Symptoms**: "Request timeout" oder "API call failed"
**Diagnose**:
```bash
# Check API response time
time curl "http://localhost:3000/advoware/proxy?endpoint=employees"
# Check network connectivity
ping www2.advo-net.net
curl -I https://www2.advo-net.net:90/
```
**Solutions**:
1. **Increase timeout**:
```bash
# In environment
export ADVOWARE_API_TIMEOUT_SECONDS=60
# Or in systemd service
Environment=ADVOWARE_API_TIMEOUT_SECONDS=60
```
2. **Network issues**:
```bash
# Check firewall
sudo ufw status
# Test direct connection
curl -v https://www2.advo-net.net:90/
```
3. **Advoware API down**:
```bash
# Wait and retry
# Implement exponential backoff in code
```
## Google Calendar Issues
### Service Account Not Found
**Symptoms**: "service-account.json not found"
**Diagnose**:
```bash
# Check file exists
ls -la /opt/motia-app/service-account.json
# Check permissions
ls -la /opt/motia-app/service-account.json
# Check environment variable
echo $GOOGLE_CALENDAR_SERVICE_ACCOUNT_PATH
```
**Solutions**:
1. **File missing**:
```bash
# Copy from backup
sudo cp /backup/service-account.json /opt/motia-app/
# Set permissions
sudo chmod 600 /opt/motia-app/service-account.json
sudo chown www-data:www-data /opt/motia-app/service-account.json
```
2. **Wrong path**:
```bash
# Update environment
# In /etc/systemd/system/motia.service:
Environment=GOOGLE_CALENDAR_SERVICE_ACCOUNT_PATH=/opt/motia-app/service-account.json
sudo systemctl daemon-reload
sudo systemctl restart motia.service
```
### Calendar API Rate Limit
**Symptoms**: "403 Rate limit exceeded" oder "429 Too Many Requests"
**Diagnose**:
```bash
# Check rate limiting in logs
sudo journalctl -u motia.service | grep -i "rate\|403\|429"
# Check Redis rate limit tokens
redis-cli -n 2 GET google_calendar_api_tokens
```
**Solutions**:
1. **Wait for rate limit reset**:
```bash
# Rate limit resets every minute
# Wait 60 seconds and retry
```
2. **Adjust rate limit settings**:
```python
# In calendar_sync_event_step.py
MAX_TOKENS = 7 # Decrease if hitting limits
REFILL_RATE_PER_MS = 7 / 1000
```
3. **Request quota increase**:
- Go to Google Cloud Console
- Navigate to "APIs & Services" → "Quotas"
- Request increase for Calendar API
### Calendar Access Denied
**Symptoms**: "Access denied" oder "Insufficient permissions"
**Diagnose**:
```bash
# Check service account email
python3 << 'EOF'
import json
with open('/opt/motia-app/service-account.json') as f:
data = json.load(f)
print(f"Service Account: {data['client_email']}")
EOF
# Test API access
python3 << 'EOF'
from google.oauth2 import service_account
from googleapiclient.discovery import build
creds = service_account.Credentials.from_service_account_file(
'/opt/motia-app/service-account.json',
scopes=['https://www.googleapis.com/auth/calendar']
)
service = build('calendar', 'v3', credentials=creds)
result = service.calendarList().list().execute()
print(f"Calendars: {len(result.get('items', []))}")
EOF
```
**Solutions**:
1. **Calendar not shared**:
```bash
# Share calendar with service account email
# In Google Calendar UI: Settings → Share → Add service account email
```
2. **Wrong scopes**:
```bash
# Verify scopes in code
# Should be: https://www.googleapis.com/auth/calendar
```
3. **Domain-wide delegation**:
```bash
# For G Suite, enable domain-wide delegation
# See GOOGLE_SETUP_README.md
```
## Calendar Sync Issues
### Sync Not Running
**Symptoms**: Keine Calendar-Updates, keine Sync-Logs
**Diagnose**:
```bash
# Check if cron is triggering
sudo journalctl -u motia.service | grep -i "calendar_sync_cron"
# Manually trigger sync
curl -X POST "http://localhost:3000/advoware/calendar/sync" \
-H "Content-Type: application/json" \
-d '{"full_content": true}'
# Check for locks
redis-cli -n 1 KEYS "calendar_sync:lock:*"
```
**Solutions**:
1. **Cron not configured**:
```python
# Verify calendar_sync_cron_step.py has correct schedule
config = {
'schedule': '0 2 * * *', # Daily at 2 AM
}
```
2. **Lock stuck**:
```bash
# Clear all locks
python /opt/motia-app/bitbylaw/delete_employee_locks.py
# Or manually
redis-cli -n 1 DEL calendar_sync:lock:SB
```
3. **Errors in sync**:
```bash
# Check error logs
sudo journalctl -u motia.service -p err | grep calendar
```
### Duplicate Events
**Symptoms**: Events erscheinen mehrfach in Google Calendar
**Diagnose**:
```bash
# Check for concurrent syncs
redis-cli -n 1 KEYS "calendar_sync:lock:*"
# Check logs for duplicate processing
sudo journalctl -u motia.service | grep -i "duplicate\|already exists"
```
**Solutions**:
1. **Locking not working**:
```bash
# Verify Redis lock TTL
redis-cli -n 1 TTL calendar_sync:lock:SB
# Should return positive number if locked
```
2. **Manual cleanup**:
```bash
# Delete duplicates in Google Calendar UI
# Or use cleanup script (if available)
```
3. **Improve deduplication logic**:
```python
# In calendar_sync_event_step.py
# Add better event matching logic
```
### Events Not Syncing
**Symptoms**: Advoware events nicht in Google Calendar
**Diagnose**:
```bash
# Check specific employee
curl -X POST "http://localhost:3000/advoware/calendar/sync" \
-H "Content-Type: application/json" \
-d '{"kuerzel": "SB", "full_content": true}'
# Check logs for that employee
sudo journalctl -u motia.service | grep "SB"
# Check if calendar exists
python3 << 'EOF'
from google.oauth2 import service_account
from googleapiclient.discovery import build
creds = service_account.Credentials.from_service_account_file(
'/opt/motia-app/service-account.json',
scopes=['https://www.googleapis.com/auth/calendar']
)
service = build('calendar', 'v3', credentials=creds)
result = service.calendarList().list().execute()
for cal in result.get('items', []):
if 'AW-SB' in cal['summary']:
print(f"Found: {cal['summary']} - {cal['id']}")
EOF
```
**Solutions**:
1. **Calendar doesn't exist**:
```bash
# Will be auto-created on first sync
# Force sync to trigger creation
```
2. **Date range mismatch**:
```python
# Check FETCH_FROM and FETCH_TO in calendar_sync_event_step.py
# Default: Previous year to 9 years ahead
```
3. **Write protection enabled**:
```bash
# Check environment
echo $ADVOWARE_WRITE_PROTECTION
# Should be "false" for two-way sync
```
## Webhook Issues
### Webhooks Not Received
**Symptoms**: EspoCRM sendet Webhooks, aber keine Verarbeitung
**Diagnose**:
```bash
# Check if endpoint reachable
curl -X POST "http://localhost:3000/vmh/webhook/beteiligte/create" \
-H "Content-Type: application/json" \
-d '[{"id": "test-123"}]'
# Check firewall
sudo ufw status
# Check nginx logs (if using reverse proxy)
sudo tail -f /var/log/nginx/motia-access.log
sudo tail -f /var/log/nginx/motia-error.log
```
**Solutions**:
1. **Firewall blocking**:
```bash
# Allow port (if direct access)
sudo ufw allow 3000/tcp
# Or use reverse proxy (recommended)
```
2. **Wrong URL in EspoCRM**:
```bash
# Verify URL in EspoCRM webhook configuration
# Should be: https://your-domain.com/vmh/webhook/beteiligte/create
```
3. **SSL certificate issues**:
```bash
# Check certificate
openssl s_client -connect your-domain.com:443
# Renew certificate
sudo certbot renew
```
### Webhook Deduplication Not Working
**Symptoms**: Mehrfache Verarbeitung derselben Webhooks
**Diagnose**:
```bash
# Check Redis dedup sets
redis-cli -n 1 SMEMBERS vmh:beteiligte:create_pending
redis-cli -n 1 SMEMBERS vmh:beteiligte:update_pending
redis-cli -n 1 SMEMBERS vmh:beteiligte:delete_pending
# Check for concurrent webhook processing
sudo journalctl -u motia.service | grep "Webhook.*received"
```
**Solutions**:
1. **Redis SET not working**:
```bash
# Test Redis SET operations
redis-cli -n 1 SADD test_set "value1"
redis-cli -n 1 SMEMBERS test_set
redis-cli -n 1 DEL test_set
```
2. **Clear dedup sets**:
```bash
# If corrupted
redis-cli -n 1 DEL vmh:beteiligte:create_pending
redis-cli -n 1 DEL vmh:beteiligte:update_pending
redis-cli -n 1 DEL vmh:beteiligte:delete_pending
```
## Performance Issues
### High CPU Usage
**Diagnose**:
```bash
# Check CPU usage
top -p $(pgrep -f "motia start")
# Profile with Node.js
# Already enabled with --inspect flag
# Connect to chrome://inspect
```
**Solutions**:
1. **Too many parallel syncs**:
```bash
# Reduce concurrent syncs
# Adjust DEBUG_KUERZEL to process fewer employees
```
2. **Infinite loop**:
```bash
# Check logs for repeated patterns
sudo journalctl -u motia.service | tail -n 1000 | sort | uniq -c | sort -rn
```
### High Memory Usage
**Diagnose**:
```bash
# Check memory
ps aux | grep motia | awk '{print $6}'
# Heap snapshot (if enabled)
kill -SIGUSR2 $(pgrep -f "motia start")
# Snapshot saved to current directory
```
**Solutions**:
1. **Increase memory limit**:
```ini
# In systemd service
Environment=NODE_OPTIONS=--max-old-space-size=16384
```
2. **Memory leak**:
```bash
# Restart service periodically
# Add to crontab:
0 3 * * * systemctl restart motia.service
```
### Slow API Responses
**Diagnose**:
```bash
# Measure response time
time curl "http://localhost:3000/advoware/proxy?endpoint=employees"
# Check for database/Redis latency
redis-cli --latency
```
**Solutions**:
1. **Redis slow**:
```bash
# Check slow log
redis-cli SLOWLOG GET 10
# Optimize Redis
redis-cli CONFIG SET tcp-backlog 511
```
2. **Advoware API slow**:
```bash
# Increase timeout
export ADVOWARE_API_TIMEOUT_SECONDS=60
# Add caching layer
```
## Debugging Tools
### Enable Debug Logging
```bash
# Set in systemd service
Environment=MOTIA_LOG_LEVEL=debug
sudo systemctl daemon-reload
sudo systemctl restart motia.service
```
### Redis Debugging
```bash
# Connect to Redis
redis-cli
# Monitor all commands
MONITOR
# Slow log
SLOWLOG GET 10
# Info
INFO all
```
### Python Debugging
```python
# Add to step code
import pdb; pdb.set_trace()
# Or use logging
context.logger.debug(f"Variable value: {variable}")
```
### Node.js Debugging
```bash
# Connect to inspector
# Chrome DevTools: chrome://inspect
# VSCode: Attach to Process
```
## Getting Help
### Check Logs First
```bash
# Last 100 lines
sudo journalctl -u motia.service -n 100
# Errors only
sudo journalctl -u motia.service -p err
# Specific time range
sudo journalctl -u motia.service --since "1 hour ago"
```
### Common Log Patterns
**Success**:
```
[INFO] Calendar sync completed for SB
[INFO] VMH Webhook received
```
**Warning**:
```
[WARNING] Rate limit approaching
[WARNING] Lock already exists for SB
```
**Error**:
```
[ERROR] Redis connection failed
[ERROR] API call failed: 401 Unauthorized
[ERROR] Unexpected error: ...
```
### Collect Debug Information
```bash
# System info
uname -a
node --version
python3 --version
# Service status
sudo systemctl status motia.service
# Recent logs
sudo journalctl -u motia.service -n 200 > motia-logs.txt
# Redis info
redis-cli INFO > redis-info.txt
# Configuration (redact secrets!)
sudo systemctl show motia.service -p Environment > env.txt
```
## Related Documentation
- [Architecture](ARCHITECTURE.md)
- [Configuration](CONFIGURATION.md)
- [Deployment](DEPLOYMENT.md)
- [Development Guide](DEVELOPMENT.md)