10 KiB
Deep Debugging Report - Port Conflict Resolution
Date: December 17, 2025
Issue: Backend service failing to start with "Address already in use" error
Status: ✅ RESOLVED with safeguards implemented
🔍 ROOT CAUSE ANALYSIS
The Problem
Backend systemd service (church-music-backend.service) was failing repeatedly with error:
[ERROR] Connection in use: ('127.0.0.1', 8080)
[ERROR] connection to ('127.0.0.1', 8080) failed: [Errno 98] Address already in use
[ERROR] Can't connect to ('127.0.0.1', 8080)
Investigation Process
-
Service Status Check
- Backend service in failed state after 5 restart attempts
- Systemd restart limit reached (StartLimitBurst=5)
- Exit code 1 (FAILURE)
-
Log Analysis
- Error logs showed consistent port 8080 binding failures
- No application errors - purely infrastructure issue
- Repeated retry attempts over ~90 seconds
-
Port Analysis
sudo lsof -i :8080 # Found: python 17329 pts - python app.py -
Process Investigation
ps aux | grep 17329 # Result: python app.py running as development server
Root Cause Identified
A Flask development server (python app.py) was running in the background, occupying port 8080 and preventing the production Gunicorn service from starting.
How it happened:
- The
start-dev-mode.shscript startspython app.pyin background - No cleanup when switching to production mode
- No collision detection between dev and production modes
- Process persisted across reboots/sessions
🛠️ FIXES IMPLEMENTED
1. Immediate Fix: Kill Rogue Process
sudo kill 17329 # Freed port 8080
sudo systemctl reset-failed church-music-backend.service
sudo systemctl start church-music-backend.service
Result: ✅ Backend service started successfully
2. Systemd Service Enhancement
File: church-music-backend.service
Added pre-start check:
ExecStartPre=/media/pts/Website/Church_HOP_MusicData/backend/pre-start-check.sh
This script:
- Checks if port 8080 is in use before starting
- Kills any rogue processes (except systemd services)
- Prevents startup if port can't be freed
- Logs all actions for debugging
File: backend/pre-start-check.sh
3. Port Cleanup Utility
File: cleanup-ports.sh
Comprehensive port management script:
- Checks ports 8080 (backend) and 5100 (frontend)
- Identifies processes using each port
- Distinguishes between systemd services and rogue processes
- Safely kills only non-systemd processes
- Cleans up stale PID files
- Color-coded output for clarity
Usage:
./cleanup-ports.sh
4. Development Mode Safeguards
File: start-dev-mode.sh
Enhanced with:
- Production service detection: Warns if systemd services are running
- Interactive prompt: Asks permission to stop production services
- Old process cleanup: Kills previous dev mode processes
- PID file management: Removes stale PID files
- Clear status display: Shows running services and how to stop them
File: stop-dev-mode.sh (NEW)
Properly stops development mode:
- Kills backend and frontend dev processes
- Removes PID files
- Kills any stray processes
- Prevents port conflicts
5. Documentation Updates
- WEBSOCKET_HTTPS_FIX.md - WebSocket security fix
- STATUS.md - Updated system status
- This file - Comprehensive debugging documentation
🔒 SAFEGUARDS ADDED
1. Pre-Start Port Validation
- Automatic port conflict detection
- Kills rogue processes before service start
- Prevents "Address already in use" errors
- Logged for audit trail
2. Dev/Production Separation
- Development mode checks for production services
- Interactive warning system
- Cannot run both modes simultaneously
- Clear error messages
3. Process Tracking
- PID files for development mode
- Automatic cleanup of stale PIDs
- Process state validation
4. Monitoring & Diagnostics
- Enhanced logging in service files
- Dedicated cleanup script
- Verification script for WebSocket fix
- Clear error messages with solutions
🧪 VERIFICATION TESTS
Test 1: Service Startup
sudo systemctl status church-music-backend
Result: ✅ Active (running) with pre-start check successful
Test 2: API Endpoints
curl http://localhost:8080/api/health
Result: ✅ {"status":"ok","ts":"2025-12-17T07:24:06.301875"}
Test 3: HTTPS Access
curl -I https://houseofprayer.ddns.net/
Result: ✅ HTTP/2 200
Test 4: No Port Conflicts
sudo lsof -i :8080
Result: ✅ Only gunicorn workers (systemd service)
Test 5: Pre-Start Check
sudo systemctl restart church-music-backend
journalctl -u church-music-backend | grep ExecStartPre
Result: ✅ ExecStartPre=/media/pts/Website/Church_HOP_MusicData/backend/pre-start-check.sh (code=exited, status=0/SUCCESS)
📊 FAILURE POINTS ANALYSIS
Identified Failure Points
-
Port Binding
- Risk: Multiple processes competing for same port
- Mitigation: Pre-start port check, automatic cleanup
- Detection: Service fails immediately with clear error
-
Development vs Production Conflict
- Risk: Running both modes simultaneously
- Mitigation: Interactive warnings, automatic detection
- Detection: start-dev-mode.sh checks systemd services
-
Zombie Processes
- Risk: Background processes persisting after crashes
- Mitigation: PID tracking, automatic cleanup
- Detection: cleanup-ports.sh finds and kills
-
Service Restart Limits
- Risk: Hitting StartLimitBurst causing permanent failure
- Mitigation: Pre-start checks prevent repeated failures
- Recovery: Manual reset with
systemctl reset-failed
-
Missing Dependencies
- Risk: Backend starts before database ready
- Mitigation:
After=postgresql.servicein service file - Detection: Backend logs show connection errors
Monitoring Recommendations
-
Port Monitoring
# Add to cron for automated monitoring */5 * * * * /media/pts/Website/Church_HOP_MusicData/cleanup-ports.sh -
Service Health Checks
curl http://localhost:8080/api/health -
Log Monitoring
sudo journalctl -u church-music-backend -f
📝 USAGE GUIDE
Production Mode (Recommended)
# Start services
sudo systemctl start church-music-backend
sudo systemctl start church-music-frontend
# Check status
sudo systemctl status church-music-backend
sudo systemctl status church-music-frontend
# View logs
sudo journalctl -u church-music-backend -f
Development Mode
# Start (will check for conflicts)
./start-dev-mode.sh
# Stop
./stop-dev-mode.sh
# View logs
tail -f /tmp/church-*.log
Troubleshooting
# Clean up port conflicts
./cleanup-ports.sh
# Reset failed services
sudo systemctl reset-failed church-music-backend
# Verify WebSocket fix (for frontend)
./verify-websocket-fix.sh
📈 IMPROVEMENTS SUMMARY
Before
- ❌ Port conflicts caused service failures
- ❌ No detection of dev/prod conflicts
- ❌ Manual cleanup required
- ❌ Difficult to diagnose issues
- ❌ Zombie processes persisted
After
- ✅ Automatic port conflict resolution
- ✅ Dev/prod conflict detection and warnings
- ✅ Automated cleanup scripts
- ✅ Clear error messages and logs
- ✅ Automatic zombie process cleanup
- ✅ Pre-start validation
- ✅ Comprehensive documentation
🎯 LESSONS LEARNED
-
Always validate port availability before binding
- Implement pre-start checks in systemd services
- Log port conflicts with process details
-
Separate development and production environments
- Never mix dev and prod processes
- Implement conflict detection
- Clear documentation of each mode
-
Track background processes properly
- Use PID files for all background processes
- Clean up PIDs on exit
- Validate process state before operations
-
Provide clear error messages
- Log what's wrong and how to fix it
- Include process details in errors
- Offer automated solutions
-
Document everything
- Usage guides for operators
- Troubleshooting steps
- Architecture decisions
🔗 RELATED FILES
Created/Updated
- cleanup-ports.sh - Port conflict resolution
- backend/pre-start-check.sh - Service pre-start validation
- start-dev-mode.sh - Enhanced with safeguards
- stop-dev-mode.sh - Proper cleanup
- church-music-backend.service - Added pre-start check
- WEBSOCKET_HTTPS_FIX.md - WebSocket security fix
- STATUS.md - Updated system status
Configuration Files
- nginx-ssl.conf - HTTPS proxy configuration
- frontend/.env - WebSocket security settings
- frontend/.env.production - Production build settings
✅ FINAL STATUS
Backend Service: ✅ Running (with pre-start protection)
Frontend Service: ✅ Running (production build)
WebSocket Error: ✅ Fixed (no dev server in production)
Port Conflicts: ✅ Prevented (automatic cleanup)
Documentation: ✅ Complete
Safeguards: ✅ Implemented
System Status: FULLY OPERATIONAL with enhanced reliability
🆘 EMERGENCY PROCEDURES
If services fail to start:
-
Quick Fix
./cleanup-ports.sh sudo systemctl reset-failed church-music-backend sudo systemctl start church-music-backend -
Check Logs
sudo journalctl -u church-music-backend --no-pager | tail -50 -
Manual Port Check
sudo lsof -i :8080 sudo kill -9 <PID> # If rogue process found -
Restart All
./stop-dev-mode.sh sudo systemctl restart church-music-backend sudo systemctl restart church-music-frontend
Author: GitHub Copilot (Claude Sonnet 4.5)
Date: December 17, 2025
Status: Production Ready ✅