Deployment with Dokploy
Overview
Dokploy provides a self-hosted deployment platform with:
- 🎯 Web UI for deployment management
- 📊 Real-time logs and container monitoring
- 🔄 One-click rollbacks
- 🔔 Discord/Slack notifications
- 🚀 Zero-downtime deployments
- 🌐 Automatic SSL via Traefik
Architecture
GitHub (dev/main branches)
↓ (push triggers CI)
GitHub Actions
↓ (builds & pushes Docker image)
GitHub Container Registry (GHCR)
↓ (webhook on new image)
Dokploy
↓ (pulls image & deploys with zero downtime)
Docker Swarm (production containers)
↓
Traefik (auto-routing, auto-SSL)Supporting services (postgres, redis, watermark) run on host via docker-compose and connect to Dokploy's bot via shared network.
Prerequisites
- ✅ Dokploy installed on your server
- ✅ Domain with DNS configured
- ✅ GitHub repository access (GHCR is automatically available)
Initial Setup
1. Create Docker Network
SSH into your server and create the shared network:
docker network create catto-networkThis network allows Dokploy's bot containers to communicate with host services (postgres/redis/watermark).
2. Start Supporting Services
On your server, start the supporting services:
cd /path/to/catto/repo
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d postgres redis watermarkVerify they're running:
docker ps | grep catto
docker exec catto-postgres psql -U postgres -l
docker exec catto-redis redis-cli ping3. GitHub Container Registry
No setup needed! GHCR is automatically available for your repository.
- Images publish to:
ghcr.io/your-org/your-repo:tag - Uses built-in
GITHUB_TOKEN(no secrets to configure) - After first build, images appear in Packages tab on GitHub
- By default, packages are private (can be made public in package settings)
Dokploy Configuration
4. Connect GitHub Source
- In Dokploy dashboard: Sources → Add Source
- Select GitHub
- Authorize Dokploy to access your repository
- Verify connection successful
5. Create Production Project
- Projects → New Project
- Name:
catto-production - Description: "Production deployment - main branch"
6. Create Production Bot Resource
- Inside project: New Resource → Docker Image
- Configuration:
- Name:
catto-bot-prod - Image:
ghcr.io/your-org/your-repo:latest(replace with your org/repo) - Registry: GitHub Container Registry
- Registry Credentials (if package is private):
- Username: your GitHub username
- Password: Personal Access Token with
read:packagesscope
- Auto-deploy: ✅ Enable
- Network: Select
catto-network(or configure in Advanced settings)
- Name:
7. Configure Production Environment Variables
Go to Environment Variables tab and add all variables from .env.dokploy.example:
Critical variables:
DISCORD_TOKEN=your_production_token
DATABASE_URL=postgresql://postgres:password@postgres:5432/catto_prod?schema=public
REDIS_HOST=redis
REDIS_PASSWORD=your_redis_password
WATERMARK_SERVICE_URL=http://watermark:3847
SESSION_ENCRYPTION_KEY=your_32_char_hex
NODE_ENV=productionSee .env.dokploy.example for the full list.
8. Configure Production Domain
- Domains tab → Add Domain
- Domain:
api.yourdomain.com - Enable HTTPS (Traefik auto-configures Let's Encrypt)
- Port:
4000 - Save
9. Configure Zero-Downtime Deployments
This is critical for production stability!
- Advanced tab → Cluster Settings → Swarm Settings
- Add Health Check JSON:
{
"Test": ["CMD", "curl", "-f", "http://localhost:4000/api/health"],
"Interval": 30000000000,
"Timeout": 10000000000,
"StartPeriod": 40000000000,
"Retries": 3
}- Add Update Config JSON:
{
"FailureAction": "rollback",
"Order": "start-first"
}This ensures:
- New container starts and passes health checks
- Traffic switches to new container
- Old container shuts down
- On failure, automatic rollback to previous version
Development Environment
Repeat steps 5-9 for development:
Project: catto-developmentResource: catto-bot-devImage: ghcr.io/your-org/your-repo:devDomain: api-dev.yourdomain.com
Environment variable differences:
DISCORD_TOKEN→ dev bot (if separate)DATABASE_URL→...@postgres:5432/catto_dev?schema=publicREDIS_DB→1(different Redis database)DEPLOY_VERSION→devAPI_REDIRECT→https://api-dev.yourdomain.com/api/oauth/callback
DNS Configuration
Configure DNS A records:
api.yourdomain.com → your_server_ip
api-dev.yourdomain.com → your_server_ipWait 5-60 minutes for propagation.
First Deployment
Production
- Push code to
mainbranch - GitHub Actions builds and pushes
ghcr.io/your-org/your-repo:latestto GHCR - Dokploy receives webhook and pulls image
- In Dokploy UI: watch deployment logs in real-time
- Verify:
curl https://api.yourdomain.com/api/health
Development
- Push code to
devbranch - GitHub Actions builds and pushes
ghcr.io/your-org/your-repo:devto GHCR - Dokploy auto-deploys
- Verify:
curl https://api-dev.yourdomain.com/api/health
Daily Operations
Viewing Logs
- In Dokploy UI: navigate to resource → Logs tab
- Real-time streaming logs from all containers
- Filter by container if needed
Manual Deployment
- Click Deploy button in resource view
- Optionally specify a different image tag
- Watch deployment progress in logs
Rollback
- Go to Deployments history tab
- Find a previous successful deployment
- Click Redeploy
- Zero-downtime rollback to that version
Notifications
- Project settings → Notifications
- Add Discord/Slack webhook URL
- Enable for: deployment started, succeeded, failed
Deployment Flow
1. Developer pushes to main/dev branch
↓
2. GitHub Actions CI runs (lint, test, typecheck)
↓
3. CI builds Docker image with tag (latest or dev)
↓
4. CI pushes image to GitHub Container Registry
↓
5. Dokploy receives webhook from GHCR
↓
6. Dokploy pulls new image
↓
7. Dokploy starts new container in Swarm
↓
8. Health checks pass on new container
↓
9. Traefik switches traffic to new container (zero downtime!)
↓
10. Old container gracefully shuts down
↓
11. Discord notification: "Deploy successful 🚀"Troubleshooting
Deployment Fails
Check:
- Deployment logs in Dokploy UI
- Image exists on GHCR: Go to GitHub repo → Packages tab
- Or try:
docker pull ghcr.io/your-org/your-repo:latest - Environment variables are correct
- Docker network:
docker network inspect catto-network - If package is private, verify Dokploy has valid GHCR credentials
Health Check Fails
Check:
- Bot logs in Dokploy: look for startup errors
- Health endpoint responds:
curl http://localhost:4000/api/health - Redis connectivity:
docker exec catto-redis redis-cli ping - Postgres connectivity:
docker exec catto-postgres psql -U postgres -l
Common issues:
- Wrong
DATABASE_URL(check hostname ispostgresnotlocalhost) - Wrong
REDIS_HOST(should beredisnotlocalhost) - Missing environment variables
- Wrong network configuration
Domain/SSL Issues
Check:
- DNS propagation:
dig api.yourdomain.com(should show server IP) - Port 80/443 open:
sudo ufw statusor check cloud provider firewall - Traefik logs in Dokploy
- Domain correctly configured in Dokploy UI
Auto-Deploy Not Triggering
Check:
- Image was pushed to GHCR successfully (check GitHub Actions logs)
- Image appears in GitHub repo → Packages tab
- Auto-deploy is enabled in Dokploy resource settings
- Image name matches exactly (including
ghcr.io/prefix and tag) - Dokploy webhook is configured for GHCR
- If package is private, verify Dokploy can authenticate to GHCR
Bot Can't Connect to Postgres/Redis
Check:
- Services are running:
docker ps | grep catto - Services are on
catto-network:docker network inspect catto-network - Dokploy bot is also on
catto-network(configure in Advanced settings) - Hostnames are correct:
postgres,redis,watermark(notlocalhost)
Migration from GitHub Actions CD
If you're migrating from the old GitHub Actions CD approach:
- ✅ GitHub Actions CI still runs (builds images now instead of deploying)
- ✅ Supporting services stay on host (no change needed)
- ❌ Old
.github/workflows/cd.ymlarchived (Dokploy handles deployment) - ❌ Caddy removed (Traefik replaces it)
- ❌
scripts/deploy.shno longer needed
Your docker-compose supporting services continue running as before, just the bot deployment method changed.
Advanced: Zero-Downtime Deep Dive
Dokploy uses Docker Swarm's rolling update strategy:
- start-first: New container starts before old one stops
- Health check: New container must pass health checks
- Timeout: 40s start period for app warmup
- Retries: 3 attempts before marking unhealthy
- Rollback: On failure, automatically reverts to previous version
This ensures your API stays responsive during deploys - no 5-10s Gateway disconnect window!
Resources
Phase 2: True Zero-Downtime (Future)
Dokploy's zero-downtime is good for HTTP APIs, but Discord Gateway connections still disconnect during container restarts.
For true Discord Gateway handoff (no events missed), the Phase 2 plan with Redis leader election and pub/sub handoff is still needed. This would be implemented as a pre-deployment script in Dokploy or as an external orchestration layer.
Current setup handles 99% of use cases - Gateway reconnects are fast (~2s) and BullMQ jobs survive container restarts.