Deployment with Dokploy

Overview

Dokploy provides a self-hosted deployment platform with:

🎯 Web UI for deployment management
📊 Real-time logs and container monitoring
🔄 One-click rollbacks
🔔 Discord/Slack notifications
🚀 Zero-downtime deployments
🌐 Automatic SSL via Traefik

Architecture

GitHub (dev/main branches)
    ↓ (push triggers CI)
GitHub Actions
    ↓ (builds & pushes Docker image)
GitHub Container Registry (GHCR)
    ↓ (webhook on new image)
Dokploy
    ↓ (pulls image & deploys with zero downtime)
Docker Swarm (production containers)
    ↓
Traefik (auto-routing, auto-SSL)

Supporting services (postgres, redis, watermark) run on host via docker-compose and connect to Dokploy's bot via shared network.

Prerequisites

✅ Dokploy installed on your server
✅ Domain with DNS configured
✅ GitHub repository access (GHCR is automatically available)

Initial Setup

1. Create Docker Network

SSH into your server and create the shared network:

bash

docker network create catto-network

This network allows Dokploy's bot containers to communicate with host services (postgres/redis/watermark).

2. Start Supporting Services

On your server, start the supporting services:

bash

cd /path/to/catto/repo
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d postgres redis watermark

Verify they're running:

bash

docker ps | grep catto
docker exec catto-postgres psql -U postgres -l
docker exec catto-redis redis-cli ping

3. GitHub Container Registry

No setup needed! GHCR is automatically available for your repository.

Images publish to: ghcr.io/your-org/your-repo:tag
Uses built-in GITHUB_TOKEN (no secrets to configure)
After first build, images appear in Packages tab on GitHub
By default, packages are private (can be made public in package settings)

Dokploy Configuration

4. Connect GitHub Source

In Dokploy dashboard: Sources → Add Source
Select GitHub
Authorize Dokploy to access your repository
Verify connection successful

5. Create Production Project

Projects → New Project
Name: catto-production
Description: "Production deployment - main branch"

6. Create Production Bot Resource

Inside project: New Resource → Docker Image
Configuration:
- Name: catto-bot-prod
- Image: ghcr.io/your-org/your-repo:latest (replace with your org/repo)
- Registry: GitHub Container Registry
- Registry Credentials (if package is private):
  - Username: your GitHub username
  - Password: Personal Access Token with read:packages scope
- Auto-deploy: ✅ Enable
- Network: Select catto-network (or configure in Advanced settings)

7. Configure Production Environment Variables

Go to Environment Variables tab and add all variables from .env.dokploy.example:

Critical variables:

bash

DISCORD_TOKEN=your_production_token
DATABASE_URL=postgresql://postgres:password@postgres:5432/catto_prod?schema=public
REDIS_HOST=redis
REDIS_PASSWORD=your_redis_password
WATERMARK_SERVICE_URL=http://watermark:3847
SESSION_ENCRYPTION_KEY=your_32_char_hex
NODE_ENV=production

See .env.dokploy.example for the full list.

8. Configure Production Domain

Domains tab → Add Domain
Domain: api.yourdomain.com
Enable HTTPS (Traefik auto-configures Let's Encrypt)
Port: 4000
Save

9. Configure Zero-Downtime Deployments

This is critical for production stability!

Advanced tab → Cluster Settings → Swarm Settings
Add Health Check JSON:

json

{
  "Test": ["CMD", "curl", "-f", "http://localhost:4000/api/health"],
  "Interval": 30000000000,
  "Timeout": 10000000000,
  "StartPeriod": 40000000000,
  "Retries": 3
}

Add Update Config JSON:

json

{
  "FailureAction": "rollback",
  "Order": "start-first"
}

This ensures:

New container starts and passes health checks
Traffic switches to new container
Old container shuts down
On failure, automatic rollback to previous version

Development Environment

Repeat steps 5-9 for development:

Project: catto-developmentResource: catto-bot-devImage: ghcr.io/your-org/your-repo:devDomain: api-dev.yourdomain.com

Environment variable differences:

DISCORD_TOKEN → dev bot (if separate)
DATABASE_URL → ...@postgres:5432/catto_dev?schema=public
REDIS_DB → 1 (different Redis database)
DEPLOY_VERSION → dev
API_REDIRECT → https://api-dev.yourdomain.com/api/oauth/callback

DNS Configuration

Configure DNS A records:

api.yourdomain.com      →  your_server_ip
api-dev.yourdomain.com  →  your_server_ip

Wait 5-60 minutes for propagation.

First Deployment

Production

Push code to main branch
GitHub Actions builds and pushes ghcr.io/your-org/your-repo:latest to GHCR
Dokploy receives webhook and pulls image
In Dokploy UI: watch deployment logs in real-time
Verify: curl https://api.yourdomain.com/api/health

Development

Push code to dev branch
GitHub Actions builds and pushes ghcr.io/your-org/your-repo:dev to GHCR
Dokploy auto-deploys
Verify: curl https://api-dev.yourdomain.com/api/health

Daily Operations

Viewing Logs

In Dokploy UI: navigate to resource → Logs tab
Real-time streaming logs from all containers
Filter by container if needed

Manual Deployment

Click Deploy button in resource view
Optionally specify a different image tag
Watch deployment progress in logs

Rollback

Go to Deployments history tab
Find a previous successful deployment
Click Redeploy
Zero-downtime rollback to that version

Notifications

Project settings → Notifications
Add Discord/Slack webhook URL
Enable for: deployment started, succeeded, failed

Deployment Flow

1. Developer pushes to main/dev branch
        ↓
2. GitHub Actions CI runs (lint, test, typecheck)
        ↓
3. CI builds Docker image with tag (latest or dev)
        ↓
4. CI pushes image to GitHub Container Registry
        ↓
5. Dokploy receives webhook from GHCR
        ↓
6. Dokploy pulls new image
        ↓
7. Dokploy starts new container in Swarm
        ↓
8. Health checks pass on new container
        ↓
9. Traefik switches traffic to new container (zero downtime!)
        ↓
10. Old container gracefully shuts down
        ↓
11. Discord notification: "Deploy successful 🚀"

Troubleshooting

Deployment Fails

Check:

Deployment logs in Dokploy UI
Image exists on GHCR: Go to GitHub repo → Packages tab
Or try: docker pull ghcr.io/your-org/your-repo:latest
Environment variables are correct
Docker network: docker network inspect catto-network
If package is private, verify Dokploy has valid GHCR credentials

Health Check Fails

Check:

Bot logs in Dokploy: look for startup errors
Health endpoint responds: curl http://localhost:4000/api/health
Redis connectivity: docker exec catto-redis redis-cli ping
Postgres connectivity: docker exec catto-postgres psql -U postgres -l

Common issues:

Wrong DATABASE_URL (check hostname is postgres not localhost)
Wrong REDIS_HOST (should be redis not localhost)
Missing environment variables
Wrong network configuration

Domain/SSL Issues

Check:

DNS propagation: dig api.yourdomain.com (should show server IP)
Port 80/443 open: sudo ufw status or check cloud provider firewall
Traefik logs in Dokploy
Domain correctly configured in Dokploy UI

Auto-Deploy Not Triggering

Check:

Image was pushed to GHCR successfully (check GitHub Actions logs)
Image appears in GitHub repo → Packages tab
Auto-deploy is enabled in Dokploy resource settings
Image name matches exactly (including ghcr.io/ prefix and tag)
Dokploy webhook is configured for GHCR
If package is private, verify Dokploy can authenticate to GHCR

Bot Can't Connect to Postgres/Redis

Check:

Services are running: docker ps | grep catto
Services are on catto-network: docker network inspect catto-network
Dokploy bot is also on catto-network (configure in Advanced settings)
Hostnames are correct: postgres, redis, watermark (not localhost)

Migration from GitHub Actions CD

If you're migrating from the old GitHub Actions CD approach:

✅ GitHub Actions CI still runs (builds images now instead of deploying)
✅ Supporting services stay on host (no change needed)
❌ Old .github/workflows/cd.yml archived (Dokploy handles deployment)
❌ Caddy removed (Traefik replaces it)
❌ scripts/deploy.sh no longer needed

Your docker-compose supporting services continue running as before, just the bot deployment method changed.

Advanced: Zero-Downtime Deep Dive

Dokploy uses Docker Swarm's rolling update strategy:

start-first: New container starts before old one stops
Health check: New container must pass health checks
Timeout: 40s start period for app warmup
Retries: 3 attempts before marking unhealthy
Rollback: On failure, automatically reverts to previous version

This ensures your API stays responsive during deploys - no 5-10s Gateway disconnect window!

Resources

Phase 2: True Zero-Downtime (Future)

Dokploy's zero-downtime is good for HTTP APIs, but Discord Gateway connections still disconnect during container restarts.

For true Discord Gateway handoff (no events missed), the Phase 2 plan with Redis leader election and pub/sub handoff is still needed. This would be implemented as a pre-deployment script in Dokploy or as an external orchestration layer.

Current setup handles 99% of use cases - Gateway reconnects are fast (~2s) and BullMQ jobs survive container restarts.

Deployment with Dokploy ​

Overview ​

Architecture ​

Prerequisites ​

Initial Setup ​

1. Create Docker Network ​

2. Start Supporting Services ​

3. GitHub Container Registry ​

Dokploy Configuration ​

4. Connect GitHub Source ​

5. Create Production Project ​

6. Create Production Bot Resource ​

7. Configure Production Environment Variables ​

8. Configure Production Domain ​

9. Configure Zero-Downtime Deployments ​

Development Environment ​

DNS Configuration ​

First Deployment ​

Production ​

Development ​

Daily Operations ​

Viewing Logs ​

Manual Deployment ​

Rollback ​

Notifications ​

Deployment Flow ​

Troubleshooting ​

Deployment Fails ​

Health Check Fails ​

Domain/SSL Issues ​

Auto-Deploy Not Triggering ​

Bot Can't Connect to Postgres/Redis ​

Migration from GitHub Actions CD ​

Advanced: Zero-Downtime Deep Dive ​

Resources ​

Phase 2: True Zero-Downtime (Future) ​

Deployment with Dokploy

Overview

Architecture

Prerequisites

Initial Setup

1. Create Docker Network

2. Start Supporting Services

3. GitHub Container Registry

Dokploy Configuration

4. Connect GitHub Source

5. Create Production Project

6. Create Production Bot Resource

7. Configure Production Environment Variables

8. Configure Production Domain

9. Configure Zero-Downtime Deployments

Development Environment

DNS Configuration

First Deployment

Production

Development

Daily Operations

Viewing Logs

Manual Deployment

Rollback

Notifications

Deployment Flow

Troubleshooting

Deployment Fails

Health Check Fails

Domain/SSL Issues

Auto-Deploy Not Triggering

Bot Can't Connect to Postgres/Redis

Migration from GitHub Actions CD

Advanced: Zero-Downtime Deep Dive

Resources

Phase 2: True Zero-Downtime (Future)