DevOps19 December 2024Michael Hettwer12 min read

Zero-Downtime Deployments with Nginx and Systemd

A step-by-step walkthrough for achieving true zero-downtime releases using Nginx upstream switching and systemd socket activation — no Kubernetes required.

Zero-downtime deployments are often presented as a Kubernetes-only capability. They are not. With Nginx as a reverse proxy and systemd socket activation, you can achieve true zero-downtime application updates on a single server with no container orchestrator required — and understand exactly what is happening at each step.

The Core Idea

The strategy has two parts: (1) run two versions of your application simultaneously for a brief period during deployment, and (2) use Nginx to shift traffic from the old version to the new one atomically. Systemd socket activation ensures new connections are held — not dropped — during the switchover.

Step 1: Systemd Socket Activation

Socket activation lets systemd own the listening socket. When your application restarts, the socket remains open — incoming connections queue in the kernel — and are handed to the new process once it is ready. No connections are refused.

ini

# /etc/systemd/system/myapp.socket
[Unit]
Description=MyApp socket

[Socket]
ListenStream=127.0.0.1:3000
Accept=no

[Install]
WantedBy=sockets.target

ini

# /etc/systemd/system/myapp.service
[Unit]
Description=MyApp application server
Requires=myapp.socket
After=myapp.socket

[Service]
ExecStart=/usr/bin/node /srv/myapp/current/server.js
WorkingDirectory=/srv/myapp/current
User=myapp
Group=myapp
Restart=on-failure

# Tell the service to use the socket passed by systemd
Environment=NODE_ENV=production

[Install]
WantedBy=multi-user.target

Step 2: Nginx Upstream Configuration

nginx

# /etc/nginx/conf.d/myapp.conf
upstream myapp {
    server 127.0.0.1:3000;
    keepalive 32;
}

server {
    listen 80;
    server_name example.com;

    location / {
        proxy_pass http://myapp;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

        # Allow Nginx to hold connections briefly during restart
        proxy_next_upstream error timeout;
        proxy_connect_timeout 5s;
        proxy_read_timeout 60s;
    }
}

Step 3: The Deployment Script

bash

#!/bin/bash
# deploy.sh — zero-downtime deployment
set -euo pipefail

APP_DIR=/srv/myapp
RELEASE=$(date +%Y%m%d%H%M%S)
RELEASE_DIR="$APP_DIR/releases/$RELEASE"

echo "==> Creating release directory: $RELEASE_DIR"
mkdir -p "$RELEASE_DIR"

echo "==> Pulling latest code"
git clone --depth 1 git@github.com:yourorg/myapp.git "$RELEASE_DIR"

echo "==> Installing dependencies"
cd "$RELEASE_DIR"
npm ci --production

echo "==> Running database migrations"
npm run migrate

echo "==> Switching symlink atomically"
ln -sfn "$RELEASE_DIR" "$APP_DIR/current"

echo "==> Reloading application (socket remains open)"
# systemd will restart the service while keeping the socket alive
systemctl reload-or-restart myapp.service

echo "==> Verifying health check"
sleep 2
curl -sf http://127.0.0.1:3000/health || { echo "Health check failed!"; exit 1; }

echo "==> Cleaning up old releases (keeping last 5)"
ls -dt "$APP_DIR/releases"/* | tail -n +6 | xargs rm -rf

echo "==> Deployment complete: $RELEASE"

Tip:

The key is ln -sfn which atomically replaces the symlink. From the moment the symlink changes, any new worker processes started by systemd will serve the new code. Existing in-flight requests continue on the old workers until they finish.

Step 4: Verify Zero Downtime

bash

# Run this in a second terminal while deploying
# It will report immediately if any request fails
while true; do
  STATUS=$(curl -s -o /dev/null -w "%{http_code}" http://example.com/health)
  echo "$(date +%H:%M:%S) — HTTP $STATUS"
  sleep 0.2
done

You should see a continuous stream of "HTTP 200" lines — including through the deployment. If you see a 502 or 503, check your proxy_next_upstream settings and ensure your application starts accepting connections quickly (within the proxy_connect_timeout window).

When to Graduate to Kubernetes

This approach works excellently for teams running one to a handful of servers with a single application. When you need multi-node deployments, automatic horizontal scaling, or complex service meshes, Kubernetes earns its complexity cost. Until then, Nginx + systemd is simpler, faster to debug, and requires no cluster to maintain.

Why Linux Powers 96% of the World's Top Servers