← Horizontal Scaling · intermediate · 8 min · 05 / 06 বাংলা

WebSockets & Shared State at Scale

How to handle persistent connections, pub/sub fan-out, and the Redis adapter that makes Socket.io work across instances.

WebSocketsSocket.ioRedis adapterpub/substicky sessions

Real-World Analogy

A walkie-talkie network: each device (instance) has its own radio, but to broadcast to all devices, you need a repeater (Redis) that relays your signal to every radio on the network. Without the repeater, your message only reaches devices within direct range of yours.

The Problem with WebSockets and Multiple Instances

HTTP is stateless — each request is independent. WebSockets are stateful — a connection persists on a specific instance.

With one app server, every WebSocket message from any client goes to the right place. With multiple instances, a message sent to instance A cannot natively reach clients connected to instance B.

Instance A:  [user-1, user-3, user-5 connected]
Instance B:  [user-2, user-4, user-6 connected]

user-1 sends message → arrives at Instance A
Instance A wants to broadcast to all users in user-1's room
→ Instance A knows about user-3 and user-5 (connected to it)
→ Instance A does NOT know about user-2, user-4, user-6
→ They miss the message

Socket.io with Redis Adapter

The Redis adapter uses Redis Pub/Sub to relay events across all instances:

import { createServer } from 'http';
import { Server } from 'socket.io';
import { createAdapter } from '@socket.io/redis-adapter';
import { createClient } from 'redis';

const httpServer = createServer(app);
const io = new Server(httpServer, {
	cors: { origin: 'https://myapp.com' }
});

// Two Redis clients: one for publishing, one for subscribing
const pubClient = createClient({ url: process.env.REDIS_URL });
const subClient = pubClient.duplicate();

await Promise.all([pubClient.connect(), subClient.connect()]);

// Wire up the adapter — now io.to().emit() works across all instances
io.adapter(createAdapter(pubClient, subClient));

// This now fans out to ALL instances, not just this one
io.to('room-123').emit('message', { text: 'Hello everyone' });

What the Redis adapter does:

Instance A emits to room-123
  → publishes to Redis channel "socket.io#room-123#"
  → Instance B, C subscribed to that channel receive it
  → Instance B, C deliver to their local room-123 sockets

Architecture

Client 1  ──── WebSocket ──→  Instance A  ──→  Redis Pub/Sub
Client 2  ──── WebSocket ──→  Instance B  ──→  Redis Pub/Sub
Client 3  ──── WebSocket ──→  Instance A      ↑
                                               │
                              Instance B ──────┘ (subscribes, relays to Client 2)

Every instance subscribes to Redis. When any instance publishes a room event, all instances receive it and deliver to their locally-connected clients.

Sticky Sessions as a Temporary Measure

Socket.io requires the HTTP upgrade handshake and subsequent WebSocket frames to hit the same instance. Without sticky sessions, the handshake might go to instance A, but the first WebSocket frame hits instance B (which has no record of the handshake) and fails.

upstream socketio {
    ip_hash;   # route same client IP to same instance
    server instance-1:3000;
    server instance-2:3000;
}

Sticky sessions are acceptable here — unlike application state, WebSocket connections naturally “belong” to one instance. The problem is failure: if an instance dies, its clients disconnect and reconnect (to another instance). This is expected behavior for WebSockets, not a data integrity issue.

AWS ALB sticky sessions:

aws elbv2 modify-target-group-attributes \
  --target-group-arn arn:... \
  --attributes '[
    {"Key": "stickiness.enabled", "Value": "true"},
    {"Key": "stickiness.type", "Value": "lb_cookie"},
    {"Key": "stickiness.lb_cookie.duration_seconds", "Value": "86400"}
  ]'

Presence and Connection Registry

Track which users are currently connected across all instances:

// On connection: register in Redis
io.on('connection', async (socket) => {
	const userId = socket.handshake.auth.userId;

	// Mark user as online with TTL (auto-expires if instance crashes)
	await redis.setex(`presence:${userId}`, 30, socket.id);

	// Refresh TTL periodically to handle long connections
	const refreshInterval = setInterval(async () => {
		await redis.expire(`presence:${userId}`, 30);
	}, 10_000);

	socket.on('disconnect', async () => {
		clearInterval(refreshInterval);
		await redis.del(`presence:${userId}`);
		// Notify others this user went offline
		io.to(`friends-of-${userId}`).emit('user-offline', { userId });
	});
});

// Check if a user is online (from any instance)
async function isUserOnline(userId: string): Promise<boolean> {
	return (await redis.exists(`presence:${userId}`)) === 1;
}

// Get all online users
async function getOnlineUsers(userIds: string[]): Promise<string[]> {
	const keys = userIds.map((id) => `presence:${id}`);
	const results = await redis.mget(...keys);
	return userIds.filter((_, i) => results[i] !== null);
}

Scaling Limits

Each WebSocket connection consumes a file descriptor on the server. Linux default limit is 1024 per process — but this is easily raised:

# Check current limits
ulimit -n   # file descriptors per process

# Raise for the node process
ulimit -n 65536

# Or in /etc/security/limits.conf (permanent)
# * soft nofile 65536
# * hard nofile 65536

# Kernel-level socket backlog
sysctl -w net.core.somaxconn=65535
sysctl -w net.ipv4.tcp_max_syn_backlog=65535

Practical limits per instance with Node.js:

~10,000 concurrent WebSocket connections (comfortable)
~50,000 with tuning
Beyond that: scale out (add more instances + Redis adapter handles fan-out)

When to Not Use WebSockets

WebSockets have real overhead. Consider cheaper alternatives:

Server-Sent Events (SSE): One-way push from server to client. Simpler, lower overhead, HTTP/2-compatible. Right for notifications, live feeds, dashboard updates.

app.get('/events', (req, res) => {
	res.setHeader('Content-Type', 'text/event-stream');
	res.setHeader('Cache-Control', 'no-cache');
	res.setHeader('Connection', 'keep-alive');

	const send = (data: unknown) => {
		res.write(`data: ${JSON.stringify(data)}\n\n`);
	};

	const sub = redis.subscribe('notifications', (message) => {
		const event = JSON.parse(message);
		if (event.userId === req.user.id) send(event);
	});

	req.on('close', () => sub.unsubscribe());
});

Long-polling: Client makes a request, server holds it open until there’s data, client immediately re-requests. Works anywhere HTTP works. Worse for high-frequency updates but simpler operationally.

Webhook push: Server pushes to a client-provided URL when events occur. Right for integrations, not user-facing realtime.

Use WebSockets when you need bidirectional communication (chat, collaborative editing, multiplayer games). For one-way server push, SSE is usually simpler and sufficient.