Nginx Reverse Proxy Complete Guide: Upstream, Buffering, and Timeout
3 AM. Phone vibrating like crazy—production alert.
Checking the logs: all 502 Bad Gateway errors. Backend service didn’t crash, but Nginx timeout was set too short. Traffic spike hit, and requests got cut off before they finished processing. I stared at that proxy_read_timeout 60s line and thought: I literally pulled that number out of thin air.
That incident sent me down a week-long rabbit hole understanding three core Nginx reverse proxy modules: upstream load balancing, proxy buffer configuration, and timeout settings. Honestly, when these three are configured right, your reverse proxy can handle 10x traffic. When they’re wrong, you get 3 AM alerts like I did.
This article captures all the pitfalls I hit, debugging lessons learned, and the principles I finally grokked. If you’re doing backend, DevOps, or just want to understand what those Nginx config parameters actually mean, this should save you some time.
Upstream Load Balancing: More Than “Distributing Requests”
Let’s Start with Basic Syntax
The upstream block is Nginx load balancing’s core. You’ve probably seen this:
upstream backend {
server 192.168.1.10:8080;
server 192.168.1.11:8080;
server 192.168.1.12:8080;
}
server {
location / {
proxy_pass http://backend;
}
}
Looks simple—define backend servers, proxy_pass to them. But honestly, that’s not enough for production. Real environments need more: what if a server crashes? Can beefier machines get more traffic? Should we keep connections alive?
Four Load Balancing Algorithms, Each for Different Scenarios
Nginx defaults to round-robin—distribute in order. Fair, but not smart.
If your backend handles long connections—WebSocket, database connection pools—round-robin might suddenly overload certain servers. Least connections (least_conn) is better here:
upstream backend {
least_conn;
server 192.168.1.10:8080;
server 192.168.1.11:8080;
}
It tracks active connections per server, sending new requests to the least busy one. I had a WebSocket project pushing real-time messages—round-robin caused one server’s memory to explode. Switching to least_conn balanced the load properly.
Another scenario you might know: user logs in, subsequent requests must hit the same server (session stored locally). IP Hash handles this:
upstream backend {
ip_hash;
server 192.168.1.10:8080;
server 192.168.1.11:8080;
}
Same client IP gets hashed to a fixed backend. But honestly, this has flaws—if that server dies, the session is gone. Better approach is Redis for sessions, using ip_hash as a temporary fix.
Fourth is consistent hashing (hash), common for distributed caching:
upstream backend {
hash $request_uri consistent;
server 192.168.1.10:8080;
server 192.168.1.11:8080;
}
Nginx creates 160 virtual nodes per weight unit, hashing request URIs to specific servers. Benefit: high cache hit rate—same URI always hits the same machine.
Weight Configuration: When Machines Have Different Specs
Backend servers with different specs is common. Some have 32GB RAM, 8 CPU cores; others 16GB, 4 cores. Fair round-robin? That wastes the beefier machines.
upstream backend {
server 192.168.1.10:8080 weight=3;
server 192.168.1.11:8080 weight=2;
server 192.168.1.12:8080 weight=1;
}
weight=3 gets triple the requests. Better machines do more work, weaker ones do less—that makes sense.
There’s also backup, standby server:
upstream backend {
server 192.168.1.10:8080;
server 192.168.1.11:8080;
server 192.168.1.12:8080 backup;
}
backup doesn’t participate normally, only activates when the primaries fail. Like a bench player—only plays when starters are out.
Keepalive Connection Pool: The Secret to Doubling Performance
This one gets overlooked. Nginx default behavior: create a new TCP connection to backend per request, close after response. Sounds fine? It’s not.
TCP connection needs three-way handshake to establish, four-way handshake to close. High concurrency, this overhead is brutal. Keepalive connection pool reuses connections, eliminating this cost.
Example config:
upstream backend {
server 192.168.1.10:8080;
keepalive 32; # Each worker keeps 32 idle connections
}
server {
location / {
proxy_pass http://backend;
proxy_http_version 1.1;
proxy_set_header Connection "";
}
}
Two things to note:
keepalive 32sets max idle connections per worker process- Must set
proxy_http_version 1.1andConnection ""—HTTP/1.0 doesn’t support persistent connections
I tested an API service—without keepalive, QPS around 2000. With it, 4000+. Doubling isn’t hype, it’s real.
But don’t set keepalive too high. I once set it to 100 in test environment with only 1 ECS container backend—it got crushed by connection count. Production formula:
keepalive ≈ Total QPS × Avg response time ÷ Worker process count
Say you expect QPS 10000, avg response 50ms, 4 workers:
10000 × 0.05 ÷ 4 = 125
keepalive around 125 makes sense.
Health Checks: Auto-Remove Dead Servers
Nginx open source only has passive health checks—marks unhealthy after request fails:
upstream backend {
server 192.168.1.10:8080;
server 192.168.1.11:8080;
}
server {
location / {
proxy_pass http://backend;
proxy_next_upstream error timeout http_502 http_503 http_504;
proxy_next_upstream_tries 3;
}
}
proxy_next_upstream defines when to retry next server: connection error, timeout, or 502/503/504. proxy_next_upstream_tries 3 means max 3 attempts.
But passive checks have delay—you only discover a dead server after a request fails. If availability matters, NGINX Plus active health checks are better:
upstream backend {
server 192.168.1.10:8080;
server 192.168.1.11:8080;
}
server {
location / {
proxy_pass http://backend;
health_check interval=5s fails=3 passes=2;
}
}
Every 5 seconds, actively send health check requests. 3 consecutive failures marks unhealthy, 2 consecutive successes restores it.
Proxy Buffer: Helper or Trouble Maker
What Buffering Actually Does
Concept: Nginx doesn’t send backend response directly to client—it buffers first.
Why? Client network speed is unpredictable. Backend might output data fast, but if client is slow, Nginx gets stuck waiting. With buffer, Nginx stores response once, then slowly sends to client—backend doesn’t wait, can handle next request sooner.
But buffering costs: memory consumption. Large responses, high concurrency—memory usage gets real.
Three Core Parameters, Understanding Their Relationships
proxy_buffer_size 4k;
proxy_buffers 8 32k;
proxy_busy_buffers_size 64k;
These three confused me at first—similar names, tangled meanings. Had to draw a diagram:
proxy_buffer_size: buffer for response headers, one per requestproxy_buffers: buffer array for response body, format iscount size_eachproxy_busy_buffers_size: buffers currently sending to client, can’t exceed half of total buffers
Example: proxy_buffers 8 32k, total is 8×32k = 256k. proxy_busy_buffers_size 64k, quarter, follows the rule.
When to adjust these?
If backend response headers are huge (lots of cookies), you might see “upstream sent too big header”. Fix: increase proxy_buffer_size:
proxy_buffer_size 16k;
If response bodies are often large (big JSON payloads), increase buffers:
proxy_buffers 16 64k;
Special Cases: Disable Buffering
Sometimes buffering causes problems.
Server-Sent Events (SSE): backend continuously pushes event stream. If Nginx buffers, client gets delayed messages. Config needs buffering off:
location /events {
proxy_pass http://backend;
proxy_buffering off;
proxy_cache off;
proxy_read_timeout 86400s;
}
proxy_read_timeout 86400s (a day) because SSE is long-lived, can’t timeout.
WebSocket: Similar, bidirectional real-time:
location /ws {
proxy_pass http://backend;
proxy_buffering off;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_read_timeout 86400s;
}
Large file upload: Client uploads 1GB, if Nginx buffers everything before forwarding, memory explodes. Disable request buffering:
location /upload {
proxy_pass http://backend;
proxy_request_buffering off;
client_max_body_size 1G;
}
proxy_request_buffering off makes Nginx stream directly—receive and forward simultaneously.
Timeout: The Logic Behind Config Values
Three Timeout Parameters, Each with Its Own Job
proxy_connect_timeout 10s;
proxy_read_timeout 60s;
proxy_send_timeout 60s;
Names look similar? They have distinct roles:
proxy_connect_timeout: Time Nginx waits to establish TCP connection. If backend is slow (network congestion, firewall), exceeding this aborts.proxy_read_timeout: After connection, time Nginx waits for backend data. Interval between two read operations exceeding this is timeout.proxy_send_timeout: Time limit for Nginx sending request body to backend.
Common confusion: proxy_read_timeout isn’t total timeout—it’s interval between reads. If backend takes 5 minutes but sends heartbeats during processing, proxy_read_timeout 60s works. If backend is silent for 5 minutes, need proxy_read_timeout 300s.
Timeout vs 502/504 Relationship
The 3 AM alert taught me one crucial lesson:
- 502 Bad Gateway: Nginx couldn’t connect to backend—service down, port unreachable, firewall blocking
- 504 Gateway Timeout: Nginx connected, but backend took too long to respond
Example: proxy_connect_timeout 10s, if backend takes 15s to accept connection, Nginx returns 502. But if connection establishes fast, backend processes for 2 minutes before responding, with proxy_read_timeout 60s, you get 504.
Timeout Strategies for Different Scenarios
API services: 30-60 seconds usually enough. APIs should respond fast—short timeout catches slow requests:
proxy_connect_timeout 5s;
proxy_read_timeout 30s;
proxy_send_timeout 30s;
File processing: Export reports, generate PDFs might take minutes. Relax timeout:
proxy_connect_timeout 10s;
proxy_read_timeout 300s;
proxy_send_timeout 300s;
Streaming services: Video, WebSocket, SSE—long connections, a day is normal:
proxy_read_timeout 86400s;
502/504 Troubleshooting in Practice
Root Cause Analysis
Cases I’ve encountered:
- Backend actually crashed: process died, port occupied, OOM
- Connection exhaustion: backend connection pool full, Nginx can’t connect
- Timeout too short: like my 3 AM incident—proxy_read_timeout 60s, backend needed 2 minutes
- Firewall/network issues: security group rules missing, iptables blocking
Log Diagnosis Method
First step: always check error_log:
error_log /var/log/nginx/error.log warn;
Common errors:
upstream timed out (110: Connection timed out) while reading response header from upstream
That’s 504, read timeout.
connect() failed (111: Connection refused) while connecting to upstream
That’s 502, connection refused—backend not listening.
Advanced technique: custom log format showing upstream status:
log_format upstream_status '$status $upstream_status $upstream_response_time';
access_log /var/log/nginx/access.log upstream_status;
You’ll see output like 200 200, 200, 502 0.5, 1.2, 3.0—clearly showing which backend returned what status, how long.
Typical Solutions
Scenario 1: Backend slow, frequent 504
Fix: increase proxy_read_timeout, verify backend can actually finish. Don’t just tune Nginx—backend timeout needs to match.
Scenario 2: Connection refused, 502
Fix: check if backend process runs, port listens, firewall allows.
netstat -tlnp | grep 8080
ps aux | grep your_app
Scenario 3: High concurrency, connection exhaustion
Fix: increase backend connection pool limit, or enable Nginx upstream keepalive to reduce connection creation overhead.
Performance Optimization Best Practices
Worker Configuration
Nginx is multi-process. worker_processes sets count, typically equals CPU cores:
worker_processes auto;
auto detects CPU cores. 8-core machine = 8 worker processes.
worker_connections is max connections per worker:
events {
worker_connections 4096;
}
Theoretical max concurrent connections = worker_processes × worker_connections. 8 cores × 4096 = 32768. But actual value depends on system file descriptor limits.
TCP Optimization Trio
sendfile on;
tcp_nopush on;
tcp_nodelay on;
These three combined significantly boost performance:
sendfile on: kernel-level file transfer, bypass user-space bufferstcp_nopush on: with sendfile, batch send packets instead of one-by-onetcp_nodelay on: small packets sent immediately, don’t wait for buffer fill
I tested static file serving—enabling these three boosted throughput 30%+.
Other Optimizations
gzip compression: compress text responses, saves bandwidth:
gzip on;
gzip_types text/plain text/css application/json application/javascript;
gzip_min_length 1024;
File descriptor limits: high concurrency might exhaust. Check system limit:
ulimit -n
If only 1024, increase it. Edit /etc/security/limits.conf:
* soft nofile 65535
* hard nofile 65535
Complete Configuration Example
Production-ready config template:
# Basic config
worker_processes auto;
events {
worker_connections 4096;
multi_accept on;
}
http {
# TCP optimization
sendfile on;
tcp_nopush on;
tcp_nodelay on;
# Keepalive
keepalive_timeout 30;
keepalive_requests 100;
# Buffer config
proxy_buffering on;
proxy_buffer_size 4k;
proxy_buffers 8 32k;
proxy_busy_buffers_size 64k;
# Timeout config
proxy_connect_timeout 10s;
proxy_read_timeout 60s;
proxy_send_timeout 60s;
# gzip
gzip on;
gzip_types text/plain text/css application/json;
upstream backend {
least_conn;
server 192.168.1.10:8080 weight=3;
server 192.168.1.11:8080 weight=2;
server 192.168.1.12:8080 backup;
keepalive 32;
}
server {
listen 80;
location / {
proxy_pass http://backend;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_next_upstream error timeout http_502 http_503 http_504;
proxy_next_upstream_tries 3;
}
# SSE dedicated config
location /events {
proxy_pass http://backend;
proxy_buffering off;
proxy_read_timeout 86400s;
}
}
}
Summary
After all this, three key points:
- Upstream config: pick right load balancing algorithm, enable keepalive pool, configure health checks
- Buffer config: understand three parameters’ relationships, disable for special cases
- Timeout config: understand what each parameter controls, use different strategies per scenario
That 3 AM incident taught me: Nginx config isn’t just filling in parameters. Each has design logic behind it—understanding principles prevents pitfalls.
If you’re new to Nginx, start with defaults, adjust when issues arise—don’t randomly write proxy_read_timeout 60s for production like I did. If you’ve already hit pitfalls, this article should help organize scattered experience into a system.
Finishing this article, I checked my current production config—keepalive 32, proxy_read_timeout 120s, least_conn load balancing. No more 3 AM alerts.
FAQ
Is proxy_read_timeout total timeout or interval between reads?
When should proxy_buffering be disabled?
• Server-Sent Events (SSE): real-time push, buffering causes message delay
• WebSocket: bidirectional real-time, needs streaming
• Large file upload: avoid memory explosion, receive and forward simultaneously
What value should keepalive be set to?
What's the difference between 502 and 504?
Which load balancing algorithm to choose?
• Round-robin (default): stateless services, fair distribution
• least_conn: long connection scenarios (WebSocket, DB pools)
• ip_hash: session persistence needed (temporary fix, Redis is better)
• hash: distributed caching, improves hit rate
How to fix upstream sent too big header?
Why do sendfile + tcp_nopush + tcp_nodelay boost performance?
9 min read · Published on: Mar 30, 2026 · Modified on: Mar 30, 2026
Related Posts
shadcn/ui and Radix: How to Maintain Accessibility When Customizing Components
shadcn/ui and Radix: How to Maintain Accessibility When Customizing Components
Tailwind Performance Optimization: JIT, Content Configuration, and Production Bundle Size Control
Tailwind Performance Optimization: JIT, Content Configuration, and Production Bundle Size Control
Dialog, Sheet, Popover: Accessibility and Focus Management for Overlay Components

Comments
Sign in with GitHub to leave a comment