Switch Language
Toggle Theme

Nginx Performance Tuning: gzip, Caching, and Connection Pool Configuration

Last week, I received an alert that our e-commerce site’s homepage load time had spiked to 4 seconds. Checking Chrome DevTools, I found the HTML file was 120KB, with another 350KB for CSS and JavaScript—all uncompressed original files. Even worse, every request was hitting the backend, with a cache hit rate of only 12%. After working late, I spent 2 hours tuning the Nginx configuration: enabled gzip compression, added proper caching policies, and adjusted connection pool parameters. The next morning, homepage load time had dropped to 1.6 seconds, and backend QPS was nearly cut in half.

Honestly, this kind of problem is all too common. Many people just install Nginx and run it—gzip is disabled by default, caching is minimally configured, and connection limits use default values. As soon as traffic peaks, the server gasps for breath.

This article compiles the Nginx performance tuning configurations I’ve validated in production. Gzip compression can reduce transfer size by 60-80%. With a 95% cache hit rate, backend pressure drops by 90%. Proper connection pool configuration can triple or quadruple concurrency. I’ll lay out the configuration details, pitfalls I’ve encountered, and real test data for each module.

Chapter 1: gzip Compression Configuration — Reducing Transfer Size

Let’s start with why gzip is so important. Imagine your HTML file is 100KB raw. After gzip compression, it might only be 20-25KB. That 75-80KB of saved bandwidth means faster loading for users and lower traffic costs for you.

I first realized this issue on a client project. Over 60% of their users were on mobile, many accessing via 4G networks. Homepage load time was 3-4 seconds, and bounce rate hit 70%. After adding gzip, transfer size dropped 70%, and first-screen load time fell to around 1.5 seconds.

1.1 Basic Configuration: Getting It Running

Nginx’s gzip configuration isn’t complicated—just a few core lines:

http {
    gzip on;
    gzip_vary on;
    gzip_min_length 1000;
    gzip_types text/plain text/css application/json application/javascript text/xml application/xml;
}

gzip on is the switch—no explanation needed. gzip_vary on is crucial. It adds Vary: Accept-Encoding to response headers, telling CDNs and browsers that response content varies based on client compression capability, preventing cache confusion.

gzip_min_length set to 1000 bytes means files smaller than 1KB won’t be compressed. Tiny files offer minimal compression gains and just waste CPU. gzip_types specifies MIME types to compress. By default, only text/html is compressed—you need to add CSS, JS, JSON, and XML.

1.2 Advanced Configuration: Compression Level and MIME Types

Compression level is a balancing act. Nginx’s gzip_comp_level can be set from 1-9. Higher numbers mean better compression but more CPU consumption.

I ran a series of tests:

Compression LevelHTML Compression RateCPU Time (ms)Recommended Scenario
165%2CPU-constrained
472%3Balanced (Recommended)
675%5Bandwidth-constrained (Recommended)
978%12Extreme cases

Honestly, levels 4 and 6 are the best choices for most situations. Level 9 doubles CPU consumption but only adds a few percentage points to compression—not worth it.

Here’s my production gzip configuration:

# gzip compression configuration
gzip on;
gzip_vary on;
gzip_proxied any;
gzip_comp_level 6;
gzip_min_length 1000;
gzip_types
    text/plain
    text/css
    text/xml
    text/javascript
    application/json
    application/javascript
    application/xml
    application/xml+rss
    application/xhtml+xml
    application/x-javascript;
gzip_disable "msie6";

gzip_proxied any is often overlooked. If your Nginx is a reverse proxy and backend responses lack Content-Length headers, compression won’t happen by default. Setting this to any forces compression for all matching responses.

gzip_disable "msie6" is for old IE6 compatibility, which had gzip support issues. IE6 is basically extinct now, so you could remove this line, but I keep it just in case.

1.3 Which File Types Benefit Most?

Based on my real-world tests, compression effects vary significantly by file type:

File TypeOriginal SizeCompressedCompression Rate
HTML100KB20-25KB75-80%
CSS80KB24-28KB65-70%
JavaScript120KB36-42KB65-70%
JSON API50KB20-25KB50-60%
Images/VideoAlready compressedIneffective0-5%
75-80%
HTML File Compression Rate
来源: Production Environment Test Data

Images and videos are already compressed (JPEG, PNG, MP4), so gzip will actually increase their size. Never add image/* or video/* to gzip_types—it’s counterproductive.

I once helped troubleshoot an issue where someone had added image/jpeg to gzip_types. Image sizes actually increased by 3-5%. I’ve made this basic mistake myself—back when I didn’t understand, I wanted to add every MIME type possible.

Chapter 2: Caching Strategy Configuration — Accelerating Static Content

Caching delivers the most direct performance gains. With proper configuration, 95% of requests can be served directly from Nginx without hitting the backend. I’ve seen too many systems where backend servers are sweating profusely while Nginx’s cache sits virtually unused—not because it’s disabled, but because it’s misconfigured.

2.1 proxy_cache vs fastcgi_cache: Which to Choose?

Nginx offers two caching mechanisms:

  • proxy_cache: Caches upstream server responses. For reverse proxy scenarios (Node.js, Python, Go services)
  • fastcgi_cache: Caches FastCGI process responses. For PHP-FPM scenarios

Choose based on your backend stack. For PHP, use fastcgi_cache; for Node.js, Python, or Go, use proxy_cache. Both have nearly identical configuration logic—I’ll use proxy_cache as the example below.

2.2 Complete proxy_cache Configuration

First, define the cache path in the http block:

http {
    proxy_cache_path /var/cache/nginx
                     levels=1:2
                     keys_zone=my_cache:10m
                     max_size=10g
                     inactive=60m
                     use_temp_path=off;
}

Let me explain line by line:

  • levels=1:2: Cache directory hierarchy. 1:2 means two-level directory structure to avoid too many files in single directory
  • keys_zone=my_cache:10m: Cache zone name and metadata memory size. 10m can store about 80,000 cache keys
  • max_size=10g: Total cache size limit. Excess is evicted via LRU algorithm
  • inactive=60m: Cache entries not accessed for 60 minutes will be purged
  • use_temp_path=off: Write directly to cache directory, avoiding temp file move overhead

Then enable it in server or location:

server {
    listen 80;
    server_name example.com;

    location / {
        proxy_pass http://backend;
        proxy_cache my_cache;

        # Cache validity configuration
        proxy_cache_valid 200 302 10m;
        proxy_cache_valid 404 1m;
        proxy_cache_valid any 1m;

        # Cache key design
        proxy_cache_key $scheme$request_method$host$request_uri;

        # Stale-while-revalidate strategy (detailed later)
        proxy_cache_use_stale error timeout http_500 http_502 http_503 http_504;

        # Response header with cache status (for debugging)
        add_header X-Cache-Status $upstream_cache_status;
    }
}

proxy_cache_valid is the core configuration, defining cache duration for different status codes:

  • 200 302 10m: Normal responses cached for 10 minutes
  • 404 1m: 404 errors cached for 1 minute to prevent malicious requests from hammering backend
  • any 1m: Other status codes cached for 1 minute

2.3 Cache Key Design and Invalidation Strategies

The cache key proxy_cache_key determines which requests are considered “identical”. Default is $scheme$proxy_host$request_uri, but I recommend explicitly declaring:

proxy_cache_key $scheme$request_method$host$request_uri;

This way, the cache key includes protocol, request method, hostname, and full URI. If your site supports both GET and POST, or has multiple domains, this configuration is more precise.

Cache invalidation is a headache. Common strategies:

  1. Time expiration: proxy_cache_valid sets time, auto-expires when reached
  2. Active bypass: Use proxy_cache_bypass to skip cache
  3. Cache purge: Commercial Nginx Plus has proxy_cache_purge

I usually use the second approach, controlling via request headers:

# Bypass cache via specific header
proxy_cache_bypass $http_x_nocache;

# Or bypass via specific parameter
proxy_cache_bypass $arg_nocache;

When you need to refresh cache, just add ?nocache=1 or request header X-Nocache: 1.

2.4 Stale-while-revalidate Strategy: Serve Even When Backend Is Down

proxy_cache_use_stale is highly practical. When the backend errors or times out, Nginx can return stale cached content instead of failing directly.

proxy_cache_use_stale error timeout http_500 http_502 http_503 http_504;

Last year on Double 11, our backend had issues during scaling—API service intermittently returned 502s. Fortunately, this stale-while-revalidate configuration was in place. Users barely noticed—cache was a few minutes stale, but content still returned normally. Once the backend recovered, cache auto-updated.

Real-world data comparison:

MetricNo CacheCache HitStale Mode
Response Time150-200ms5-10ms5-10ms
Backend QPS1000500
User ExperienceNormalNormalSlightly slower
95%
Cache Hit Rate
来源: Production Environment Test

When cache hits, response time drops from 200ms to 5-10ms—nearly 20x faster. The gains are immediate and obvious.

Chapter 3: Connection Pool Configuration — Essential for High Concurrency

Gzip and caching solve “how to transmit faster”. Connection pools solve “how to handle more requests”. With default configuration, a single Nginx worker maxes out at 1024 concurrent connections. When traffic surges, it’s simply not enough.

3.1 worker_connections: Calculate Your Limit

Maximum concurrent connections formula:

Max Concurrency = worker_processes x worker_connections

Assuming your server has 8 CPU cores, worker_processes set to 8 (or auto for automatic matching), and worker_connections set to 4096:

Max Concurrency = 8 x 4096 = 32768

This number looks large, but remember: each request typically occupies two connections (client to Nginx, Nginx to backend). So actual concurrent requests handled is about half this number.

Configure in the events block:

events {
    worker_connections 4096;
    use epoll;
    multi_accept on;
}

use epoll is default on Linux—you don’t need to write it explicitly, but doing so makes it clearer. multi_accept on lets a worker accept multiple new connections simultaneously, reducing connection queuing in high-concurrency scenarios.

3.2 Client keepalive: Reuse Connections, Reduce Overhead

TCP connection establishment requires a three-way handshake—significant overhead. keepalive lets client-to-Nginx connections be reused instead of rebuilding for every request.

http {
    keepalive_timeout 65;
    keepalive_requests 1000;
}

keepalive_timeout 65: Connection stays open for 65 seconds, then closes. This value shouldn’t be too large (would occupy excessive server resources) or too small (minimal reuse benefit). 60-75 seconds is a reasonable range.

keepalive_requests 1000: Single connection handles up to 1000 requests. Setting it too low causes frequent disconnections; too high might cause resource leaks. 1000 is a solid value from my testing.

3.3 Upstream keepalive: Backend Connection Pool

Many people don’t know about this configuration, but the effect is obvious. Nginx and backend services can also reuse connections, reducing overhead from frequent TCP establishment.

upstream backend {
    server 127.0.0.1:8080;
    server 127.0.0.1:8081;

    keepalive 64;
    keepalive_timeout 60s;
    keepalive_requests 1000;
}

server {
    location / {
        proxy_pass http://backend;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    }
}

keepalive 64: Maintain a pool of 64 idle connections. Adjust this value based on your backend server count—generally 4-8x the number of backend servers.

proxy_http_version 1.1 and proxy_set_header Connection "" are required. HTTP/1.1 defaults to supporting keepalive, and clearing the Connection header enables connection reuse. Without these two lines, upstream keepalive won’t work.

Real-world comparison:

ConfigurationConnection Establishments/MinuteCPU OverheadRecommended Scenario
No upstream keepalive6000HighLow traffic
keepalive 323000MediumMedium traffic
keepalive 641500LowHigh traffic

After adding upstream keepalive, connection establishment overhead is cut by 50%. This configuration is especially important in high-concurrency scenarios.

I’ve compiled recommended configurations for different scenarios:

ParameterLow Traffic (under 1000 QPS)Medium Traffic (1000-5000 QPS)High Traffic (over 5000 QPS)
worker_processesautoautoauto
worker_connections102420484096
keepalive_timeout606575
keepalive_requests1005001000
upstream keepalive163264

This is just a starting point. Actual tuning requires load testing. I typically use wrk or ab to test, observing connection counts and response time curves to find optimal values.

Chapter 4: Comprehensive Configuration Template — Production Ready

The previous three chapters covered principles. Here’s an integrated configuration template. You can adjust parameters for your actual scenario, but the framework is universal.

# nginx.conf production template

user nginx;
worker_processes auto;

events {
    worker_connections 4096;
    use epoll;
    multi_accept on;
}

http {
    # gzip compression configuration
    gzip on;
    gzip_vary on;
    gzip_proxied any;
    gzip_comp_level 6;
    gzip_min_length 1000;
    gzip_types text/plain text/css text/xml text/javascript
               application/json application/javascript application/xml
               application/xml+rss application/xhtml+xml;
    gzip_disable "msie6";

    # Cache path configuration
    proxy_cache_path /var/cache/nginx
                     levels=1:2
                     keys_zone=my_cache:10m
                     max_size=10g
                     inactive=60m
                     use_temp_path=off;

    # Client connection configuration
    keepalive_timeout 65;
    keepalive_requests 1000;

    # Backend server group
    upstream backend {
        server 127.0.0.1:8080;
        server 127.0.0.1:8081;
        keepalive 64;
        keepalive_timeout 60s;
        keepalive_requests 1000;
    }

    server {
        listen 80;
        server_name example.com;

        location / {
            proxy_pass http://backend;
            proxy_http_version 1.1;
            proxy_set_header Connection "";

            # Cache configuration
            proxy_cache my_cache;
            proxy_cache_valid 200 302 10m;
            proxy_cache_valid 404 1m;
            proxy_cache_key $scheme$request_method$host$request_uri;
            proxy_cache_use_stale error timeout http_500 http_502 http_503 http_504;

            # Debug response header
            add_header X-Cache-Status $upstream_cache_status;
        }
    }
}

Differentiated Configuration for Three Scenarios

E-commerce sites: Homepage and product detail pages change frequently, so cache duration is shorter—10-15 minutes is sufficient. upstream keepalive should be larger because database queries are heavy and backend pressure is high.

API services: Data real-time requirements are high, proxy_cache_valid might only be 1-5 minutes. Gzip works great for JSON—definitely enable it.

Static sites: HTML, CSS, JS rarely change—cache can be set to 1 hour or longer. Gzip compression delivers the biggest gains since static files contain lots of text.

Chapter 5: Common Issues and Troubleshooting

Here are a few high-frequency issues I’ve encountered, with direct solutions.

Q: Gzip compression not working, response header shows no Content-Encoding: gzip

Check three things:

  1. Is gzip on in the correct configuration block (http block)?
  2. Does gzip_types include your response MIME type?
  3. Is response size larger than gzip_min_length?

Test with curl: curl -H "Accept-Encoding: gzip" -I http://your-site.com

Q: Cache hit rate is very low, X-Cache-Status mostly shows MISS

Common causes:

  • Cache key design is unreasonable, every request is considered “different”
  • proxy_cache_valid set too short, cache expires before use
  • Backend response headers have Cache-Control: no-cache or Set-Cookie

Check response headers to confirm no cache-inhibiting directives.

Q: worker_connections insufficient, getting 502 errors

Check Nginx error logs. If you see worker_connections are not enough, concurrency exceeds the limit.

Solutions:

  1. Increase worker_connections value
  2. Check for connection leaks (improper keepalive settings)
  3. Consider adding more servers for load balancing

Q: Memory usage too high, server frequently OOM

Possible causes:

  • proxy_cache_path’s keys_zone set too large
  • Too many cache files, high memory mapping usage
  • keepalive connection pool too large, idle connections occupying resources

Reduce these parameters appropriately, or add more memory to the server.

Final Thoughts

Nginx performance tuning boils down to three core techniques: gzip compression, caching strategy, and connection pool configuration. Get these three modules working together, and your site can load twice as fast with triple or quadruple the concurrency.

Tuning isn’t a one-and-done task. I recommend this path:

  1. Enable gzip first: Smallest change, most direct benefit—done in ten minutes
  2. Configure caching second: Design caching strategy based on business scenario—can be done in a day
  3. Tune connection pools last: Requires load testing validation—best done after traffic stabilizes

After each change, remember to load test and validate results. wrk or ab both work—watch response time, QPS, and error rate changes. Don’t tune by feel—use data.

Here’s a final checklist you can follow:

  • gzip enabled, MIME types fully configured
  • gzip_comp_level set to 4-6, balancing CPU and compression rate
  • proxy_cache_path configured, cache size reasonable
  • proxy_cache_valid set according to business scenario
  • proxy_cache_use_stale stale-while-revalidate configured
  • worker_connections set to 4096 or higher
  • keepalive_timeout set to 60-75 seconds
  • upstream keepalive configured (with HTTP/1.1 and Connection header)
  • Response headers include X-Cache-Status for debugging

That about covers it. If you have questions, leave a comment and I’ll do my best to reply.

Nginx Performance Tuning Process

Complete gzip compression, caching strategy, and connection pool optimization in three steps for production environments

⏱️ Estimated time: 30 min

  1. 1

    Step1: Enable gzip Compression

    Add configuration to http block:

    - gzip on; Enable compression
    - gzip_vary on; Add Vary response header
    - gzip_comp_level 6; Set compression level (recommended 4-6)
    - gzip_min_length 1000; Don't compress files under 1KB
    - gzip_types specify MIME types: text/plain text/css application/json application/javascript
  2. 2

    Step2: Configure proxy_cache

    Two-step configuration:

    Step 1: Define cache path
    - proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=my_cache:10m max_size=10g inactive=60m

    Step 2: Enable cache in location
    - proxy_cache my_cache;
    - proxy_cache_valid 200 10m; Normal responses cached 10 minutes
    - proxy_cache_use_stale error timeout http_502; Configure stale-while-revalidate
  3. 3

    Step3: Optimize Connection Pool Parameters

    Three key configurations:

    - worker_connections 4096; Set in events block
    - keepalive_timeout 65; keepalive_requests 1000; Set in http block
    - upstream keepalive 64; proxy_http_version 1.1; proxy_set_header Connection ""; Backend connection pool config

FAQ

What's the right gzip compression level?
Recommended 4-6. Level 4 offers 72% compression with low CPU usage, suitable for CPU-constrained scenarios. Level 6 offers 75% compression, better bandwidth savings, suitable for bandwidth-constrained scenarios. Level 9 doubles CPU consumption with minimal compression gains—not recommended.
How do I choose between proxy_cache and fastcgi_cache?
Depends on your backend stack: Node.js, Python, Go and other application servers use proxy_cache; PHP-FPM uses fastcgi_cache. Both have nearly identical configuration logic—core parameters are cache path, validity period, and cache key design.
What should I set worker_connections to?
Based on CPU cores and traffic estimates: low traffic (under 1000 QPS) set to 1024; medium traffic (1000-5000 QPS) set to 2048; high traffic (over 5000 QPS) set to 4096. Actual concurrency = worker_processes x worker_connections / 2.
How do I troubleshoot low cache hit rates?
Three-step check: 1. Check if cache key is reasonable, avoid making every request different; 2. Check if proxy_cache_valid duration is too short; 3. Check if backend response headers have Cache-Control: no-cache or Set-Cookie blocking cache.
Why must HTTP/1.1 be configured for upstream keepalive?
HTTP/1.0 doesn't support keepalive by default—you need HTTP/1.1 to reuse backend connections. You also need proxy_set_header Connection "" to clear the Connection header, otherwise Nginx closes the connection and upstream keepalive won't work.
How should cache duration be set for different business scenarios?
E-commerce site homepage and product detail pages change frequently—cache 10-15 minutes. API services have high real-time requirements—cache 1-5 minutes. Static site HTML/CSS/JS rarely changes—cache 1+ hour. Core principle: the faster business changes, the shorter cache duration.

12 min read · Published on: Apr 11, 2026 · Modified on: Apr 11, 2026

Comments

Sign in with GitHub to leave a comment

Related Posts