Skip to main content

FM-KED-007 — 502 Bad Gateway Error (Application Unreachable)

Severity: S1 — Critical
Recovery Class: B — Standard Recovery
Covered by Monthly Support: Yes (known causes only)


Description

Nginx returns a 502 Bad Gateway error because the Django application backend becomes unreachable.

In the majority of observed cases, this is caused by Out Of Memory (OOM) conditions on the virtual machine hosting the application worker, leading to termination of Gunicorn or equivalent application processes.


Typical Symptoms

  • HTTP 502 responses from Nginx
  • Application intermittently unavailable
  • Gunicorn workers restarting or disappearing
  • Kernel logs indicating OOM events

Primary Root Cause

  • Application requests producing excessive memory usage
  • Large datasets loaded into memory
  • Insufficient RAM on the worker virtual machine

When memory limits are exceeded, the operating system terminates the application process, leaving Nginx without a valid upstream.


Diagnostic Checklist

Confirm OOM Condition

dmesg | grep -i oom
journalctl -k | grep -i kill

Check Available Memory

free -h

Verify Application Server Status

systemctl status gunicorn

Recovery Options

Apply one or more of the following, depending on constraints.


Option 1: Reduce Request Scope

  • Apply stricter filters to API requests
  • Limit requested date ranges
  • Reduce number of portfolios, instruments, or entities per request
  • Avoid bulk data retrieval in a single call

This reduces memory pressure at the application level.


Option 2: Increase RAM on Worker Virtual Machine

  • Increase memory allocation on the worker VM
  • Restart application services after resizing
  • Verify stability under previous load

This addresses the issue at the infrastructure level.


Escalation and Unknown Issues

If the issue persists after:

  • request scope reduction, and
  • sufficient memory allocation

then the incident is classified as an unknown issue.

Such cases require investigation, profiling, or architectural analysis and are not covered by the standard monthly support allocation.


Preventive Notes

  • Avoid unbounded API queries
  • Monitor memory usage trends
  • Define safe defaults and limits at API level
  • Prefer asynchronous processing for heavy workloads

Responsibility Boundary

Finmars SCSA provides best-effort diagnostics and guidance for known memory-related causes.
Application design decisions and infrastructure capacity planning beyond documented scenarios require separate analysis.