Troubleshooting Load Balancer Backend Server Issues
Learn about backend server issues associated with load balancers.
Debugging a Backend Server Timeout
When the backend server exceeds the response time when responding to a request, a 504
error occurs indicating that the backend server is either down or not responding to the
request forwarded by the load balancer. The client application receives the following
response code: HTTP/1.1 504 Gateway Timeout
.
Errors can occur for the following reasons:
-
The load balancer failed to establish a connection to the backend server before the connection timeout expired.
-
The load balancer established a connection to the backend server but the backend did not respond before the idle timeout period elapsed.
-
The security lists or network security groups for the subnet or the VNIC did not allow traffic from the backends to the load balancer.
-
The backend server or application server failed.
Follow these steps to troubleshoot the backend server timeout errors:
-
Use the
curl
utility to directly test the backend server from a host in the same network.curl -i http://backend_ip_address
If this test takes longer than one second to respond, an application-level issue is causing latency. Oracle recommends that you check any upstream dependencies that might cause latency, including:-
Network attached storage such as iSCSI or NFS
-
Database latency
-
An off-premise API
-
An application tier
-
-
Check the application by accessing it directly from the backend server. Check its access logs to determine if the application can be accessed and is functioning properly.
-
If the load balancer and the backend server are in different subnets, then check whether the security lists contain rules to allow traffic. If no rules exist, then traffic is not allowed.
-
Enter the following commands to determine whether firewall rules exist on the backend servers that block traffic:
iptables -L
lists all firewall rules enforced byiptables
sudo firewall-cmd --list-all
lists all firewall rules enforced byfirewalld
-
Enable logging on the load balancer to determine whether the load balancer or the backend server is causing the latency.
Testing TCP and HTTP Backend Servers
This topic describes how to troubleshoot a load balancer connection. The topology used in this procedure has a public load balancer in a public subnet and the backends are in the same subnet.
Oracle recommends that you use the Oracle Cloud Infrastructure Logging service to troubleshoot issues. (See Details for Load Balancer Logs.)
In addition to using Oracle Cloud Infrastructure logging, however, you can use other utilities listed in this section to troubleshoot the traffic that is processed by the load balancer and sent to a backend. To perform these tests, Oracle recommends that you create an instance in the same network as your load balancer and allow the traffic in the same network security groups and security lists. Use the following tools to troubleshoot:
-
ping
Before using the more advanced utilities listed here, Oracle recommends that you perform a basicping
test. For this test to succeed, you must allow ICMP traffic between the test instance and the backend.$ ping backend_ip_address
The response should look similar to:PING 192.0.2.2 (192.0.2.2) 56(84) bytes of data. 64 bytes from 192.0.2.2: icmp_seq=1 ttl=64 time=0.028 ms 64 bytes from 192.0.2.2: icmp_seq=2 ttl=64 time=0.044 ms
If you receive a message that contains "64 bytes from...", then the ping succeeded.
Receiving a message that contains "Destination Host Unreachable" indicates that the system does not exist.
Receiving no message indicates that the system exists but the ICMP protocol is not allowed. Check all firewalls, security lists, and network security groups to ensure ICMP is allowed.
-
curl
Use the
curl
utility to send HTTP requests to a specific host, port, or URL.-
The following example shows using
curl
to connect to a backend that is sending a403 Forbidden
error:$ curl -I http://backend_ip_address/health HTTP/1.1 403 Forbidden Date: Tue, 17 Mar 2021 17:47:10 GMT Content-Type: text/html; charset=UTF-8 Content-Length: 3539 Connection: keep-alive Last-Modified: Tue, 10 Mar 2021 20:33:28 GMT ETag: "dd3-5b3c6975e7600" Accept-Ranges: bytes
In the preceding example, the health check fails, returning a
403
error, indicating that the backend does not have local file permissions configured properly for the Health check page. -
The following example shows using
curl
to connect to a backend that is sending a404 Not Found
error:$ curl -I http://backend_ip_address/health HTTP/1.1 404 Not Found Date: Tue, 17 Mar 2021 17:47:10 GMT Content-Type: text/html; charset=UTF-8 Content-Length: 3539 Connection: keep-alive Last-Modified: Tue, 10 Mar 2021 20:33:28 GMT ETag: "dd3-5b3c6975e7600" Accept-Ranges: bytes
In the preceding example, the health check fails, returning a
404
error, indicating that the Health check page does not exist in the expected location. -
The following example shows a backend that exists and either a network security group, the security lists, or a local firewall is blocking the traffic:
$ curl -I backend_ip_address curl: (7) Failed connect to backend_ip_address:port; Connection refused
-
The following example shows a backend that does not exist:
$ curl -I backend_ip_address curl: (7) Failed connect to backend_ip_address:port; No route to host
-
-
Netcat
Netcat is a networking utility for reading from and writing to network connections using TCP or UDP.
-
The following example shows using the
netcat
utility at the TCP level to ensure that the destination backend server can receive a connection:$ nc -vz backend_ip_address port Ncat: Connected to backend_ip_address:port.
In the preceding example,
port
is open for connections. -
$ nc -vn backend_ip_address port Ncat: Connection timed out.
In the preceding example,
port
is closed.
-
-
Tcpdump
Use the
tcpdump
utility to capture all traffic to a backend to ensure which traffic is coming from a load balancer and what is being returned to the load balancer.sudo tcpdump -i any -A port port src load_balancer_ip_address 11:25:54.799014 IP 192.0.2.224.39224 > 192.0.2.224.80: Flags [P.], seq 1458768667:1458770008, ack 2440130792, win 704, options [nop,nop,TS val 461552632 ecr 208900561], length 1341: HTTP: POST /health HTTP/1.1
-
OpenSSL
When troubleshooting SSL issues between the load balancer instance and the backend servers, Oracle recommends using the
openssl
utility. This utility opens an SSL connection to a specific host name and port, and prints the SSL certificate and other parameters.Other options for troubleshooting issues are:-
-showcerts
This option prints all certificates in the certificate chain presented by the backend server. Use this option to identify issues, such as a missing intermediate certificate authority certificate.
-
-cipher cipher_name
This option forces the client and server use a specific cipher suite and helps to rule out whether the backend is allowing specific ciphers.
-
-
Netstat
Use the
netstat -natp
command to ensure that the application running on the backend server is up and running. For TCP or HTTP traffic, the backend application, IP address, and port must all be in listen mode. If the application port on the backend server is not in listen mode, then the TCP port of the application is not up.To resolve this issue, ensure that the application is up and running by either restarting the application or the backend server.