Environment
- YugabyteDB - YSQL
Issue
Connections are being closed and the following error is seen in the logs:
Could not receive data from client: Connection reset by peer
Resolution
Overview
Though the error message Could not receive data from client: Connection reset by peer
is seen in the YugabyteDB's PostgreSQL logs. This error has nothing to do with YugabyteDB issues. Below are the possible reasons for this error:
- The client application is going away without closing the connection properly.
- Network interruption between the client and the server.
- Intermediate network devices such as load balancers close the connections due to their timeout settings as an idle timeout.
Troubleshooting Steps
- Check the client application logs to see if there are any errors or exceptions that might be causing the client to go down.
- Perform a network test between the client and the server to see if there are any network issues.
- Check the intermediate network devices such as load balancers to see if they are closing the connections due to their own timeout settings. To avoid this, you can increase the timeout settings on the load balancers or enable YSQL to send keep-alive packets to the client.
- To enable backend connection to send keep-alive packets to the client, you can update the
tcp_keepalives_idle
,tcp_keepalives_interval
, andtcp_keepalives_count
parameters. These parameters can be set using theysql_pg_conf_csv
flag. For example:
- To enable backend connection to send keep-alive packets to the client, you can update the
--ysql_pg_conf_csv=tcp_keepalives_idle=250,tcp_keepalives_interval=10,tcp_keepalives_count=9
About the parameters:
-
tcp_keepalives_idle
: Specifies the time in seconds with no network activity after which TCP should send a keepalive packet to the client. If this value is specified without units, it is taken as seconds. A value of 0 (the default) selects the operating system's default (which is usually 7200 seconds). This means if there is no network activity for 2 hours, TCP will send a keepalive packet to the client. -
tcp_keepalives_interval
: Specifies the time in seconds after the last keepalive packet is sent and no response is received before TCP should send another keepalive packet. If this value is specified without units, it is taken as seconds. A value of 0 (the default) selects the operating system's default (which is usually 75 seconds). This means if no response is received after 75 seconds, another keepalive packet will be sent. -
tcp_keepalives_count
: Specifies the maximum number of keepalive packets that can be lost before the connection is considered dead. A value of 0 (the default) means that the operating system's default is used (which is usually 9). This means if 9 keepalive packets are lost, the connection will be considered dead.
So, Lets say if your client is going away after 5 minutes of inactivity, you can set the tcp_keepalives_idle
to 240 seconds (4 minutes) and tcp_keepalives_interval
to 30 seconds. This way, if there is no network activity for 4 minutes, TCP will send a keepalive packet to the client and if no response is received after 30 seconds, another keepalive packet will be sent. If 9 keepalive packets are lost, the connection will be considered dead and if the client is still alive, it will send a response to the keepalive packet and the connection will be kept alive.
Comments
0 comments
Please sign in to leave a comment.