Environment
Product: Yugabyte-Platform
Version: all
This should affect on-prem workflow, especially where a universe has previously been created using the same nodes
Issue
While creating a new cluster in-universe and the following error appears under a "failed uptime" health check:
Invalid uptime 238519 247305
Resolution
Overview
The cause of this error is due to the extra running process of tserver and master which are also known as the “Zombie” process
Root Cause
This issue occurs when there are rogue or leftover processes running on the server, creating conflicting time values and causing the health check to have conflicting data.
Tservers may fail to start if the pid file is corrupted.
Steps
- Run the following command
# confirm if the following command returns more than one PID
# here should only be a single tserver or master proccess
ps -C <process_name> -o etimes=
For example:
[yugabyte@yb-1-user-identity-2-n1 ~]$->ps -C yb-tserver -o etimes=
1011
251064
[yugabyte@yb-1-user-identity-2-n1 ~]$->ps -C yb-master -o etimes=
251142
997
As shown above, an example shows 2-yb-master and 2 yb-tserver. There should be only one.
2. Kill the extra process running in yb-master and yb-server:
kill <PID>
3. confirm the output now has a single PID
[yugabyte@yb-1-user-identity-2-n1 ~]$ ps -C yb-tserver -o etimes=
1386513
[yugabyte@yb-1-user-identity-2-n1 ~]$ ps -C yb-master -o etimes=
1387991
Next Steps
If the above actions do not result in a successful Universe creation, please open a ticket with Yugabyte Support.
Other notes
What does etimes means?
etime is elapsed time since the process was started, in the form [[DD-]hh:]mm:ss. etimes ELAPSED elapsed time since the process was started, in seconds
Comments
0 comments
Please sign in to leave a comment.