Environment
YugabyteDB - Core DB
Issue:
- The tserver process crashes with the following FATAL log:
hybrid_clock.cc:169] Too big clock skew is detected: 0.503s, while max allowed is: 0.500sF0418 23:14:41.833551
- Master logs report the below errors:
hybrid_clock.cc:172] Too big clock skew is detected: 1.078s, while max allowed is: 0.500s I0610 12:39:59.924661 48 cluster_balance.cc:311] Total pending adds=1, total pending removals=0, total pending leader stepdowns=0
Resolution:
In order to keep the clocks in sync on the universes, please install NTP
or Chrony
.
To check if clock is in sync using NTP use the command ntpq -p
~]$ ntpq -p remote refid st t when poll reach delay offset jitter ============================================================================== +clock.util.phx2 .CDMA. 1 u 111 128 377 175.495 3.076 2.250 *clock02.util.ph .CDMA. 1 u 69 128 377 175.357 7.641 3.671 ms21.snowflakeh .STEP. 16 u - 1024 0 0.000 0.000 0.000 rs11.lvs.iif.hu .STEP. 16 u - 1024 0 0.000 0.000 0.000 2001:470:28:bde .STEP. 16 u - 1024 0 0.000 0.000 0.000
To check if clock is in sync using Chrony use chronyc sources
See appropriate documentation on using chrony or ntp for your distribution. For convenience, find some documentation below. Remember to confer with your systems team for the appropriate solution in your environment.
Red Hat Enterprise Linux:
ntp documentation
chrony documentation
Root Cause:
The above error indicates the nodes running tserver/master process are having clock skew outside of an acceptable range. Clock skew and clock drift can lead to significant consistency issues and should be fixed as soon as possible. YugabyteDB uses the fail_on_out_of_range_clock_skew
flag in order to govern the behavior for the tserver/master process if clock skew is detected.
fail_on_out_of_range_clock_skew
is set to true
in all Yugabyte releases starting starting in YugabyteDB 2.8
Find the documentation on release versioning here
If fail_on_out_of_range_clock_skew
gflag is set to false then, the tserver/master process will not crash on clock skew. However, it will still log the error messages and if clock skew is not addressed, data inconsistencies may occur.
Comments
0 comments
Please sign in to leave a comment.