Environment
- YugabyteDB Anywhere - 2.20.x, 2024.x
-
Linux Host Operating System Version:
- RHEL 8+ (kernel 4.18)
- Alma 8+ (kernel 4.18)
Issue
tserver nodes crash due to Operating System Out of Memory (OOM), or go unresponsive, requiring reboot of host.
Resolution
Overview
A tserver can crash due to memory conditions for numerous reasons. This article serves as a guide to gather appropriate information to characterize conditions which may contribute to the tservers crashing due to out of memory conditions.
Steps to Identify Excess Resident Set Size (RSS) Memory
1. Provide a support bundle which covers the same time period of the node restarts. See Support Bundle documentation for more information.
2. Provide Prometheus metrics from the cluster across the same time period for offline analysis.
- For versions
2024.2.x
or later, the Prometheus metrics can be gathered as part of the Support Bundle. See Support Bundle for more information. - Alternatively, a stand-alone version of promdump is available for gathering Yugabyte Prometheus metrics.
3. In the event of live issue analysis, run the following PromQL query in the Prometheus UI (Documentation for Prometheus access), over the period of 2 weeks (default metrics retention window):
- For YBA version
2024.2.x
or later, issue the following query, which calculates the amount of excess Resident Set Size (RSS) memory. - The
NODEPREFIX
value can be found on the Nodes tab for the Universe. For example, if there is a node calledyb-prod-appname-n1
, the correct prefix value for this Universe isyb-prod-appname
.
label_replace(
avg_over_time(yb_process_memory_kb{
type="resident", process=~"yb-tserver",node_prefix="NODEPREFIX"}[1m]
) * 1024
- on (exported_instance)
avg_over_time(
{export_type=~"tserver_export",
saved_name="generic_current_allocated_bytes",node_prefix="NODEPREFIX"}[1m])
- on (exported_instance)
avg_over_time(
{export_type=~"tserver_export",
saved_name="tcmalloc_pageheap_free_bytes",node_prefix="NODEPREFIX"}[1m])
, "saved_name", "Tserver_other_RSS", "saved_name", ""
)
- For YBA versions earlier than
2024.2.x
, issue the following query, which gives an approximate percentage of memory which may be reclaimable. - The
NODEPREFIX
value can be found on the Nodes tab for the Universe. For example, if there is a node calledyb-prod-appname-n1
, the correct prefix value for this Universe isyb-prod-appname
.
100 -
(sum by (exported_instance) (
avg_over_time(
{node_prefix="NODEPREFIX",
saved_name=~"node_memory_(Cached|Buffers|MemFree)_bytes"}[1m]
)
or
avg_over_time(
{node_prefix="NODEPREFIX", export_type=~"tserver_export",
saved_name=~"(generic_current_allocated_bytes|tcmalloc_pageheap_free_bytes)"}[1m]
)
or
avg_over_time(
{node_prefix="NODEPREFIX", export_type=~"master_export",
saved_name=~"(generic_current_allocated_bytes|tcmalloc_pageheap_free_bytes)"}[1m]
)
)
/
sum by (exported_instance) (
avg_over_time(
node_memory_MemTotal_bytes{node_prefix="NODEPREFIX"}[1m]
)
)
* 100)
4. Browse to the T-Server admin UI associated with a node which may be affected. Browse to Utilities > Total Memory. And compare two values:
Actual memory used (physical + swap)
vs.
TOTAL: ( MiB) Bytes resident (physical memory used)
The example below shows a comparison of the two values and gaps which can emerge between the two.
5. From each host in the cluster, gather the Transparent Hugepage (THP) settings by running the following command as the OS command line:
find /sys/kernel/mm/transparent_hugepage/ -type f -exec sh -c 'printf "%-70s %s\n" "{}" "$(cat {})"' \;
The output will look similar to the following (with potential differences in reported values):
/sys/kernel/mm/transparent_hugepage/defrag always defer defer+madvise [madvise] never
/sys/kernel/mm/transparent_hugepage/khugepaged/defrag 1
/sys/kernel/mm/transparent_hugepage/khugepaged/scan_sleep_millisecs 10000
/sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none 511
/sys/kernel/mm/transparent_hugepage/khugepaged/pages_to_scan 4096
/sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_swap 64
/sys/kernel/mm/transparent_hugepage/khugepaged/alloc_sleep_millisecs 60000
/sys/kernel/mm/transparent_hugepage/khugepaged/pages_collapsed 167670
/sys/kernel/mm/transparent_hugepage/khugepaged/full_scans 778
/sys/kernel/mm/transparent_hugepage/enabled [always] madvise never
/sys/kernel/mm/transparent_hugepage/use_zero_page 1
/sys/kernel/mm/transparent_hugepage/shmem_enabled always within_size advise [never] deny force
/sys/kernel/mm/transparent_hugepage/hpage_pmd_size 2097152
6. Collect the kernel information of the host by running the following commands, which may vary on the version of linux:
Kernel Version:
uname -a
Additional OS information (may vary with OS):
cat /etc/*release
And/Or the following command:
hostnamectl
7. Contact Yugabyte Support via a ticket to assess the appropriate next steps.
Comments
0 comments
Please sign in to leave a comment.