Environment
- Yugabyte Platform - kubernetes install
Issue
Yugabyte Anywhere installation with helm is failing and pod is in CrashLoopBackOff as Postgres container is failing with the below error.
kubectl logs <pod-name> -n <namespace-name> -c postgres
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
The database cluster will be initialized with locale "en_US.utf8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
Data page checksums are disabled.
fixing permissions on existing directory /var/lib/postgresql/data/pgdata ... ok
creating subdirectories ... ok
selecting default max_connections ... 20
selecting default shared_buffers ... 400kB
selecting default timezone ... Etc/UTC
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
Bus error <=========
child process exited with exit code 135
initdb: removing contents of data directory "/var/lib/postgresql/data/pgdata"
- Additional Information: You see below logs in yugaware pod.
#kubectl logs <pod-name> -n <namespace-name> -c yugaware
2022-07-07 11:16:34.363 [info] DefaultDBApi.scala:70 [main] Database [default] initialized at jdbc:postgresql://[::1]:5432/yugaware
2022-07-07 11:16:34.364 [info] HikariCPModule.scala:54 [main] Creating Pool for datasource 'default'
2022-07-07 11:16:35.366 [error] HikariPool.java:567 [main] HikariPool-2 - Exception during pool initialization.
org.postgresql.util.PSQLException: Connection to [::1]:5432 refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.
Resolution
Overview
This issue is hit when huge pages are enabled at the Kubernetes node and not enabled in the Postgres container, So PostgreSQL could not fall back to not using the huge pages. You will see something similar when you describe the Kubernetes node where huge pages are allowed.
kubectl describe node <node-name>
Capacity:
cpu: 80
ephemeral-storage: 314287172Ki
hugepages-1Gi: 80Gi
memory: 394686320Ki
pods: 110
Allocatable:
cpu: 64
ephemeral-storage: 289647057236
hugepages-1Gi: 80Gi
memory: 310697840Ki
pods: 110
- We can solve this issue by allowing hugepages for postgres container in yugaware statefulset
- Get statefulsets using below commands
kubectl get statefulset -n <namespace>
- Example: (You might get different result/names in your environment)
➜ ~ kubectl get statefulset -n yb-platform
NAME READY AGE
yb-platform-yugaware 0/1 2d16h
- To edit yb-platform-yugaware statefulset run the below command:
kubectl edit statefulset yb-platform-yugaware -n yb-platform
- Above command opens a YAML file in vi editor. We need to make below changes in the resources section of the postgres container. (Notice the hugepages-1Gi: 2Gi)
- containerPort: 5432 name: postgres protocol: TCP resources: limits: cpu: 500m hugepages-1Gi: 2Gi memory: 1Gi requests: cpu: 500m hugepages-1Gi: 2Gi memory: 1Gi
- Once you save it, You will get output like below and your pod will run normally.
➜ keys kubectl edit statefulset yb-platform-yugaware -n yb-platform
statefulset.apps/yb-platform-yugaware edited
➜ keys kubectl get pods -n yb-platform
NAME READY STATUS RESTARTS AGE
yb-platform-yugaware-0 4/4 Running 0 2m42s
Comments
0 comments
Please sign in to leave a comment.