Environment
- Yugabyte CoreDB - All versions.
Issue
Queries are failing with the following error:
Service unavailable (yb/tserver/tablet_service.cc:257): SST files limit exceeded 58 against (24, 48), score: 0.35422774182913203: 3.854s (tablet server delay 3.854s)
This message is emitted when the number of SST files has exceeded its limit.
Root Cause
Typically, the client is running a high INSERT/UPDATE/DELETE workload, and compactions are falling behind.
These are some reasons that these queries can fail:
- CPU bottleneck
- Disk Bottleneck
- Hot Tablet
- Software bug
- Memory pressure
- Bad query pattern
CPU Bottleneck
- New node? (horizontal scaling)
- Increase instance size? (vertical scaling)
In both cases, it might be necessary to also perform tablet splitting, in order to spread the load out among the nodes.
Disk Bottleneck
- Upgrade storage
- Increase disk count
- New nodes?
- Change instance type? (some clouds limit disk usage by instance class or size)
It might be necessary to also perform tablet splitting, in order to spread the load out among the nodes.
Hot Tablet
- A single tablet is being updated too often
- Query pattern?
- Bad table schema?
- Should this tablet be split by range instead of hash?
- Should this tablet be split by hash instead of range?
- Does this table have a 'bad' primary key?
Depending on the root cause, this might be fixed by splitting the hot tablet, but this may need to be fixed in the application, as this is usually due to a mistake in schema or application design.
Memory Pressure
Disk might not actually be the bottleneck here, but rather because we rely on buffered I/O to ensure fast writes, when the node comes under memory pressure the OS may be forced to start performing synchronous writes. If this is the case, then we need to find a way lower the memory overhead on the node, or scale to a larger instance type.
Bad Query Pattern
- Is the write/read pattern to the table actually reasonable?
- Are updates occurring in multiple steps when they could be consolidated into a single query?
- Is there an unintended consequence of running a stored procedure?
- Is a single transaction doing a lot of work?
- Are there a lot of transaction conflicts?
Resolution
Overview
This message is emitted when the number of SST files has exceeded its limit. Usually, the client is running a high INSERT/UPDATE/DELETE workload and compactions are falling behind.
To determine why this error is happening, you can check the disk bandwidth, network bandwidth, and find out if enough CPU is available in the server.
The limits are controlled by the following YB-TServer configuration flags: --sst_files_hard_limit=48
and --sst_files_soft_limit=24
.
Steps
The CPU spikes are likely to be caused by the compaction thread(s) running, and thus most logically, the compaction fell behind because of too many inserts.
A workaround is to wait a bit to have the compactions consume smaller sst files and added them to bigger ones. The number of sst files is a per tablet limit, and so is --sst_files_(hard|soft)_limit
You can monitor the number of sst files for a tablet by looking at the statistic rocksdb_current_version_num_sst_files
:
The actual issue is that the tserver rejects writes because the number of SST files is too high. That number should come down if the compaction picks up the SST files and joins these.
The compaction works by reading SST files, uncompress the file contents, and then compress it and and write it. That means that logical limiters are IOPS or MBPS limits, but that is too simplistic.
Obviously, IOPS&MBPS are part in this.But if CPU is scarce, it might not get that many CPU slices to run, and compaction is absolutely CPU intensive.
Yugabyte currently is doing IO in a buffered way. That means that excess memory has a huge play in IO speed, and consuming too much memory will force IO that previously could be served from memory might now include doing actual physical IO, leading to a 1000 fold difference. So memory starvation might have a play in this too.
This could occur on a system running with relatively low disk IO.
Comments
0 comments
Please sign in to leave a comment.