Table of Contents
- Environment
- Overview
- Accessing the UIs
-
Part 1: Master UI Pages (Port 7000)
- 1.1 Home —
/ - 1.2 Tablet Servers —
/tablet-servers - 1.3 Tables —
/tables - 1.4 Table Details —
/table?id=<table_uuid> - 1.5 Masters —
/masters - 1.6 Cluster Config —
/cluster-config - 1.7 Replica Info —
/tablet-replication -
1.8 Load Balancer —
/load-distribution - 1.9 Tasks —
/tasks - 1.10 xCluster —
/xcluster - 1.11 Namespaces —
/namespaces - 1.12 Tablet Server Clocks —
/tablet-server-clocks - 1.13 Stateful Services —
/stateful-services
- 1.1 Home —
-
Part 2: TServer UI Pages (Port 9000)
- 2.1 Home —
/ - 2.2 Tables —
/tables - 2.3 Tablets —
/tablets - 2.4 Tablet Details —
/tablet?id=<tablet_id> - 2.5 Tablet Consensus Status —
/tablet-consensus-status?id=<tablet_id> - 2.6 Transactions —
/transactions?id=<tablet_id> - 2.7 Wait Queue —
/waitqueue?id=<tablet_id> - 2.8 RocksDB —
/rocksdb?id=<tablet_id> - 2.9 In-Memory Locks —
/sharedlockmanager?id=<tablet_id> - 2.10 Log Anchors —
/log-anchors?id=<tablet_id> - 2.11 Preparer —
/preparer?id=<tablet_id> - 2.12 Operations —
/operations - 2.13 Remote Bootstraps —
/remotebootstraps - 2.14 Maintenance Manager —
/maintenance-manager - 2.15 Snapshots —
/snapshots - 2.16 xCluster —
/xcluster - 2.17 TSLocalLockManager —
/TSLocalLockManager
- 2.1 Home —
- Part 3: Common Pages (Available on Both Master and TServer)
- Part 4: JSON API Quick Reference
- Part 5: Tips and Best Practices
Environment
- YugabyteDB (YSQL / YCQL)
Overview
Every YugabyteDB cluster exposes two web UIs:
- Master UI — runs on each master node, default port 7000. The leader master serves the authoritative view; follower masters redirect to it automatically.
- TServer UI — runs on each tablet server, default port 9000. Each TServer's UI shows data local to that node.
Both UIs also expose JSON API endpoints (prefixed with /api/v1/) that can be consumed by scripts and automation.
This runbook covers what each page contains, when to use it, and how to combine pages to troubleshoot common issues.
Accessing the UIs
Master UI: http://<master-ip>:7000
TServer UI: http://<tserver-ip>:9000If TLS is enabled for the HTTP endpoint, use https and the configured HTTPS port (default: master 7000, tserver 9000, but may differ if --webserver_port is overridden).
Tip: You can discover all TServer IPs from the Master UI at /tablet-servers, and all Master IPs from /masters.
Part 1: Master UI Pages (Port 7000)
1.1 Home — /
The landing page. Shows the cluster UUID, the master's role (LEADER or FOLLOWER), and navigation links to all other pages.
When to use: Quick sanity check that the master is running and you're connected to the leader.
1.2 Tablet Servers — /tablet-servers
Lists every TServer in the cluster with:
| Column | What It Tells You |
|---|---|
| UUID | Unique identifier for each TServer |
| Host:Port | RPC address |
| Heartbeat Delay | Time since the last heartbeat. Values above a few seconds indicate network issues or an overloaded TServer |
| Status | ALIVE or DEAD |
| Uptime | How long the TServer has been running — helps spot recent restarts |
| RAM Used | Memory consumed by the TServer process |
| Num SST Files | Total SST file count across all tablets on this TServer |
| Total SST File Size | Aggregate on-disk data size |
| Uncompressed SST File Size | Data size before compression |
| Read/Write ops/sec | Current throughput on this TServer |
| Tablet Leaders / Total | Number of tablet leaders hosted vs total tablet replicas |
| Cloud/Region/Zone | Placement information |
When to use:
- Cluster health check: If any TServer shows DEAD status or a heartbeat delay > 10s, investigate that node
- Load imbalance: Compare tablet leader counts across TServers. A large skew may indicate a load balancer issue or a node that recently restarted and hasn't received leaders back
- Capacity planning: Check total SST file sizes to understand data distribution
- Post-restart verification: Confirm the restarted node is back to ALIVE and its tablet count is recovering
1.3 Tables — /tables
Lists all tables in the cluster, grouped into:
- User Tables — application tables and materialized views
- Index Tables — secondary indexes
- Parent Tables (for colocated databases)
- System Tables — internal catalog and transaction tables
Each entry shows table name, table UUID, state (RUNNING, ALTERING, etc.), keyspace, and table type (YSQL, YCQL, SYSTEM).
When to use:
- Verify a table exists after CREATE TABLE or during schema migration
- Check table state — an ALTERING state that persists may indicate a stuck DDL
- Find the table UUID needed for other API calls or yb-admin commands
1.4 Table Details — /table?id=<table_uuid>
The most frequently used troubleshooting page on the Master UI. Shows:
- Schema: Column names, types, and key information
-
Tablets: Every tablet for this table, including:
- Tablet UUID
- Key range (partition start/end)
- State (RUNNING, SPLITTING, NOT_STARTED, etc.)
- Leader TServer and replica locations with their Raft roles (LEADER, FOLLOWER, LEARNER)
- Size information per tablet (SST files, WAL)
Query parameters:
-
id— table UUID -
show_deleted— include deleted/hidden tablets (useful for post-split debugging)
When to use:
-
Find tablet IDs and leaders — needed before checking TServer-level pages like
/waitqueueor/transactions - Investigate tablet splitting: Look for tablets in SPLITTING state or check if child/parent tablets exist
- Detect under-replication: If a tablet has fewer replicas than the replication factor, or a replica is in NOT_STARTED state
- Identify hotspots: Tablets with significantly larger SST files may be receiving disproportionate write load
- Verify placement: Confirm replicas are spread across the expected zones/regions
1.5 Masters — /masters
Lists all master nodes with their UUID, RPC address, role (LEADER/FOLLOWER), and state.
When to use:
- Verify master quorum — all masters should be visible and one must be LEADER
- After a master failover — confirm the new leader is elected and followers are connected
- Identify the leader master — important because some operations (e.g., DDL, load balancing) only run on the leader
1.6 Cluster Config — /cluster-config
Shows the cluster's replication configuration in protobuf text format:
- Replication factor
- Placement information (cloud, region, zone constraints)
- Read replica configuration
- Blacklisted TServers (marked for decommission)
- Encryption-at-rest status
When to use:
- Verify placement rules after a topology change
- Check blacklist — TServers being decommissioned appear here
- Confirm encryption status during security audits
- Troubleshoot placement failures — if tablets cannot find placement for replicas, check that placement constraints match available TServers
1.7 Replica Info — /tablet-replication or Master UI -> Utilities -> Replica Info
Shows the health of tablet replication across the cluster. Identifies:
- Under-replicated tablets (fewer live replicas than the replication factor)
- Tablets with unavailable leaders
When to use:
- After a node failure — check how many tablets lost a replica and whether re-replication is progressing
- During rolling upgrades — monitor for under-replicated tablets before proceeding to the next node
- Alert investigation — if monitoring alerts on under-replication, this page gives the full picture
1.8 Load Balancer — /load-distribution or Master UI -> Utilities -> Load Balancer
This page has three sections:
Section 1: Ongoing Remote Bootstraps
Shows active remote bootstrap (RBS) operations — i.e., tablet data being copied from one TServer to another. Each row contains:
| Column | Description |
|---|---|
| Tablet ID | The tablet being bootstrapped |
| Namespace.Table | The table the tablet belongs to |
| Source TServer UUID | The TServer sending the data |
| Destination TServer UUID | The TServer receiving the data |
| Progress | Current progress of the bootstrap operation |
If this table is empty, there are no active remote bootstraps. During rebalancing or after a node restart, you will typically see entries here as tablets are moved or re-replicated.
Section 2: Last Run Summary
Shows the result of the load balancer's most recent run, split into two sub-tables:
Warnings Summary — warnings generated by the load balancer, grouped by type. Each row shows an example warning message and the count of similar warnings.
Example Warnings Summary output:
| Example Warning | Count |
|---|---|
| Skipping adding replicas for table 000034d4000030008000000000004003: Cannot add replicas. Currently have a total overreplication of 50, when max allowed is 50, overreplicated tablets: 35da74a6..., 16d4a085... |
3 |
In the first warning, the load balancer tried to add replicas (to move tablets for better balance) but cannot because 50 tablets are already over-replicated (temporarily have more replicas than the replication factor) — the maximum allowed by load_balancer_max_over_replicated_tablets. This means the removal step (step 3 of the move sequence) is lagging behind. This is common during large-scale rebalancing (e.g., after adding a new node) and usually resolves on its own as removals catch up. If it persists, check whether TServers are overloaded (slow Raft config changes) or if there are stuck remote bootstraps.
In the second warning, two tablets have replicas in wrong placements but there's no eligible TServer to move them to. Verify the cluster has enough TServers in each required placement zone using /tablet-servers.
Tasks Summary — load balancer tasks grouped by type and state. Each row shows:
| Column | Description |
|---|---|
| Example description | A representative description of the task type and reason for the task |
| Task state | Current state — e.g., kRunning, kComplete, kFailed, kAborted |
| Count | Number of tasks of this type in this state |
| Example status (if complete) | For terminal-state tasks, shows the completion status |
Example Tasks Summary output:
| Example description | Task state | Count | Example status |
|---|---|---|---|
|
AddServer ChangeConfig RPC for tablet 35da74a6... (t1 [id=000034d4...004003]) on peer b73797f7... Reason: Source tserver has more tablets for this table than destination (95 > 20) |
kComplete | 10 | OK |
|
Stepdown Leader RPC for tablet 16d4a085... (t1 [id=000034d4...004003]) on peer be077ef1... Reason: Tablet is over-replicated (this is expected if the tablet is being moved) |
kComplete | 38 | OK |
|
RemoveServer ChangeConfig RPC for tablet 067def28... (t1 [id=000034d4...004003]) on peer f322db17... with cas_config_opid_index 9. Reason: Tablet is over-replicated (this is expected if the tablet is being moved) |
kComplete | 4 | OK |
How to interpret the task types:
The load balancer moves a tablet from one TServer to another in a three-step sequence:
AddServer — adds the destination TServer as a new replica for the tablet. The "Reason" field explains why the move was initiated (e.g.,
Source tserver has more tablets for this table than destination (95 > 20)means TServer A had 95 replicas of that table while TServer B had only 20, so a replica is being moved to balance the load).Stepdown Leader — if the tablet being moved is a leader on the source TServer, the load balancer first steps it down so that a different replica becomes leader. The reason
Tablet is over-replicatedis expected — at this point the tablet temporarily has an extra replica (the new one added in step 1).RemoveServer — removes the old replica from the source TServer, bringing the replica count back to the replication factor.
A healthy rebalancing operation shows all three task types with status OK. If you see kFailed tasks, check the status column for the error — common failures include RPC timeouts (TServer overloaded) or Tablet not found (tablet was deleted or split during the move).
When the load balancer is idle and the cluster is balanced, both tables will be empty. Active entries here indicate the load balancer is working to rebalance the cluster.
Section 3: Tablet Distribution
A matrix showing how tablet replicas and leaders are distributed across TServers for every user table and index:
-
Rows: One row per user table/index, showing keyspace, table name (linked to the
/tabledetail page), and total tablet count - Columns: One column per TServer, identified by host:port and UUID
-
Cells: Each cell shows
Total/Leaders— the total number of replicas of that table on that TServer, and how many of those are leaders
How to read the Tablet Distribution table:
For a well-balanced cluster with replication factor 3 and N TServers, you should see:
-
Total replicas per TServer roughly equal to
(tablet_count × 3) / Nfor each table -
Leaders per TServer roughly equal to
tablet_count / Nfor each table - Significant skew in either column indicates an imbalance the load balancer should address
Example interpretation:
Keyspace | Table | Tablets | TServer-1 | TServer-2 | TServer-3
---------|------------|---------|------------|------------|----------
mydb | orders | 6 | 6/2 | 6/2 | 6/2
mydb | customers | 6 | 6/4 | 6/1 | 6/1
In this example, orders is well balanced (2 leaders each), but customers has a leader imbalance — TServer-1 has 4 out of 6 leaders, which means reads and writes for this table are disproportionately hitting TServer-1.
When to use:
- After adding/removing nodes — verify the load balancer is rebalancing tablets. New nodes should gradually appear in the distribution with increasing replica counts
- Tablet imbalance investigation — if some nodes have significantly more tablets or leaders than others, this table pinpoints exactly which tables are skewed and on which TServers
- Stalled rebalancing — if the Tasks Summary shows no active tasks but the distribution is still uneven, the load balancer may be blocked (check Warnings Summary for the reason)
- Leader imbalance — even if total replicas are balanced, leaders may be skewed. This matters because leaders handle all reads and writes. Look for tables where one TServer has significantly more leaders than others
- Post-failure recovery — after a TServer comes back, monitor this page to see replicas and leaders being redistributed back to it
- Remote bootstrap monitoring — the Ongoing Remote Bootstraps section shows the actual data movement in progress during rebalancing
1.9 Tasks — /tasks
Shows background tasks on the master, such as:
- Tablet creation
- Tablet deletion
- Snapshot operations
- Async flush operations
When to use:
- DDL operations hanging — check if there are stuck tasks
- Post-split cleanup — verify parent tablet deletion tasks are completing
- Snapshot operations — monitor backup/restore progress
1.10 xCluster — /xcluster
Shows xCluster (cross-cluster) replication configuration, including:
- Replication groups
- Tables being replicated
- Replication lag
- Stream status
When to use:
- xCluster replication lag — identify which tables or streams are lagging
- Replication health — verify all streams are active
- Post-failover — confirm replication state after a cluster failover
1.11 Namespaces — /namespaces
Lists all namespaces (databases in YSQL, keyspaces in YCQL) with their UUID and type.
When to use:
- Verify database existence after CREATE DATABASE
- Find namespace UUID for API calls
1.12 Tablet Server Clocks — /tablet-server-clocks
Shows the hybrid clock skew between the master and each TServer.
When to use:
- Clock skew alerts — large skew can cause transaction errors.
- NTP configuration issues — high skew usually points to misconfigured NTP.
1.13 Stateful Services — /stateful-services
Shows stateful services (like transaction status tables, metrics snapshots service).
When to use: Diagnosing issues with internal stateful services.
Part 2: TServer UI Pages (Port 9000)
2.1 Home — /
The landing page for the TServer, showing navigation links and basic server info.
2.2 Tables — /tables
Lists all tables that have tablets on this specific TServer, with per-table aggregates:
- SST file count and total size
- WAL file count and size
- State
- Raft leader count for this table on this TServer
When to use:
- Disk usage investigation — identify which tables consume the most space on this node
- Compare across TServers — large differences for the same table may indicate a hotspot or uneven splitting
2.3 Tablets — /tablets
Lists every tablet replica hosted on this TServer:
- Tablet UUID
- Table name
- Raft role (LEADER, FOLLOWER, LEARNER)
- State (RUNNING, TABLET_DATA_COPYING, NOT_STARTED, SHUTDOWN, FAILED)
- SST file size and count
- WAL size
- Last status
When to use:
- Identify large tablets — candidates for tablet splitting
- Disk breakdown — see exactly which tablets use the most space
2.4 Tablet Details — /tablet?id=<tablet_id>
Detailed view of a single tablet:
- Schema information
- Links to the tablet's consensus status, log anchors, transactions, RocksDB, wait queue, shared lock manager, and preparer pages
When to use: As an entry point to drill into tablet-specific debugging pages.
2.5 Tablet Consensus Status — /tablet-consensus-status?id=<tablet_id>
Shows the Raft consensus state for a specific tablet:
- Current term and role
- Leader UUID
- Committed and received OpId indices
- Pending operations count
- Raft Config
When to use:
- Leader election issues — check if a tablet is stuck in leader election or has no leader
- Replication lag — compare committed vs received OpId across replicas
- Split-brain investigation — verify all replicas agree on the current term and leader
2.6 Transactions — /transactions?id=<tablet_id>
Shows all transaction participants on a specific tablet:
- Transaction ID
- Start time
- Last known status (PENDING, COMMITTED, ABORTED)
- Number of write intents (
next_write_id) - Transaction metadata
When to use:
-
Zombie transaction detection — look for PENDING transactions with a
start_timeolder than 1 hour. These may be zombies (orphaned transactions with no active backend) - Transaction volume — see how many active transactions touch this tablet
- Post-zombie-cleanup verification — after cancelling a zombie, confirm it no longer appears here
2.7 Wait Queue — /waitqueue?id=<tablet_id>
Shows the current transaction wait queue on a tablet. Contains two sections:
Txn Waiters table:
| Column | Description |
|---|---|
| WaiterId | Transaction ID of the waiting (blocked) transaction |
| RequestId | Internal request identifier |
| PgSessionReqVersion | Session request version |
| BlockerId | Transaction ID of the blocking transaction |
Blockers table:
| Column | Description |
|---|---|
| BlockerId | Transaction ID of the blocker |
| Status | PENDING, released, committed, or aborted |
When to use:
- Blocked queries/transactions — identify what is blocking a stuck DML
-
Zombie transaction identification — a PENDING blocker with no corresponding entry in
pg_stat_activityis a zombie - Deadlock investigation — look for circular wait patterns across tablets
Important: The wait queue is transient — entries only appear while a transaction is actively waiting. If a DML times out, the waiter is removed. You must check this page while a query is actively stuck.
2.8 RocksDB — /rocksdb?id=<tablet_id>
Shows RocksDB (DocDB storage engine) details for a specific tablet:
- SST file list with sizes, levels, and key ranges
- Compaction statistics
- Block cache hit rates
- Bloom filter statistics
- Read/write amplification
- RocksDB options in effect
When to use:
- Compaction issues — check if compaction is falling behind (many L0 files) or if post-split compaction is stuck
- Read performance — low block cache hit rates or high read amplification may explain slow reads
- Storage investigation — see the actual SST files and their sizes at each level
- Tombstone accumulation — excessive tombstones (from DELETEs) can degrade read performance
2.9 In-Memory Locks — /sharedlockmanager?id=<tablet_id>
Shows in-memory locks held on a specific tablet.
When to use:
- Lock contention debugging — see what locks are currently held at the tablet level
- Complement to wait queue — provides a different view of lock state
2.10 Log Anchors — /log-anchors?id=<tablet_id>
Shows WAL (Write-Ahead Log) anchors for a tablet. Anchors prevent WAL segments from being garbage collected.
When to use:
- WAL disk usage growing — if WAL files are not being cleaned up, check for stale log anchors
- Slow followers — a lagging follower can anchor old WAL segments on the leader
2.11 Preparer — /preparer?id=<tablet_id>
Shows the transaction preparer state for a tablet.
When to use: Advanced debugging of transaction preparation pipeline issues.
2.12 Operations — /operations
Shows in-flight RPC operations across all tablets on this TServer:
- Operation type (write, read, etc.)
- Duration
- Target tablet
- Client information
Query parameters:
-
raw— return raw text instead of HTML -
include_traces— include detailed trace information for each operation
When to use:
- Slow operations — find long-running operations that may indicate a problem
- Operation type breakdown — understand the read/write mix hitting this TServer
- Stuck operations — operations that have been running for an unusually long time
2.13 Remote Bootstraps — /remotebootstraps
Shows active remote bootstrap operations (tablet data transfer from one TServer to another).
When to use:
- After adding a new node — monitor bootstrap progress
- Tablet re-replication — after a node failure, check bootstrap progress on surviving nodes
- Stuck bootstraps — bootstraps that don't complete may indicate network or disk issues
2.14 Maintenance Manager — /maintenance-manager
Shows maintenance operations (compactions, flushes, WAL cleanup) and their scheduling state.
Query parameters:
-
raw— return raw protobuf text
When to use:
- Compaction scheduling — see which compactions are queued and running
- Resource contention — check if maintenance operations are starved
2.15 Snapshots — /snapshots
Shows snapshot state on this TServer.
When to use: Monitoring backup/restore snapshot operations.
2.16 xCluster — /xcluster
Shows xCluster consumer (replication target) state on this TServer.
When to use: Debugging xCluster replication issues at the TServer level.
2.17 TSLocalLockManager — /TSLocalLockManager
Shows the TServer-level local lock manager state.
When to use: Advanced debugging of object-level locks.
Part 3: Common Pages (Available on Both Master and TServer)
These pages are available on both Master (port 7000) and TServer (port 9000):
3.1 Flags — /varz
Shows all gflags and their current values for the process.
JSON API: /api/v1/varz
When to use:
- Verify flag values — confirm that a configuration change was applied
- Compare flags across nodes — differing flags can cause inconsistent behavior
- Troubleshoot unexpected behavior — check if non-default flags explain the issue
3.2 Memory (Total) — /memz
Shows total memory usage for the process, including:
- Heap size
- Mapped memory
- Process RSS
When to use:
- OOM investigation — check memory before and after the event
- Memory growth — compare current usage against baseline
3.3 Memory (Detail) — /mem-trackers
Shows detailed memory breakdown by tracker (e.g., block cache, tablet overhead, RPC layer, log cache).
JSON API: /api/v1/mem-trackers
When to use:
- Identify memory consumers — find which subsystem is using the most memory
- Block cache tuning — check actual block cache usage vs limit
- Memory leak investigation — track growth of individual trackers over time
- Per-tablet memory — see which tablets consume the most memory
3.4 Logs — /logs
Shows the most recent log entries from the process's log file.
When to use:
- Quick log inspection — when you don't have SSH access to the node
- Correlate with events — check what the process was doing at a specific time
3.5 Metrics — /metrics and /prometheus-metrics
Exposes all internal metrics in JSON (/metrics) or Prometheus (/prometheus-metrics) format. Use Yugabyte Anywhere dashboard to visualize the metrics.
3.6 Drives — /drives
Shows filesystem usage for all data directories.
When to use:
- Disk space issues — check used vs total for each data directory
- Multi-disk setups — verify all configured drives are present and have expected capacity
3.7 TLS Certificates — /tls
Shows TLS certificate details if encryption in transit is enabled.
When to use:
- Certificate expiry — check when certificates will expire
- TLS troubleshooting — verify the correct certificates are loaded
3.8 Version Info — /api/v1/version-info
JSON endpoint returning the build version, git hash, and build timestamp.
When to use:
- Version verification — confirm all nodes run the same version during upgrades
- Bug reporting — include exact version in bug reports
3.9 Profiling — /pprof/...
CPU and memory profiling endpoints (same as Go/C++ pprof):
| Endpoint | What It Does |
|---|---|
/pprof/heap |
Heap memory profile |
/pprof/profile |
CPU profile (blocks for the sampling duration) |
/pprof/growth |
Heap growth profile |
/pprof/contention |
Lock contention profile |
/pprof/symbol |
Symbol resolution |
/pprof/cmdline |
Process command line |
When to use:
- Performance investigation — capture a CPU profile during high-latency periods
- Memory leak hunting — capture heap profiles at intervals and compare
- Lock contention — identify hot locks under high concurrency
Part 4: JSON API Quick Reference
These endpoints return machine-readable JSON and are useful for scripting and automation.
Master JSON APIs
| Endpoint | Description |
|---|---|
/api/v1/tablet-servers |
TServer list with status, metrics |
/api/v1/health-check |
Cluster health status |
/api/v1/tablet-replication |
Tablet replication health |
/api/v1/tablet-under-replication |
Under-replicated tablets |
/api/v1/tables |
All tables (use ?only_user_tables=true for user tables only) |
/api/v1/table?id=<uuid> |
Single table details |
/api/v1/namespaces |
All namespaces/databases |
/api/v1/masters |
Master list with roles |
/api/v1/cluster-config |
Cluster configuration |
/api/v1/is-leader |
Whether this master is the leader |
/api/v1/version |
Build version |
/api/v1/varz |
All flags as JSON |
/api/v1/mem-trackers |
Memory tracker details |
/api/v1/meta-cache |
Metadata cache state |
/api/v1/xcluster |
xCluster replication state |
/api/v1/stateful-services |
Stateful services info |
/dump-entities |
Full cluster entity dump (tables, tablets, TServers) |
TServer JSON APIs
| Endpoint | Description |
|---|---|
/api/v1/health-check |
TServer health status |
/api/v1/version |
Build version |
/api/v1/masters |
Master list (as seen by this TServer) |
/api/v1/tablets |
All tablets on this TServer |
/api/v1/varz |
All flags as JSON |
/api/v1/mem-trackers |
Memory tracker details |
/api/v1/meta-cache |
Metadata cache state |
/api/v1/xcluster |
xCluster consumer state |
Part 5: Tips and Best Practices
Always start at the Master UI. It gives the cluster-wide view. Drill into TServer UIs only once you've identified the specific node and tablet to investigate.
Use JSON APIs for scripting. The
/api/v1/endpoints are stable and return structured data. Prefer them over scraping HTML pages.Use
/dump-entitiesfor full cluster snapshots. This Master endpoint returns every table, tablet, and TServer in one JSON blob — useful for offline analysis or sharing with engineering.
Comments
0 comments
Please sign in to leave a comment.