Description
Currently in Phoenix, we create one HBase scan per region to be scanned and submit these scans for execution in-parallel. The HBase scan metrics are collected as sum of scan metrics across all parallel scans. But given the scans are executing in parallel, the sum of scan metrics isn't very helpful as that doesn't let us focus on the slowest scan which will be the reason for whole query slowing down.
We need a way to identify the top N slowest scans so that their metrics can be captured for reporting and debugging the reason behind query execution getting slowed down.
Additionally, with HBASE-29233, we are capturing region hash and RS name in the scan metrics by capturing region level scan metrics. If we can capture the region hash and RS name for the top N slowest scans the we can exactly pin-point the RS running the slowest scans.
Thus, enabling HBASE-29233 along with reporting metrics for top N slowest scans can help track the slowest scan and drill down on reason for it getting slowed.