-
Notifications
You must be signed in to change notification settings - Fork 26
Release 0.6 changes
Paresh Gupta edited this page Jun 11, 2021
·
12 revisions
- Pulls class MgmtEntity to get the FI leadership of primary and subordinate
- Added location in BackplanePortStats
- FI-A and FI-B show their leadership states - Primary or Subordinate
- Fixed the over-reporting of PAUSE frames in Locations dashboard
- Added new use-case for top 10 congested servers
- Edited the links to carry the current time range
- Edited the links to not open the new tab. Use browser functionality (middle-click or right click > open in new tab) if opening in a new tab is required.
- Improved calculation of total FC and Eth traffic on locations dashboard
- Added location filter in top-10 panels on Locations dashboard
- Added UTM version in the locations dashboard
- Removed horizontal bar charts in locations dashboard using Multistat panel. Now, the bar graphs use the native Grafana table gradient bars.
- Because of the above change, the locations dashboard bar graphs offer a compact design with more high-level visualization in less space.
- Fixed - Occasional showing of 0 as the total number of uplink and server ports on Locations dashboard.
- Added new bar charts with domain name, FI ID, and port name for Eth and FC errors on Locations dashboard. Also added the errors from Server ports.
- Error counters now use sum() instead of mean()
- Changes on Ingress Traffic Congestion:
- Renamed the dashboard from Ingress Traffic Congestion to Congestion Monitoring
- Deprecating the Chassis PAUSE frame monitoring dashboard. Migrated the use-cases to Congestion Monitoring dashboard
- Added use-cases for top-10 congested ports, top-10 congested servers, and many more.
- Updated the navigation on the other dashboards
- The top-10 tabular views offer Avg and Peak utilization
- In Domain traffic dashboard, under the row for tabular view of uplink and server ports, added avg, peak, errors, and port speed.
- Using max instead of mean for all graphs.
- The mean calculation flattens any peaks when the traffic is fluctuating. The mean calculation may look nice, but it may mislead by hiding any high link utilization. This behavior is not visible when traffic is constant and the selected time duration is short so that Grafana interval is 1m which is also the default UTM collector polling interval. However, as the duration increases, the Grafana interval increases also, and the mean calculation flattens the peak in a group by time bucket. For example, when the interval is 1m, the max, mean, and last remain the same as the value. But when the interval is 2m, with values 10 and 2, the mean becomes 6 which is flattening the peak of 10. Using max, 10 is used which retains the peak and also retains the severity of the utilization.
- For many single stats display, changing the time range from
time > [[__to]]ms - [[5m_in_ms]]ms and time < [[__to]]ms
to$timeFilter
Pros: Displays the output even if only a single data point is available in the selected timerange. The earlier implementation limits the range range to 5 minutes which may not show any results in some cases Cons: Relatively slower to load. Earlier implementation limits the data points within 5min only regardless of the selected time range. The changed implementation looks through the entire time range. If you select last 30 days, it will look through the entire range (for some metrics, like count, etc.).
- In some cases, a link utilization may be shown as more than 100%. Consider any higher value same as 100%. This behavior happens because of two reasons:
- Clock skews in metric calculation within UCS, collection by UTM, etc. By default, the UTM collector pulls metrics every 60 seconds. To calculate the rate, the difference in bytes is divided by 60. In some cases, if the metrics are collected at 61 seconds, they still get divided by 60 resulting in > 100% reporting. I don't want to artificially hide this reality.
- In some cases, especially FC interfaces, the advertised speed is different from the max throughput capacity of the link. For example, 8G FC port can transmit only 6.8 Gbps. Likewise, 16G FC link can transmit only 13.8 Gbps. UTM, in favor of simplicity, using a fc_data_rate variable set at 0.85 to adjust the % utilization. 0.85 value is close, but not perfect for the different FC speeds. Hence, it reports minor deviations from the exact value. In all such cases, your action plan should change with minor difference. Be it 90%, 100%, or 102%, all such values indicate that you should upgrade or move your workloads to higher capacity servers/links.