Replies: 2 comments
-
@Sehiro thanks for the suggestion. Node-level network metrics would be useful for spotting bottlenecks. I've looked at I'll keep investigating if there's a way to get more granular node network data that would be useful for real-time monitoring. |
Beta Was this translation helpful? Give feedback.
-
@rcourtman thanks for your reply. The Proxmox (native) frontend gives you some metrics under Summary, and as you can choose the same granularity as for rrdata, I'd assume that rrdata will give you the data which is visualized on the Summary page. In the Proxmox frontend, there is no way to see any of these metrics for all nodes on one page. As I get the project, it is all about seeing the current status of the cluster’s health. So having this in mind, only the "hourly" metrics would be of interest. Choosing hourly in the Summary pane of a node shows you quite well how your network is doing. (Even if the metric is updated once a minute only. Wether average or maximum would be the best number to show might also be debatable... If it was possible to get more up-to-date network data, that would even be better :) |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
first of all, thanks for the project and your work you're putting in. It is really awesome to have all relevant metrics at one glance.
Some resources are limited per node, such as CPU, MEM, and Disk space. They can be perfectly seen all at once. One resource, which is also quite limited, is the network I/O. It is quite useful to see net-in and net-out per VM, but the overall "bottleneck" is/are the network interfaces. Especially if you are using Ceph, the net-IO per VM - even cumulated net IO of all VMs - might not tell you anything if your Ceph might utilize 100% of your network interfaces.
I have looked at the API and as I see it, this counter can only be gotten by calling /nodes/{node}/rrddata for each node (maybe I have overlooked something). I realize that this would cause much more traffic than one call to /cluster/resources. For me, it would be worth it ;) I would also use the MAX parameter and not AVERAGE, because I would like to see if my link gets saturated.
For me, it would be great if the net-in and net-out would be set in relation to the max speed (like Mem % of usage). You would have to have the link speed, and for me, it would be sufficient to be able to set a max net speed in Pulse manually (per node?), as I have only one network interface.
I realize that this metric might not be interesting for everyone. It might also not too valuable if you have more than one physical link, as Proxmox is counting total net-in net-out traffic (I think). On the other hand one would get an idea of overall traffic - and this might count for something, too ;) It might point to network problems if this value is higher than expected.
All in all, this would be an important metric to observe, and this is one metric which is quite inconvenient to check in the Proxmox GUI, as you have to select a node, then „Summary," and then scroll down to the Network chart.
Thanks for any consideration ;)
Stefan
Beta Was this translation helpful? Give feedback.
All reactions