Chrony Monitoring

What is Chrony?

Chrony is an open source, low-level utility for managing the system clock. It can be used to maintain the accuracy of the computer’s clock across a network, or even in the absence of an internet connection. Chrony is designed to be more accurate and resilient than the traditional utilities such as ntpd, and can adjust the system clock even in the presence of large time offsets and/or network outages. Chrony also offers a number of features such as automatic time synchronization, access control lists, and logging.+

Monitoring Chrony with Netdata

The prerequisites for monitoring Chrony with Netdata are to have Chrony and Netdata installed on your system.

Netdata auto discovers hundreds of services, and for those that aren’t discvovered, you can use manual discovery with a one line configuration. For more information on configuring Netdata for Chrony monitoring please read the collector documentation.

You should now see the Chrony section on the Overview tab in Netdata Cloud already populated with charts about all the metrics you care about.

Netdata has a public demo space (no login required) where you can explore different monitoring use-cases and get a feel for Netdata.

What Chrony metrics are important to monitor?

stratum

The stratum indicates the distance (hops) to the computer with the reference clock. The higher the stratum number the more the timing accuracy and stability degrades.

current_correction

Any error in the system clock is corrected by slightly speeding up or slowing down the system clock until the error has been removed and then returning to the system clock’s normal speed. A consequence of this is that there will be a period when the system clock (as read by other programs) will be different from chronyd\s estimate of the current true time (which it reports to NTP clients when it is operating as a server). The reported value is the difference due to this effect.

root_delay

The total of the network path delays to the stratum-1 computer from which the computer is ultimately synchronised.

root_dispersion

The total dispersion accumulated through all the computers back to the stratum-1 computer from which the computer is ultimately synchronised. Dispersion is due to system clock resolution statistical measurement variations etc.

last_offset

The estimated local offset on the last clock update. A positive value indicates the local time (as previously estimated true time) was ahead of the time sources.

rms_offset

The root mean square (RMS) offset of the system clock from true time. Large offsets may indicate a problem with the clock or network synchronization.

frequency

The frequency is the rate by which the system’s clock would be wrong if chronyd was not correcting it. It is expressed in ppm (parts per million). For example a value of 1 ppm would mean that when the system’s clock thinks it has advanced 1 second it has actually advanced by 1.000001 seconds relative to true time.

residual_frequency

The residual frequency for the currently selected reference source. This reflects any difference between what the measurements from the reference source indicate the frequency should be and the frequency currently being used. The reason this is not always zero is that a smoothing procedure is applied to the frequency.

skew

The estimated error bound on the frequency.

update_interval

The interval between clock updates. Shorter intervals may improve accuracy but may also increase network load.

ref_measurement_time

The time elapsed since the last measurement from the reference source was processed.

leap_status

The current leap status of the source. Statuses include the following:
- Normal - indicates the normal status (no leap second).
- InsertSecond - indicates that a leap second will be inserted at the end of the month.
- DeleteSecond - indicates that a leap second will be deleted at the end of the month.
- Unsynchronised - the server has not synchronized properly with the NTP server.

activity

The number of servers and peers that are online and offline. The following explains the status options:
- Online - the server or peer is currently online (i.e. assumed by chronyd to be reachable).
- Offline - the server or peer is currently offline (i.e. assumed by chronyd to be unreachable and no measurements from it will be attempted).
- BurstOnline - a burst command has been initiated for the server or peer and is being performed. After the burst is complete the server or peer will be returned to the online state.
- BurstOffline - a burst command has been initiated for the server or peer and is being performed. After the burst is complete the server or peer will be returned to the offline state.
- Unresolved - the name of the server or peer was not resolved to an address yet.

Chrony monitoring with Netdata

What is Chrony?

Monitoring Chrony with Netdata