Is accurate timing the key to optimizing data center performance?

Let's explore the practices and timing architectures that are required to achieve the precision and resilience needed for today's cloud data centers.
Ulrich Kohn
Data centre

Hyperscale data centers host hundreds and thousands of servers connected through a high-performance data network. Such powerful cloud infrastructure provides compute resources for many customers, enabling them to control and operate their enterprises in a reliable and efficient way. 

Hosting of data and applications in core and edge clouds requires precise common time for reasons such as:

  • Distributed compute processes need to be synchronized 
  • Transaction sequence must be maintained
  • Timestamping assures efficient backup and mirroring of huge amounts of data

Today’s timing architectures commonly use network-hosted NTP servers. These NTP solutions are sufficiently accurate and reliable for best-effort timing with microsecond accuracy but might not meet requirements for increasingly resilient and precise synchronization of mission-critical applications. So where do they fall short?

Challenges for data center timing

The current practices for data center timing include various weaknesses that must be addressed:

  • Network-delivered time suffers from delay and delay variations in packet networks and even more from asymmetric delay. Congested links have a further negative impact on the accuracy of packet time protocols. Methods need to be applied which improve the forwarding characteristics of timestamped packets.
  • A network-hosted server might suffer from outages caused by DDoS attacks, attacks on the global DNS system or GNSS issues. Any data center should have independent time sources, using independent network-delivered timing technologies, local clocks with extended holdover, or their own robust GNSS receivers, and preferably a mix of these technologies. 
  • Accurate time must be provided to applications hosted on standard servers in data centers. The IT network in a data center as well as the server architecture might not be optimized for delivery of accurate timing. Hence, the on-site IT architecture needs a way to deliver accurate time to the software applications running on COTS servers. 

Optimized timing architecture for cloud data centers:

While NTP has served us well over the last few decades, this technology has some limitations. The accuracy could significantly be improved by supplying the NTP client from a GNSS-synchronized NTP server at each data center. This however creates an unacceptable vulnerability, as GNSS receivers can be easily compromised by jamming and spoofing attacks. 

Applying cesium atomic clocks in data centers offers a solution. These clocks provide excellent holdover capabilities and can survive even extended unavailability of satellite-delivered timing. An ePRTC solution provides highest accuracy and availability by combining GNSS receivers with atomic cesium clocks in a redundant configuration. A combination of multi-band GNSS receivers (which mitigate atmospheric propagation issues) with ultra-stable cesium atomic clocks provides UTC-traceable time that can be maintained with an accuracy of better than 40ns for more than two weeks. This allows a data center to survive even extended periods of GNSS outages.

With highly accurate timing at core sites, there is a need for a more sophisticated way of delivering synchronization to edge data centers. PTP is a superior time protocol to NTP as it provides significantly higher accuracy as well as sophisticated management options. Packet networks assisting PTP delivery with boundary and transparent clocks functionality can significantly improve timing accuracy.

However, there is still some impact of the packet network behavior on the quality of the synchronization network. An independent timing network using a separate wavelength in the DWDM transport network known as an optical timing channel (OTC) is a sensible way to deliver timing with the highest accuracy. As DWDM technology is used for data center interconnect, this OTC can be applied on an out-of-band wavelength above 1600nm.

A combination of such native optical PTP transport with boundary clock class D functionality provides very precise synchronization, sufficiently accurate for even the most time-sensitive applications.

For delivery of accurate time to server-hosted applications, the servers need to be enhanced with the ability to process highly accurate time. There are different approaches, such as integrating timing features into the server hardware and software, adding NID cards with timing capability or inserting time cards or time modules into open servers. With the Open Compute Project’s (OCP) Time Appliance Project (TAP) initiated by Facebook, the third approach is gaining strong momentum. Pluggable time cards with integrated GNSS receivers and reasonable oscillators offer a perfect way to provide high-accuracy, UTC-traceable timing in close proximity to the software applications running on the server. 

The diagram below compares essential timing technologies. There are pros and cons for each of them. But combining network-delivered with satellite-delivered synchronization and backing it up with cesium atomic clocks is a highly resilient strategy for meeting the timing requirements of even the most demanding data center applications.

The diagram compares essential timing technologies

Precise and resilient data center synchronization

The Oscilloquartz solution portfolio from ADVA includes a rich set of products for any synchronization need. It can be easily applied to NTP architectures, transforming them into a highly robust and accurate PTP solutions, backed up with GNSS receivers and ultra-stable cesium atomic clocks. Some of the relevant products are outlined below.

  • The OSA 5401 SyncPlugTM is a powerful SFP-hosted PTP grandmaster featuring a hardware-based NTP server that supports more than 500,000 NTP clients. While this unit can be plugged into a spare port of a network devices, the OSA 5405 SyncReachTM a stand-alone version for in- and out-door applications featuring a multi-band GNSS receiver with an integrated antenna.
  • The OSA 5400 embedded timing module is designed for commercial off-the-shelf (COTS) switches and servers as applied in data centers. The OSA 5400 SyncModuleTM and OSA 5400.
  • TimeCardTM can easily be inserted into standard COTS servers using internal M.2 or PCIe interfaces. 
  • The shelf-variants OSA 5412/22 excel in scalability, redundancy, and interface options. 
  • The grandmasters OSA 5430/40 can be combined with cesium atomic clocks for highly available, ultra-stable core clock solutions. 

All of these products can be effectively complemented with ADVA’s FSP 3000 optical transport products in optical timing channel applications.

What’s more, the comprehensive Ensemble Sync Director provides management that simplifies the operation of timing networks, combining comprehensive SyncjackTM GNSS-, PTP, and NTP-assurance capabilities with a transparent and easy-to-operate GUI.

Click here for more information on Oscilloquartz's multi-technology PTP and NTP time servers.

Related articles