Data centers are hungry for power, and their appetite is only growing. The demand for data storage and online services has fueled a steady rise in data center power consumption. This trend has been significantly amplified by the recent explosion of generative AI and cryptocurrency mining, which require immense computational resources. By 2026, reports predict a staggering doubling of global data center electricity use [1]. This growth is akin to adding the entire power consumption of a nation like Germany to the world's grid. The situation is particularly acute in the US, which houses a third of the world's data centers. Here, data center power consumption is projected to reach six percent of the nation's total electricity usage by 2026.

Data centers have always been significant consumers of power, but the advent of generative AI has introduced a new level of demand. Generative AI, with its complex algorithms and vast data processing requirements, requires substantial computational power. The hardware required for these operations, especially GPUs (Graphics Processing Units) for machine learning tasks, consumes five to ten times more power than traditional CPUs (Central Processing Units) used for more basic computational tasks such as a Google search. This is evident in tasks such as natural language processing, image generation, and deep learning models that underpin generative AI technologies. The impact of the power demand for AI can be seen in the forecast below of the global data center power demand [2].

The power consumption in data centers can be attributed to several factors, including cooling systems, servers, storage and networking equipment. Much attention has been given to optimizing cooling systems through liquid cooling and server efficiency through advancements in chip design since these areas consume the majority of the energy in data centers. However, networking equipment also presents an opportunity for substantial energy savings. This is where robotic patch panels and optical cross connect switches (OCS) come into play, especially within the context of different types of networks operating within data centers. In fact, the Telescent robotic cross connect system was originally funded by a small business (SBIR) contract from the U.S. Department of Energy with the purpose of addressing the growing energy demand of data centers through more efficient usage of fiber networking.

In traditional data center architectures, such as those running front-end networks for tasks like Google searches, the network typically follows a Clos network design. In a recent paper, Google's use of their OCS has demonstrated a reduction of power usage by the network by 40% [3]. Considering that optical networking equipment has been reported to consume about 7% of a data center's total power, a 40% reduction translates to an overall reduction of approximately 3% in total power demand. For the next generation of hyperscale data centers that may consume 600 MW, this would amount to a savings of approximately 18 MW, enough to power a typical town.

The scenario changes significantly with back-end networks in hyperscale data centers that handle machine learning workloads. Here, the power consumption by GPUs, which are 5 to 10 times higher than traditional CPUs, alters the dynamics of power usage. While the percentage of power consumed by networking in this context may be smaller due to the disproportionate power use by GPUs, the optimization of bandwidth through technologies like the Telescent robotic switch can lead to more efficient workload management. Both Google's OCS and MIT's implementation of the Telescent system have demonstrated a 3x improvement in training workload efficiency [4, 5]. Further information on optimizing optical networks for machine learning workloads can be found in this blog [6].

This improvement does not necessarily mean that a data center will consume a third of its usual power. Rather, it suggests that the same amount of energy could support three times the workload. This efficiency could lead to a scenario where fewer mega data centers are needed, thus reducing the overall strain on power grids.

The challenge of managing the increasing power demand of data centers in the age of generative AI is complex, but not insurmountable. Solutions like Telescent’s robotic patch panels and optical cross connect switches offer promising avenues for significantly reducing power consumption without hindering the progress and deployment of advanced technologies. By optimizing network efficiency and thereby making better use of available power, the tech industry can continue to innovate sustainably, addressing both operational costs and environmental concerns.

Ready to join the fiber revolution? Visit Telescent at www.telescent.com and see how robotic patch panels can help your data center tame the AI power surge.

[1] Global data center electricity use to double by 2026 - IEA report - DCD (datacenterdynamics.com)

[2] AI Datacenter Energy Dilemma - Race for AI Datacenter Space (semianalysis.com)

[3] [2208.10041] Mission Apollo: Landing Optical Circuit Switching at Datacenter Scale (arxiv.org)

[4] [2304.01433] TPU v4: An Optically Reconfigurable Supercomputer for Machine Learning with Hardware Support for Embeddings (arxiv.org)

[5] [2202.00433] TopoOpt: Co-optimizing Network Topology and Parallelization Strategy for Distributed Training Jobs (arxiv.org)

[6] Optimizing Optical Networks for Machine Learning Workloads — Telescent