Data centers consume a lot of power.  A single data center can consume as much power as a city and it is estimated that data centers account for ~2% of the total US energy consumption. [1]  While the power usage efficiency has improved over time, especially in hyperscale data centers, the growth in the number of data centers is starting to overwhelm the available power in many data center locations.  As examples, Ireland and Singapore have imposed or are considering moratoriums on new data center construction.  And even in Ashburn Virginia, known as Data Center Alley due to the densest consolidation of data centers in the world, there is a power shortage that is causing developers to scramble. [2]  Data center operators need to find any possible ways to reduce power consumption.

Replacing electrical switches with optical switches is one clear way to reduce power consumption in a data center network fabric.  At a recent SIGGCOMM meeting, Google released details on their use of optical circuit switches to replace the electrical spine layer in their data center fabric.  They claimed that the use of optical switches to replace the electrical spine layer reduced electrical usage by over 30% while also having significant benefits in capital costs and uptime. [3]  

The diagram below taken from the Google presentation at SIGGCOMM shows the initial placement of the optical circuit switches (OCS) between the aggregation blocks and the spine layer.  The initial value of adding the OCS into the DC fabric was to allow automated restriping for data center expansion.  Since the hyperscale data centers can contains tens of thousand of servers, a data center is typically not “turned on” all at once and instead is installed in stages.  This also improves the use of capital, allowing the installed capacity to more closely match demand. [4]  Since the restriping is only performed periodically (on the order of weeks), the electrical spine layer can be removed and replaced entirely with the optical circuit switches.  This creates a significant power and cost savings in the DC network fabric.

There is another benefit of replacing the electric spine layer with the OCS.  Due to the gradual buildout of a data center as well as the eventual refresh of the compute and storage equipment, data centers can operate with different generations of equipment.  Since optical switches are bit-rate agnostic they can be used with multiple generations of transceivers.  This allows both for the latest generation of transceivers to be used as the data center is upgraded and allows efficient use of the mixed bit rate equipment in the network.

To achieve the impressive results described by Google in the papers, they chose to internally develop a MEMS based OCS.  As stated in the papers, “due to the difficulties in maintaining reliability and quality of this solution at scale, the decision was made to internally develop an OCS system.” [3]  This decision was likely made about 10 years ago based on some patent filings by those involved in the development.  However, the choice of a MEMS OCS lead to other development challenges.  The MEMS based OCS has yield and reliability issues, and the Google OCS has 136x136 ports even though the internal MEMS switch is 176x176 – implying only an 80% yield at the MEMS chip level.  The low port count of the MEMS OCS led Google to use bidirectional optics and cWDM to increase the effective ports in the system.  The use of bidirectional optics required the use of an optical circulator in the optical path.  This increased the effective loss in the system, requiring home-run links between aggregation blocks using APC connections to minimize the effect of MPI and return loss issues.  This also led to the development of improved FEC algorithms as well to deal with the poor return loss value of the optical links. 

Telescent has an all-fiber, high port count, low-loss optical switch that makes implementing an optical switch layer in the data center fabric much easier.  The Telescent system consists of a short fiber link between two LC fiber ports with a robot that moves the selected port to the requested new location.  The key element of the Telescent system is the routing algorithm that the robot uses to weave the fiber around over 1,000 other fibers to the new location.  With over 1,000 ports in the Telescent system, it has the equivalent port count to 8 of the Google MEMS switch.  The Telescent system has passed NEBS Level 3 certification and has been used in production networks.  An interesting point of the Telescent system is that today it offers many of the element that the Google paper stated as future development requests of the MEMS approach – including higher port counts, lower loss and better reliability.  Contact Telescent today to learn more about reducing power consumption is data centers through the use of optical circuit switches.  

[1]  The Amount of Data Center Energy Use - AKCP Monitoring
[2]  Ashburn Power Crunch May Cause Delays in Data Center Construction (datacenterfrontier.com)
[3]  Jupiter Evolving: Transforming Google's Datacenter Network via Optical Circuit Switches and Software-Defined Networking – Google Research
[4]  Minimal Rewiring: Efficient Live Expansion for Clos Data Center Networks | USENIX