What We Can Learn from the Energy Efficiency of Supercomputers

Supercomputing was once limited to the realms of research-intensive, scientific tasks such as analysing large amounts of data to solve medical, environmental and infrastructural challenges. 

However, due to events like the pandemic and the subsequent shift to cloud-based technologies such as artificial intelligence (AI) and machine learning (ML), high-performance computing (HPC) – which uses supercomputers and compute clusters to solve advanced computation problems – has started to make its way into the enterprise. 

Just as enterprise cloud computing created new ways for businesses to engage customers and enable a new, more flexible way of working, supercomputing is opening up new possibilities for innovation breakthroughs by accelerating R&D speed and product development by orders of magnitude.   

Naturally, some businesses remain sceptical of HPC and are of the opinion that these technologies won’t become relevant to day-to-day operations for years to come. However, much like hyperconverged infrastructure and virtual desktop infrastructure (VDI), this seemingly futuristic technology is already shaping the future of the enterprise. 

Ultimately, HPC can be used as an indicator of the technologies that will filter into the “civil” space over time; for example, if particular CPUs and GPUs become increasingly deployed in the HPC space, it’s a pretty good sign that they will trickle down into the enterprise and mid-market soon. 

Paying attention to these technologies now can help your organisation stay at the forefront of innovation and remain ahead of its competitors. After all, users in HPC belong in the early adopter customer group and look for the latest and fastest technologies before those technologies are acquired by more cautious enterprises.

Energy Efficiency 

According to IDC, Asia/Pacific is experiencing transformation from core to edge, with the emergence of sub-regional datacenter (clusters/hubs). Interestingly, “about one-third of organizations in Indonesia see sustainability as one of the key considerations when selecting a colocation provider. These organizations identify investments in renewable energy sources and green initiatives as the selection criteria.”

You might think that ‘energy efficiency and ‘supercomputing’ are terms that don’t go hand-in-hand. After all, many of these machines require more than a megawatt of electricity to operate, and annual electricity costs can easily run into millions of dollars.

However, not only are a new generation of supercomputers helping organisations be kinder to the planet due to the fact that they offer impressive performance per watt, but they are also being used to develop the next generation of fuel-efficient products and solutions to help reduce the degree of climate change.

Take Frontier, for example, a supercomputer powered by optimized 3rd Gen AMD EPYC™ CPUs and AMD Instinct™ accelerators to deliver more than 1.5 exaflops of peak processing power. Not only does take the top spot in the latest instalment of the Top500 list, but it also tops the latest Green500 list, which measures supercomputer energy efficiency. 

Whereas the previous top Green500 machine, MN-3 in Japan, delivered 39.38 gigaflops per watt, Frontier – which was made by HPE for the US Department of Energy’s Oak Ridge National Laboratory – achieves 62.68 gigaflops per watt. This means that AMD EPYC™ processors and AMD Instinct accelerators now power some of the most efficient supercomputers in the world. 

Lumi, a pre-exascale machine located at the IT Centre for Science (CSC) in Kajaani, Finland, also ranks as one of the most energy efficient supercomputers in the world with a gigaflops/watts ratio of 51.6. 

The machine utilises similar technology as Frontier with its optimised AMD EPYC™ CPU and four AMD Instinct MI250X accelerators per node. The current performance of Lumi according to the Top500 list is 151 petaflops and has a theoretical peak performance of more than 550 petaflops per second.

However, what makes these machines particularly interesting is the memory-coherent nature of the optimized 3rd Gen EPYC CPU and MI250X GPU. By supporting coherent CPU-GPU memory, whereby one copy of data is processed by the CPU and GPU, less power is used to read/write data from system memory, helping top supercomputers run more efficiently. 

This is a prime example of an innovative technology that, in just a few years’ time, will likely start to appear in the server market. Not only does it mean that CPUs and GPUs won’t need to waste energy working with two datasets, but it also makes life easier for software developers who will be able to write unified code for both CPUs and GPUs. 

What’s more, Lumi also boasts innovative “free cooling technology”, which enables waste heat to be utilised in the district heating network of Kajaani. This technology reportedly will reduce the entire city’s annual carbon footprint by 12,400 tons. 

Cooling data centres can take up to 40% of their total energy consumption, but by using natural airflows in cooling and avoiding the recirculation of warm air, like Lumi, data centre operators can reduce energy use and help cut associated emissions.   

What we can learn

Aptly described by Deloitte, “Sustainability isn’t just about optimizing compute, storage, and applications.” Many organizations, especially cloud providers, are breaking new ground by using environmental, social, and governance (ESG) metrics used to measure a company’s overall sustainability performance—in addition to power consumption—to determine what impact operations have on the environment.” Like AMD, several organisations are reviewing their supply chain and ensuring that partners and other stakeholders are also aligned with ESG goals.

Additionally, there are lessons to be learnt from the innovative technologies being employed by some of the world’s fastest supercomputers, particularly those developed by AMD. 

For example, AMD’s own research has demonstrated that to deliver 1200 VMS, it takes ten AMD EPYC 7713 dual socket servers versus 15 dual socket Intel Xeon Platinum 8380 based servers, translating to reduced power use of approximately 32% and saving an estimated 70 metric tons of greenhouse gas emissions. What’s more, with more cores per socket and more cores per server, it’s easier to fit more processing power into a nimble dual or single-socket server, enabling the reduction of server counts and footprints even further.  

Not only is the company already at the forefront of HPC innovation and energy efficiency, but AMD has even bigger ambitions for the future with a goal to increase energy efficiency for AMD processors and Instinct GPU accelerators by 30x from 2020 to 2025, representing a 2.5x acceleration of the industry trends from 2015-2020, and a 97% reduction in energy use per computation over this period. 

Both business and IT leaders can benefit from paying attention to the latest news in the supercomputer world. Today’s supercomputers go beyond high performance and scale, and pave the way for next-generation computing methods and workloads like AI, driving high power efficiency to help environmental sustainability.

While “supercomputing” may not be a term at the forefront of your organisation’s mind, it’s clear that HPC is fast becoming a go-to tool for modern businesses looking to stay ahead in today’s competitive market. And AMD CPUs behind this technology could shape the future of your IT infrastructure.  

Previous articleAVM’s ‘Cloud-in-a-Box’ now in Generation 2 to offer ‘best of both worlds’ OPEX model
Next article5 Tips to Live a Happier Life

LEAVE A REPLY

Please enter your comment!
Please enter your name here