Exascale computers are needed to solve complex problems related to the DOE’s research priorities, which include enhancing renewable energy resources, developing advanced materials and designing experimental facilities. The goal for exascale systems is to perform 1 billion billion calculations in a second. (There are nine zeroes in 1 billion — 1,000,000,000 — and 18 zeroes in 1 billion billion, the number of calculations per second in an exascale system.)
No single processor could ever compute this fast, so researchers depend on parallel computing to partition calculations across multiple processors working together. Such systems require a complex software stack to provide the instructions that direct each of the processors to solve specific parts of large-scale problems.
“Understanding how to tailor applications to exploit massive parallelism and complex hardware resources within and across nodes is a problem that must be solved if we are to harness the power of exascale systems,” said Mellor-Crummey, professor of computer science and electrical and computer engineering.
Energy consumption is a daunting challenge for those designing future generations of supercomputers. To make exascale supercomputers possible, computer scientists are struggling to balance the need for increased computational power against the need for energy efficiency. The DOE created the Exascale Initiative to meet the growing needs for computing power requirements with the aim of deploying capable exascale computing systems by 2023. To this extent, the DOE’s Exascale Computing Project is funding efforts like Mellor-Crummey’s to develop the technologies necessary to make exascale computing possible.
Mellor-Crummey said his group’s work as part of the exascale computing initiative is focused on extending Rice University’s HPCToolkit software, which is used to analyze application performance around the world on systems ranging from desktops to supercomputers.
“Today, HPCToolkit is used to pinpoint performance and scalability bottlenecks in a diverse set of open science and national security applications,” Mellor-Crummey said. “Our proposal seeks to enhance HPCToolkit to meet the needs of developers working at all levels of the software stack on emerging and future extreme-scale systems.”
Mellor-Crummey and his collaborators at the University of Wisconsin-Madison plan to use the $2 million DOE award to develop a new generation of tools that can cope with the growing complexity of architectures used for extreme-scale computing as well as dramatic changes to the software stack.
“Providing insights into the behaviors of these emerging components will require integration of performance-monitoring interfaces to support observation and interpretation,” he said.
Meeting the goal of a thousandfold increase in processing power (from the current petascale to projected exascale) can be accomplished only by changing approaches to both hardware and software. Mellor-Crummey said, “The extreme parallelism of exascale systems both within and across nodes will require redesign of tool capabilities for monitoring, data collection, data analysis and presentation. Preparing tools for exascale must begin immediately; otherwise, tools will never be able to manage the complexity and scale of the hardware and software of next-generation systems.”