AsianScientist (May. 01, 2024) – Though not originally designed to function in tandem, high-performance computing (HPC) and artificial intelligence (AI) have coalesced to become a cornerstone of the digital era, reshaping industry processes and pushing scientific exploration to new frontiers.
The number-crunching prowess and scalability of HPC systems are fundamental enablers of modern AI-powered software. Such capabilities are particularly useful when it comes to demanding applications like planning intricate logistics networks or unravelling the mysteries of the cosmos. Meanwhile, AI similarly enables researchers and enterprises to do some clever workload processing—making the most out of their HPC systems.
“With the advent of powerful chips and sophisticated codes, AI has become nearly synonymous with HPC,” said Professor Torsten Hoefler, Director of the Scalable Parallel Computing Laboratory at ETH Zurich.
A master of stringing various HPC components together—from hardware and software to education and cross-border collaborations—Hoefler has spent decades researching and developing parallel-computing systems. These systems enable multiple calculations to be carried out simultaneously, forming the very bedrock of today’s AI capabilities. He is also the newly appointed Chief Architect for Machine Learning at the Swiss National Supercomputing Centre (CSCS), responsible for shaping the center’s strategy related to advanced AI applications.
Collaboration is central to Hoefler’s mission as a strong AI advocate. He has worked on many projects with various research institutions throughout the Asia- Pacific region, including the National Supercomputing Centre (NSCC) in Singapore, RIKEN in Japan, Tsinghua University in Beijing, and the National Computational Infrastructure in Australia, with research ranging from pioneering deep-learning applications on supercomputers to harnessing AI for climate modeling.
Beyond research, education is also always at the top of Hoefler’s mind. He believes in the early integration of complex concepts like parallel programming and AI processing systems into academic curricula. An emphasis on such education could ensure future generations become not just users, but innovative thinkers in computing technology.
“I’m specifically making an effort to bring these concepts to young students today so that they can better grasp and utilize these technologies in the future,” added Hoefler. “We need to have an education mission—that’s why I’ve chosen to be a professor instead of working in industry roles.”
In his interview with Supercomputing Asia, Hoefler discussed his new role at CSCS, the interplay between HPC and AI, as well as his perspectives on the future of the field.
Q: Tell us about your work.
At CSCS, we’re moving from a traditional supercomputing center to one that is more AI-focused, inspired by leading data center providers. One of the main things we plan to do is scale AI workloads for the upcoming “Alps” machine—poised to be one of Europe’s, if not the world’s, largest open science AI-capable supercomputer. This machine will arrive early this year and will run traditional high-performance codes as well as large-scale machine learning for scientific purposes, including language modeling. My role involves assisting CSCS’s senior architect Stefano Schuppli in architecting this system, enabling the training of large language models like LLaMA and foundation models for weather, climate or health applications.
I’m also working with several Asian and European research institutions on the “Earth Virtualization Engines” project. We hope to create a federated network of supercomputers running high-resolution climate simulations. This “digital twin” of Earth aims to project the long-term human impact on the planet, such as carbon dioxide emissions and the distribution of extreme events, which is particularly relevant for regions like Singapore and other Asian countries prone to natural disasters like typhoons.
The project’s scale requires collaboration with many computing centers—and we hope Asian centers will join to run local simulations. A significant aspect of this work is integrating traditional physics-driven simulations, like solving the Navier-Stokes or Eulerian equations for weather and climate prediction, with data-driven deep learning methods. These methods leverage a lot of sensor data we have of the Earth, collected over decades.
In this project, we’re targeting a kilometer-scale resolution—crucial for accurately resolving clouds which are a key component on our climate system.
Q: What is parallel computing?
Parallel computing is both straightforward and fascinating. At its core, it involves using more than one processor to perform a task. Think of it like organizing a group effort among a group of people. Take, for instance, the task of sorting a thousand numbers. This task is challenging for one person but can be made easier by having 100 people sort 10 numbers each. Parallel computing operates on a similar principle, where you coordinate multiple execution units—like our human sorters—to complete a single task.
Essentially, you could say that deep learning is enabled by the availability of massively parallel devices that can train massively parallel models. Today, the workload of an AI system is extremely parallel, allowing it to be distributed across thousands, or even millions, of processing components.
Q: What are some key components for enabling, deploying and advancing AI applications?
The AI revolution we’re seeing today is basically driven by three different components. First, the algorithmic component, which determines the training methods such as stochastic gradient descent. The second is data availability, crucial for feeding models. The third is the compute component, essential for number-crunching. To build an effective system, we engage in a codesign process. This involves tailoring HPC hardware to fit the specific workload, algorithm and data requirements. One such component is the tensor core.
It’s a specialized matrix multiplication engine integral to deep learning. These cores perform matrix multiplications, a central deep learning task, at blazingly fast speeds.
Another crucial aspect is the use of specialized, small data types. Deep learning aims to emulate the brain, which is essentially a biological circuit. Our brain, this dark and mushy thing in our heads, is teeming with about 86 billion neurons, each with surprisingly low resolution.
Neuroscientists have shown that our brain differentiates around 24 voltage levels, equivalent to just a bit more than 4 bits. Considering that traditional HPC systems operate at 64 bits, that’s quite an overkill for AI. Today, most deep-learning systems train with 16 bits and can run with 8 bits—sufficient for AI, though not for scientific computing.
Lastly, we look at sparsity, another trait of biological circuits. In our brains, each neuron isn’t connected to every other neuron. This sparse connectivity is mirrored in deep learning through sparse circuits. In NVIDIA hardware, for example, we see 2-to-4 sparsity, meaning out of every four elements, only two are connected. This approach leads to another level of computational speed-up.
Overall, these developments aim to improve computational efficiency—a crucial factor given that companies invest millions, if not billions, of dollars to train deep neural networks.
Q: What are some of the most exciting applications of AI?
One of the most exciting prospects is in the weather and climate sciences. Currently some deep-learning models can predict weather at a cost 1,000 times lower than traditional simulations, with comparable accuracy. While these models are still in the research phase, several centers are moving toward production. I anticipate groundbreaking advancements in forecasting extreme events and long-term climate trends. For example, predicting the probability and intensity of typhoons hitting places like Singapore in the coming decades. This is vital for long-term planning, like deciding where to build along coastlines or whether stronger sea defenses are necessary.
Another exciting area is personalized medicine which tailors medical care based on individual genetic differences. With the advent of deep learning and big data systems, we can analyze treatment data from hospitals worldwide, paving the way for customized, effective healthcare based on each person’s genetic makeup.
Finally, most people are familiar with generative AI chatbots like ChatGPT or Bing Chat by now. Such bots are based on large language models with capabilities that border on basic reasoning. They also show primitive forms of logical reasoning. They’re learning concepts like “not cat”, a simple form of negation but a step toward more complex logic. It’s a glimpse into how these models might evolve to compress knowledge and concepts, like how humans developed mathematics as a simplification of complex ideas. It’s a fascinating direction, with potential developments we can only begin to imagine.
Q: What challenges can come up in these areas?
In weather and climate research, the primary challenge is managing the colossal amount of data generated. A single high-resolution, ensemble kilometer-scale climate simulation can produce over an exabyte of data. Handling this data deluge is a significant task and requires innovative strategies for data management and processing.
The shift toward cloud computing has broadened access to supercomputing resources, but this also means handling sensitive data like healthcare records on a much larger scale. Thus, in precision medicine, the main hurdles are security and privacy. There’s a need for careful anonymization to ensure that people can contribute their health records without fear of misuse.
Previously, supercomputers processed highly secure data only in secure facilities that can only be accessed by a limited number of individuals. Now, with more people accessing these systems, ensuring data security is vital. My team recently proposed a new algorithm at the Supercomputing Conference 2023 for security in deep-learning systems using homomorphic encryption, which received both the best student paper and the best reproducibility advancement awards. This is a completely new direction that could contribute to solving security in healthcare computing.
For large language models, the challenge lies in computing efficiency, specifically in terms of communication within parallel computing systems. These models require connecting thousands of accelerators through a fast network, but current networks are too slow for these demanding workloads.
To address this, we’ve helped to initiate the Ultra Ethernet Consortium, to develop a new AI network optimized for large-scale workloads. These are just some preliminary solutions in these areas—industry and computing centers need to explore these for implementation and further refine them to make them production-ready.
Q: How can HPC help address AI bias and privacy concerns?
Tackling AI bias and privacy involves two main challenges: ensuring data security and maintaining privacy. The move to digital data processing, even in sensitive areas like healthcare, raises questions about how secure and private our data is. The challenge is twofold: protecting infrastructure from malicious attacks and ensuring that personal data doesn’t inadvertently become part of training datasets for AI models.
With large language models, the concern is that data fed into systems like ChatGPT might be used for further model training. Companies offer secure, private options, but often at a cost. For example, Microsoft’s retrieval-augmented generation technique ensures data is used only during the session and not embedded in the model permanently.
Regarding AI biases, they often stem from the data itself, reflecting existing human biases. HPC can aid in “de-biasing” these models by providing the computational power needed. De-biasing is a data intensive process that requires substantial computing resources to emphasize less represented data aspects. It’s mostly on data scientists to identify and rectify biases, a task that requires both computational and ethical considerations.
Q: How crucial is international collaboration when it comes to regulating AI?
International collaboration is absolutely crucial. It’s like weapons regulation—if not everyone agrees and abides by the rules, the regulations lose their effectiveness. AI, being a dual-use technology, can be used for beneficial purposes but also has the potential for harm. Technology designed for personalized healthcare, for instance, can be employed in creating biological weapons or harmful chemical compounds.
However, unlike weapons which are predominantly harmful, AI is primarily used for good—enhancing productivity, advancing healthcare, improving climate science and much more. The variety of uses introduces a significant grey area.
Proposals to limit AI capabilities, like those suggested by Elon Musk and others, and the recent US Executive Order requiring registration of large AI models based on compute power, highlight the challenges in this area. This regulation, interestingly defined by computing power, underscores the role of supercomputing in both the potential and regulation of AI.
For regulation to be effective, it absolutely must be a global effort. If only one country or a few countries get on board, it just won’t work. International collaboration is probably the most important thing when we talk about effective AI regulation.
—
This article was first published in the print version of Supercomputing Asia, January 2024.Click here to subscribe to Asian Scientist Magazine in print.
Copyright: Asian Scientist Magazine.
Disclaimer: This article does not necessarily reflect the views of AsianScientist or its staff.