Exascale Computing & Open Source Solutions

Have you been paying attention to the topic of Exascale Computing? Perhaps you should. Congress recently provided funding to the U.S. Department of Energy (DOE) to develop a new class of supercomputers capable of a quintillion [billion billion] FLoating Point Operations per Second (FLOPS) needed to model nuclear weapons explosions – see 2014 National Defense Authorization Act.

Also, DOE officials laid out a detailed 10-year roadmap for 'exascale' computing at the November 2013 meeting of the Advanced Scientific Computing Advisory Committee in Denver. In addition to weapons research and simulation, exascale computing will be critical for the processing of the increasing number of large scale data sets related to genomics, climate modeling, medical imaging, and much more.  See HPCwire article.

Exascale Computing

A new era in computing is emerging that will see an emerging next generation of machines performing at least 1,000 times faster than today's most powerful supercomputers. By 2020, the first generation of 'exascale' computers are predicted to start going online.

"Exa" is a prefix which stands for quintillion - a billion billion. Thus, 'exascale' computers are capable of performing a quintillion FLoating Point Operations per Second (FLOPS).

Background

As reported in a recent Computerworld article, Hewlett-Packard, IBM and Intel currently dominate the global high performance computing (HPC) market. Based on worldwide ranking of the most powerful supercomputers, HP now has 39% of the systems, IBM, 33%, and Cray, nearly 10%. Many other countries  are busy building their own chips to counter this dominance in the marketplace by U.S. companies.

China, India, Japan, and the European Unions are working hard to challenge the current dominance by the U.S. over this next decade. For example:

• Europe has begun to develop and build an exascale computer system using ARM chips designed by British semiconductor firm Arm Holdings to be delivered by 2020. Europe has already committed to spending the equivalent of $1.6 billion.
• China is also striving to build an exascale computer  system by 2020. Its Tianhe-2 currently holds the title as the world's most powerful supercomputer.
• Japan also hopes to create an exascale computer system by 2020, according to the RIKEN Advanced Institute for Computational Science.  RIKEN is the home of the world's fourth largest system, the Fujitsu's K system.
• The Indian Government has committed over $2 billion to Indian Space Research Organization (ISRO) and the Indian Institute of Science to develop an exaflop supercomputer by 2018.

However, PCWorld recently reported that even as these countries rush to develop exascale computing solutions, Intel is making a small change that could have a big impact on system design with its upcoming Xeon Phi chip. Intel has promised big performance and power improvements with the redesigned chip, code-named Knights Landing, which analysts said could ship by 2015.

Key Challenges & Issues

In a recent interview by ComputerWeekly, Thomas Sterling, Chief Scientist for the Center for Research on Extreme Scale Technology at Indiana University, stated that "some of the key computational challenges that face not just individual companies, but civilization as a whole, will be enabled by exascale computing". These include such areas as climate modeling, controlled fusion energy, analyzing genomic data, medical imaging, space exploration, and more.

However, achieving exascale compute levels may require fundamentally rethinking virtually the entire computation process. One of the biggest problems standing in our way is power usage and efficiency. Other major issues involve new ways to store and analyze large-scale data sets, new operating systems, application software platforms and tools, standards, and much more. In many cases, the best way to overcome these challenges involve global collaboration, sharing knowledge, and the use of open source software and solutions.

'Open' Exascale Solutions

The push to exascale computing will not only take hardware, but the need to also optimize today's operating systems and application software.  Open architecture, open standards , and open source software will play a crucial role. The following are some examples of 'open solutions' and initiatives currently underway related to the development of exascale computing systems.

• According to InformationWeek, Scientists at Argonne National Laboratory are developing a prototype exascale operating system and runtime software capable of running a trillions of calculations per second.  The Argo Project is a collaborative effort with scientists from the Lawrence Livermore and Pacific Northwest national laboratories and several universities. 
Modular Assembly Quality Analyzer and Optimizer (MAQAO) with its associated tools form a software suite co-developed by the Exascale computing research, the Universities of Versailles St Quentin and Bordeaux, designed for performance analysis of HPC applications. In 2013, the complete version of MAQAO for the Xeon PHI was published as 'open source'. A second significant step was taken with the open source release of the Codelet Tuning Infrastructure (CTI)
Arvados is a free software platform for managing, analyzing, and sharing genomic and other large-scale biomedical data sets.  All of the code is AGPLv3, except SDKs which are Apache 2.0.

Also, according to a DOE report on "Tools for Exascale Computing: Challenges and Strategies", based on the ALCF BlueGene experience there is a need to focus is on open source tools going forward. Vendor-only strategies and proprietary solutions have not proven to be very successful.

Exascale Projects, Organizations & Activities

The following are selected links to exascale computing projects organizations, and activities that you might also want to check out:

Advanced Scientific Computing Research (ASCR)
CRESTA Project
DOE Exascale Initiative
European Exascale Software Initiative (EESI)
Exascale Computing Research Center
EXA2CT Project
International Supercomputing Conference
Human Brain Project
Mont-Blanc Project
Parallel Runtime Scheduling & Execution Control (PaRSEC)
RIKEN

If you know of other 'open source' Exascale Computing initiatives, please share them with us.