HPC, or high-performance computing, is one of the big focus areas for semiconductors (along with mobile, automotive and IoT). The highest performance computing of all are the supercomputers. Whereas the computers we have on our laptops, smartphones, and even server farms are measured in thousands of MIPS, supercomputers are measured in thousands of teraflops, also known as petaflops. They are literally millions of times faster than everyday computers. It has been hard to define what is and is not a supercomputer since the definition changes over time. After all, your smartphone would be considered a supercomputer if you go back just a couple of decades. Today, the working definition of a supercomputer is that it must be one of the top 500 fastest computers in the world, meaning that the system appears on the top 500 list . The Top 500 List Origins The top 500 list is published twice a year. You might think that the list wouldn't change much in six months, but you'd be wrong. For example, to make the list all the way back in June (and, yes, I mean June 2017) you needed a system with 432 teraflops. By November, you needed 548 teraflops. The list has been running for 25 years, at a twice-per-year cadence, meaning that that last month's list, published at the 2017 Supercomputing Conference SC17, is the 50th. The list was presented for the first time as a one-off at the Mannheimer Supercomputer Seminar (no prizes for guessing that it was held in Mannheim Germany). It was the start of having a passing grade for a supercomputer and to grade on a curve instead. Erich Strohmaier and Professor Hans Meuer compiled the first list. They picked 500 as a number since they had a pretty good idea that there were several hundred supercomputers around, but not 1000. They also decided to use actual benchmarks, which meant that systems that didn't work, or performed way below their theoretical limit, would be correctly graded. Strohmaier created a little database on his computer. On that first list from June 1993, the top position was a Thinking Machines (remember them?) CM-5 at Los Alamos National Laboratory. It had 1,024 processors and delivered 59.7 gigaflops running the Linpack benchmark. Thinking Machines had five of the top ten computers on that original list. Five months later Meuer and Strohmaieir decided to recalculate the list for the 1993 Supercomputing Conference. Much of what has happened since has been changed very little. The list is still 500 systems, the Linpack benchmark is used as the basis for ranking, the list appears twice a year, in June, and at the November Supercomputing Conference. The Current Top Ten Rank System Cores Rmax (TFlop/s) Rpeak (TFlop/s) Power (kW) 1 Sunway TaihuLight - Sunway MPP, Sunway SW26010 260C 1.45GHz, Sunway , NRCPC National Supercomputing Center in Wuxi China 10,649,600 93,014.6 125,435.9 15,371 2 Tianhe-2 (MilkyWay-2) - TH-IVB-FEP Cluster, Intel Xeon E5-2692 12C 2.200GHz, TH Express-2, Intel Xeon Phi 31S1P , NUDT National Super Computer Center in Guangzhou China 3,120,000 33,862.7 54,902.4 17,808 3 Piz Daint - Cray XC50, Xeon E5-2690v3 12C 2.6GHz, Aries interconnect , NVIDIA Tesla P100 , Cray Inc. Swiss National Supercomputing Centre (CSCS) Switzerland 361,760 19,590.0 25,326.3 2,272 4 Gyoukou - ZettaScaler-2.2 HPC system, Xeon D-1571 16C 1.3GHz, Infiniband EDR, PEZY-SC2 700Mhz , ExaScaler Japan Agency for Marine-Earth Science and Technology Japan 19,860,000 19,135.8 28,192.0 1,350 5 Titan - Cray XK7, Opteron 6274 16C 2.200GHz, Cray Gemini interconnect, NVIDIA K20x , Cray Inc. DOE/SC/Oak Ridge National Laboratory United States 560,640 17,590.0 27,112.5 8,209 6 Sequoia - BlueGene/Q, Power BQC 16C 1.60 GHz, Custom , IBM DOE/NNSA/LLNL United States 1,572,864 17,173.2 20,132.7 7,890 7 Trinity - Cray XC40, Intel Xeon Phi 7250 68C 1.4GHz, Aries interconnect , Cray Inc. DOE/NNSA/LANL/SNL United States 979,968 14,137.3 43,902.6 3,844 8 Cori - Cray XC40, Intel Xeon Phi 7250 68C 1.4GHz, Aries interconnect , Cray Inc. DOE/SC/LBNL/NERSC United States 622,336 14,014.7 27,880.7 3,939 9 Oakforest-PACS - PRIMERGY CX1640 M1, Intel Xeon Phi 7250 68C 1.4GHz, Intel Omni-Path , Fujitsu Joint Center for Advanced High Performance Computing Japan 556,104 13,554.6 24,913.5 2,719 10 K computer, SPARC64 VIIIfx 2.0GHz, Tofu interconnect , Fujitsu RIKEN Advanced Institute for Computational Science (AICS) Japan 705,024 10,510.0 11,280.4 12,660 The top two of the list are both Chinese. The most powerful computer in the world is (still, for the 4th time in a row) TaihuLight, developed by China’s National Research Center of Parallel Computer Engineering & Technology (NRCPC), and installed at the National Supercomputing Center in Wuxi (Taihu is a lake outside Wuxi, hu = lake). Second is Tianhe-2 (tian he = sky river in Chinese, the name for the Milky Way, so it is also known as Milkyway-2), developed by China’s National University of Defense Technology (NUDT) and deployed at the National Supercomputer Center in Guangzhou. The third is somewhat of a surprise, Switzerland, with Piz Daint Piz Daint, which is a Cray XC50 system installed at the Swiss National Supercomputing Centre (CSCS) in Lugano, and obviously also the most powerful supercomputer in Europe. I was surprised reading this that Cray still exists as a company and that it is still in business at the very high end of computing performance. It was famous in the late 1970s for the Cray-1, described as the "most expensive love-seat in the world" due to its construction with a central cylinder of electronics and the power supplies around the outside, with cushions on top. In fact, Cray is very strong, delivering about 20% of all the computational power of the entire top 500 list. The most powerful system in the US is Titan, another Cray system, this one a five-year-old XK7, installed at the Department of Energy’s (DOE) Oak Ridge National Laboratory (ORNL). Obviously, also the largest system in the US. It is based on AMD Opteron CPUs and NVIDIA K20x GPUs. At the very high end of the performance listing, there has been some commentary that the numbers are just for show, building systems that run the Linpack benchmark very well but have an unbalanced architecture for real work. Linpack mostly measures raw CPU performance and doesn't stress the memory and networks systems so much. The TOP500 list now incorporates the high-performance conjugate gradient (HPCG), which is a more representative workload in terms of data access. The highest performance system on this benchmark is Fujitsu's K computer, which is tenth on the overall listing. Leader TaihuLight drops to fifth on this test, although China's Tianhe-2 is only slightly behind Futitsu in second place. Summit Next year, the US should regain the #1 spot with Summit, under construction at ORNL. It is due to come online in summer of 2018. It is planned to have 200 petaflops peak performance, which is over twice the performance of Taihulight, the current #1 at 93 petaflops. It is based on IBM Power9 processors and NVIDIA's latest GPU, the V100. You won't be surprised to know a lot of work is going on in machine learning, as it is everywhere from chips up to the biggest datacenters. This is seen as being one of the most important areas of supercomputing over the next decade. The current status of the system, according to TOP500, is: The installation of Summit appears to be proceeding at a steady pace. All of the cabinets are installed, and most of the interconnects are now wired. NVIDIA has been shipping the V100 GPUs for awhile now, so there shouldn’t be any holdup on the accelerator side. And even though IBM hasn’t officially launched the Power9 processors, the company expects they will be in production before the end of the year. Silicon What's under the hood? The above chart shows the manufacturers of the systems in the top 500 over time. Another name I was surprised to see was Bull, a computer company in France that we did a lot of work with at VLSI/Compass twenty years ago but that I thought had gone out of business. But it seems that, after many ownership changes and rolling up the rest of the French computer industry, it is still headquartered in Les Clayes Sous Bois, a suburb of Paris that I visited many times in the late 1980s and early 1990s. IBM, the orange at the bottom, used to have about 50% market share but that is now more like 5%, mostly BlueGen/Q supercomputers, most of the rest of their installed base having aged out of the top 500. The growing light blue "others" band mostly reflects the growth of the Chinese who are not using traditional enterprise computer suppliers like IBM, HPE/SGI, Oracle/Sun, Fujitsu and so on. 471 systems out of 500 (94%) use Intel processors, slightly up from June. IBM Power processors are down from 21 to 14 in the last six months (but don't forget the upcoming Summit is Power-based). The Chinese systems use a local Chinese processor called SW26010 (the SW stands for Sunway, the name of the architecture). TaihuLight uses 40,960 of them (which is a strange number, 10 times a power of 2, sort of hybrid decimal-binary). As to networking, out of the 500 systems, 204 use 10G Ethernet, another 21 use Gigabit (1G) Ethernet, and 178 use Infiniband. The rest are proprietary network architectures. I took a look at operating systems. There is a menu item to allow you to select on their interactive table, but the only choice is Linux. What started in Linus Torvald's bedroom in 1991 has achieved world domination (especially given that Android is a Linux derivative). The power of the systems varies enormously. Sometimes, in semiconductors, we are looking at IoT devices with milliwatt power requirements or energy harvesting in microwatts. The top two systems, TaihuLight and Tianhe-2, use 15 and 18 megawatts respectively. The Swiss Piz Daint only uses two2, barely enough to melt chocolate. The Top 500 also has a "green list" that measures in gigaflops per Watt. The top positions are all taken by Japanese supercomputers, Shoubu, Suiren, and Sakira, all around 17 gigaflops/W. Next is NVIDIA's own DGX SaturnV Volta at 15 gigaflops/W. When you get below the very top of the list, which includes very specialized one-off designs, it is largely the province of the few companies that can build a system like this for an end-user. HP Enterprise (HPE) is the leader with 122 of the 500, which is actually down from 144 in June. Second is Lenovo with 81 systems. I believe much of this is part of the business that they purchased from IBM. IBM is actually next, but with just 19 systems. It gets very fragmented after that. More Information There is lots more information. The home page of the Top 500 List is here . The full list is here . There is an interactive version of the list that allows you to limit it in various ways (for example, which systems use gigabit ethernet). Hyperion offers an interactive map that lets you see where in the US high-performance datacenters (not just supercomputers, there are 700 systems listed and it is US-only) are located. You can get an insight into who pays for many of these systems by the fact that this map lets you search by congressional district. To wrap up this post, here are the datacenters in the city of San Jose, where Cadence is headquartered (Cadence is one of the pins): Sign up for Sunday Brunch, the weekly Breakfast Bytes email.
↧