Tianhe No. 2 CPU performance secret: E5-2692V2 measured

In 2013, the most exciting news in the field of supercomputing will be regarded as "Tianhe 2" reaching the top and winning the Top500 championship. This is the second time that China Supercomputer has won this title after "Tianhe 1A" in November 2010 This marks that the level of Chinese supercomputer has once again jumped to a new peak.

TOP500 data for November 2013

According to the data released by the world's supercomputer TOP500, the peak speed (Rpeak) of Tianhe-2 is 54,902.4 TFLOPS (trillion floating point operations) per second, and the total storage capacity is 12400 trillion bytes. To make an approximate analogy of the image, Tianhe 2 operates for one hour, which is equivalent to 1.3 billion people using a calculator to calculate for one thousand years. Its total storage capacity is equivalent to 60 billion books that can store 100,000 words per book.

The Tianhe No. 2 supercomputer is located in the National Supercomputing Center in Guangzhou. The system has a total of 16,000 computing nodes, each of which is equipped with two Ivy Bridge-EP platforms Intel Xeon E5-2692V2 CPU. The computing accelerator uses 3 Intel ’s latest Intel Xeon Phi 31S1P coprocessor product with MIC architecture. The nodes are interconnected through a THExpress-2 high-speed network with a bandwidth of 160Gbps.

The editor noted that in the TOP500 list in November, a total of 13 systems were equipped with Intel Xeon Phi coprocessors, and there is now a computing acceleration architecture using Xeon processors and Xeon Phi coprocessors. The brand-new name is "micro-heterogeneous architecture". The reason is that this architecture belongs to a hardware architecture with multiple types of computing power, but the general programming model can be supported, which can simplify the development and optimization process. This advantage cannot be achieved by other heterogeneous architectures. Since the release of Tianhe II, the "micro-heterogeneous architecture" has gradually become an acceleration model generally recognized by the high-performance computing industry.

In terms of network design, Tianhe No. 2 adopted the Arch interconnection mechanism, which is TH Expresss-2. The Arch interconnect system and two Ivy Bridge-EP nodes are placed on the same circuit board. The computing node shares the rack space with the Xeon Phi coprocessor—the space on the left belongs to the computing node, and the right side accommodates five Xeon Phi coprocessors, and both can be extracted separately. In addition, Tianhe 2 has also made a series of innovations and breakthroughs in high-speed interconnection, new hierarchical accelerated storage architecture, fault-tolerant design and fault management, integrated energy efficiency control, and high-density and high-precision structural processes.

Having said so much, you also have a simple understanding of Tianhe No. 2, and then enter today's topic-Tianhe No. 2 CPU comprehensive analysis.

As mentioned earlier, the CPU model used by Tianhe 2 is the Intel Xeon E5-2692V2 CPU. It must be clear to everyone that this is one of the latest Ivy Bridge-EP product families released by Intel. Ivy Bridge-EP can be said to have brought a spring to the server field. Because the new V2 version of the Xeon E5-2600V2 series increases the number of cores from the original Sandy Bridge-EP up to 8 cores to have 8-core, 10-core and 12-core multiple specifications, the V2 version of the Xeon E5 has become a good choice for building servers . Its process technology uses Intel's most advanced 22-nanometer process technology. Compared with the previous generation, the energy efficiency is increased by up to 45%. Up to 12 cores can be integrated. Performance increased by 50%.

According to the E5-2600V2 series processor product information published by Intel, Intel divides the models into four types, namely basic type, standard type, advanced type and industry optimized type, and its performance has also been greatly improved and improved. The user's demand for performance-level products is the same as its positioning. It is always the choice of most people, especially those who have high-performance users who require performance-level products often care more about performance. Users often consider the overall budget, cost performance, core, frequency, etc. when purchasing processor products.

The E5-2692V2 in the industry-optimized type is the CPU used in Tianhe No. 2, the CPU frequency is 2.2Ghz, and the number of cores is up to 12. After the release of Tianhe-2, the E5-2692V2 that broke out in the media hung our appetite. "Waiting" sometimes means the deepening and mysterious envelope of expectation. 2692V2 CPU for testing.

Intel E5-2692V2 CPU

Everyone must be looking forward to seeing the performance of E5-2692V2 just like the editor. What new and different experiences will it bring in the actual comprehensive application? With these questions, the editor will present the Intel E5-2692V2 CPU for everyone. The first evaluation article was released to all friends who are concerned about Tianhe No.2.

In order to evaluate the performance changes brought by the IvyBridge platform, the editor specially selected the Intel SandyBridge E5-2670 CPU as today's comparative evaluation object, and also chose another Ivybridge processor E5-2680V2 as a comparative test on the same platform (theoretical floating Point calculation ability is slightly higher than E5-2692V2).

http: //

Intel E5-2670 CPU Intel E5-2680V2 CPU

Before the formal test, the editor will take you to take a look at the model specifications of these three CPUs:

It can be seen from the parameters that these three processors are the same in power consumption, which is also an important reason why Xiaobian chose these three processors for comparative testing.

Everyone knows that CPU-Z is a very common CPU detection software, and it is the software that detects the most CPU usage. The types of CPUs it supports are quite comprehensive, and the software startup speed and detection speed are very fast. In addition, it can also detect information about the motherboard and memory, including the memory dual-channel detection function that we commonly use. Therefore, the editor first tested these three CPUs with the CPU-Z software.

E5-2692V2 detection

E5-2670 detection

E5-2680V2 detection

Let's enter today's evaluation together.

The hardware platform to be tested this time is two NF5280M3 machines borrowed from a manufacturer. This model can support the above three processors. The evaluation part will be divided into two parts: benchmark test and application performance test. Except for the CPU, the three configurations tested are consistent with the rest of the hardware. Mainly test the performance comparison of E5-2692V2, E5-2670, E5-2680V2, and focus on verifying the performance changes brought by IvyBridge platform.

In the test state, both platforms will be tested in the Auto state of the motherboard, and the energy-saving options and turbo acceleration will be enabled by default.

1. Comparison of benchmark data

In the benchmark performance test, we used Linpack and Stream to test the computing performance of the platform.

Linpack test results:

According to Linpack test data, the advantages of the dual-channel E5-2692V2 CPU platform are very obvious. The measured performance reaches 445.980GFlops, but the measured performance of the E5-2670 is only 345.513 GFlops, which improves Linapck performance by 29%.

At the same time, from the test results, although the theoretical floating point computing power (Rpeak) of E5-2680V2 is higher than that of E5-2692V2, the actual efficiency of E5-2692V2 is 7% higher, and the actual Linpack result of E5-2692V2 is also slightly higher. E5-2680V2.

Stream test results (memory frequency 1600Mhz):

It can be seen from the Stream test data that the E5-2692V2 CPU is 5% -9% higher than the E5-2670 in memory Copy, Scale, Add, Triad performance.

Although from the parameter point of view, E5-2680V2 and E5-2692V2 are the same in terms of memory channels and supported memory frequencies, but in terms of Stream test results, E5-2692V2 is higher in memory Copy, Add, Triad E5-2680V2.

Second, HPC application data comparison

1. Comparison of FLUNT test of typical fluid mechanics application software

Test example:

This comparison test uses Flunt software application truck truck external flow test example. The number of three-dimensional grids is 14 million. The DES turbulence model is used, and the pressure-based NS equation solver (pbns) is used to iteratively stabilize 100 steps. The software is Flunt version 14.0.

Test Data:

Table 1: Time data of truck14m test

Figure 1: Completion time comparison of truck 14m

It can be seen from Table 1 and Figure 1:

Under the single node, the performance of E5-2692V2 is 16.6% higher than that of E5-2670; the performance of E5-2692V2 is 5.3% higher than that of E5-2680V2.

Under the dual node, E5-2692V2 performance is 31.9% higher than E5-2670; E5-2692V2 performance is 21.5% higher than E5-2680V2.

2. WRF test of typical meteorology application software

Test example:

This test uses a typical WRF test example of meteorology, with a forecast time limit of 48 hours, four levels of nesting, and a WRFOUT file output every three hours. The software version adopts WRFV3.4.1

Test Data:

Table 2: Completion time and acceleration ratio of WRF study

Figure 2: Comparison of WRF calculation performance improvement

It can be seen from Table 2 and Figure 2:

Under the single node, the performance of E5-2692V2 is 28.1% higher than that of E5-2670; the performance of E5-2692V2 is 2.3% higher than that of E5-2680V2.

Under the two nodes, the performance of E5-2692V2 is 30.5% higher than that of E5-2670. The performance of E5-2692V2 is 6.7% higher than that of E5-2680V2.

Three, test summary

Through various performance comparison tests on Ivy Bridge-EP E5-2692V2, Intel E5-2680V2, Intel E5-2680V2 and Sandy Bridge-EP E5-2670 in benchmark tests, industry application software tests, we can see that E5-2692V2 has very good performance.

The editor also learned from the relevant manufacturers that because the E5-2692V2 is a processor dedicated to high-performance computing, it has made many internal optimizations for high-performance computing applications. Perhaps this is why Tianhe 2 chose this CPU. Right.

The File Holder is designed with digital printing, silk screen printing, UV printing and other printing methods and accessories. It is a combination of practicality and aesthetic culture.

 12310

File Holder

File Holder,Magazine File,Folder Holder,Magazine File Holder

Jilin Y.F. Import & Export Co.,Ltd , https://www.jlyoufoundit.com