... | ... | @@ -41,7 +41,7 @@ This wiki page gives an overview of the various server machines offered by the A |
|
|
| architecture | [Nehalem](https://en.wikipedia.org/wiki/Nehalem_(microarchitecture)) | [Nehalem](https://en.wikipedia.org/wiki/Nehalem_(microarchitecture)) | [Skylake-H](https://en.wikichip.org/wiki/intel/microarchitectures/skylake_(client)) | [Skylake-SP](https://en.wikichip.org/wiki/intel/microarchitectures/skylake_(server)) | [Skylake-SP](https://en.wikichip.org/wiki/intel/microarchitectures/skylake_(server)) | [Skylake-SP](https://en.wikichip.org/wiki/intel/microarchitectures/skylake_(server)) | [Knights Landing](https://en.wikichip.org/wiki/intel/microarchitectures/knights_landing) |
|
|
|
| vector instructions | SSE4.2 | SSE4.2 | AVX2 | AVX-512 | AVX-512 | AVX-512 | AVX-512 |
|
|
|
| cores per socket | 4 | 4 | 4 | 8 | 16 | 24 | 64 |
|
|
|
| SMT | | | 2 threads per core | | | | |
|
|
|
| simultaneous multithreading | | | 2 threads per core | | | | |
|
|
|
| base frequency | 2.0 GHz | 2.4 GHz | 3.5..3.9 GHz | 2.4..3.0 GHz | 2.8..3.7 GHz | 3.4..3.7 GHz | 1.3..1.5 GHz |
|
|
|
| AVX2 frequency | | | 3.5..3.9 GHz | 2.1..2.9 GHz | 2.4..3.6 GHz | 3.0..3.6 GHz | 1.1 GHz |
|
|
|
| AVX-512 frequency | | | | 1.3..1.8 GHz | 1.9..3.5 GHz | 2.5..3.5 GHz | 1.1 GHz |
|
... | ... | @@ -60,14 +60,15 @@ instruction set used. A tabular overview can be found on the WikiChip pages link |
|
|
value ranges given in the table ("A..B GHz") state the clock frequencies for single-core load and
|
|
|
full load.
|
|
|
|
|
|
Simultaneous multithreading ("hyperthreading") is enabled on the mp-media* machines and disabled
|
|
|
on all other machines.
|
|
|
[Simultaneous multithreading](https://en.wikipedia.org/wiki/Simultaneous_multithreading) (also
|
|
|
known as "hyperthreading") is enabled on the mp-media* machines and disabled on all other
|
|
|
machines.
|
|
|
|
|
|
eDRAM refers to the Embedded DRAM in the Skylake-H CPU, a memory side cache shared between CPU
|
|
|
and GPU, cf.
|
|
|
[https://en.wikichip.org/wiki/intel/microarchitectures/skylake_(client)](https://en.wikichip.org/wiki/intel/microarchitectures/skylake_(client)#eDRAM_architectural_changes).
|
|
|
[https://en.wikichip.org/wiki/intel/microarchitectures/skylake_(client)#eDRAM_architectural_changes](https://en.wikichip.org/wiki/intel/microarchitectures/skylake_(client)#eDRAM_architectural_changes).
|
|
|
|
|
|
HBM is the configurable high-bandwidth memory available on the Knights Landing CPUs, cf.
|
|
|
HBM refers to the configurable high-bandwidth memory available on the Knights Landing CPUs, cf.
|
|
|
[https://colfaxresearch.com/knl-mcdram/](https://colfaxresearch.com/knl-mcdram/).
|
|
|
|
|
|
|
... | ... | @@ -75,10 +76,10 @@ HBM is the configurable high-bandwidth memory available on the Knights Landing C |
|
|
|
|
|
| measure | Xeon E5504 | Xeon E5530 | Xeon E3-1585 | Xeon Silver 4110 | Xeon Gold 6130 | Xeon Platinum 8168 | Xeon Phi 7210 |
|
|
|
| ---------------------------- | ----------- | ----------- | ---------------- | ---------------- | ---------------- | ------------------ | ---------------- |
|
|
|
| L1 read per core | | | f/GHz × 128 GB/s | f/GHz × 128 GB/s | f/GHz × 128 GB/s | f/GHz × 128 GB/s | f/GHz × 128 GB/s |
|
|
|
| L1 write per core | | | f/GHz × 64 GB/s | f/GHz × 64 GB/s | f/GHz × 64 GB/s | f/GHz × 64 GB/s | f/GHz × 64 GB/s |
|
|
|
| L1 read+write per core | | | f/GHz × 192 GB/s | f/GHz × 192 GB/s | f/GHz × 192 GB/s | f/GHz × 192 GB/s | f/GHz × 192 GB/s |
|
|
|
| L2 per core | | | f/GHz × 64 GB/s | f/GHz × 64 GB/s | f/GHz × 64 GB/s | f/GHz × 64 GB/s | f/GHz × 64 GB/s |
|
|
|
| L1 read per core | | | $`f/\mathrm{GHz} \times 128 \mathrm{GB}/\mathrm{s}`$ | $`f`$/GHz × 128 GB/s | $`f`$/GHz × 128 GB/s | $`f`$/GHz × 128 GB/s | $`f`$/GHz × 128 GB/s |
|
|
|
| L1 write per core | | | $`f`$/GHz × 64 GB/s | $`f`$/GHz × 64 GB/s | $`f`$/GHz × 64 GB/s | $`f`$/GHz × 64 GB/s | $`f`$/GHz × 64 GB/s |
|
|
|
| L1 read+write per core | | | $`f`$/GHz × 192 GB/s | $`f`$/GHz × 192 GB/s | $`f`$/GHz × 192 GB/s | $`f`$/GHz × 192 GB/s | $`f`$/GHz × 192 GB/s |
|
|
|
| L2 per core | | | $`f`$/GHz × 64 GB/s | $`f`$/GHz × 64 GB/s | $`f`$/GHz × 64 GB/s | $`f`$/GHz × 64 GB/s | $`f`$/GHz × 64 GB/s |
|
|
|
| eDRAM, single-threaded | | | | | | | |
|
|
|
| eDRAM, concurrent | | | | | | | |
|
|
|
| HBM, single-threaded | | | | | | | |
|
... | ... | @@ -86,9 +87,9 @@ HBM is the configurable high-bandwidth memory available on the Knights Landing C |
|
|
| DRAM, single-threaded | | | 17.1 GB/s | 19.2 GB/s | 21.3 GB/s | 21.3 GB/s | |
|
|
|
| DRAM, concurrent, per socket | | | 34.1 GB/s | 115.2 GB/s | 128 GB/s | 128 GB/s | ~90 GB/s |
|
|
|
|
|
|
Note that all bandwidth indications use decimal unit prefixes, i.e. 1 GB = $10^9$ B.
|
|
|
Note that all bandwidth indications use decimal unit prefixes, i.e. 1 GB = 10⁹ B.
|
|
|
|
|
|
The cache bandwidths scale with the clock frequency $f$ which depends on the number of cores
|
|
|
The cache bandwidths scale with the clock frequency $`f`$ which depends on the number of cores
|
|
|
under load and the instruction set used. Computations are based on the number of 64B cache lines
|
|
|
the CPUs can fetch or store in each cycle.
|
|
|
|
... | ... | |