For $four hundred,000, you could get all over four hundred Iphone X handsets, 300 Surface Pro laptops, or eleven Tesla Sequence 3 electric powered cars and trucks. But it would just take the total $400K and additional to get your arms on just just one Nvidia DGX-two server, which is billed as “the world’s most strong AI system for the most sophisticated AI challenges”.
But does the DGX-two reside up to that claim — and is any server seriously value these kinds of an eye-watering rate tag?
In order to reply those issues you have very first to have an understanding of that the DGX-two isn’t really the very first off-the-peg Nvidia server to be qualified at AI. That honour goes to the DGX-1, centered on a combine of Intel Xeon processors paired with Nvidia’s have AI-optimised Tesla V100 Volta-architecture GPUs. The DGX-two carries on that solution, but alternatively of 8 Tesla V100s joined working with Nvidia’s NVLink bus, the DGX-two will come with sixteen of these mighty GPUs related working with its additional scalable NVswitch technologies. According to Nvidia, this setup permits the DGX-two to cope with deep mastering and other demanding AI and HPC workloads up to ten periods quicker than its smaller sized sibling.
Whilst it was declared at the exact time as the DGX-1, it has taken a additional six months for the larger sized model to look. One of the very first to make it to the Uk was set up in the labs of Nvidia lover Boston Constrained. They asked if we would like to have a look: we did, and right here is what we uncovered.
The DGX-two ‘unboxed’
As nicely as effectiveness, dimensions is a large differentiator with the DGX-two which has the exact crackle-finish gold bezel as the DGX-1 but is bodily a large amount even larger, weighing in at 154.2kg (340lbs) in comparison to sixty.8kg (134lbs) for the DGX-1 and consuming ten rack units alternatively of 3.
It truly is also value noting that the DGX-two requires a large amount additional electricity than its tiny brother, necessitating up to 10kW at complete tilt, soaring to 12kW for the not too long ago declared DGX-2H model (about which additional soon). The picture below shows the electricity preparations at Boston needed to retain this tiny beast pleased. Cooling, in the same way, will require watchful thing to consider, specifically where additional than just one DGX-two is deployed or where it can be set up along with other hardware in the exact rack.
Distributing that electricity is a established of six hot-swap and redundant PSUs that slide in at the rear of the chassis alongside with the various modules that make up the rest of the system. Cooling, in the meantime, is managed by an array of ten fans located powering the front bezel with home on possibly aspect for sixteen two.5-inch storage units in two financial institutions of 8.
Nvidia features 8 3.84TB Micron 9200 Pro NVMe drives as element of the foundation configuration, equating to just more than 30TB of large-effectiveness storage. This, on the other hand, is primarily to cope with community details, with supplemental storage on the primary motherboard for OS and software code. It also leaves 8 bays vacant to insert additional storage if needed. In addition, the DGX-two is bristling with large-bandwidth community interfaces to connect to even additional capability and create server clusters if essential.
The Intel bits
Pull out the primary server tray and inside of you discover a traditional-on the lookout Intel-centered motherboard with two sockets for Xeon Platinum chips. On the system we seemed at these have been 24-core Xeon Platinum 8168 processors clocked at two.7GHz, though Nvidia has considering the fact that declared the DGX-2H model with somewhat quicker 3.1GHz Xeon Platinum 8174 processors alongside with more recent 450W Volta 100 modules. This will come at the price of necessitating a large amount additional electricity (up to 12kW) and will most likely insert to the general cost, though at the time of writing the rate of this new model had however to be confirmed.
No matter of specification, the Xeon processors sit in the center of the motherboard surrounded by 24 absolutely populated DIMM slots, giving prospective buyers an extraordinary 1.5TB of DDR4 RAM to engage in with. Together with this are a pair of 960GB NVMe storage sticks configured as a RAID 1 array both to boot the OS (Ubuntu Linux) and give area for the DGX software package stack and other programs.
The common USB and community controllers are also created in, with two RJ-45 Gigabit ports at the back — just one for out-of-band distant management and the other for general connectivity. One of the two available PCIe expansion slots also will come all set fitted with a dual-port Mellanox ConnectX-5 adapter that can accommodate Ethernet transceivers up to 100GbE for supplemental community bandwidth.
The second PCIe expansion slot is normally vacant but even additional connectivity is available courtesy of the different PCIe tray that sits just earlier mentioned the server motherboard. This adds a additional 8 PCIe interfaces filled, once more, with Mellanox adapters that can be applied to connect to clustered storage working with possibly 10GbE Ethernet or InfiniBand EDR 100 transceivers.
The Nvidia elements
And now the bit you’ve got all been waiting around for — the sixteen Nvidia Tesla V100 GPUs which, partly because of of their huge heatsinks (see below), have to be break up throughout two baseboards.
As a reminder, this is what a Tesla Volta 100 module appears to be like:
And this is what 8 Volta 100 modules look like when set up inside of just one of the GPU trays of a DGX-two:
The GPU boards also hold the NVswitches which require to be bodily joined in order for the Volta 100 modules to talk and purpose as a one GPU. This is completed by attaching two tailor made-designed backplanes to the rear of the baseboards at the time they have been pushed into the chassis.
The Tesla V100 GPUs on their own are much the exact SXM modules as those in the most recent DGX-1. Each individual is outfitted with 32GB of HBM2 memory for each GPU, so with sixteen set up there is double the GPU memory — 512GB — entirely.
Each individual GPU also has 5,a hundred and twenty CUDA processing cores as nicely as 640 of the additional specialised AI-optimised Tensor core. Multiplied by sixteen, that gives ten,240 Tensor cores in complete and a whopping 81,920 CUDA equivalents. All of which helps make for a large amount of processing electricity, which is additional increased by the interconnect bandwidth of two.4TB/sec available from the NVSwitch technologies with capability to scale even additional in the long run.
Effectiveness to go
So much, then, for the hardware. In addition to this you also get a total stack of preinstalled AI equipment all set to electricity up and begin working.
When reviewing a server it can be at this point that we would ordinarily start off conversing about effectiveness and the success of tests that we would usually operate to see how it stacks up. Nonetheless, functioning benchmarks on the DGX-two is a considerably from trivial task which, offered the sort of deep mastering and other HPC workloads included, would require prolonged periods more than a number of times. So alternatively we will have to depend on Nvidia’s promises, alongside with feed-back from the specialists at Boston.
To this conclude, the headline figure for the DGX-two is an extraordinary two petaFLOPS (PFLOPS) of processing electricity sent principally by the Tensor cores to cope with combined AI schooling workloads. This figure rises to two.1 PFLOPS on the DGX-2H working with quicker 450W Tesla V100 modules.
To put that into perspective, this processing electricity enabled the DGX-two to entire the FairSeq PyTorch benchmark in just 1.5 times — that is ten periods quicker than the fifteen times needed for the exact take a look at on the DGX-1 just six months previously. In addition, Nvidia reckons that to get the exact success working with x86 technologies would require 300 dual-socket Xeon servers, occupying fifteen racks and costing all over $two.7 million.
All of which helps make the DGX-two appear like a deal at all over $four hundred,000 (or the equivalent in GB£), even when you insert in the cost of assist — which, in the Uk, starts at all over £26,000 (ex. VAT) for each calendar year. Inspite of the large rate tag, businesses by now investing in AI will discover this quite reasonably priced in comparison to the alternate options, which include things like leasing compute time in shared details centres or the cloud. Nvidia is also eager to pressure that the DGX-two can also be applied to cope with considerably less exotic HPC workloads along with its AI responsibilities.
Bear in head also that, though the DGX-1 and DGX-two are breaking new floor, alternate options are on their way from other vendors. Not the very least SuperMicro, which on its site by now lists a server centered on the exact Nvidia HGX-two reference model as the DGX-two. Other individuals, these kinds of as Lenovo, aren’t considerably powering and these alternate options will inevitably function to generate prices down. We’ll be adhering to these developments through 2019.
Modern AND Connected Content material
IBM, Nvidia pair up on AI-optimized converged storage system
IBM Spectrum AI with Nvidia DGX is designed for AI and device mastering workloads.
MLPerf benchmark success showcase Nvidia’s leading AI schooling periods
For the very first release of MLPerf, an objective AI benchmarking suite, Nvidia achieved leading success in six groups.
Nvidia aims to operate neural nets quicker, additional effectively
As details receives even larger and styles expand larger sized, deep mastering is at the time once more “totally gated by hardware.” At the VLSI Symposia, Nvidia suggested some means to handle this trouble.
Nvidia unveils the HGX-two, a server system for HPC and AI workloads
The platform’s exceptional large-precision computing abilities are designed for the growing variety of programs that mix large-effectiveness computing with AI.
GPU computing: Accelerating the deep mastering curve
To create and train deep neural networks you require really serious quantities of multi-core computing electricity. We examine top GPU-centered answers from Nvidia and Boston Constrained.
AI expertise reign supreme in the speediest-growing positions of the calendar year (TechRepublic)
6 out of the fifteen leading rising positions in 2018 have been connected to artificial intelligence, in accordance to LinkedIn.
Nvidia outlines inference system, lands Japan’s industrial giants as AI, robotics shoppers (TechRepublic)
The information highlights Nvidia’s traction in AI and the details center.