xilinx, Inc., a leader in adaptive computing, recently announced at the Global Supercomputing Conference (SC21) that it has launched the Alveo™ U55C data center accelerator card and a standards-based, API-driven cluster solution Solution for large-scale deployment of FPGAs. The Alveo U55C accelerator can provide excellent performance per power consumption for high-performance computing (HPC) and database workloads, while also being able to pass Xilinx® The HPC cluster solution is easy to expand.
Figure Xilinx Alveo U55C accelerator card
The new Alveo U55C card is specially built for HPC and big data workloads. It is the most powerful Alveo accelerator card in Xilinx history and can provide the highest computing density and HBM capacity in the Alveo accelerator product series. Combined with Xilinx’s new cluster solution based on RoCE v2, it can greatly benefit all kinds of customers running large-scale computing workloads, allowing them to use existing data center infrastructure and networks to implement powerful FPGA-based HPC clusters.
Salil Raje, executive vice president and general manager of the data center business unit of Xilinx, said: “Now, it is easier, more efficient, and more powerful to extend Alveo’s computing power to target HPC workloads. At the architectural level, the Alveo card is represented. FPGA accelerators can provide the highest performance for many computationally intensive workloads at the lowest cost. We have introduced a standards-based approach that supports the use of customers’ existing infrastructure and networks to create Alveo HPC clusters. With this approach, we are bringing These major advantages can be applied to any data center on a large scale. This is a major leap forward in the wider application of Alveo and adaptive computing in the data center.”
Designed for HPC and big data applications
The Alveo U55C card incorporates many key features required by current HPC workloads. It can provide higher data pipeline parallelism, excellent memory management, optimized data migration of the entire pipeline, and the highest unit power consumption performance in the Alveo product series. The Alveo U55C card adopts a single-slot, full-height half-length (FHHL) form factor, and its maximum power consumption is as low as 150W. Compared with the previous-generation dual-slot Alveo U280 card, Alveo U55C provides superior computing density and doubles the capacity of HBM2 to 16GB. U55C provides higher computing power in a smaller form factor, which helps to create dense clusters based on Alveo accelerators. It is specially developed for high-density streaming data, high I/O mathematics, and large-scale computing problems that require performance expansion, such as big data analysis and AI applications.
Figure Xilinx Alveo U55C accelerator card
By using RoCE v2 and data center bridging technology, combined with 200 Gbps bandwidth, this API-driven cluster solution enables Alveo networks to be comparable to InfiniBand networks in terms of performance and latency, without requiring vendor locks. The MPI (Information Transfer Interface) integration function enables HPC developers to extend the Alveo data pipeline with the Xilinx Vitis™ unified software platform. Using existing open standards and frameworks, performance can now be scaled across hundreds of Alveo cards, regardless of server platform and network infrastructure, while at the same time sharing workload and storage.
With high-level programming for applications and clusters, software developers and data scientists can use the Vitis platform to unlock the advantages of Alveo and adaptive computing. Xilinx has invested heavily in the Vitis development platform and tool flow, aiming to make it easier for software developers and data scientists without hardware expertise to use adaptive computing. The Vitis platform supports mainstream AI frameworks such as Pytorch and Tensorflow. It also supports high-level programming languages such as C, C++, and Python, allowing developers to use specific APIs and libraries to build domain solutions, or use Xilinx software development kits to gain Easily accelerate critical HPC workloads in existing data centers.
HPC customer use cases
CSIRO is Australia’s national research organization and has the world’s largest radio telescope antenna array. CSIRO is currently using the Alveo U55C card for signal processing for its Square Kilometer Array radio telescope. Deploying the Alveo card as a network-attached accelerator equipped with HBM can achieve large-scale throughput in the entire HPC signal processing cluster. The cluster based on the Alveo accelerator enables CSIRO to handle massive computing tasks, real-time aggregation, filtering, preparation and processing of data from 131,000 antennas. 420 Alveo U55C cards are fully networked through 100Gbs switches that support P4 language, providing 460GBs of HBM2 bandwidth in the entire signal processing cluster. The processing performance of the Alveo U55C cluster can reach a total throughput of 15Tb/s, with lower power consumption and more cost-effectiveness, and significant cost savings. CSIRO is now completing an Alveo reference design to help other radio astronomy or neighboring industries achieve the same success.
Figure Xilinx Alveo U55C accelerator card
Ansys LS-DYNA crash simulation software is used in almost all automobile companies in the world. The design of safety and structural systems often depends on the performance of the model, because it can reduce the cost of physical crash testing by means of computer-aided design finite element method (FEM) simulation. The FEM solver is the main algorithm that drives the simulation with hundreds of millions of degrees of freedom, and these huge algorithms can be subdivided into more basic solvers, such as PCG, sparse matrix, and ICCG. Compared with x86 CPU, LS-DYNA can achieve more than 5 times performance acceleration by using ultra-parallel data pipeline to expand the performance on a large number of Alveo cards. This can improve the efficiency of the unit clock cycle in an Alveo pipeline, allowing LS-DYNA customers to benefit from the breakthrough simulation time.
Wim Slagter, Director of Ansys Strategic Partners, said: “Upholding the spirit of unremitting innovation, we are very happy to cooperate with Xilinx to greatly accelerate the finite element solver in our LS-DYNA simulation application and represent 90% of the implicit finite element method. With the help of Xilinx acceleration, we look forward to fulfilling the mission of supporting innovators in designing the future.”
TigerGraph, a leading graph analysis platform provider, is using multiple Alveo U55C cards to cluster and accelerate the two most efficient algorithms to drive graph-based recommendation and clustering engines. For data scientists, graph databases can be described as a disruptive platform. The graph collects data from information islands and focuses on the relationship between the data. The next frontier in the field of graphs is finding answers in real time. Alveo U55C shortens the query and prediction time of the recommendation engine from a few minutes to a few milliseconds. Compared with CPU-based clusters, using multiple U55C cards to expand the excellent computing power and memory bandwidth provided by analysis can increase the graph query speed by up to 45 times. The quality score has also increased by up to 35%, which significantly increases confidence and reduces the probability of false positives to low single digits.
Product availability and easy evaluation
The Alveo U55C card is currently available through China.xilinx.com and Xilinx authorized distributors. The product can also be easily evaluated by public cloud-based FaaS (FPGA-as-a-Service, FPGA-as-a-Service) providers, and can also be previewed exclusively through selected hosted data centers. The cluster solution is now available in exclusive preview and is expected to be fully available in the second quarter of next year.
Xilinx will showcase the Alveo U55C accelerator card and partner solutions at the Global Supercomputing Conference (SC21) this week. Welcome to register for SC21 and visit the Xilinx virtual booth.