Of all the technologies in the datacenter, none are evolving more rapidly than network connectivity. 20 years ago, 1GbE networking was considered exotic, and the infrastructure required to support it (e.g., NICs, switches, and cables) were expensive and rare. Three years ago, 10GbE networks were state-of-the-art, but now 20, 40, and even 100GbE networks have become standard in data centers. Not only has the network gotten wider, it has become more sophisticated with the addition of elements like virtualization, software-defined networking (SDN), overlay networks, and other technologies that hadn’t even been envisioned ten years ago but are now commonplace and are straining the resources in datacenter servers.
Of all the technologies in the datacenter, none are evolving more rapidly than network connectivity. 20 years ago, 1GbE networking was considered exotic, and the infrastructure required to support it (e.g., NICs, switches, and cables) were expensive and rare. Three years ago, 10GbE networks were state-of-the-art, but now 20, 40, and even 100GbE networks have become standard in data centers. Not only has the network gotten wider, it has become more sophisticated with the addition of elements like virtualization, software-defined networking (SDN), overlay networks, and other technologies that hadn’t even been envisioned ten years ago but are now commonplace and are straining the resources in datacenter servers.
**Learn More about SmartNICs and the composable data center at Xilinx Adapt***
When dealing with networks over 10GbE, we start to see CPU bottlenecks in servers occur as networking packets are passed up to the CPU for processing. With 25GbE networks, a measurable percentage of the CPU’s time is spent processing network packets. To deal with this issue, we have developed techniques to push some networking functions down from the CPU to the network interface controller (NIC). We call the devices that can handle this offloading SmartNICs.
In this article, we will explain what a SmartNIC is, the value that they bring to the data center, and why you should start to investigate and invest in them. Finally, we’ll look at a particularly innovative SmartNIC, the Xilinx SN1000.
What is a SmartNIC?
Offloading network operations from the CPU to a NIC has been the focus of major cloud providers as they are always pushing for efficiency in the data center. There is no hard and fast rule as to what is needed to label a NIC as Smart; however, at a minimum, they should be able to handle some of the control-plane functions found in virtual switches, and some of the capabilities found in network function virtualization (NFV) functions, such as firewall, intrusion detection and prevention, host inspection, and encryption, as well as data-plane tasks, such as network quality of service (QoS) and flow reporting and monitoring.
What is Driving the Adoption of SmartNICs?
Public clouds and hyperscalers have driven data center innovation for the past decade and will continue to do so for the foreseeable future. The technology that they use eventually drifts down to enterprise data centers and the same is true of SmartNICs. In hindsight, SmartNICs simply put network functions where they should have been in the first place: back on the NIC rather than wasting CPU and motherboard bandwidth.
It is easy to envision how much network traffic gets discarded or hair-pinned back on the network without adding any value to the system/CPU that are forced to deal with them – simply because NFVs and other functions are implemented using a traditional server’s CPU rather than at the NIC level. Every CPU cycle that can be offloaded from the CPU frees it up and allows the server to do productive work.
To illustrate how a SmartNIC can be beneficial, we can take something as simple as a distributed denial of service (DDoS) attack. Although a DDoS is rare in a modern data center, having a SmartNIC deal with a DDoS would allow the system’s CPU to continue with work that is productive rather than sorting, classifying, and discarding packets. A more modern example would be to have a SmartNIC deal with the encapsulation of network packets that are used by overlay networks rather than the system CPU.
Why Xilinx is a Leader in SmartNIC Technology
With all emerging technologies, there are companies that position themselves at the forefront of the technology. These companies tend to be passionate and laser-focused on technology. They devote their energies to the goal of overcoming the myriad of hurdles that would prevent the new technology from coming into the marketplace. Xilinx is such a company.
Xilinx has a long history of being an innovator in emerging technologies. For instance, they invented the field-programmable gate array (FPGA) and are considered the leader in this technology. Bringing a new technology, like a SmartNIC, to the market is not an inexpensive proposal, and with revenues of over $3 billion in 2020, they have the financial resources to do so. But it also takes serious engineering and management know-how – again, Xilinx has both.
In April 2019, Xilinx entered into an agreement to acquire Solarflare Communications, an earlier developer of ultra-low-latency networking and application acceleration, and a leader in SmartNIC technology. Later that year, Xilinx demonstrated a single-chip FPGA-based 100G SmartNIC based on Solarflare and Xilinx technologies. This SmartNIC combined Xilinx FPGA, system on a chip (SoC), and adaptive compute acceleration platform (ACAP) with Solarflare’s technology to create a new converged SmartNIC solution, which became the Xilinx SN1000.
The Xilinx ALVEO SN1000
The Xilinx SN1000 is an FHHL PCIe x16 physical (Gen 4 x8 or Gen 3 x16 electrical) NIC with dual 100GbE copper or optical ports. It has a 16-core Cortex-A72 processor and an FPGA with over a million look-up tables (LUTs). A LUT is basically how FPGA builds up its logic; the more LUTs an FPGA has, the more powerful and flexible it will be. The card has a total of 12 GB of DDR4 RAM, with 4 GB dedicated to the Arm processor and 8 GB to the FPGA. This hardware scheme translates to the SN1000 being able to offload 4 million stateful connections and process 100 million packets per second (PPS).
Application-specific integrated circuits (ASICs), FPGAs, and SoCs can be used to make SmartNICs. ASICs can be performant; however, they have limited flexibility and it’s difficult to add additional functionality to them. And while SoCs are extremely flexible, they lack the speed of ASICs and FPGAs.
To provide both flexibility and performance in their SN1000 SmartNIC, Xilinx uses a powerful SoC for control-plane functions due to its inherent flexibility and mates it with an FPGA for data-plane functions for performance reasons. The primary advantage of using an FPGA over an ASIC is that an FPGA can be reprogrammed when new functionality is developed and/or needed, whereas it can take a year or longer to get a new ASIC in the field.
Developing code for an FPGA is not a trivial matter, and Xilinx has some excellent tools to assist with this need. Using the Xilinx development and programming toolset, Xilinx customers can write their own FPGA applications in a high-level programming language that software developers are accustomed to, rather than the hardware code that has traditionally been used for FPGA application development.
Xilinx has also developed an application marketplace where solutions developed by Xilinx and third parties can be obtained. This approach allows SN1000 buyers to achieve a faster time to value (TTV) by bypassing the development cycle. The app store has solutions for NFV, network security, imaging processing, machine learning (ML), and other functions that can and should be done at the SmartNIC level.
Xilinx applications are packaged as docker containers. They can be evaluated for use for free, and then purchased directly from the store via a credit card.
While Xilinx SmartNICs are a leading-edge product, this isn’t to say that they are so far ahead of the curve that they are preventing adoption and usage. On the contrary, Xilinx SmartNICs are already being deployed to address specific use cases in public clouds, hyperscalers, and modern datacenters. A few examples of what they are being used for include; VXLAN and NVGRE tunneling encapsulation, Open Virtual Switch (OVS), Intel DPDK, and the Virtio-net I/O.
Another interesting use case for SmartNICs is that they are being used to offload storage functions such as the Ceph object storage client, and NVMe-oF is gaining popularity. Furthermore, for high-speed trading, Xilinx has stated that their SmartNIC can achieve nano-second latency for “tick to trade” algorithmic trading.
Video analytics is another sector where SmartNICs shine. Due to the volume of data involved with video, it is impracticable to pass it back to a central repository. As a solution, SmartNICs are being used on edge devices to handle video interpretation functions, such as mask detection, people counting and tracking, as well as virtual fencing as these require the AI inference that FPGA can handle quickly and efficiently.
Why You need SmartNICs
With the rise in high-bandwidth networking, we are asking more and more from the servers in our datacenters. We are getting to a point where, due to the number of network packets that need to be processed with more network bandwidth, servers have fewer cycles to do profitable work. Some studies have shown that more than 20% of a server’s CPU cycles can be used for packet processing in traditional datacenters that have high-bandwidth networks. For example, with a 3GHz CPU, a processor has approximately 300 cycles to process a 1500B packet to keep up with line rate.
To free CPUs to do the high-value work that they were designed to carry out, we need to offload unnecessary functions to other devices, ones that are closer to their source. In this case, the right device to deal with networks is a SmartNIC.
While SmartNICs are not just the purview of Xilinx, they are positioned at the forefront of this emerging technology. By using an SoC combined with FPGA, they are able to get the ease of use and flexibility of a software-defined solution with the performance of a hardware-implemented solution. Xilinx knows that applications are needed to exploit SmartNICs, so they have created a programming environment that allows computer programmers, rather than hardware engineers, to develop the applications that run on their SmartNICs. For those users looking for a faster TTV proposition, Xilinx has an app store to allow third-party applications to be purchased.
For a modern datacenter to be competitive, it needs to free its servers from as many unneeded tasks as possible. These tasks include stateful firewalls, load balancing, IPsec, TLS, NVMe-over-TCP, Virtio.blk storage access, data compression, or the myriad of other functions that are better managed with a SmartNIC.
Learn More at Xilinx Adapt
Xilinx Adapt is a digital event on March 24-25, 2021 that will cover the relevance of SmartNICs in the datacenter along with important topics like cloud computing, computational storage, and the composable data center. Admission is free and replays will be made available afterward.
Engage with StorageReview
Newsletter | YouTube | Podcast iTunes/Spotify | Instagram | Twitter | Facebook | RSS Feed