Firmware Updates and Initial Performance Data for Data Center Systems
Over the past several days, Intel has made further progress to address the exploits known as “Spectre” and “Meltdown.” We are continuing to support our customers through this process and we remain focused on doing so. As we continue these efforts, I would like to express my appreciation to many of our partners, including Dell, HPE, HPI, Lenovo and Microsoft, for joining our Security-First Pledge.
I’ll be covering two topics in this blog post: our progress in rolling out firmware updates for the exploits, as well as addressing the reboot issue I discussed last week, and initial data from the benchmarking we are doing on data center platforms.
We have now issued firmware updates for 90 percent of Intel CPUs introduced in the past five years, but we have more work to do. As I noted in my blog post last week, while the firmware updates are effective at mitigating exposure to the security issues, customers have reported more frequent reboots on firmware updated systems.
As part of this, we have determined that similar behavior occurs on other products in some configurations, including Ivy Bridge-, Sandy Bridge-, Skylake-, and Kaby Lake-based platforms. We have reproduced these issues internally and are making progress toward identifying the root cause. In parallel, we will be providing beta microcode to vendors for validation by next week.
For those customers looking for additional guidance, we have provided more information on this Intel.com Security Center site. I will also continue to provide regular updates on the status.
Data Center Performance Testing
On Jan. 10, I provided initial performance data results for client systems and today I have initial results to share on the data center side. These results are run on industry standard benchmarks and are helpful, but we understand that what ultimately matters to our customers is their own workloads. To date, we have tested server platforms running two-socket Intel Xeon® Scalable® systems (code-named Skylake), our latest server microarchitecture.
As expected, our testing results to date show performance impact that ranges depending on specific workloads and configurations. Generally speaking, the workloads that incorporate a larger number of user/kernel privilege changes and spend a significant amount of time in privileged mode will be more adversely impacted.
To summarize what we’ve seen in testing so far:
- Impacts ranging from 0-2% on industry-standard measures of integer and floating point throughput, Linpack, STREAM, server-side Java and energy efficiency benchmarks. These benchmarks represent several common workloads important to enterprise and cloud customers.
- An online transaction processing (OLTP) benchmark simulating modeling a brokerage firm’s customer-broker-stock exchange interaction showed a 4% impact. More analytics testing is in process and the results will be dependent on system configuration, test setup and benchmark used.
- Benchmarks for storage also showed a range of results depending on the benchmark, test setup and system configuration:
- For FlexibleIO, a benchmark simulating different types of I/O loads, results depend on many factors, including read/write mix, block size, drives and CPU utilization. When we conducted testing to stress the CPU (100% write case), we saw an 18% decrease in throughput performance because there was not CPU utilization headroom. When we used a 70/30 read/write model, we saw a 2% decrease in throughput performance. When CPU utilization was low (100% read case), as is the case with common storage provisioning, we saw an increase in CPU utilization, but no throughput performance impact.
- Storage Performance Development Kit (SPDK) tests, which provide a set of tools and libraries for writing high performance, scalable, user-mode storage applications, were measured in multiple test configurations. Using SPDK iSCSI, we saw as much as a 25% impact while using only a single core. Using SPDK vHost, we saw no impact.
More details on the specific benchmarks, platforms and results available are summarized in the table below. In those areas where we are seeing higher impacts, we are working hard with our partners and customers to identify ways to address this. For example, there are other mitigations options that could yield less impact. More details on some of these options can be found in our white paper and in Google’s post on their “Retpoline” security solution.
I will continue to share information on our progress, including more performance data on additional older platforms, in future updates.
Navin Shenoy is executive vice president and general manager of the Data Center Group at Intel Corporation.