0% found this document useful (0 votes)
110 views8 pages

Intel Microarchitecture White Paper

The white paper introduces Intel's new Nehalem microarchitecture, which enhances energy efficiency and performance for the Xeon processor 3500 and 5500 series. It emphasizes dynamic scalability, allowing for optimized performance across various computing environments while managing power costs effectively. Key features include Intel Turbo Boost and Hyper-Threading technologies, which provide significant performance improvements for multi-threaded applications.

Uploaded by

XxShadowGaming
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
110 views8 pages

Intel Microarchitecture White Paper

The white paper introduces Intel's new Nehalem microarchitecture, which enhances energy efficiency and performance for the Xeon processor 3500 and 5500 series. It emphasizes dynamic scalability, allowing for optimized performance across various computing environments while managing power costs effectively. Key features include Intel Turbo Boost and Hyper-Threading technologies, which provide significant performance improvements for multi-threaded applications.

Uploaded by

XxShadowGaming
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

WHITE PAPER

Intel® Xeon® processor 3500


and 5500 series
Intel® Microarchitecture

First the Tick, Now the Tock:


Intel® Microarchitecture (Nehalem)

Introducing a New Dynamically and Intel microarchitecture’s (Nehalem’s) energy


Design-Scalable Microarchitecture efficiency and performance comes at a
that Rewrites the Book on Energy critical crossroads in computing. In the
Efficiency and Performance past, when a computer’s energy efficiency
Since the introduction of Intel® Core™ wasn’t a concern, nearly every architec-
microarchitecture in 2006 and its 2007 ture feature that could improve processor
45nm enhancements—the Intel Core performance would be included without
microarchitecture (Penryn) family of worrying about the power cost. But in
processors—the blistering performance an age of increasing concern for limited
and energy efficiency of Intel® micropro- resources and increased energy costs,
cessors have delivered unprecedented every segment (server, workstation,
capabilities to computer users. Now a new desktop, and mobile) is power-constrained
microarchitecture named Nehalem (the and designing a microarchitecture requires
“By measuring the benefit foundation of the Intel® Xeon® processor a different approach. Processor manufac-
3500 and 5500 series) builds on these turers must consider the power cost for
of the performance gain against whether the processor is intended for the
earlier microarchitectural marvels, rewriting
the power cost, Intel was able the book on processor scalability, perfor- home, data center, or ultra-light laptop.
mance, and energy efficiency.
to design Intel microarchitecture Intel weighed every architectural
The first chapter is all about scalability. feature added to Intel microarchitecture
(Nehalem) to deliver greater power (Nehalem) against a strict power/perfor-
Intel® microarchitecture (Nehalem) is a
efficiency at any power envelope.” dynamically scalable and design-scalable mance efficiency threshold. If the feature
microarchitecture. At runtime, it dynamically couldn’t add more than a one percent
manages cores, threads, cache, interfaces, performance gain for a less than three
and power to deliver outstanding energy percent power cost, Intel wouldn’t add it.
efficiency and performance on demand. By measuring the benefit of the perfor-
At design time, it scales easily, enabling mance gain against the power cost, Intel
Intel to provide versions optimized for was able to design Intel microarchitecture
each server, desktop, and notebook market. (Nehalem) to deliver greater power
Intel will deliver versions differing in the efficiency at any power envelope.
Jeff Casazza
Intel Corporation
number of cores, caches, interconnect
capability, and memory controller capability,
as well as in the inclusion of an integrated
graphics controller. This allows Intel to
deliver a wide range of price, performance,
and energy efficiency targets for servers,
workstations, desktops, and laptops.
White Paper First Tick, Now Tock: Intel® Microarchitecture (Nehalem)

Table of Contents The Adaptability of the Intel® Xeon® Processor 3500 and 5500 Series
Introducing a New Dynamically and Performance Software Adaptable
Design-Scalable Microarchitecture Maximizes performance The processor adapts to
by adapting to the workload the way your application
that Rewrites the Book on Energy through Intel® Turbo Boost wants to run
Efficiency and Performance. . . . . . . . 1 Technology and Intel® Intel® Turbo Boost
Hyper-Threading Technology Technology
The Adaptability of the Intel® Xeon®
Processor 3500 and 5500 Series. . . 2
Example: Intel® Xeon® Processor
3500 and 5500 Series. . . . . . . . . . . . . 2 Integrated
Power Gates
Quad-Core with
Hyper-Threading

Unlocking All the Power of Intel’s


45nm Hi-k Metal Gate Process
Technology . . . . . . . . . . . . . . . . . . . . . . 3
Automated Lower
An Overview of Intel’s New Energy Efficiency Power States
IT Adaptable
Microarchitecture. . . . . . . . . . . . . . . . . 3 Automatically puts CPU You can enable automatic
into the lowest available operation or selectively
An In-Depth Look at Top Features . . . 4 power state while still meeting configure for manual control
performance requirements
1. Intel® Turbo Boost Technology . . . . 4
2. Intel® Hyper-Threading Technology . 4
3. Other Key Performance Example: Intel® Xeon® Processor memory bandwidth, delivering greater
Improvement Features. . . . . . . . . . 5 3500 and 5500 Series throughput and responsiveness for
A good example of how Intel microarchi- multi-threaded applications.
Intel® Smart Cache
Enhancements . . . . . . . . . . . . . . . . . . 5 tecture (Nehalem) enables the scaling of • Intel® QuickPath Technology. This new,
energy efficiency and performance can be scalable, shared memory architecture
Instructions per Cycle
seen in the Intel Xeon processor 3500 and integrates a memory controller into each
Improvements. . . . . . . . . . . . . . . . . . . 5
5500 series. These two server/workstation microprocessor and connects processors
Enhanced Branch Prediction. . . . . . 6 processor series incorporate a number of and other components with a new high-
 ew Application Targeted
N Intel’s innovative technologies to deliver speed interconnect. It speeds traffic
Accelerators and Intel® SSE4. . . . . . 6 intelligent performance. (Each of which between processors and I/O controllers
will be discussed in more depth later in for bandwidth-intensive applications,
Improved Virtualization this paper.) delivering up to 3.5 times the bandwidth
Performance. . . . . . . . . . . . . . . . . . . . . 6
• Intel® Turbo Boost Technology. This for technical computing2.
4. New System Architecture:
technology (in combination with Intel® • Intel® Intelligent Power Technology.
Intel® QuickPath Technology. . . . . . 6
Intelligent Power Technology described This feature enables policy-based control
Intel® QuickPath Interconnect below), delivers performance on demand, that allows processors to operate at
Performance. . . . . . . . . . . . . . . . . . . . . 6 allowing processors to operate above the optimal frequency and power. Operating
5. Intel® Intelligent Power rated frequency to speed specific work- systems can make this determination
Technology. . . . . . . . . . . . . . . . . . . . . . 7 loads and drop back down to reduce power automatically, or administrators can
consumption during low utilization periods. designate which applications require
6. Intel® Virtualization Technology. . . 7
• Intel® Hyper-Threading Technology.1 high-frequency processing and which
Summary. . . . . . . . . . . . . . . . . . . . . . . . . 8 should be executed at lower frequencies
This well-known Intel innovation provides
more performance for applications to conserve power.3
designed for parallel, multi-threaded • Intel® Virtualization Technology.4
execution by reducing computational This latest generation version of Intel®
latency and making optimal use of every Virtualization Technology (Intel® VT)
cycle. Intel® Hyper-Threading Technology enhances virtualization performance by
benefits from this latest Intel microarchi- up to 2.1 times5 with new hardware-
tecture’s larger caches and massive assist capabilities across server and
workstation elements.

2
White Paper First Tick, Now Tock: Intel® Microarchitecture (Nehalem)

Unlocking All the Power of Intel’s 45nm Hi-k Metal


Gate Process Technology
Intel microarchitecture (Nehalem) marks the next step (a “tock”)
in Intel’s rapid “tick-tock” cadence for delivering a new process
technology (tick) or an entirely new microarchitecture (tock)
every year.

Intel microarchitecture (Nehalem) was designed from the ground


up to capitalize on all the advantages of Intel’s industry-leading
45-nanometer (nm) Hi-k metal gate silicon technology. This process
technology is one of the biggest advancements in fundamental
transistor design in 40 years. It uses a unique material combination
of Hi-k gate dielectrics and conductors to enable Intel to continue
record-breaking PC, laptop, and server processor performance
while reducing the electrical leakage from transistors that can
hamper chip and PC design, size, power consumption, and costs.

Intel’s 45nm Hi-k silicon process technology increases transistor


switching speeds to enable higher core and bus clock frequencies
and thus more performance in the same power and thermal
envelope. This performance efficiency is helping Intel extend
Moore’s Law (a high-tech industry axiom that transistor counts
double about every two years to deliver ever more functionality
Figure 1. Intel® microarchitecture (Nehalem).
at exponentially decreasing cost) well into the next decade.

An Overview of Intel’s New Microarchitecture


• Innovative extensions to the Intel® Streaming SIMD Extensions 4
Intel microarchitecture (Nehalem) continues Intel’s philosophy
(SSE4) that center on enhancing Extensible Markup Language
of focusing on improvements in how the processor uses available
(XML), string, and text processing performance
clock cycles and power, rather than just pushing up ever higher
clock speeds and energy needs. The goal is to do more in the same • Superior multi-level cache, including an inclusive shared L3
power envelope—or even reduced envelopes. In turn, like its (last-level) cache
Intel Core microarchitecture predecessor (Penryn), Intel microar-
• New high-end system architecture that delivers from two to
chitecture (Nehalem) includes the ability to process up to four
three times more peak bandwidth and up to four times more
instructions per clock cycle on a sustained basis compared to just
realized bandwidth (depending on configuration) as compared
three instructions per clock cycle or less processed by other
to previous Intel Xeon processors
processors. However, the biggest innovations offered by Intel
microarchitecture (Nehalem) come from new optimizations of the • Performance-enhanced dynamic power management
individual cores and the overall multi-core microarchitecture to
On the design side, Intel microarchitecture (Nehalem) enables
increase single-thread and multi-thread performance.
optimal price/performance/energy efficiency for each market
Intel microarchitecture (Nehalem) includes these performance segment through:
and power management innovations:
• Scalable performance for from one-to-16 (or more) threads
• Dynamically managed cores, threads, cache, interfaces, and power and from one-to-eight (or more) cores

• Intel Hyper-Threading Technology, a capability which enables • Scalable and configurable system interconnects and integrated
running two simultaneous threads per core—an amazing eight memory controllers
simultaneous threads per quad-core processor and 16 simulta-
• High-performance integrated graphics engine for client platforms
neous threads for dual-processor, quad-core designs. This
feature provides an energy efficient means of increasing
performance for multi-threaded workloads.

3
White Paper First Tick, Now Tock: Intel® Microarchitecture (Nehalem)

An In-Depth Look at Top Features Higher Performance on Demand


We will now provide a more in-depth
Normal 4C Turbo <4C Turbo
look at some of the top features of Intel
microarchitecture (Nehalem) that are
available in the Intel Xeon processor 3500
and 5500 series.

1. Intel® Turbo Boost Technology


Intel® Turbo Boost Technology is an innova-
tive feature that automatically allows
active processor cores to run faster than
the base operating frequency when there
Freqemcy

CORE 0

CORE 1

CORE 2

CORE 3

CORE 0

CORE 1

CORE 2

CORE 3

CORE 0

CORE 1
is available headroom within power, current,
and temperature specification limits. This
enables Intel Xeon processor 3500 and
All cores operate All cores operate Fewer cores may operate
5500 series to deliver extra performance at rated frequency at higher frequency at even higher frequencies
when and where it is needed (see Figure 2).
This can be particularly advantageous in Figure 2. Intel® Turbo Boost Technology increases performance by increasing processor frequency
and enabling faster speeds when conditions allow.
speeding up the processing of light or
lightly threaded workloads.

Intel Turbo Boost Technology is activated frequency step (133.33 MHz) when two or and voltage. Due to the way the system
when the operating system requests the more cores are active. Therefore, higher firmware and OS communicate, the
highest processor performance state. C-state residency (“C3” or “C6”) on some software may never detect core clock
Headroom is dynamically assessed by cores will generally result in increased frequencies above the operating frequency.6
continual measurement of temperature, core frequency on the active cores.
current draw, and power consumption. 2. Intel® Hyper-Threading Technology
The upper limits are further constrained
The maximum frequency of Intel Turbo Many server and workstation applications
by temperature, power, and current.
Boost Technology is dependent on the lend themselves to parallel, multi-threaded
These constraints are managed as a
number of active cores. The amount of execution. Intel Hyper-Threading Technol-
simple closed-loop control system. If
time the processor spends in the Intel ogy enables simultaneous multi-threading
measured temperature, power, and
Turbo Boost Technology state depends on within each processor core, up to two
current are all below factory-configured
the workload and operating environment. threads per core or eight threads per
limits, and the operating system (OS) is
quad-core processor (see Figure 3, next
Any of the following can set the upper requesting maximum processor perfor-
page). Hyper-threading reduces computa-
limit of Intel Turbo Boost Technology mance, the processor automatically steps
tional latency, making optimal use of every
on a given workload: up core frequency (+133.33 MHz) until it
clock cycle. For example, while one thread is
reaches the upper limit dictated by the
• Number of active cores waiting for a result or event, another
number of active cores. When temperature,
thread is executing in that core to maximize
• Estimated current consumption power, or current exceed factory-configured
the work from each clock cycle.
limits—and you are above the base operating
• Estimated power consumption
frequency—the processor automatically An Intel® processor and chipset combined
• Processor temperature steps down core frequency (-133.33 MHz) with an operating system and system
in order to reduce temperature, power, firmware supporting Intel Hyper-Threading
The number of active cores at any given
and current. The processor then monitors Technology enables:
instant dictates the upper limit of Intel
temperature, power, and current, and
Turbo Boost Technology. For instance, • Running demanding applications
continuously re-evaluates.
a core is considered “active” if it is in the simultaneously while maintaining
“C0” or “C1” state; a core in the “C3” or “C6” Note: When Intel Turbo Boost Technology system responsiveness
state is considered “inactive.” (C-states is requested by the OS, the processor will
•R
 unning multi-threaded applications
are the power conservation states of a commonly operate between the maximum
faster to maximize productivity
processor core.) The upper limits will vary Intel Turbo Boost Technology frequency
and performance
on a per-processor number basis. For and the base operating frequency. All
example, a particular processor may allow active cores in the processor will operate • Increasing the number of transactions
up to two frequency steps (266.66 MHz) at the same frequency. Even at frequen- that can be processed simultaneously
when just one core is active and one cies above the base operating frequency,
•P
 roviding headroom for new solution
all active cores will run at the same frequency
capabilities and future needs
4
White Paper First Tick, Now Tock: Intel® Microarchitecture (Nehalem)

When Intel Hyper-Threading Technology Intel® Hyper-Threading Technology


and Intel Turbo Boost Technology are
combined in processors based on Intel
microarchitecture (Nehalem), these
intelligent processors deliver better
performance by dynamically adapting
to the workload. They automatically
take advantage of available headroom
to increase processor frequency and
maximize clock cycles on active cores
to get the tasks done quicker.

3. Other Key Performance


Improvement Features
Intel microarchitecture (Nehalem) includes
significant core enhancements to further
improve the performance of the individual
processor cores. Below we describe some
of these enhancements.
Figure 3. Intel® Hyper-Threading Technology enables simultaneous multi-threading of eight
Intel® Smart Cache Enhancements threads per quad-core processor.

Intel microarchitecture (Nehalem) enhances


the Intel® Smart Cache by adding an inclusive
A new two-level Translation Lookaside Intel also increased the size of the other
shared L3 cache that can be up to eight
Buffer (TLB) hierarchy is also included in buffers in the core to ensure they
megabytes (MB) in size. In addition to this
Intel microarchitecture (Nehalem). A TLB wouldn’t become a limiting factor.
cache being shared across all cores, the
is a processor cache that is used by memory
inclusive shared L3 cache can increase •M
 ore Efficient Algorithms. With each
management hardware to improve the
performance while reducing traffic to new microarchitecture, Intel has included
speed of virtual address translation. The
the processor cores. Some architectures improved algorithms in places where
TLB references physical memory address-
use exclusive L3 cache, which contains previous processor generations saw lost
es in its table. All current desktop and
data not stored in other caches. Thus, if a performance due to stalls (dead cycles).
server processors use a TLB, but Intel
data request misses on the L3 cache, each Intel microarchitecture (Nehalem) brings
microarchitecture (Nehalem) adds a new
processor core must still be searched (or many such improved algorithms to
second level 512 entry TLB to further
snooped) in case their individual caches increase performance. These include:
improve performance.
might contain the requested data. This
–F
 aster Synchronization Primitives: As
can increase latency and snoop traffic Instructions per Cycle Improvements multi-threaded software becomes more
between cores. With Intel microarchitecture
The more instructions that can be run each prevalent, the need to synchronize
(Nehalem), a miss of its inclusive shared L3
clock cycle, the greater the performance. threads is also becoming more common.
cache guarantees the data is outside the
In many cases, by running more instructions Intel microarchitecture (Nehalem) speeds
processor and thus is designed to eliminate
in any given clock cycle, the work task can up the common legacy synchronization
unnecessary core snoops to reduce latency
complete sooner, enabling the processor primitives (such as instructions with a
and improve performance.
to more quickly return to a lower power LOCK prefix or the XCHG instruction) so
The new three-level cache hierarchy for state. To run more instructions per cycle, that existing threaded software will
Intel microarchitecture (Nehalem) consists of: Intel made several key innovations. see a performance boost.
• Same L1 cache as Intel Core microarchi- • Greater Parallelism. One way to extract –F
 aster Handling of Branch Mispredictions:
tecture (32 KB Instruction Cache, 32 KB more parallelism out of software code is A common way to increase performance
Data Cache) to increase the amount of instructions is through the prediction of branches.
that can be run “out of order.” This enables Intel microarchitecture (Nehalem)
• New L2 cache per core for very low
more simultaneous processing. To be able optimizes the cases where the predic-
latency (256 KB per core for handling
to identify more independent operations tions are wrong, so that the effective
data and instruction)
that can be run in parallel, Intel increased penalty of branch mispredictions overall
• New fully inclusive, fully shared 8 MB L3 the size of the out-of-order window and is lower than on prior processors.
cache (all applications can use entire cache) scheduler, giving them a wider window
from which to look for these operations.

5
White Paper First Tick, Now Tock: Intel® Microarchitecture (Nehalem)

– Improved Hardware Prefetch and Better encoding and processing, 3-D imaging, and 4. New System Architecture: Intel®
Load-Store Scheduling: Intel microarchi- gaming). In addition, Intel microarchitecture QuickPath Technology
tecture (Nehalem) continues the many (Nehalem) adds seven new Application To deliver top performance for bandwidth-
advances Intel made with the Intel Core Targeted Accelerators for more efficient intensive applications, the Intel Xeon
microarchitecture (Penryn) family of accelerated string and text processing of processor 3500 and 5500 series feature
processors in reducing memory access applications like XML. Intel® QuickPath Technology (see Figure 4,
latencies through prefetch and load- next page). As mentioned earlier in this
Application Targeted Accelerators extend
store scheduling improvements. paper, this new scalable, shared memory
the capabilities of Intel® architecture by
architecture delivers memory bandwidth
Enhanced Branch Prediction adding performance-optimized, low-latency,
leadership at up to 3.5 times the band-
lower power fixed-function accelerators
Branch prediction attempts to guess whether width of previous-generation processors.
on the processor die to benefit specific
a conditional branch will be taken or not.
applications. Such accelerators are the Intel QuickPath Technology is a platform
Branch predictors are crucial in today’s
start of a natural evolution where gradually architecture that provides high-speed (up
processors for achieving high performance.
more and more advantageous implemen- to 25.6 GB/s), point-to-point connections
They allow processors to fetch and execute
tations of fixed-function capabilities will between processors, and between proces-
instructions without waiting for a branch
be developed and added to the processor. sors and the I/O hub. Each processor has
to be resolved. Processors also use branch
Just as the evolution of silicon technology its own dedicated memory that it accesses
target prediction to attempt to guess
from 65nm to 45nm to 32nm enables directly through an Integrated Memory
the target of the branch or unconditional
more transistors for additional cores and Controller. In cases where a processor
jump before it is computed by parsing the
cache, so too will it also enable more of needs to access the dedicated memory of
instruction itself. In addition to greater
these fixed-function on-die implementations. another processor, it can do so through a
performance, an additional benefit of
The benefit will be greater performance— high-speed Intel® QuickPath Interconnect
increased branch prediction accuracy is
and superior energy efficiency—for these that links all the processors.
that it can enable the processor to consume
specific applications.
less energy by spending less time executing Intel microarchitecture (Nehalem) comple-
mispredicted branch paths. The seven Application Targeted Accelerators ments the benefits of Intel QuickPath
included in Intel microarchitecture (Nehalem) Interconnect by enhancing Intel Smart
Intel microarchitecture (Nehalem) uses
provide new string and text processing Cache with an inclusive shared L3 cache
several innovations to reduce branch
instructions to improve performance of that boosts performance while reducing
mispredicts that can hinder performance
string and text processing operations. For traffic to the processor cores.
and to improve the handling of
example, they enable parsing of XML strings
branch mispredicts.
and text at a much higher speed. These Intel® QuickPath Interconnect Performance
• New Second-Level Branch Target Buffer Application Targeted Accelerators will be Intel QuickPath Interconnect ‘s throughput
(BTB). To improve branch predictions in useful for lexing, tokenizing, regular expres- clearly demonstrates its best-in-class
applications that have large code foot- sion evaluation, virus scanning, and intrusion. interconnect performance in the server/
prints (e.g., database applications), Intel workstation market segment.
added a second-level branch target buffer. Improved Virtualization Performance
• Intel QuickPath Interconnect uses up to
BTBs reduce the performance penalty Virtualization partitions a computer so
6.4 Gigatranfers/second links, delivering
of branches in pipelined processors by that it can run separate operating systems
up to 25 Gigabytes/second (GB/s) of total
predicting the path of the branch and and software in each partition, allowing
bandwidth. That’s up to 300 percent
caching information used by the branch. one computer to act as many. Virtualiza-
greater than any other interconnect
tion enables computers, particularly
• New Renamed Return Stack Buffer solution used previously. (Gigatransfer
servers, to better leverage multi-core
(RSB). RSBs store forward and return refers to the number of data transfers
processing power and increase efficiency.
pointers associated with call and return or operations.)
Intel microarchitecture (Nehalem) adds
instructions. Intel’s new renamed RSB
new features that enable software to • Intel QuickPath Interconnect ‘s superior
helps avoid many common return
further improve their performance in architecture reduces the amount of
instruction mispredictions.
virtualized environments. For example, communication required in the interface
New Application Targeted Accelerators Intel microarchitecture (Nehalem) includes of multi-processor systems to deliver
and Intel® SSE4 an Extended Page Table (EPT) for reconcil- faster payloads.
Intel microarchitecture (Nehalem) includes ing memory type specification in a guest
• Intel QuickPath Interconnect Implicit
all the additional Intel SSE4 instructions operating system with memory type
Cyclic Redundancy Check (CRC) with
Intel included in Intel Core microarchitec- specification in the host operating system
link-level retry ensures data quality
ture (Penryn) for faster computation/ in virtualization systems that support
and performance by providing CRC
manipulation of media (graphics, video memory type specification.
without the performance penalty of
additional cycles.
6
White Paper First Tick, Now Tock: Intel® Microarchitecture (Nehalem)

Intel® QuickPath Technology Automatic Operation or Manual Core Control

Voltage (cores)
Controller

Controller
Memory

Memory
Processor Processor

Core 0 Core 1 Core 2 Core 3


Memory Memory

Memory, System, Cache, I/O


I/O
Controller
Voltage (rest of processor)

Figure 4. Intel® QuickPath Technology provides dedicated per-processor Figure 5. Integrated Power Gates enable idle cores to go to near-zero
memory and point-to-point connectivity. power independently.

5. Intel® Intelligent Power Technology Chipset: Intel® Virtualization Technology for Directed I/O (Intel®
Intel Intelligent Power Technology is an innovation that monitors VT-d) helps speed data movement and eliminates much of the
power consumption in servers to identify those that are not performance overhead by giving designated virtual machines
being fully utilized. It has two main features: their own dedicated I/O devices, thus reducing the overhead
of the VMM in managing I/O traffic.
• Integrated Power Gates (see Figure 5) allow individual idling cores
to be reduced to near-zero power independent of other operating Network Adapter: Intel® Virtualization for Connectivity (Intel® VT-c)
cores, reducing idle power consumption to 10 watts, versus 16 further enhances server I/O solutions by integrating extensive
or 50 watts in prior generations of Intel quad-core processors7. hardware assists into the I/O devices that are used to connect
servers to the data center network, storage infrastructure, and
• Automated Low-Power States automatically put processor and other external devices. By performing routing functions to and
memory into the lowest available power states that will meet from virtual machines in dedicated network silicon, Intel VT-c
the requirement of the current workload (see Figure 6). Because speeds delivery and reduces the load on the VMM and server
processors are enhanced with more and lower CPU power processors, providing up to two times the throughput of
states, and the memory and I/O controllers have new power non-hardware-assisted devices.8
management features, the degree to which power can be
minimized is now greatly enhanced.

6. Intel® Virtualization Technology Automated Low-Power States


Next-generation Intel Virtualization Technology (Intel VT) ENHANCED Power
Management
enhances virtualization performance with new hardware-assist
Controller

Controller

capabilities across all elements of your server and workstation.


Memory

Memory

Processor Processor
Processor: Improvements to Intel® Virtualization Technology
for IA-32, Intel® 64 and Intel® Architecture (Intel® VT-x) provide
hardware-assisted page-table management, allowing the guest Memory Memory

OS more direct access to the hardware and reducing compute-


intensive software translation from the virtual machine monitor I/O NEW Power
Controller Management
(VMM). Intel VT-x also includes Intel® Virtualization Technology
FlexMigration and Intel® Virtualization Technology FlexPriority,
which are capabilities for flexible workload migration and Figure 6. Automated Low-Power States adjust system power consump-
performance optimization across the full range of 32-bit and tion based on real-time load.
64-bit operating environments.

7
White Paper First Tick, Now Tock: Intel® Microarchitecture (Nehalem)

Summary
Intel microarchitecture (Nehalem) represents the next level of multi-core performance, offering the latest in processor innovation.
First appearing as the Intel® Core™ i7 processor and now the foundation for the Intel Xeon processor 3500 and 5500 series, Intel
microarchitecture (Nehalem) intelligently maximizes performance to match workloads. As a microarchitecture for server/workstation
processors, it offers energy-efficient performance that scales energy use per performance demands while unleashing parallel processing
performance. Its new, scalable, shared memory architecture integrates a memory controller into each microprocessor and connects
processors and other components with a new high-speed interconnect that speeds traffic between processors and I/O controllers for
bandwidth-intensive applications. Numerous virtualization technologies enable Intel microarchitecture (Nehalem) to offer best-in-class
virtualization, making it the obvious choice for consolidation projects and—with its energy-efficient performance—server refreshes.

Learn More
For more information on Intel microarchitecture (Nehalem) including animations and podcasts, visit:
www.intel.com/technology/architecture-silicon/next-gen/index.htm

For more on Intel’s 45nm Hi-k metal gate process technology, see: www.intel.com/technology/45nm

For more on Intel QuickPath Technology, download the Intel QuickPath Architecture white paper at:
www.intel.com/technology/quickpath

For more on Intel Xeon processor 5500 series, download the product brief at:
www.intel.com/Assets/PDF/prodbrief/xeon-5500.pdf

To learn more about Intel Xeon processor 3500 series, visit:


www.intel.com/p/en_US/products/server/processor/xeon3000

For more on Intel SSE4, download the white paper, “Extending the World’s Most Popular Processor Architecture,” at:
download.intel.com/technology/architecture/new-instructions-paper.pdf

www.intel.com

1
Hyper-Threading Technology requires a computer system with a processor supporting Hyper-Threading Technology and an HT Technology enabled chipset, BIOS and operating system. Performance will vary depending on the specific hard-
ware and software you use. See www.intel.com/info/hyperthreading/ for more information including details on which processors support HT Technology.
2
Intel internal measurement. (Feb 2009) Stream-Triad benchmark. Red Hat Enterprise Linux Server 5.3. Intel® Xeon® processor E5472, 3.0 GHz, 2x6 MB L2 cache, 1600 MHz system bus, 16 GB memory (8x2 GB FB DDR2-800) vs. Intel® Xeon®
processor X5570, 2.93 GHz, 8 MB L3 cache, 6.4QPI, 24 GB memory (6x4 GB DDR3-1333).
3
Not applicable to Macintosh operating systems. Uses Intel® Turbo Boost Technology which requires a platform with a processor with Intel Turbo Boost Technology capability. Intel Turbo Boost Technology performance varies depending on
hardware, software and overall system configuration. Check with your platform manufacturer on whether your system delivers Intel Turbo Boost Technology. For more information, see www.intel.com/technology/turboboost.
4
Intel® Virtualization Technology requires a computer system with an enabled Intel® processor, BIOS, virtual machine monitor (VMM) and, for some uses, certain platform software enabled for it. Functionality, performance or other benefits will
vary depending on hardware and software configurations and may require a BIOS update. Software applications may not be compatible with all operating systems. Please check with your application vendor.
5
Performance results on VMmark benchmark. Intel® Xeon processor X5470 data based on published results. Intel® Xeon processor X5570 Intel internal measurement. (Feb 2009): HP Proliant ML370 G5 server platform with Intel Xeon proces-
sors X5470 3.33 GHz, 2x6 MB L2 cache, 1333 MHz FSB, 48 GB memory, VMware* ESX* V3.5.0 Update 3 Published at 9.15@ 7 tiles vs. Intel® Xeon® processor X5570, 2.93 GHz, 8 MB L3 cache, 6.4QPI, 72 GB memory (18x4 GB DDR3-800),
VMware* ESX* Build 140815. Performance measured at 19.51@ 13 tiles.
6
For a more in-depth discussion of how Intel Turbo Boost technology works, see: http://download.intel.com/design/processor/applnots/320354.pdf
7
Depending on processor SKU.
8
Intel internal measurement. (April 2008) Ixia* IxChariot* 6.4 benchmark. VMWare* ESX* v3.5U1. Intel® Xeon® processor E5355, 2.66 GHz, 8 MB L2 cache, 1333 MHz system bus, 8 GB memory (8x1 GB FB DIMM 667 MHz).
All products, platforms, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice.
All data is based on comparisons of engineering data sheets or measurements using actual hardware or simulators.
Intel® Virtualization Technology requires a computer system with an enabled Intel® processor, BIOS, virtual machine monitor (VMM) and, for some uses, certain platform software enabled for it. Functionality, performance or other benefits will
vary depending on hardware and software configurations and may require a BIOS update. Software applications may not be compatible with all operating systems. Please check with your application vendor.
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE,
TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S TERMS AND CONDITIONS OF SALE FOR SUCH
PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL
PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT,
COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR
INTENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH MAY OCCUR.
Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions
marked “reserved” or “undefined.” Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to
them. The information here is subject to change without notice. Do not finalize a design with this information.
The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current
characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copies
of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or by visiting Intel’s Web site
at www.intel.com.
Copyright © 2009 Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Core, Xeon Inside, and Xeon are trademarks of Intel Corporation in the U.S. and other countries.
*Other names and brands may be claimed as the property of others. Printed in USA 0809/EB/HBD/PDF Please Recycle 319724-002US

You might also like