0% found this document useful (0 votes)
66 views4 pages

Soc Fpga Main Memory Performance: Architecture Brief

Uploaded by

Gideros
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views4 pages

Soc Fpga Main Memory Performance: Architecture Brief

Uploaded by

Gideros
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Architecture Brief

SoC FPGA Main Memory Performance


Introduction
A cursory look at the memory specifications can conceal the whole story of how it will perform in an
SoC FPGA-based system. It is important to check the measured memory performance, not just the bus
specifications to ensure that maximum efficiency is realized, for performance, operation and power
consumption benefits.
This Architecture Brief looks at memory performance considerations when selecting an SoC FPGA for a
design project.

Key aspects of this Architecture Brief are highlighted in an online video: “System Performance:
How smart is your memory controller?” which can be found at [Link]/socarchitecture.

Top Level Specs


When selecting an SoC FPGA, one would typically assume that the memory bus speed would dominate the
realized system memory performance (see Table 1).

Table 1: External Memory Controller Support Comparison


Function/Feature Altera SoC FPGA Vendor B Vendor C
Hardened External Memory
Yes Yes Yes
Controller for Processor System
Maximum Supported
4G 1G 4G
Address Space
LPDDR2, DDR2, LPDDR2, DDR2,
Memory Types Supported LPDDR, DDR2, DDR3
DDR3L, DDR3 DDR3L, DDR3
x8
x8
x8+ECC
x16 x16
Data Width x16
x16+ECC x16+ECC
Configuration Modes x16+ECC
x32 x32
x32
x32+ECC
x32+ECC
Integrated ECC Support 16 bit, 32 bit 16 bit 8 bit, 16 bit, 32 bit
External Memory Bus 400 MHz (Cyclone V SoC),
533MHz 333 MHz
Maximum Frequency 533 MHz (Arria V SoC)
Memory Controller Intelligence
However, other factors - how intelligently the memory data transfers are prioritized, scheduled, and processed - can significantly
impact overall memory performance. Altera SoC FPGAs utilize Altera’s third generation memory controller technology which
include advanced features in the areas of scheduling, bank management, command and data reordering, and more.

Figure 1: Altera Memory Controller Intelligence

Altera SoCs
Deficit Weighted Round Robin Scheduling
Bank Management (+hint)
User Supplied Profiles
Trust Zone Security
Memory Controller Features

Round Robin Sheduler


Comand and Data Reordering GEN 3
Priority Management
Power Management
GEN 2
Simple Scheduler
Bank Management
Active Refresh

GEN 1

Memory Performance Case Study: LMbench


To illustrate the impact of the memory controller intelligence on system memory performance, consider two SoC FPGA devices
with different memory bus speeds (shown in Figure 2). The one on the left is the Altera Cyclone V SoC FPGA; the one on the
right is an SoC FPGA from “Vendor B”. Both have a dual-core ARM Cortex-A9 processor running at the same frequency of
667 MHz. However, one has an external memory operating at 400 MHz, while the other uses an external memory running at
533 MHz. Which one would you expect to have the better system memory performance? Initially, one would expect the system
with 533 MHz memory to exhibit 33% higher performance. However, factors in the memory controller architecture produce
some noticeably different results.

Figure 2: SoC FPGA Memory Performance Comparison

Altera SoC FPGA Vendor B SoC FPGA


CPU Hard Memory DDR CPU Hard Memory DDR
667 MHz Controller Memory 667 MHz Controller Memory
A 400 MHz B 533 MHz

FPGA Logic FPGA Logic


Turning to the system performance benchmark called LMbench, an industry-standard benchmark ([Link]/lmbench)
well known for exercising the memory system performance, helps to quantify and compare the results. LMbench (ver. 3) consists
of several different read/write test cases. The results for the partial read/write case are shown in Figure 3 as the partial read/write
case is most indicative of transfers in a typical embedded system.

Figure 3: LMbench Partial Read/Write Memory Bandwidth Test Demonstrates Benefits of Advanced Controller

Higher is better
5,000
Altera SoC FPGA
CPU: 667 MHz
Vendor B
4,000

CPU: 667 MHz With its more sophisticated memory


controller, a 400-MHz DDR3 memory
Memory Bandwidth (MB/s)

3,000 interface on an Altera SoC FPGA


outperforms a 533-MHz DDR3 memory
interface on a competing device.
2,000

1,000

0
512

1K
2K
4K

8K

16 K
32 K

64 K
128 K
256 K

512 K
1M

2M
4M

8M
16 M
32 M
64 M
Transfer Size (bytes)

The vertical axis shows the memory bandwidth vs. the data transfer size along the horizontal axis. (Higher is better for the
memory bandwidth.) The curve can be grouped into three stages as the data size moves from the L1 cache (32KB data + 32KB
instruction) to the L2 cache (512KB shared) to external memory. Note that the Altera SoC FPGA significantly outperforms the
Vendor B SoC FPGA on the L1 andL2 cache regions. As discussed earlier, one would expect that by the time the transfers reach
the external memory (>512 KB on the curve) that the Vendor B solution would outperform the Altera SoC FPGA due to the
533 MHz external bus on SoC FPGA B vs. the 400 MHz memory bus of the Altera Cyclone V SoC FPGA. However, this is not
the case as the Altera SoC FPGA exhibits comparable or better performance, even when accessing main memory at >1MB data
transfer size. These results are due to the L1/L2 cache structure and external memory controller intelligence of the Altera SoC FPGA.
Grouping the data into small (512 byte to 16 KB), medium (16KB to 1MB) and large (>2MB) data transfer sizes as shown in
Figure 4 helps provide a numerical analysis for the three different regions of the curve.

Figure 4: LMbench Memory Bandwidth Difference Grouped by Data Transfer Size

Ratio of Memory Bandwidth for


Altera SoC FPGA vs. SoC FPGA B
18% 17.03% Altera SoC FPGA SoC FPGA B
Memory Bandwidth Increase

16% CPU
667 MHz 667 MHz
14% Frequency
12% Memory Device 400 MHz 533 MHz
Frequency
10%
8% 6.28% 6.60%
6%
4%
2%
0%
512 Bytes - 16 KB 16 KB - 1 MB 2 MB - 67 MB
Data Transfer Size (Bytes) Benchmark: LMbench
Access: Partial Memory Read Write
Across the range of small, medium, and large memory accesses, the Altera SoC FPGA with a more effective cache structure
and more advanced memory controller, extracts up to 17% more memory bandwidth despite a slower external memory bus
operating frequency.
These results demonstrate that when comparing SoC FPGAs, it is important to check the measured memory system
performance, not just the memory bus specifications. Memory controller algorithms extract maximum bandwidth by managing
transaction priority, reordering command and data, and scheduling pending transactions using, for example, deficit weight
round robin algorithms. Additional performance can be achieved by customizing the memory controller via software for the
system’s custom data profile, set priorities, assign ports or transaction channels, and even share the bandwidth between them.

Conclusion
The main memory selection is another example of where architecture matters. Memory controllers today can use sophisticated
algorithms to maximize system memory efficiency. A superior memory controller can extract more bandwidth from system
memory, enabling the memory to run at a lower frequency for the same throughput; thus saving system power and benefiting
the whole system design.

Want to Learn More?


For a more in-depth explanation of the Altera SoC FPGA
architecture and LMbench performance results, tune to
the EE Journal Chalk Talk entitled: Architecture Matters:
Three Architectural Insights for SoC FPGAs.
For more details on the Altera Cyclone V SoC FPGA memory
controller architecture and settings, consult the SDRAM
Controller Section of the Cyclone V Device Handbook, Vol. 3
Hard Processor System Technical Reference Manual.

Altera Corporation Altera European Headquarters Altera Japan Ltd. Altera International Ltd.
101 Innovation Drive Holmers Farm Way Shinjuku i-Land Tower 32F Unit 11- 18, 9/F
San Jose, CA 95134 High Wycombe 6-5-1, Nishi-Shinjuku Millennium City 1, Tower 1
USA Buckinghamshire Shinjuku-ku, Tokyo 163-1332 388 Kwun Tong Road
[Link] HP12 4XF Japan Kwun Tong
United Kingdom Telephone: (81) 3 3340 9480 Kowloon, Hong Kong
Telephone: (44) 1494 602000 [Link] Telephone: (852) 2 945 7000
[Link]

Copyright © 2014 Altera Corporation. All rights reserved. Altera, the stylized Altera logo, speciἀc device designations, and all other words and logos that are identified as trademarks and/or service
marks are, unless noted otherwise, the trademarks and service marks of Altera Corporation in the U.S. and other countries. All other product or service names are the property of their respective
holders. Altera products are protected under numerous U.S. and foreign patents and pending applications, mask work rights, and copyrights. Altera warrants performance of its semiconductor
products to current specifications in accordance with Altera’s standard warranty, but reserves the right to make changes to any products and services at any time without notice. Altera assumes no
responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Altera. Altera customers are advised
to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services. November 2014 SS-01243

You might also like