SW Dev WP
SW Dev WP
Compile Profile
Benefits XCC ISS
• Easy-to-use Xtensa Xplorer IDE based on familiar Eclipse
platform
• Small, high-performance code from ‘C’ source
Simulate/Debug
– Compiler offers state-of-the-art inter-procedural and ISS, XTMP,
XTSC
alias analysis
– Automatic vectorization of operations for Xtensa SIMD
Figure 1: Tensilica’s Eclipse-based Xtensa Xplorer IDE serves as the
processors cockpit for custom processor development.
– Automatic Flexible Length Instruction eXtension (FLIX)
instruction bundling for multi-issue Xtensa very long
The Xtensa Processor Developer’s Toolkit is the integrated
instruction word (VLIW) cores design environment that delivers powerful tools to your
• Detailed pipeline analysis guides optimizations from cycle/ desktop to guide you through the processor customization
pipeline-accurate ISS process. You’ll find that Tensilica has created the most
• Fast TurboXim simulation for up to 50 million instructions advanced, powerful, and easy-to-use tools for processor
per second customization.
Figure 3. The editor includes many useful functions to speed up code generation and debugging
www.cadence.com 2
Tensilica Software Development Toolkit (SDK)
Original Ported
www.cadence.com 3
Tensilica Software Development Toolkit (SDK)
Interprocedural analysis is an optimization method that looks preprocessor in the GNU tools, and the flags for the preprocessor
globally across all associated files of an application at link remain the same. The assembler and linker also utilize the same
time. Global optimization is a much more powerful method flags as the GNU versions of the tools.
than optimizing locally within an expression or procedure.
Interprocedural analysis examines relationships across function Xtensa debugger
calls, and can perform optimizations that cannot be achieved The debugger allows you to target either the pipeline-/cycle-
with a local scope. Interprocedural analysis eliminates unneeded accurate ISS or TurboXim when no hardware is available, or
computations, improves function inlining, and performs alias external probes to connect with hardware development boards.
analyses that may not be performed by less sophisticated As shown in Figure 5, the GUI-based debugger allows full system
optimization techniques. visibility into your project; it controls program execution and
The Xtensa C/C++ compiler supports operator overloading on provides views to variables, breakpoints, memory, registers, etc.
custom data types in the ‘C’ programming language (without the Source and assembly code can be made visible simultaneously
overhead that is often associated with it). while debugging an application, and either code window can be
single stepped. The debugger interoperates seamlessly with the
Tensilica is well known for its ability to let designers add custom other development tools (compiler toolchain, ISS) to allow rapid
instructions and data types to improve performance. If an code development for Xtensa processor systems.
application needs to work on 56-bit data, a designer can define a
custom 56-bit data type with a single line of code. The designer Cores in multi-processor subsystems can be debugged and
can also specify what regular ‘C’ operators, such as ‘+’ and ‘*’, stepped synchronously or asynchronously with the other cores.
should do when using this data type. The overloading is always With user-defined data formatting, any data value can be
done with zero overhead so the resulting binaries are always re-formatted to display a more user-friendly representation. This is
efficient. particularly effective when dealing with non-native ‘C’ types such
Porting and creating ‘C’ application code that uses custom data as fixed point or vector data or when certain bits represent status.
types is easier because standard ‘C’ operator syntax can be used. This data can be displayed in the Xtensa Xplorer IDE however you
This makes the code easier to read and simpler to port via changes want using familiar print formatting. Datatypes that are defined by
in the ‘C’ header files rather than throughout the source code Tensilica in its DSP engines have default formatting that will show
itself. See Figure 4. the user-friendly representations automatically. See Figure 6 for an
example.
The rest of the software development toolchain is based on
standard GNU tools. The compiler front-end remains similar to the
Figure 5. The Xtensa debugger allows full visibility into the system
www.cadence.com 4
Tensilica Software Development Toolkit (SDK)
Figure 7. The profiling window allows performance metric analysis while optimizing code “hot spots”
Profiling tools data from hardware instantiated in an FPGA or ASIC. You can
track performance data such as instruction execution count,
Code profiling is an extremely important tool for optimizing the
subroutine calls, subroutine total cycles, cache performance,
performance of your application code. The Xtensa Xplorer IDE
etc. While viewing functions in the profiling view, you can also
enables you to view profiling results generated by Tensilica’s
simultaneously view the assembly code in the disassembly view
pipeline-accurate ISS (see Figure 7). Additionally, for much
and the source code in the editor. The call graph view enables
faster and more accurate profiling, you can generate profiling
you to view the entire application hierarchy’s caller and callee
Figure 8. The pipeline viewer helps you understand instruction stalls and latency issues
www.cadence.com 5
Tensilica Software Development Toolkit (SDK)
functions. For those inner loop optimizations, the graphical Vectorization Assistant
pipeline view (Figure 8) shows any pipeline inefficiencies and
bubbles that may be occurring. Vectorization is the process of transforming the flow of your
code (from the usual handling of one data item at a time) into a
Profiling of multi-processor subsystems shows each core side by parallel loop that operates on multiple data items at once. The
side for easy load assessment and re-partitioning guidance. See Xtensa compiler is capable of performing this transformation
Figure 9. automatically, but you can help it exploit implicit parallelism in your
code by eliminating certain patterns of data access that prevent
successful vectorization.
Cycles
Figure 10 shows how the Vectorization Assistant finds and displays
15,000 loops in your code that could be “vectorized” by the compiler if
the source was tweaked. Locating areas in the code that have not
ICache Miss Cycles
DCache Miss Cycles
been vectorized, but could be, can take a long time looking at
10,000 Uncached Instruction Fetch profiles, assembler, and pipeline views—then you have the task
Uncached Load Cycles of doing the optimization to make it vectorize. In a few clicks, the
Interlock Cycles
Vectorization Assistant gets you to the loops in your source code
Branch Delay Cycles
5,000
Total Cycles that would benefit the most from vectorization.
Figure 10. Vectorization Assistant helps find areas that can be improved
www.cadence.com 6
Tensilica Software Development Toolkit (SDK)
XTSC
Xtensa ISS Compile and
Device A Device B
Libraries Link on Host SystemC RTL SystemC
Application Pin-Level XTSC
Code
User Device FIFO
Models Run Producer Core Consumer Core
Figure 11. Using the ISS with XTMP or XTSC for modeling
System
RAM ROM Memory RAM ROM
XTMP and XTSC are integrated into the Xtensa Xplorer IDE, which Modeling of local and system memory
automates the creation and development of multi-processor XTMP and XTSC allow memory modeling of both local and system
subsystem simulations. For XTMP, simulations are described in memory. System memory can have programmable latencies
standard C code, which you can modify to allow more complex specified for different transaction types, allowing an accurate
systems and additional simulator control if required. For XTSC, system simulation for analyzing performance tradeoffs. Memory-
simulations are described in standard SystemC code. In addition, mapped peripherals may be included in an XTMP/XTSC system
you have full visibility into all aspects of the simulation through simulation, and functions are provided to connect the processor to
the extensive API. Designers can use a single Xtensa Xplorer IDE peripheral devices.
to debug all simulated cores for additional visibility. The Xtensa
Xplorer IDE manages all of these connections for you in its IDE for
simplicity and easy viewing of any core.
www.cadence.com 7
Tensilica Software Development Toolkit (SDK)
Summary
The Xtensa Xplorer IDE is a complete GUI-based collection of tools
that allows the software developer to create code for systems
based on Xtensa processors. From project implementation to code
generation to analysis, the Xtensa SDK enables you to achieve
fast time-to-market while employing one of the most efficient
32-bit architectures available today. Xtensa processors lower
total system costs and help design teams construct extremely
high-performance system architectures.
Cadence Design Systems enables global electronic design innovation and plays an essential role in the
creation of today’s electronics. Customers use Cadence software, hardware, IP, and expertise to design
and verify today’s mobile, cloud, and connectivity applications. www.cadence.com
© 2014 Cadence Design Systems, Inc. All rights reserved worldwide. Cadence, the Cadence logo, Tensilica, and Xtensa are registered trademarks
and Xplorer is a trademark of Cadence Design Systems, Inc. in the United States and other countries. All other trademarks are the property of
their respective owners. 08/14 2768 SA/DM/PDF