50% found this document useful (2 votes)
2K views1,059 pages

3DS Programming Manual

Official Programming Manual for Nintendo 3DS/New Nintendo 3DS.

Uploaded by

Avery Ramsey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
50% found this document useful (2 votes)
2K views1,059 pages

3DS Programming Manual

Official Programming Manual for Nintendo 3DS/New Nintendo 3DS.

Uploaded by

Avery Ramsey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1059

3DS Programming Manual: System

Version 1.6

Nintendo Confidential

This document contains confidential and proprietary information of Nintendo, and is protected under
confidentiality agreements as well as the intellectual property laws of the United States and of other
countries. No part of this document may be released, distributed, transmitted, or reproduced in any form ,
including by any electronic or mechanical means and by including within information storage and retrieval
systems, without written permission from Nintendo.

©2015–2016 Nintendo. All rights reserved.


All company and product names in this document are the trademarks or registered trademarks of their respective companies.
CTR-06-0019-002-F

1. Introduction

This document provides 3DS application developers with an overview of CTR features, functions used,
and programming procedures.

2. System provides hardware block diagrams and overviews of the various features. Refer to this
chapter for an overall grasp of the whole system.

3. Memory and Nintendo 3DS Game Cards describes system memory regions and Nintendo 3DS Game
Card memory regions. Refer to this chapter to learn about memory maps and proper CTR memory
usage.

4. Software Configuration describes the various types of software and libraries, and how they are
organized. Refer to this chapter for information about differences between the hardware and application
types, and about the libraries included in the SDK.

5. Initializing Applications and Handling State Transitions describes the processes required to run an
application and to properly transition through system states such as Sleep Mode. Refer to this chapter
for explicit initialization and state transitioning procedures, and code examples.

6. Input From Input Devices describes how to use the various input devices in an application. Refer to
this chapter for more information about using key input, accelerometer, touch panel, gyro sensor,
microphone, and camera.

7. File System describes how to access media files and directories. Refer to this chapter for details on
accessing files located on a Nintendo 3DS Card and its backup memory, or in SD cards.
8. Time describes time-related features, such as ticks, timers, alarms, and the real-time clock (RTC).
Refer to this chapter for more information about measuring the passage of time or using the RTC.

9. Threads describes threads and classes used to synchronize multiple threads. Refer to this chapter
for more information about how to create threads and how to share resources between multiple threads.

10. Sound describes how to use the libraries needed for sound playback. Refer to this chapter for more
information about playing back sounds in your application.

11. System Settings describes how to access settings such as the user's name or the sound output
mode. Refer to this chapter for more information about retrieving such information in your application.

12. Applets describes how to use applets from within an application. Refer to this chapter to learn how
to use applets provided on the 3DS.

13. Supplemental Libraries describes how to use the libraries that provide functionality specific to the
3DS family. Refer to this chapter to learn how to access pedometer information and built-in resources.

14. Infrared Communications Between Systems describes how to use the libraries needed by the
infrared communications module. Refer to this chapter to communicate between systems using infrared
communications.

15. Differences Between TWL Mode and Actual TWL System describes the differences in operations of
a 3DS system operating in TWL emulation mode and an actual TWL system. Refer to the cautions
mentioned in this chapter when developing applications for Nintendo DS-family systems that will also
run on 3DS systems.

Note: For more information about the graphics features, see the separate 3DS Programming
Manual: Basic Graphics and 3DS Programming Manual: Advanced Graphics.

CONFIDENTIAL

2. System

This section describes the system block diagrams and hardware block information for both SNAKE and
CTR.

2.1. System Block Diagram (SNAKE)

Figure 2-1 shows an overview of the SNAKE system.

Figure 2-1. System Block Diagram (SNAKE)


The 800 pixels for stereoscopic display mode consist of 400 pixels each for the left and right
eyes.

The items with a red background in the diagram are newly added specifications in SNAKE.
Items with an orange background have changed from CTR.

Note: The LCD screen size and other specifications differ depending on the hardware variation.
For information about the specification differences, see 3DS Overview – General.

2.2. System Block Diagram (CTR)

Figure 2-2 shows an overview of the CTR system.

Figure 2-2. System Block Diagram (CTR)


The 800 pixels for stereoscopic display mode consist of 400 pixels each for the left and right
eyes.

Note: FTR (Nintendo 2DS) has different specifications than CTR for the number of speakers and
the upper screen. For information about the specification differences, see 3DS Overview –
General.

Application developers do not need to consider the differences between CTR and FTR.

2.3. SoC

The system-on-chip (SoC) includes the main components of the CPU, GPU, DSP, and VRAM all on
one chip. The specifications of each are as follows.

2.3.1. CPU

This is an ARM11 MPCore processor with multiple cores and their related vector floating-point
(VFP) coprocessors.

SNAKE CTR
Processor Core ARM11 MPCore
Operating 268 MHz or 804 MHz 268 MHz
Frequency
4-way set-associative (per core)
Core 0-1:
L1 cache Core 0-1:
16 KB for instructions, 16 KB for data
16 KB for instructions, 16 KB for
Core 2-3:
data
32 KB for instructions, 32 KB for data
16-way set-associative
L2 cache (Shared by all cores. 2 MB shared None
command/data.)

Endianness Little-endian

CTR has two CPU cores (Core 0 and Core 1). SNAKE has four CPU cores (Core 0 through 3). An
application can use CPU Core 0 exclusively, while the remaining cores are used by the system. The
core used exclusively by the application is called the application core and the cores used by the
system are called the system cores. Each CPU core is used as follows.

CPU Core Usage


Core 0 Application
Core 1 System (but applications can use up to 30%)
Core 2 Unused (reserved for system)
Core 3 System (controls the Super-Stable 3D feature)

Note: Although part of the system core can be used by the application, there are
disadvantages such as a negative effect on background processes.

2.3.2. GPU

This includes a graphics core developed by Digital Media Professionals Inc. (DMP).

Because graphics processing is handled by the framebuffers, the BG or OBJ concepts used for the
Nintendo DS do not apply.

Graphics Core PICA graphics core


Operating
268 MHz
Frequency

Architecture Framebuffer architecture


Shader Vertex (programmable), pixel (not programmable)
OpenGL ES 1.1-based fixed pipeline + independent extensions (partially equivalent
Graphics
to OpenGL ES 2.0)
Features
Frequently used 3D graphics processes are built into the hardware.

2.3.3. VRAM

This includes two 3-MB video memory modules, which are read/write accessible only from the GPU.
There is no difference in the performance of the two memory modules.
This is generally used as the location for the color buffers, depth buffers (Z buffers), and stencil
buffers, and for textures and vertex buffers that are frequently used.

VRAM is independent of the main memory and cannot be written to directly by the CPU.

2.3.4. Sound DSP

The CTR system comes equipped with a digital signal processor (DSP). Unlike the DSP in the TWL,
the CTR's DSP cannot be used for other processing.

Number of Channels 24 ch
Roughly 32,728 Hz (precisely, Fs = 16756991 * 16 / (32 * 256) =
Sampling Rate
32,728.49805… Hz)

Parameter Update
Roughly 4.889 ms (precisely, T = 160 / Fs = 4.888705… ms)
Interval
Sampling Format DSP ADPCM / PCM8 / PCM16
Resampling Polyphase filter / linear interpolation / disabled
Number of AUX Buses 2

2.4. Main Memory

Each system has the following memory space available.

SNAKE CTR
124 MB 64 MB
Main Memory
(Development hardware has 178 MB.) (Development hardware has 96 MB.)

Texture and vertex buffers may also be located in main memory and referenced directly from the
GPU.

For more information about the memory map, see 3.1. Memory.

2.5. Operating Modes

The SNAKE hardware has two modes of operation: a standard mode that is equivalent to CTR, and
an extended mode that is unique to SNAKE.

The following table lists the performance differences between standard mode and extended mode.

Table 2-1. Differences Between Standard Mode and Extended Mode

Standard Mode Extended Mode

CPU Operating Frequency 268 MHz 804 MHz (3x)


CPU L2 Cache Disabled Enabled (2 MB)

Main Memory Size 64 MB 124 MB

2.6. LCD

The system has two LCD screens.

SNAKE CTR (SPR) FTR


Screen 3.88 inches 3.5 inches 3.5
size XL is 4.88 inches. SPR is 4.88 inches. inches
Resolution 400 × 240 pixels
(3D 800 × 240 pixels (with one pixel for the left eye and one for the right displayed in
display) the same size as a single pixel in normal 2D display)

Color
Upper 8-bit RGB for approximately 16.77 million colors
depth
Screen
LCD type Semi-transmissive (anti-glare)
Built-in active backlight (controlled by turning power-saving mode on and off)
Backlight
Brightness can be adjusted while the HOME Menu is displayed.

By setting the mode and designing an application to support it, an


3D display application can support autostereoscopic display (stereoscopic 3D without None
glasses).

Screen 3.33 inches 3.0 inches 3.0


size XL is 4.18 inches. SPR is 4.18 inches. inches

Resolution 320 × 240 pixels


Color
8-bit RGB for approximately 16.77 million colors
Lower depth
Screen LCD type Semi-transmissive (no anti-glare)
Built-in active backlight (controlled by turning power-saving mode on and off)
Backlight
Brightness can be adjusted while the HOME Menu is displayed.

Touch
Resistive film (No multi-touch. Input coordinates can be obtained in LCD pixels.)
panel

Warning: Stereoscopic display does not work when the system is held vertically (with the longer
side of the LCD screens vertical).

Note: For more information about stereoscopic display, see the 3DS Programming Manual:
Advanced Graphics.

2.7. Memory Devices


2.7.1. Game Card Slot

The Game Card slot accepts the following Game Cards.

DS Only
3DS Only DSi-Compatible
DSi Only

SNAKE ○ ○
CTR ○ ○

Note: It does not have a Game Pak (GBA) slot.

[Link]. Nintendo 3DS Game Cards

Nintendo 3DS Game Cards can be accessed at higher speeds than TWL/NITRO Cards, and
security features have also been revamped.

Two types of cards are available: CARD1 and CARD2. Current specifications support a CARD1
ROM capacity of up to 4 GB (32 gigabits). A backup memory capacity of either 128 KB (1
megabit) or 512 KB (4 megabits) is supported. CARD2 provides a total ROM and backup memory
capacity of up to 2 GB (16 gigabits).

The system uses some of the capacity of both the ROM and backup memory. For more
information, see 3.2. Nintendo 3DS Game Cards and 7.3. Save Data.

2.7.2. System NAND Memory

Each system has the following amount of internal NAND memory. System NAND memory is used for
saving data such as preinstalled application data.

SNAKE CTR

Capacity 1.3 GB 1 GB

2.7.3. SD/microSD Card Slot

Each system includes a microSD or SD card slot that supports the following media types.

The system provides a means of accessing some files using dedicated libraries (for example,
SoundDB and ImageDB). Applications (specifically, downloadable applications other than Nintendo
DSiWare) can be started directly from an SD card.

SNAKE CTR
microSD Memory Card SD Memory Card
Supported microSDHC Memory Card SDHC Memory Card
Media (microSDXC Memory Cards are not (SDXC Memory Cards are not
supported.) supported.)
2.8. Input Devices

2.8.1. Key Input

CTR
SNAKE FTR Comments
SPR

+Control
Yes
Pad
When the X Button on SNAKE is pressed with
considerable force, downward movement might be
detected on the C Stick. To avoid problems, we
recommend the following workarounds in applications
that use the C Stick.

A, B, X, Do not assign press-and-hold operations to the X


and Y Yes Button.
Buttons Do not read values from the C Stick in scenarios
where the X Button is held down.
Adjust the sensitivity of the C Stick.
More force is required to reproduce this issue on the
development hardware and the new Nintendo 3DS than
on the new Nintendo 3DS XL.

L/R
Yes
Buttons
Available The ZL and ZR Buttons on SNAKE are compatible with
when the ZL and ZR Buttons on the Nintendo 3DS Circle Pad
ZL/ZR
Yes Circle Pad None Pro. Applications can handle them as if SNAKE were a
Buttons
Pro is CTR system that is always equipped with the Circle Pad
connected. Pro.

Circle Pad Yes


Available
The C Stick on SNAKE is compatible with the Right
C Stick/ when
Circle Pad on the Nintendo 3DS Circle Pad Pro.
Right Yes Circle Pad None
Applications can handle them as if SNAKE were a CTR
Circle Pad Pro is
system that is always equipped with the Circle Pad Pro.
connected.

SELECT has been kept for backward compatibility with


START or Nintendo DS software, and any SELECT press is
Yes
SELECT interpreted by 3DS applications as identical to a START
press.
Activates the HOME Menu.
HOME
Yes This input cannot be used by the application for
Button
gameplay purposes.

Controls power features. To prevent unintended


operation of the button, the button does not react to
momentary presses. Holding it down for a set period of
POWER time begins forced shutdown.
Yes
Button On SNAKE hardware, the POWER Button can be pressed
while the system is closed.
This input cannot be used by the application for
gameplay purposes.

On SNAKE and FTR, wireless communication is enabled


Wireless and disabled from the HOME Menu.
None Yes None
switch This input cannot be used by the application for
gameplay purposes.
Open/close The open/close detection switch detects whether the
detection system is open or closed. FTR cannot be closed, so it
Open/close Sleep includes a sleep switch "slider" instead of an open/close
sensor/
detection sensor Switch detection switch.
Sleep
Switch This input cannot be used by the application for
gameplay purposes.
Equipped with a volume slider.
Sound
Yes This input cannot be used by the application for
volume
gameplay purposes.
Adjusts parallax for stereoscopic display.
3D Depth
Yes None This input cannot be used by the application for
Slider
gameplay purposes.

2.8.2. Accelerometer

The CTR system includes an accelerometer in the base of the system that measures acceleration
on each of the three axes. This can be used by applications. The accelerometer range is
approximately 1.8 G in both directions on each axis, with noise when stationary of up to ±0.02 G,
sensitivity of approximately 0.002 G, and a sampling rate of 100 Hz (theoretical value for the
device).

2.8.3. Touch Panel

The CTR system has a touch panel over the lower LCD, just like the Nintendo DS/DSi.

The touch panel uses resistive film technology for single-point touch functionality similar to the
Nintendo DSi. As with previous systems, input coordinates can be obtained in terms of LCD pixels.

2.8.4. Microphone

The monaural microphone is located to the side of the group of buttons at the bottom of the lower
screen. Performance is equivalent to the microphone on the Nintendo DSi. As with the Nintendo
DSi, the microphone gain can be adjusted from 10.5 dB to 70.0 dB in increments of 0.5 dB.

Sensitivity variations between individual microphones are within 0.5 dB (1.06x).

2.8.5. Cameras

The CTR system has one inner camera and two outer cameras (left and right), with the same
specifications as the ones used on the Nintendo DSi system. You can either use both outer
cameras at the same time or the left outer camera and the inner camera at the same time, but it is
not possible to use all three cameras at the same time.

The main specifications of the cameras are as follows.

Aperture F2.8 (fixed).

Angle of view See the following table (when photographing at maximum resolution).
Photographable
20 cm to infinity. (Pan focus. Not equipped with a macro switch.)
range
Maximum
VGA
resolution

Maximum
30 fps (fps: frames per second)
frame rate

YCrYCb (can also output to RGBA8888, RGB888, RGB565, and RGBA5551 formats
Output format
by use of the separate YUV to RGB circuit).

Minimum Average Maximum


Diagonal 63.0° 66.0° 69.0°

Horizontal 52.2° 54.9° 57.6°


Vertical 40.4° 42.6° 44.8°

2.8.6. Gyro Sensor

The CTR system is equipped with a triaxial gyroscopic sensor in the bottom half of the system that
can detect when the system is tilted and the speed of rotation. The sensor can measure up to
±1,800 degrees per second (DPS), with noise when stationary of up to ±2.28 DPS, sensitivity of
0.07 DPS, and a sampling rate of 100 Hz.

2.9. Output Devices

2.9.1. Speakers

CTR, SPR, SNAKE, and CLOSER have stereo speakers located to the left and right of the upper
screen. FTR has a single monaural speaker located to the upper-left of the upper screen.

The following figure shows the sound pressure frequency characteristics of the CTR, SPR, FTR,
SNAKE, and CLOSER speakers.

Figure 2-3. Sound-pressure Frequency Characteristics of the CTR (SPR/FTR/SNAKE/CLOSER) System


Speakers
2.9.2. Audio Jack

The system comes equipped with a stereo output mini-jack. Unlike the Nintendo DS/DSi, there is no
microphone jack.

Headsets that include a microphone are not supported.

SNAKE CTR SPR FTR


Audio Jack Front and center of the lower half of the Front left of lower half of
location system system

2.9.3. LEDs

The system comes with LEDs to indicate the camera state, battery level, charging status, wireless
status, 3D display, and notification status.

Each platform has the following LEDs.

SNAKE CTR SPR FTR


Camera status None Yes Yes None
Battery Yes Yes Yes Yes
Charging Yes Yes Yes Yes
Wireless Yes Yes Yes Yes
3D Display None Yes None None
Notifications Yes Yes Yes Yes

2.10. Communication Devices

2.10.1. Wireless Communication Module

The CTR system has a wireless communication module that transmits in the 2.4 GHz band.

Communications features can be broadly classified as foreground, such as when an application


explicitly uses communications features, and background, such as when the system is engaging in
automatic communications.

For more information about wireless communication, see the 3DS Programming Manual: Wireless
Communication.

Foreground communication
Infrastructure communication
Local communication
Download Play

Background communication
StreetPass
Download tasks
Presence features

2.10.2. Infrared Communication Device

The CTR system comes with an infrared transceiver and an infrared communication device.

2.10.3. Near Field Communication (NFC)

A contactless near field communication (NFC) antenna is located below the lower screen panel for
SNAKE.
2.11. Other

2.11.1. Real-Time Clock (RTC)

This keeps and measures time.

2.11.2. Compatibility With the Nintendo DSi

The CTR system maintains compatibility with the Nintendo DSi and supports DS Download Play.

CONFIDENTIAL

3. Memory and Nintendo 3DS Game Cards

This chapter describes memory regions that can be accessed by applications and Nintendo 3DS Game
Cards written to by card-based software.

3.1. Memory

3.1.1. Memory Map

The following figure shows a memory map as seen by applications. This memory map is for release
versions.

Figure 3-1. Memory Map (Release Versions)


Memory access from applications is carried out using virtual addresses.

For security reasons, it is not possible to execute data arrays loaded into memory other than the
region where the program is loaded (such as heap memory or device memory).

Memory above address 0x4000_0000 is reserved.

3.1.2. Device Memory

Device memory is a region in memory for which the operating system guarantees address integrity
when it is accessed by the GPU or other peripheral device. Some of the buffers accessed by the
GPU used for graphics and the DSP used for sound must be allocated from this device memory.
Conversely, some buffers cannot be allocated from device memory. Generally, those buffers must
be 4096-byte aligned and have sizes that are multiples of 4096 bytes.

You can use the nn::os::SetDeviceMemorySize() function to specify the size of the memory
region to allocate as device memory. Specify the size as a multiple of
nn::os::DEVICE_MEMORY_UNITSIZE (4096 bytes). On CTR, you can allocate up to 64 MB for the
main program, heap memory and device memory (96 MB on development hardware). On SNAKE,
you can allocate up to 124 MB (178 MB on development hardware). You can call this function again
to change the size of the memory region to allocate, but if you do you must change the size by a
multiple of 1 MB (1,048,576 bytes). When changing to size 0 at this time, there are no issues if the
size before the change is a multiple of 1 MB. Similarly, when changing from size 0, there are no
issues if the size before the change is a multiple of 1 MB. However, it has to be made a multiple of
4096 bytes.

There is no guarantee that the address of the memory region allocated will be the same for every
call. After specifying the size of the region, first use the nn::os::GetDeviceMemoryAddress
function to get the starting address of the region and then access the device memory. Use the
nn::os::GetDeviceMemorySize function to get the size of the currently allocated memory.

Table 3-1. Types and Starting Address Alignment for Buffers Allocated From Device Memory

Device Buffer Type Alignment Comments


Texture image 128 bytes
GPU Vertex buffers 1 to 4 bytes Changes depending on vertex attributes.

Display buffer 16 bytes


Sound source The size of the whole region storing sound source
DSP 32 bytes
data data must be a multiple of 32 bytes.

64 bytes If you use a buffer allocated in a space other than


CAMERA Receive buffer
(recommended) device memory, the SDK stops on a panic.
Image data
64 bytes If you use a buffer allocated in a space other than
Y2R receive/send
(recommended) device memory, the SDK stops on a panic.
buffer

Buffers accessed by the GPU may also be allocated from VRAM.

Table 3-2. Buffers That Cannot Be Allocated From Device Memory

Buffer Comments
Buffer storing microphone sampling Must be 4096-byte aligned, and its size must be a multiple of
results 4096 bytes.

Must be 4096-byte aligned, and its size must be a multiple of


UDS library’s receive buffer
4096 bytes.
Must be 4096-byte aligned, and its size must be a multiple of
Socket library’s receive buffer
4096 bytes.
Must be 4096-byte aligned, and its size must be a multiple of
HTTP library’s working buffer
4096 bytes.

Must be 4096-byte aligned, and its size must be a multiple of


DLP library’s working buffer
4096 bytes.
Must be 4096-byte aligned, and its size must be a multiple of
ACT library communication buffer
4096 bytes.

3.1.3. VRAM

VRAM consists of the two regions VRAM-A and VRAM-B, each 3 MB, for a total of 6 MB.

Applications must manage memory regions used by the GX (graphics) library, such as by allocating
color buffers. (Note that when reading VRAM on the CPU, the data is not guaranteed to be
accurate.) To allocate memory in VRAM, use the nngxGetVramStartAddr,
nngxGetVramEndAddr, and nngxGetVramSize() functions to get the start and end addresses
and size of VRAM to manage the location of the requested memory regions.

For more information about the GX library and how to manage memory used by libraries, see 5.5.
Initializing the GX Library (in this manual) and the 3DS Programming Manual: Basic Graphics.
3.1.4. Heap Memory

This memory is freely usable by applications, such as for allocating file read and write buffers. Use
the nn::os::SetHeapSize function to specify the size of heap memory to allocate. The size must
be a multiple of nn::os::HEAP_UNITSIZE (4096 bytes) On CTR, the total size of main program
meny, heap memory, and device memory must be no greater than 64 MB (96 MB for development
units). On SNAKE, the heap can use up to 96 MB and the total size of main program memory, heap
memory, and device memory must be no greater than 124 MB (178 MB for development hardware).
Call this function again to change the size of the memory region to allocate.

When expanding and reducing the size of the heap memory region according to the usage status,
and resizing the device memory region accordingly, take note of the following points when
managing the heap memory region.

When reducing the heap memory region after it was expanded, reduce it by the same amount it
was incremented.
The heap memory region can be divided more than once and expanded. In such cases, reduce
its size in stages, so that it returns to the size prior to expansion.

If this condition is violated, it may not be guaranteed that the unassigned region can be secured as
device memory.

Figure 3-2. Example of Expanding Device Memory After Expanding or Reducing Heap Memory

As shown in Figure 3-2, when reducing the heap memory region and expanding the device memory
region from a state where the heap memory region was expanded in stages (Step 1, Step 2, Step
3), to return the heap memory region to the size it was prior to expansion, first reduce it by the
same size it was incremented by in Step 3 (Step 4), and then reduce it by the same size it was
increased by in Step 2 (Step 5).

Warning: Wherever possible, we recommend maintaining the total size required by the heap
memory region and not resizing it. Perform these steps only when a resize is absolutely
required. However, resizing in small units can lead to a considerable drop in
performance when accessing the memory.
There is no guarantee that the address of the memory region allocated will be the same for every
call. After specifying the size, use the nn::os::GetHeapAddr() function to get the starting
address of the region. Use the nn::os::GetHeapSize() function to get the size of the currently
allocated heap.

3.2. Nintendo 3DS Game Cards

There are two types of Nintendo 3DS Game Cards: CARD1 and CARD2. The card most appropriate
for use is determined by the game title’s specifications.

Warning: As a rule, use CARD1 for game titles with a backup memory capacity of 512 KB, and
use CARD2 for those with a capacity of 1 MB or more.

3.2.1. CARD1

CARD1 cards come equipped with both read-only memory (ROM) and backup memory (rewritable
non-volatile memory).

[Link]. ROM

The CARD1 ROM transfer speed varies depending on the version of the SDK in use and the size
of the data being transferred. The actual speed may also vary slightly due to differences in
individual cards, so do not design applications that depend on transfer speeds.

There are six different ROM capacities, ranging from 128 MB to 4 GB.

Table 3-3. ROM Capacity (CARD1)

Usable Space Size by Market


ROM Transfer
Capacity Speed Japan, North America,
Europe China, Taiwan
Korea
93.5 MB 91.5 MB 61.5 MB
128 MB
(98,041,856 bytes) (95,944,704 bytes) (64,487,424 bytes)
219.0 MB 217.0 MB 187.0 MB
256 MB
(229,638,144 bytes) (227,540,992 bytes) (196,083,712 bytes)
470.0 MB 468.0 MB 438.0 MB
512 MB
(492,830,720 bytes) (490,733,568 bytes) (459,276,288 bytes)

(See 921.5 MB 919.5 MB 889.5 MB


1 GB
Note) (966,262,784 bytes) (964,165,632 bytes) (932,708,352 bytes)

1873.5 MB 1843.5 MB
1875.5 MB (1,964,507,136 (1,933,049,856
2 GB
(1,966,604,288 bytes) bytes) bytes)

3781.0 MB 3751.0 MB
3783.0 MB
4 GB (3,964,665,856 (3,933,208,576
(3,966,763,008 bytes)
bytes) bytes)

Transfer speeds depend on the version of the SDK and the size of the data. For more information,
see 3DS Performance Tips.

[Link]. Backup Memory

The transfer speed for backup memory in CARD1 cards depends on the version of the SDK in use
and the size of the data being transferred. The actual speed may also vary slightly due to
differences in individual cards, so do not design applications that depend on transfer speeds. See
the CTR Guidelines and follow its standards for the number of write operations.

There are two different backup memory capacities, either 128 KB or 512 KB. Use a CARD2 card if
you need 1 MB or more of backup memory.

The content of backup memory at time of shipment is guaranteed to be padded with 0xFF
throughout the entire memory region. Pad all of backup memory with 0xFF when emulating the
memory state at time of shipment.

You can enable automatic redundancy of the backup memory region used for save data files.
However, when automatic redundancy is enabled, the capacity available for saving files is roughly
40% of the total capacity.

Table 3-4. Backup Memory Capacity (CARD1)

Backup Memory Transfer Usable Region Size When Automatic Redundancy Is


Capacity Speed Enabled
128 KB 50 KB or less
(See Note)
512 KB 239 KB or less

Transfer speeds depend on the version of the SDK and the size of the data. For more information,
see 3DS Performance Tips.

3.2.2. CARD2

CARD2 cards come equipped with writable memory that has features of both ROM and backup
memory.

[Link]. Writable Memory

The transfer speed for writable memory in CARD2 cards depends on the version of the SDK used
and the size of the data involved. The actual speed may vary slightly due to differences in
individual cards, so do not design applications that depend on transfer speeds. See the CTR
Guidelines and follow its standards for the number of write operations.

With writable memory, you can change the boundaries of the ROM (read-only) and backup
(rewritable) regions in 1-MB increments.

You can use the library to enable automatic redundancy for data in backup memory, just as with
CARD1 backup memory. Calculate the usable capacity for saving files in a similar manner. The
maximum file size that you can specify as backup space is half of the writable memory capacity.
There are three sizes of writable memory capacity: 512 MB, 1 GB, and 2 GB.

Table 3-5. Writable Memory Capacity (CARD2)

Usable Space Size by Market


Writable Memory Transfer
Capacity Speed Japan, North America,
Europe China, Taiwan
Korea
410.5 MB 412.5 MB
412.5 MB
512 MB (430,440,448 (432,537,600
(432,537,600 bytes)
bytes) bytes)
887.5 MB 889.5 MB
(See 889.5 MB
1 GB (930,61,200 (932,708,352
Note) (932,708,352 bytes)
bytes) bytes)
1841.5 MB 1843.5 MB
1843.5 MB
2 GB (1,930,952,704 (1,933,049,856
(1,933,049,856 bytes)
bytes) bytes)

Transfer speeds depend on the version of the SDK and the size of the data. For more information,
see 3DS Performance Tips.

CONFIDENTIAL

4. Software Configuration
This chapter describes the software configuration of the 3DS system.

4.1. Applications

To switch from one application to another, you must start the second application using the HOME
Menu.

Figure 4-1. Application Switching


4.2. Standard and Extended Applications

The SNAKE hardware offers two operating modes with different CPU performance: standard mode
and extended mode. For more information, see 2.5. Operating Modes. CTR is handled as a platform
that only supports the standard mode.

Consequently, two application types are available: standard applications that run in standard mode
even on SNAKE, and extended applications that run in extended mode on SNAKE.

Table 4-1. Application Types and Operating Modes on Each Platform

Application Type SNAKE CTR Comments

Standard Includes applications built with previous


Standard application Standard mode
mode SDKs.
Extended Extended Standard
application mode mode

4.2.1. Creating a Standard Application

If you are using the CTR-SDK build system, comment out the following line in the OMakefile if it
exists: "DESCRIPTOR = $(CTRSDK_ROOT)/resources/specfiles/[Link]".

If you are using VSI-CTR (requires VSI-CTR Platform 3.0.0 or later), in Configuration Properties >
General, change Create Extended Application to No.

4.2.2. Creating an Extended Application

If you are using the CTR-SDK build system, add the following line to the OMakefile: "DESCRIPTOR
= $(CTRSDK_ROOT)/resources/specfiles/[Link]".

If you are using VSI-CTR (requires VSI-CTR Platform 3.0.0 or later), in Configuration Properties >
General, change Create Extended Application to Yes.

4.2.3. Things to Note When Using Extended Mode


The L2 cache is enabled in extended mode. However, the performance of functions such as
nngxUpdaterBuffer might decrease due to the large capacity of the L2 cache.

To avoid this performance drop, variants of some API functions are provided for specific purposes
in extended mode. Be sure to use the right function when handling graphics in your application
while running in extended mode.

4.2.4. When the Operating Mode Changes

On SNAKE, the operating mode changes when nn::applet::Enable is called. Consequently,


even standard applications run in extended mode from the time the 3DS logo is displayed until
control is passed to the application and nn::applet::Enable is called.

The system always runs in extended mode while the HOME Menu is displayed, regardless of
whether an application is running. When a standard application is suspended by the HOME Menu,
the system changes from standard mode to extended mode, and then back to standard mode when
control returns to the application.

System applets such as Game Notes and Internet Browser also suspend the application and switch
the operating mode to extended mode in the same manner as the HOME Menu.

Table 4-2. Application Types and Timing of Operating Mode Change

Application Logo HOME Library System


nn::applet::Enable Application
Type Display Menu Applet Applets
Extended Extended Extended Extended Extended Extended
Extended mode
application mode mode mode mode mode
Standard Extended Standard Extended Standard Extended
Standard mode
application mode mode mode mode mode

4.2.5. Changing Application Behavior on SNAKE and CTR

With the addition of SNAKE to the 3DS family, in some cases it may be necessary to modify the
behavior of an application depending on the type of hardware it is running on.

An application can determine whether it is running on SNAKE hardware by calling the


nn::os::IsRunOnSnake() function. This allows hardware-specific implementations, such as
omitting the Circle Pad Pro connection check on SNAKE hardware, which has the equivalent input
devices built into the system.

An application created as an extended application can determine whether it is running in extended


mode on SNAKE by calling the nn::os::IsRunningAsExtApplication() function. Note,
however, that this function has a high execution cost and cannot be called inside nninitStartUp.

The nn::os::IsRunOnSnake() and nn::os::IsRunningAsExtApplication() functions


return the following values for the various combinations of hardware and application type.

Table 4-3. Return Values for Hardware and Application Combinations

Hardware Application IsRunOnSnake() IsRunningAsExtApplication()


Extended application true true
SNAKE
Standard application true false
Extended application false false
CTR
Standard application false false

Warning: Always use the nn::os::IsRunningAsExtApplication() function to determine


whether the application can use the enhanced features in extended mode (such as
additional memory and CPU performance). The cache manipulation functions that are
added when the L2 cache is enabled, however, can be used even in standard mode.

Use the nn::os::IsRunOnSnake() function to determine whether any other devices


with different specifications on SNAKE and CTR can be used.

4.2.6. SNAKE-Only Titles

For applications that only operate on SNAKE, set the SNAKEOnly item in the BSF file to True.
SNAKE-only titles do not run on CTR, so you do not need to consider operations on CTR.

Note: For more information about BSF files, see the reference manual for the CTR-SDK tool
ctr_makebanner.

4.3. What the SDK Provides

The 3DS SDK includes not only libraries for handling device input such as key presses, but also
provides applets and libraries for using features such as the HOME Menu and SpotPass.

4.3.1. Applets

Much like the HOME Menu, applets provide features for specific purposes that applications can
use. Using applet features can help reduce the cost of developing applications.

Warning: Only applets provided by Nintendo can be used. Developers cannot create their own
applets.

4.3.2. Libraries

CTR-SDK includes the libraries needed to use the hardware. Each library has its own namespace
derived from the library name and comprises multiple classes and member functions.

Libraries are written mainly in C++, with wrapper functions written in C. For more information about
the C-language wrapper functions, see the CTR-SDK API Reference. The following descriptions all
use C++ for function names and other code.

Table 4-4. Library List

Library
Namespace Description
Name

A collection of classes for memory allocation, mutual exclusion, ticks,


OS nn::os
alarms, threads, and other operating system-related features.
RO nn::ro Provides DLL features.

Supports starting applets, transitioning the system to sleep when the


APPLET nn::applet
system is closed, and other functionality.

FS nn::fs Used for accessing files on various media.


CX nn::cx Handles data compression and decompression.

MATH nn::math A collection of mathematical and numeric functions.

CRYPTO nn::crypto Handles encryption.


FND nn::fnd A collection of heap, time, and other fundamental classes.

FONT nn::font Handles character drawing using font data.

Handles input from the digital buttons, Circle Pad, touch panel,
HID nn::hid
accelerometer, gyro sensor, and debug pad.
PTM nn::ptm Controls system power and alarms.

Handles GPU and LCD control. Uses the gl() and nngx() functions
GX nn::gx
for 3D graphics rendering.

A lighter-weight version of the 3D graphics rendering functions in the


GD nn::gd
GX library.

GR nn::gr Supports direct generation of 3D graphics commands.


MIC nn::mic Handles automatic microphone sampling.

CAMERA nn::camera Handles image capture using a camera.

Handles conversion from YUV to RGB format using the YUVtoRGB


Y2R nn::y2r
circuit.
QTM nn::qtm Handles face tracking. Available only on SNAKE.

DSP nn::dsp Allows use of DSP for sound playback

SND nn::snd Handles local sound playback.


AACDEC nn::aacdec Handles AAC data decoding.

AACENC nn::aacenc Handles AAC data encoding.

NDM nn::ndm Controls the daemon carrying out network processes.


Handles wireless communications using the wireless communications
UDS nn::uds
module.

RDT nn::rdt Allows secure data communication using the UDS library.

Handles StreetPass [Chance Encounter Communication (CEC)]


CEC nn::cec
settings and other features.
DLP nn::dlp Supports Nintendo 3DS Download Play.

AC nn::ac Handles automatic connections for infrastructure communication.

BOSS nn::boss Handles download task registration.


FRIENDS nn::friends Supports access to friend information.

NEWS nn::news Posts notifications.

ACT nn::act Gets information registered to the account system and authenticates it.
EC nn::ec For using the EC features.
IR nn::ir For using infrared communication between systems.
Allows linking between the application and branded character products
NFP nn::nfp
called amiibo™ figures.
Calculates the left- and right-eye camera matrices used in stereoscopic
ULCD nn::ulcd
display.
JPEG nn::jpeg Handles encoding to and decoding from JPEG format.

TPL nn::tpl Handles texture collection TPL files.


CFG nn::cfg Gets information handled by System Settings.

NGC nn::ngc Checks for words that are in the profanity list.

UBL nn::ubl Manages the blocked-user list..


PL nn::pl Used for features unique to the 3DS (such as the pedometer).

UTIL nn::util Utility functions.

ENC nn::enc Handles encoding conversions.


ERR nn::err Error handling.

ERREULA nn::erreula For using the Error/EULA applet.

SWKBD nn::swkbd For using the software keyboard applet.


PHTSEL nn::phtsel For using the photo selection applet.

VOICESEL nn::voicesel For using the sound selection applet.


EXTRAPAD nn::extrapad For using the Circle Pad Pro calibration applet.

WEBBRS nn::webbrs For using the Internet browser.

OLV nn::olv For using the Miiverse application and the Post app.
SOCKET nn::socket (Debugging Only) Handles socket communication.

SSL nn::ssl (Debugging Only) Handles SSL communication.

HTTP nn::http (Debugging Only) Handles HTTP communication.


DBG nn::dbg Assists in debugging.

HIO nn::hio (Debugging Only) For using Host IO.

MIDI nn::midi (Debugging Only) For using MIDI.

Library configurations and names may be changed in the future.

4.4. Error Handling

When the 3DS system is updated, the libraries are also updated. As a result, functions could return
values that were not defined at the time the application was developed. To handle such cases, you
must use the nn::Result::IsSuccess() or nn::Result::IsFailure() function to determine
whether a process has succeeded.

Nintendo does not recommend error handling that only uses value matching to determine errors
rather than using these functions, because doing so could cause the application to erroneously
assume success or failure when an unexpected error occurs.

Note: When errors that differ from the symptoms in the reference occur only on a particular
device, a hardware problem may be occurring.
CONFIDENTIAL

5. Initializing Applications and Handling State


Transitions

This chapter describes what an application must initialize when it is started, and how to handle state
transitions, such as to Sleep Mode.

5.1. Initialization Prior to Calling the Entry Function

For more information about the processes performed prior to calling the application's entry function,
see the System Programming Guide included in the SDK.

5.2. Entry Function

An application's entry function is defined by the nnMain() function.

The entry function must first initialize the libraries used by the application. The details of initializing
the main libraries are provided later in this document.

If there is no entry function, the application quits. Within the entry function, construct a main loop,
and ensure that execution does not leave the entry function until the application quits.

5.3. Initializing the APPLET Library

The APPLET library is not just for using library applets. It is also required for handling HOME Button
and POWER Button events and transitioning to Sleep Mode when the system is closed. Initialization
of the APPLET library is performed before execution moves to the entry function. Initialization
processing that must be performed by the application calls the nn::applet::Enable() function to
enable each function of the APPLET library.

Code 5-1. APPLET Library Initialization

void nn::applet::Enable(bool isSleepEnabled = true);

Call this function after setting a sleep-related callback. Use isSleepEnabled when calling this
function to specify whether a sleep-related callback is enabled.
Just before and after calling this function, sleep-related handling must be carefully performed. For
information about sleep-related handling, see 5.3.3. Sleep Handling.

Warning: The nn::applet::Enable() function must be called before the nngxInitialize,


nn::dsp::Initialize, or nn::snd::Initialize() functions.

Primarily, this is because there are times when an application must be closed immediately
after it starts due to something that occurred before the Enable() function is called, such
as the POWER Button being pressed while the application was starting. For this reason,
immediately after calling this function, handle with an application close request as
described in 5.3.1. Handling Application Close Requests. At this time, do not call
nngxInitialize until it is verified that the close request did not arrive.

The following sections explain application close requests, the HOME Button, sleep (closing system),
and POWER Button handling that the application must implement. These sections also provide
examples of handling these state changes. The system uses Applet Manager to notify the application
about all state changes the application must respond to.

Warning: When there has been no response to an application close request or a POWER Button
press, the application is forcibly ended when the POWER Button is held down for longer
than a certain period of time. The application must implement responses to these state
changes so that they are handled by normal processing.

Figure 5-1. Notifications Sent From the Applet Manager

5.3.1. Handling Application Close Requests

The Applet Manager uses the nn::applet::IsExpectedCloseApplication() function to send


a notification if an application enters a state where it must close. This could occur, for example,
when another application is started or Close is selected on the HOME Menu while an application is
suspended. Applications must call this function periodically (such as once per frame). If
nn::applet::IsExpectedToCloseApplication returns true, the application must quickly be
closed. It is also possible for the application to enter a state where it has no rendering rights.
In this case, first make the application perform its own shutdown processing (within four seconds),
and then call the nn::applet::CloseApplication to close the application. Because graphics-
related processes cannot be executed from a state where there are no rendering rights, calls to
nngxInitialize or nngxWaitCmdlistDone block processing. Also, command request
completion interrupts are not generated in this state. You must implement the shutdown processing
without waiting for completion of command requests.

The nngxFinalize() function can be called even in a state where the application does not have
rendering rights, and calling it automatically releases the display buffer allocated by the nngx or
gl() functions.

Code 5-2. Functions Used to Close Applications

bool nn::applet::IsExpectedToCloseApplication(void);
nn::Result nn::applet::CloseApplication(const u8* pParam=NULL,
size_t paramSize=0, nn::Handle handle=NN_APPLET_HANDLE_NONE);

In addition to periodically calling the IsExpectedToCloseApplication() function, also make


sure to call this function immediately after returning from the HOME Menu or a library applet to
determine whether a condition requiring the application to close while waiting for the return has
occurred. Also check immediately after initializing the APPLET library in case a close request has
arrived while loading an application. If this function returns true, also execute the processes that
you want to run when the application closes (for example, autosave).

[Link]. Cautions When Shutting Down

Although the CTR-SDK is designed to release the allocated resources, even if an application calls
nn::applet::CloseApplication at an arbitrary time, this feature has not been sufficiently
tested. To safely close an application, implement closing based on the following measures until
CloseApplication is called.

Note: Revisions in future releases are expected to make the following measures
unnecessary.

Required

When an nn::os::Alarm object has been created, call the Alarm object member functions
Cancel() and Finalize() in order.

When an nn::os::Timer object has been created, call the Timer object member functions
Stop() and Finalize() in order.

Stop FS and UDS library processes, and then finalize the libraries. In particular, for the FS
library, call the Finalize() function for each class being used. Even for NW4C and other non-
SDK packages, be sure to finalize for classes being used by the FS library.

Recommended

Finalize all libraries that have been initialized.


5.3.2. Handling HOME Button Presses

If the HOME Button is pressed, the Applet Manager uses the


nn::applet::IsExpectedToProcessHomeButton() function to send notification of whether
the application has entered a state that requires the HOME Menu to start. Applications must call
this function periodically (such as once per frame). If this function returns true, the application
must start the HOME Menu. Have the applications stop the operation of all devices and start the
HOME Menu immediately. If, however, operations cannot be stopped due to a process that cannot
be halted, you can display a HOME Disabled icon and stop the start of the HOME Menu. Note that
the IsExpectedToProcessHomeButton() function continues to return true unless you use the
nn::applet::ClearHomeButtonState() function to invalidate the fact that the HOME Button
was pressed.

Code 5-3. Functions Used to Start the HOME Menu

bool nn::applet::IsExpectedToProcessHomeButton(void);
bool nn::applet::ProcessHomeButton(void);
nn::applet::AppletWakeupState nn::applet::WaitForStarting(
nn::applet::AppletId* pSenderId=NULL, u8* pParam=NULL,
size_t paramSize=0, s32* pReadLen=NULL, nn::Handle *pHandle=NULL,
nn::fnd::TimeSpan timeout=NN_APPLET_WAIT_INFINITE);
void nn::applet::ClearHomeButtonState(void);
void nn::applet::ProcessHomeButtonAndWait();

Processing required to start the HOME Menu can be carried out merely by calling the
nn::applet::ProcessHomeButton() function. If the return value is true, immediately call the
nn::applet::WaitForStarting() function and wait for the return from the HOME Menu. Upon
return, there are restrictions on the use of devices for purposes such as getting key input and
rendering. Also, note that only the thread that called the WaitForStarting() function stops.

Do not program an application in a way that would encourage operation that causes threads
with priority level 16 or higher (a priority number from 1 to 16) to continue to occupy the CPU
even after transitioning to the HOME [Link] game memory is started up when a thread of
this type is present, the system will [Link] an application thread continues to run while the
HOME Menu is being displayed, performance for the HOME Menu and other operations deteriorate.
We recommend that you program the system so that all application threads stop when transitioning
to the HOME Menu.

The nn::applet::ProcessHomeButtonAndWait() function is a wrapper function that calls the


ProcessHomeButton() function and handles waiting and Sleep Mode.

Warning: At some point before the ProcessHomeButton() function is called, you must call
the nngxWaitCmdlistDone() function or perform equivalent processing to ensure that
all GPU render commands have finished executing.

Note: The ProcessHomeButton() function can only be called during the time that the GX
library can be used (from the completion of nngxInitialize until nngxFinalize is
called).

Before you call the ProcessHomeButton() function, configure the display buffer, swap
buffers, and call the nngxStartLcdDisplay() function to start LCD output. If LCD
output has not been started, there is a chance that the HOME Menu could start up with
black screens. Likewise, if the display buffer has not been configured or the buffers have
not been swapped, there is a chance that undefined content could be displayed on the
screens.

Table 5-1. Handling Devices During HOME Menu Display

Device Application Response During HOME Menu Display


Stop rendering, and do not update the display buffer. Also, by the time the HOME
GPU/LCDs Menu is displayed, the graphics processing must be completed and the GPU must be
in a stopped state.
Digital
No particular handling is necessary. This input is not applied to the input values
Buttons/Circle
obtained by the application.
Pad
No particular handling is necessary. This input is not applied to the sampling values
Touch Panel
obtained by the application; an invalid sampling value (0) is returned.

Accelerometer No particular handling is necessary. Input values can be obtained.

Input values can be obtained. If the gyro sensor will not be calibrated after returning
Gyro Sensor from the HOME Menu, even during HOME Menu display, the application must get and
correct these input values.

Functions in the SND library are designed to be called from the main thread and the
sound thread. When calling functions from these threads, the application does not
need to keep track of anything when transitioning with the HOME Button. However, if
you are calling functions from other threads, you must provide appropriate handling
Sound
so that calls are not made while the application is in a suspended state.
Because you cannot control the timing at which sound stops, the application must
manage playback status or take other such steps to ensure synchronization of sound
and graphics when such synchronization is necessary, such as when playing video.

No particular handling is necessary. Unless the cameras are finalized before HOME
Camera
Menu startup, HOME Menu features that use the cameras will not work.
Microphone No particular handling is necessary.

For non-local types of communication, no particular handling is necessary. However,


if a browser or other application has been started and it uses wireless communication
while the HOME Menu is displaying, that application’s communication will be
disconnected when the system returns from the HOME Menu.
Wireless Local communication can continue. However, you must consider the possibility that
Communication the system will be closed during HOME Menu display, which would result in a
transition to Sleep Mode. If local communications are not ended at that point, the
local communications status will transition to an error. Also, unless local
communication is finalized before HOME Menu startup, any HOME Menu features that
use local communication will not work.
When using the NFP library, call nn::nfp::Finalize to exit the NFP library before
NFC transitioning to the HOME Menu or an applet.
For more information, see the 3DS Programming Manual: NFP.

From immediately after returning from the HOME Menu until the application calls the
nngxSwapBuffers() function to switch the display buffer, an image captured during transition to
the HOME Menu is displayed on the screen. Note at this point is that although the content of main
memory and VRAM is protected, GPU register settings must be reset by the application. Be sure to
reset all register settings including the framebuffer, shader binary, and lookup tables, and not just
the vertex load array settings and texture unit settings.

Note: The nngxUpdateState(NN_GX_STATE_ALL)() function call can be used to restore


register settings as long as they are not directly overwritten.

HOME Button Press Detection


Although detection of HOME Button presses can also be checked using notifications to a callback
function registered using the nn::applet::SetHomeButtonCallback() function, implement this
check by periodically checking the nn::applet::IsExpectedToProcessHomeButton()
function.

Code 5-4 Setting a HOME Button Callback Function

void nn::applet::SetHomeButtonCallback(
nn::applet::AppletHomeButtonCallback callback, uptr arg=0);
typedef bool (*nn::applet::AppletHomeButtonCallback)(
uptr arg, bool isActive, nn::applet::CTR::HomeButtonState state);

Set a callback function by calling the nn::applet::SetHomeButtonCallback() function. Calls


to the callback function pass the value of the arg parameter that was originally specified when
SetHomeButtonCallback was called. The isActive parameter specifies whether an application
is currently running, and the state parameter specifies the state of the HOME Button.

The status of the HOME Button is defined by the nn::applet::HomeButtonState enumerator


type.

Table 5-2. HOME Button States

Definition Description

HOME_BUTTON_NONE HOME Button is not pressed.

HOME_BUTTON_SINGLE_PRESSED HOME Button is pressed (held for at least 200 ms).

Use the nn::applet::GetHomeButtonState() function to check the state of the HOME Button.
After a button press is detected, the HOME Button keeps that state until the next call to the
nn::applet::ClearHomeButtonState() function. Calls to ClearHomeButtonState return the
state of the HOME Button to HOME_BUTTON_NONE.

Code 5-5. Checking the State of the HOME Button

nn::applet::AppletHomeButtonState nn::applet::GetHomeButtonState(void);

The application can use the return value from the callback function to determine whether to enable
or disable detection of HOME Button presses, and to control whether detection is indicated in the
state of the HOME Button that is obtained by calling the nn::applet::GetHomeButtonState()
function. Indicate detection of HOME Button presses for callback return values of true; do not for
return values of false. Implement the callback function to return the value in the isActive
parameter. Implement the function to only handle lightweight processing such as flag control; do
not start the HOME Menu from directly within the callback function.

[Link]. What an Application Can Do Before Starting the HOME Menu

After an application detects a HOME Button press, it has a maximum of 0.5 seconds until it must
start the HOME Menu. Within this time, the application can carry out such operations as pausing
the action or auto-saving. When you do not want to have a static image of the game as the HOME
Menu background (such as for timed puzzle games), you could use this time to process a
different image to hide the game screen. But in general, we recommend using just a static game
screen as the background.
When an application is suspended, the rendering process to create a capture image to display on
the upper screen of the HOME Menu is performed after it is confirmed that
nn::applet::IsExpectedToProcessHomeButton returns true. However to display the
HOME Menu with nn::applet::ProcessHomeButton, the GPU processing must be stopped
(graphics processing must be completed). Call nngxWaitCmdlistDone to wait until all graphics
commands have completed. After that, call nngxWaitVSync to ensure that the images displayed
on the LCDs are updated.

When starting the HOME Menu, consider the possibility that the user will close the application in
the HOME Menu and implement it so that there are no problems even if the application is closed
in the HOME Menu. Although you can usually perform shutdown processing after control returns
to the application, note that shutdown processing becomes impossible if the battery runs down
during Sleep Mode while the HOME Menu is being displayed.

[Link]. Displaying the HOME Menu Disabled Icon

When an application detects a HOME Button press while in the middle of a process that cannot
be interrupted (such as saving), and the HOME Menu cannot be displayed right away, the
application can display the HOME Menu Disabled icon in the middle of the lower screen and
cancel starting the HOME Menu.
Implement the HOME Menu Disabled icon according to the following specifications.

Icon: Image file included in the CTR-SDK (HomeNixSign_Targa.tga)


Display position: Center of the lower screen
Fade-in: 0.083 seconds (5 frames at 60 fps)
Display time: 1 second (60 frames at 60 fps)
Fade-out: 0.333 seconds (20 frames at 60 fps)

If the user presses the HOME Button while the icon is displayed, simply continue displaying the
icon. For example, if the user presses the HOME Button while the icon is fading out, you do not
need to fade in again. Just continue with the fade-out. Even if it does become possible to start
the HOME Menu while the HOME Menu Disabled icon is still displaying, you must not start the
HOME Menu while the icon is displayed.

[Link]. Prohibiting the Posting of Screenshots

You can prevent other applications (such as system applets like Miiverse) from posting images
created when transitioning to the HOME Menu.

Code 5-6. Setting and Getting Screen Capture Post Permissions

nn::Result nn::applet::SetScreenCapturePostPermission(
const nn::applet::ScreenCapturePostPermission permission);
nn::Result nn::applet::GetScreenCapturePostPermission(
nn::applet::ScreenCapturePostPermission* pPermission);

You can specify whether posting is allowed with the SetScreenCapturePostPermission()


function. Two values can be specified for the permission argument:
SCREEN_CAPTURE_POST_ENABLE or SCREEN_CAPTURE_POST_DISABLE. Specifying any other
values is prohibited. If the application is restarted with the
nn::applet::RestartApplication() function, the initial values (see Table 5-3) are restored.

You can get the current settings with the GetScreenCapturePosetPermission() function.

Table 5-3. Values Used When Setting and Getting Screen Capture Post Permissions

Definition Description

(Specifying this value is prohibited.) The default


SCREEN_CAPTURE_POST_NO_SETTING
value, where posting is allowed.

(Specifying this value is prohibited.) The default


value obtained by applications released under an
old version of the SDK, where posting is allowed.
Posting screenshots is prohibited when the
SCREEN_CAPTURE_POST_NO_SETTING_OLD_SDK camera is in use. The camera is in use after
nn::camera::Initialize is called to
initialize the camera, until
nn::camera::Finalize is called to close the
camera from the CAMERA library.

SCREEN_CAPTURE_POST_ENABLE Posting is allowed.


SCREEN_CAPTURE_POST_DISABLE Posting is prohibited.

Note: Contact Nintendo support when you want to disable screenshot posting to Miiverse
from an application using a CTR-SDK version earlier than 7.x.

5.3.3. Sleep Handling

As with the DS, on 3DS an application can determine whether to transition to Sleep Mode when the
system is closed while the application is running. However, with 3DS it is possible to transition to
Sleep Mode even if the application is suspended, such as when the HOME Menu is being
displayed.

The following figure indicates sleep state transitions as the relationship between the related user
operations, 3DS state, and callback function association. This section describes how to handle
sleep transitions focused on these callback functions and then provides information associated with
sleep.

Figure 5-2 Sleep State Transitions


[Link]. Sleep Query Callback

When the system is closed while an application is running, the Applet Manager queries the
application about whether to transition to Sleep Mode by system closing. The application sends a
notification in reply to this query using the set callback function.

Code 5-7. Setting a Sleep Query Callback Function

void nn::applet::SetSleepQueryCallback(
nn::applet::AppletSleepQueryCallback callback, uptr arg=0);
typedef nn::applet::AppletQueryReply
(*nn::applet::AppletSleepQueryCallback)(uptr arg);

Set a callback function by calling the nn::applet::SetSleepQueryCallback() function.


Calls to the callback function pass the value for the arg parameter that was originally specified to
SetSleepQueryCallback.

The application notifies the Applet Manager whether to transition immediately to Sleep Mode
depending on the query response.

Table 5-4. Values Specifiable as the Return Value of the Sleep Query Callback Function

Definition Description

REPLY_REJECT Rejects the transition to sleep.


REPLY_ACCEPT Accepts transition to sleep.

REPLY_LATER Postpones the transition to sleep.

Return REPLY_REJECT when the application continues with wireless communication or sound
playback after the system is closed. Return REPLY_ACCEPT to transition directly to Sleep Mode,
but note that the system does so immediately and the transition may not necessarily occur at a
time that is safe for graphics and other processing that is underway. Return REPLY_LATER to
postpone transitioning to Sleep Mode until the application can safely pause other processing.
However, when doing so, the application must call nn::applet::ReplySleepQuery() and
pass REPLY_ACCEPT as an argument as soon as it is safe to transition to sleep, and then use an
event class or another mechanism to wait for the system to wake up again.

When REPLY_LATER is returned, the system is in a state where it is ready to transition to Sleep
Mode, such as turning off LCD power. Also, because the Applet Manager is in a semi-stopped
state where it cannot perform HOME Button or POWER Button processing, use as little
processing time as possible when stopping other processes. However, there is no problem with
continuing the postponed state for several tens of seconds as long as the sleep cancel callback
(see [Link]. Sleep Cancel Callback) is used to appropriately handle cases where the system is
opened while sleep is postponed. If you do not use a sleep cancel callback to handle such cases,
system behavior is not considered a violation of the CTR Guidelines as long as sleep is
postponed for no longer than four seconds. If you want to prohibit sleep for longer than four
seconds, use a combination of REPLY_REJECT and the nn::applet::EnableSleep() function
(described below), and reissue the sleep query callback.

Warning: Do not accept transitions to Sleep Mode while initializing local communications
(while executing the nn::uds::Initialize() function). Conflicts with wireless
communications state transitions could result in the possibility of Sleep Mode
transition not occurring correctly. Also, when you transition to Sleep Mode without
finalizing local communications, the UDS library forcibly disconnects local
communications and transitions to an error state. Once in this error state, UDS library
functions, with some exceptions, will return errors until finalization takes place.

When calling functions in the SND library from threads other than the main or sound
threads, you must apply appropriate controls so that no SND library functions are
called between when the transition to sleep is accepted with REPLY_ACCEPT and when
the system recovers from the sleep state.

In some cases, sound thread processes get jammed up during the transition to the
sleep state. In these cases, if the time between calls to the
nn::snd::WaitForDspSync() and nn::snd::SendParameterToDsp() functions
(which are normally made every five milliseconds) exceeds 100 milliseconds,
sometimes the sound thread remains stopped when the system recovers from the
sleep state. There have been infrequent reports of threads stopping in this way in
Debug builds only.

If the battery runs out during sleep, the system never recovers from sleep and
finalization is impossible. Also, in some cases the game card will be removed during
sleep. You must keep both of these cases in mind and implement your code so that
there is no problem even if the application terminates on an error during sleep.

If a sleep query callback function has not been registered by the application, status is the same
as when a callback function that returns REPLY_REJECT has been registered.

Callbacks While an Application Is Suspended

The sleep query callback function can be called even if the application is not running, such as
when the HOME Menu is being displayed. The nn::applet::IsActive() function can be used
to check whether an application is running. Return REPLY_ACCEPT when it is determined in the
sleep query callback function that the application is suspended.

Code 5-8. Determining Whether an Application Is Running

bool nn::applet::IsActive(void);

If the return value is true, the application is running. If the value is false, it is suspended.

The state in which application operations are suspended is called the Inactive state (as opposed
to the Active state when the application is running). As when the HOME Menu or library applet is
being displayed, threads other than those used by the applications to call functions to display
them or to wait for them to complete can continue to run. Also, during the interval until
nn::applet::Enable is called while the application is running, no sleep-related callbacks are
issued, but the application is in the Inactive state.

During the Inactive state, the VSync callback and other callbacks including the sleep query
callback can be issued as usual. However, because it is usual to implement so that the main
thread stops in the Inactive state, in some cases an unintentional deadlock state occurs because
a sleep query callback cannot be handled properly. For example, when sleep handling is
performed by the main thread, if REPLY_LATER is returned while in the Inactive state, there is no
opportunity to accept the transition to the sleep state because the main thread is stopped and the
application state stagnates.

Excluding the case when REPLY_REJECT is always returned during communications, we


recommend stopping access to the FS library before entering the Inactive state, such as when
displaying the HOME Menu, and always returning REPLY_ACCEPT to sleep queries while in the
Inactive state. When creating threads to run even during the Inactive state, there is no problem in
using the same processing as that in the Active state as long as there is appropriate sleep
handling.

Controlling Sleep Responses

You can control whether the application supports transition to Sleep Mode using the
nn::applet::EnableSleep and nn::applet::DisableSleep() functions.

Code 5-9. Controlling Sleep Mode Support

void nn::applet::EnableSleep(
bool isSleepCheck=nn::applet::SLEEP_IF_SHELL_CLOSED);
void nn::applet::DisableSleep(
bool isReplyReject=nn::applet::REPLY_REJECT_IF_LATER);

The nn::applet::EnableSleep() function validates the return value specified by the sleep
query callback function, allowing control to move to the Sleep Mode transition sequence. If
SLEEP_IF_SHELL_CLOSED is specified in isSleepCheck, the system status is checked when
the function is called. The sleep query callback function is called if the system is closed. If
NO_SHELL_CHECK is specified in isSleepCheck, there is no difference in operations except that
the system status is not checked and no callback function is called. Incidentally, when sleep-
related callbacks are enabled with nn::applet::Enable(true),
EnableSleep(NO_SHELL_CHECK) is executed within the callback functions.

If SLEEP_IF_SHELL_CLOSED is specified in isSleepCheck, and


nn::applet::EnableSleep() is called, it is not necessary to consider the case of the system
being opened while transitioning to the sleep state because a sleep query callback is issued while
the system is closed. If REPLY_REJECT is returned rather than REPLY_LATER in the sleep query
callback functions, necessary processing is performed while sleep is being rejected. In other
words, if REPLY_REJECT is returned using the sleep query callback, there are no sleep query
callbacks issued while the system is closed, but by calling
EnableSleep(SLEEP_IF_SHELL_CLOSED) when the current state is ready to transition to sleep
state, the sleep query callback can be issued again while the system is still closed. Even if
DisableSleep() is not called, EnableSleep() can be called at any time. However, several
milliseconds may pass from when a function is called until it returns depending on the Applet
Manager state, so we do not recommend calling in every frame.

Warning: When the system is closed while an application is running, it is the equivalent of
returning REPLY_REJECT to the sleep query until nn::applet::Enable is called.
For applications that support sleep, call EnableSleep(SLEEP_IF_SHELL_CLOSED)
after Enable to be able to reissue sleep query callbacks even if the system is closed
during operations.

The nn::applet::DisableSleep() function disables the return value specified by the sleep
query callback function, setting the status to REPLY_REJECT regardless of which return value is
specified. If REPLY_REJECT_IF_LATER is specified in isReplyReject, the
nn::applet::ReplySleepQuery(REPLY_REJECT)() function is executed and the request is
rejected if the system has not already attempted to enter Sleep Mode. If NO_REPLY_REJECT is
specified in isReplyReject, there is no difference in operations except that the
ReplySleepQuery(REPLY_REJECT)() function is not executed.
A sleep query callback can be issued when the system is closed even after DisableSleep has
been called, but the callback function return value is ignored and it is considered as if
REPLY_REJECT is always returned. However, note that the return values of sleep query callbacks
called while the application is suspended (state when nn::applet::IsActive returns false)
are valid (can return values other than REPLY_REJECT). Also, DisableSleep does not have an
internal counter, so even if it is called multiple times, calling EnableSleep once makes the
callback function’s return value valid.

When calling a function that suspends the application, such as displaying the HOME Menu
(nn::applet::ProcessHomeButton) or POWER Menu
(nn::applet::ProcessPowerButton) or starting a library applet, before you call this function
be sure to first reject the sleep query with DisableSleep, and then call EnableSleep after the
application recovers.

Warning: If the DisableSleep(REPLY_REJECT_IF_LATER)() function is executed before


performing application shutdown processing, execution may stop during Sleep Mode if
the system is closed during shutdown processing.

Checking the Notification State and Replying to Postponed Queries

To determine whether transition to sleep has been postponed in the application, call
nn::applet::IsExpectedToReplySleepQuery. If this function returns true, the application
must determine whether it is ready to transition to the sleep state and must use the
nn::applet::ReplySleepQuery() function to reply to the postponed query.

In your implementation of handling transitions to sleep state, we recommend that you call the
IsExpectedToReplySleepQuery() function periodically (such as once per frame) and then
transition to sleep state at some predetermined time.

Code 5-10. Checking Sleep Notification State and Replying to Postponed Queries

bool nn::applet::IsExpectedToReplySleepQuery(void);
void nn::applet::ReplySleepQuery(nn::applet::AppletQueryReply reply);

[Link]. Sleep Recovery Callback

The application uses a callback function to receive notification that the system has opened and
the system has recovered from sleep.

Code 5-11. Setting a Sleep Recovery Callback Function

void nn::applet::SetAwakeCallback(nn::applet::AppletAwakeCallback callback,


uptr arg=0);
typedef void (*nn::applet::AppletAwakeCallback)(uptr arg);

Call nn::applet::SetAwakeCallback to set a sleep recovery callback function. Calls to the


callback function pass the value for the arg parameter that was originally specified to
SetAwakeCallback.

Have this callback function only perform simple processing, such as signaling the event class
instance that is waiting to resume. This callback function is called immediately if sleep status
does not result even though the system is closed for some reason such as REPLY_REJECT being
returned by the sleep query callback function. Whether the system has been opened cannot be
determined by calling this callback function.

Warning: The LCD displays are turned off when the system transitions to Sleep Mode.
Consequently, when the system is ready to display a screen after waking up, the
screens do not show anything until the nn::gx::StartLcdDisplay() function is
called. Note, however, that the screen display may momentarily distort if the
nn::gx::StartLcdDisplay() function is called while recovering from sleep while
library applets are executing. If this happens, make the call after preparations for
screen display have been completed and a state allowing normal display has been
established.

[Link]. Sleep Cancel Callback

You can use the nn::applet::SetSleepCanceledCallback() function to register a callback


function to be called in case sleep is canceled. Sleep cancellation occurs when the system has
been closed and is opened again before the application can transition to Sleep Mode.

Code 5-12. Registering a Sleep Cancel Callback Function

void nn::applet::SetSleepCanceledCallback(
nn::applet::AppletSleepCanceledCallback callback, uptr arg=0);
typedef void (*nn::applet::AppletSleepCanceledCallback)(uptr arg);

Use this callback function in cases such as when halting the sleep process by detecting if the
system has been opened while performing processing that takes time. This can happen in cases
such as saving data before entering Sleep Mode after the system is closed.

If nothing is done inside this callback function, the transition to Sleep Mode is postponed. In other
words, Sleep Mode will not be canceled unless the ReplySleepQuery(REPLY_REJECT)()
function is called explicitly. Also, note that the sleep recovery callback function will be called
even if this callback function is called.

This callback function does not need to be registered when transitioning to Sleep Mode
immediately or when the time required for other processing to complete is short. In these cases,
you may consider not implementing the sleep cancel callback feature.

Normally, a sleep cancel callback is issued only when REPLY_LATER is returned. However, when
the system is quickly opened and closed, a sleep cancel callback may be issued after
REPLY_ACCEPT or REPLY_REJECT is returned. Reproducing when the system is opened before a
reply is sent to the sleep query callback may be difficult to do during debug because this error
can only be reproduced when the cover is opened and closed quickly.

Sleep-related callbacks are not called simultaneously or in parallel. It is guaranteed that the
sleep cancel callback will be issued either zero times or one time between the sleep query
callback and the sleep recovery callback.

Note: The reference for the nn::applet::SetSleepCanceledCallback() function


shows a sample implementation.
[Link]. Prohibited Processes While in Sleep Mode

Note: Currently, no processes are prohibited to applications while the system is in Sleep
Mode.

[Link]. Device State When the System Is Closed

Some devices work differently when the system is closed depending on whether the argument
passed when calling nn::applet::ReplySleepQuery is REPLY_ACCEPT or REPLY_REJECT.

Table 5-5. Device State When the System Is Closed

Device REPLY_ACCEPT REPLY_REJECT

Display buffer updates and Display buffer updates are stopped and LCD
LCD VSyncs are both stopped, and backlights are turned off (but screen displays
LCD displays are turned off. are maintained).
CTR only accepts input from the L and R
Digital buttons No input accepted. Buttons. SNAKE accepts input from the L, R,
ZL, and ZR Buttons.
Sampling values cannot be
Touch Panel Returns an invalid sampling value (0).
obtained.
Functions as a pedometer. Input Functions as a pedometer. Input values can be
Accelerometer
values cannot be obtained. obtained.
Gyro Sensor Suspended. Input values can be obtained.

Normally output is suspended from the


speakers and forcibly output from the
Sound Suspended.
headphones, but you can make settings so that
the output also comes from the speakers.

Camera Suspended. Suspended.


Microphone Suspended. Suspended.
Suspended. Recommend
Wireless
suspending local communication Not suspended.
Communication
ahead of time.

Suspended. We recommend
Infrared running the disconnect process
Not suspended.
Communication and waiting for it to complete
before entering Sleep Mode.

NFC Suspended. Suspended.

Threads created by an application, including the main thread, are all suspended when the system
transitions to Sleep Mode. If the system has not transitioned to Sleep Mode, the threads continue
running, but will no longer receive events or other notifications from suspended devices.

Warning: Unlike with the DS/DSi systems, note that sound is not normally output from the
speakers when the system is closed. For information about having output come from
the speakers, see 10.5.7. Sound Output When the System Is Closed and Sleep Is
Rejected.
5.3.4. Handling the POWER Button

When the POWER Button is pressed for 160 ms or longer, the Applet Manager uses the
nn::applet::IsExpectedToProcessPowerButton() function to send notification whether the
application needs to handle the fact that the POWER Button was pressed. Applications must call
this function periodically (such as once per frame). If this function returns true, call the
nn::applet::ProcessPowerButton() function as soon as feasible.

Code 5-13. Functions Used to Handle POWER Button Presses

bool nn::applet::IsExpectedToProcessPowerButton(void);
bool nn::applet::ProcessPowerButton(void);
void nn::applet::ProcessPowerButtonAndWait();

After you call the nn::applet::ProcessPowerButton() function, control transfers to the


system in order to display the POWER Menu. By displaying the POWER Menu, there is no need for
the application to display a special screen when closing. Immediately after calling this function, the
application must begin waiting with the nn::applet::WaitForStarting() function for
permission to begin finalization. After processing returns from WaitForStarting, because
nn::applet::IsCloseApplication returns true in this state, perform the application close
process while referring to 5.3.1. Handling Application Close Requests.

The nn::applet::ProcessPowerButtonAndWait() function is a wrapper function that calls the


ProcessPowerButton() function and handles waiting and Sleep Mode.

Note: The ProcessPowerButton() function can only be called while the GX library can be
used (from when nngxInitialize completes until nngxFinalize is called).

Before you call the ProcessPowerButton() function, configure the display buffer,
swap buffers, and call the nngxStartLcdDisplay() function to start LCD output. If
LCD output has not been started, there is a chance that the Power Menu could start up
with black screens. Likewise, if the display buffer has not been configured or the buffers
have not been swapped, there is a chance that undefined content could be displayed on
the screens.

5.3.5. Example of Handling by the Application

Implement sleep or HOME Button handling based on the following points.

Perform the following determinations in locations where the main loop is called periodically.
Evaluate the nn::applet::IsExpectedToProcessHomeButton() function:
When true is returned, call nn::applet::ProcessHomeButton.
(When you do not want to or cannot display the HOME Menu, display the HOME Menu
Disabled icon according to the guidelines.)

Evaluate the nn::applet::IsExpectedToProcessPowerButton() function:


When true is returned, call nn::applet::ProcessPowerButton.

Evaluate the applet::IsExpectedToCloseApplication() function:


If true, close the application.

Before calling the nn::applet::ProcessHomeButton or


nn::applet::ProcessPowerButton() function, you must ensure that all render commands
have finished executing. Use the nngxWaitCmdlistDone() function to wait for currently
executing render commands to finish.
After you call the nn::applet::ProcessHomeButton or
nn::applet::ProcessPowerButton() function, you must always call the
nn::applet::WaitForStarting() function.
When control is returned from nn::applet::Enable or nn::applet::WaitForStarting,
be sure to perform nn::applet::IsExpectedToCloseApplication determination.

The following sample code shows how to handle sleeping and the HOME Button.

Code 5-14. Sample Code for Handling Different Situations

nn::os::LightEvent sAwakeEvent(true);
nn::os::LightEvent sTransitionEvent(true);
nn::os::CriticalSection sFileSystemCS(WithInitialize);

// Sleep Query callback


nn::applet::AppletQueryReply mySleepQueryCallback(uptr arg)
{
[Link]();
return (nn::applet::IsActive() ? REPLY_LATER : REPLY_ACCEPT);
}

// Sleep wake up callback


void myAwakeCallback(uptr arg)
{
[Link]();
}

// Sleep
void sleepApplication()
{
// Lock to prevent Access from the FS library during sleep
// If lock fails, retry in next frame
if ([Link]())
{
_app_prepareToSleep();
nn::applet::ReplySleepQuery(REPLY_ACCEPT);
[Link]();
_app_recoverFormSleep();
[Link]();
nn::gx::StartLcdDisplay();
}
}

// Application finalization.
void exitApplication()
{
_app_finalize();
nn::applet::CloseApplication();
}

// Main loop.
void nnMain()
{
// Run normally when an event is in the signaled state.
[Link]();
[Link]();
// Set callbacks.
nn::applet::SetSleepQueryCallback(mySleepQueryCallback);
nn::applet::SetAwakeCallback(myAwakeCallback);
nn::applet::Enable(true);
// Handle close requests while the application is loading.
if (nn::applet::IsExpectedToCloseApplication())
{
exitApplication();
}
// Handle the system being closed while the application is loading.
nn::applet::EnableSleep(SLEEP_IF_SHELL_CLOSED);

while (true)
{
_app_exec();

// Reply to Sleep Query.


if (nn::applet::IsExpectedToReplySleepQuery())
{
if (_app_isRejectSleep())
{
// Notify of rejection if unable to enter Sleep Mode.
nn::applet::ReplySleepQuery(REPLY_REJECT);
} else {
sleepApplication();
}
}
// Application Close Requests
if (nn::applet::IsExpectedToCloseApplication())
{
exitApplication();
}
// HOME Menu
if (nn::applet::IsExpectedToProcessHomeButton())
{
if (_app_isSuppressedHomeButton())
{
_app_drawSuppressedHomeButtonIcon();
nn::applet::ClearHomeButtonState();
} else {
// During state transitions that stop the main thread,
// such as starting the HOME Menu or a library applet,
// set the event to an unsignaled state.
[Link]();
// When the system enters Sleep Mode while the HOME Menu is
displayed,
// lock to prevent access by the FS library.
// If unable to acquire lock, try again next frame.
if ([Link]())
{
_app_prepareToHomeButton();
nn::applet::ProcessHomeButtonAndWait();
[Link]();
if (nn::applet::IsExpectedToCloseApplication())
{
exitApplication();
}
[Link]();
// The GPU register settings must be restored.
_app_recoverGpuState();
}
}
}
// POWER Button
if (nn::applet::IsExpectedToProcessPowerButton())
{
// Lock to prevent access by the FS library.
// If unable to acquire lock, try again next frame.
if ([Link]())
{
nn::applet::ProcessPowerButtonAndWait();
[Link]();
if (nn::applet::IsExpectedToCloseApplication())
{
exitApplication();
}
// The GPU register settings must be restored.
_app_recoverGpuState();
}
}
}
}
// Lock as demonstrated below before accessing via the FS library
// from outside the main thread.
{
// If in Sleep Mode, stop the thread until the system wakes.
[Link]();
// Stop the thread until control returns to the main thread.
s_TransitionEvent.Wait();
{
os::CriticalSection::ScopedLock lock(sFileSystemCS);
//
// Write the process to access FS here.
//
}
}

5.3.6. Restarting an Application

You can call the nn::applet::RestartApplication() function to restart an application without


returning to the HOME Menu. You can specify the parameters to pass to the application after it is
restarted, and you can use the nn::applet::GetStartupArgument() function to get these
parameters.

Code 5-15. Functions Used to Restart an Application

nn::Result nn::applet::RestartApplication(
const void* pParam = NULL,
size_t paramSize = NN_APPLET_PARAM_BUF_SIZE);
bool nn::applet::GetStartupArgument(
void* pParam,
size_t paramSize = NN_APPLET_PARAM_BUF_SIZE);

Specify a byte array of parameters in pParam and the size, in bytes, of the parameters in
paramSize. The array cannot be larger than NN_APPLET_PARAM_BUF_SIZE.

The nn::applet::RestartApplication() function usually restarts an application without


returning. To find out whether this function was used to restart an application, check the return
value from the nn::applet::GetStartupArgument() function, which returns true when it can
get the parameters that were passed to the restarted application.

5.3.7. Jump to System Settings

This function allows a jump directly from an application to System Settings screens for Internet
Settings, Parental Controls, or Data Management.

Code 5-16. Jump to System Settings

nn::Result nn::applet::JumpToInternetSetting(void);
nn::Result nn::applet::JumpToParentalControls(
nn::applet::AppletParentalControlsScene scene =
nn::applet::CTR::PARENTAL_CONTROLS_TOP);
nn::Result nn::applet::JumpToDataManagement(
nn::applet::AppletDataManagementScene scene =
nn::applet::CTR::DATA_MANAGEMENT_STREETPASS);
bool nn::applet::IsFromMset(nn::applet::AppletMsetScene* pScene = NULL);

An application can jump to the Internet Settings, Parental Control, or Data Management settings
screens by calling the nn::applet::JumpToInternetSetting,
nn::applet::JumpToParentalControls, or nn::applet::JumpToDataManagement()
function, respectively. The application shuts down before jumping to System Settings, so perform
shutdown processing in advance. Any failure in calls to these functions results in a fatal error. Also,
regardless of whether these functions are successful, control does not return to the application.
The Parental Control screen contains multiple configuration items.
A parameter is provided to allow selection of the jump destination.

Table 5-6. Specifying Jump Destination Screen

Definition Jump Destination Screen


PARENTAL_CONTROLS_TOP Parental control main menu
PARENTAL_CONTROLS_COPPACS Parental control COPPACS authentication processing screen

DATA_MANAGEMENT_STREETPASS StreetPass setting screen for data management screen

When the system setting application is exited after one of these jump functions, the application is
restarted. During a restart after exiting system settings, a call to the
nn::applet::IsFromMset() function returns true. To determine which scene was jumped to in
the system settings, the pScene parameter specifies the variable that stores the jump target. Call
the nn::applet::IsFromMset() function after the nn::applet::Enable() function is called.

5.3.8. Initial Parameters That Can Be Obtained at Startup

In cases where restarts happen from applications or from System Settings after a jump, the
following functions determine how the application was restarted.

Table 5-7. Functions That Get the Initial Parameters When Restarting

Function Initial Parameters

Parameters specified with the


nn::applet::GetStartupArgument
nn::applet::RestartApplication() function.

nn::applet::IsFromMset The System Settings screen jumped to from the application.


The type of notification and parameters set. The user is
nn::news::IsFromNewsList notified if Start Software is selected from the notifications
list.
Friend's friend key. The user is notified if Join Game is
nn::friends::IsFromFriendList
selected.

5.3.9. Jump to Nintendo eShop

You can also make jumps from applications to pages in Nintendo eShop.

Code 5-27. Verifying the Nintendo eShop Installation

bool nn::applet::IsEShopAvailable();

Depending on which system updates have been applied, Nintendo eShop might not be installed on
the user's 3DS system. Be sure to call the nn::applet::IsEshopAvailable() function before
any jump to make sure that Nintendo eShop has been installed.

Note: If Nintendo eShop is not on the user's system, notify the user that the system needs to
be updated on the Internet.
For example, display the message, "Nintendo eShop is unavailable. Please update your
system on the Internet to use Nintendo eShop." ."

Note: The install status does not need to be checked for downloadable applications.

Code 5-18. Jump to Nintendo eShop (Details Page)

nn::Result nn::applet::JumpToEShopTitlePage(bit32 uniqueId);

The page shown after the jump is the details page for the title with the unique ID specified by the
uniqueId parameter. If the title specified is not one that can be publicly searched (because its
name is not shown in searches), the page is not shown. Also, if the uniqueId parameter is set to a
unique ID for a title that has not been registered in Nintendo eShop or an invalid value, an error is
shown in Nintendo eShop.

Code 5-19. Jump to Nintendo eShop (Patch Page)

nn::Result nn::applet::JumpToEShopPatchPage(bit32 uniqueId);

When an error that an application must be updated, such as


nn::act::ResultApplicationUpdateRequired, is returned from the authentication server or
account server, use this to navigate to the patch page for the Nintendo eShop title.

The page shown after the jump is the patch page for the title with the unique ID specified by the
uniqueId parameter.

There is no patch page in Nintendo eShop for a title when application updates are handled by
remaster. In this case, a "Title does not exist" error is displayed and the Nintendo eShop startup
screen appears. Use nn::applet::JumpToEShopTitlePage() to handle this operation by
remaster.

Because these functions that jump to Nintendo eShop end the application when they are called, run
exit processing before calling them. The application is not restarted after Nintendo eShop is shut
down.

Note: When using this function to jump, submit the jump target title to OMAS. For more
information, see the Guidelines: e-Commerce.

5.3.10. Jump to E-manual

It is possible to jump from an application to the e-manual that users start from the HOME Menu.

Code 5-20. Jump to E-manual

void nn::applet::JumpToManual();

After the jump, the screen shows the application's manual.

The function jumps after suspending the application, so after calling the function, use the
nn::applet::WaitForStarting() function to wait for resumption from the HOME Menu. The
HOME Menu resumes the application after the e-manual closes.

5.4. Initializing the FS Library

In many cases, the next library to initialize is the FS library, which allows access to files on media.
Call the nn::fs::Initialize() function to initialize the FS library. You must use the classes
provided by the FS library to access files on media.

5.5. Initializing the GX Library

Use the GX library to draw to the LCD screen or render 3D graphics. Call the nngxInitialize()
function to initialize the GX library. During initialization, you must specify functions for handling
memory allocation and release requests from the library.

After calling the nngxInitialize() function, do any other required operations, such as creating
command-list objects for executing graphics commands or allocating memory for the display and
render buffers for displaying on the LCD.

Note that the 3DS LCD layouts and resolutions differ from those of the NTR or TWL, as shown in
Figure 5-3.

Figure 5-3. LCD Layouts and Display Resolutions

This figure shows an implementation example in a sample program. For more information about the
features and settings used, see the separate 3DS Programming Manual: Basic Graphics and the
CTR-SDK API Reference. For more information about stereoscopic display, see the 3DS
Programming Manual: Advanced Graphics.

5.5.1. Initializing the Library

When calling the nngxInitialize() function, you must specify the memory allocator and
deallocator functions that will handle memory requests from the library.

Code 5-21. Initializing the GX Library

void SetupNngxLibrary(const uptr fcramAddress, const size_t memorySize)


{
InitializeMemoryManager(fcramAddress, memorySize);

if (nngxInitialize(GetAllocator, GetDeallocator) == GL_FALSE)


{
NN_PANIC("nngxInitialize() failed.\n");
}
}

[Link]. Allocator

The memory accessed by the GPU for the texture image data and rendering buffer needed for
graphics processing is allocated using the allocator function specified in the call to the
nngxInitialize() function.

The allocator takes four arguments.

First Argument (Allocation Memory Space)

The first argument specifies the memory space for allocation, passed as a GLenum type. The
memory space for allocation depends on the value passed.

When passing NN_GX_MEM_FCRAM, the memory region is allocated from main memory, and in
such cases, this memory must be allocated from the device memory portion of main memory.
Device memory is a region in main memory for which the operating system guarantees address
integrity when it is accessed by both peripheral devices and the main process. For more
information, see 3.1.2. Device Memory.

When passing NN_GX_MEM_VRAMA or NN_GX_MEM_VRAMB, the memory region is allocated from


VRAM-A or VRAM-B, respectively. Call the nngxGetVramStartAddr() function to get the
starting address and the nngxGetVramSize() function to get the size, passing
NN_GX_MEM_VRAMA or NN_GX_MEM_VRAMB as appropriate.

Second Argument (Memory Region Use)

The second argument specifies the memory region use, passed as a GLenum type. The memory
region byte alignment depends on the value passed.

When passing NN_GX_MEM_TEXTURE, the memory region is used for texture image data. The
region is 128-byte aligned regardless of the data format.

When passing NN_GX_MEM_VERTEXBUFFER, the memory region is used for vertex buffers.
Depending on the data stored, the memory may be 1-, 2-, or 4-byte aligned, but because the data
type for storage is not passed to the allocator, we recommend an implementation that allocates at
the maximum 4-byte alignment.

When passing NN_GX_MEM_RENDERBUFFER, the memory region is used for render buffers (color,
depth, and stencil). The alignment may be 32-, 64-, or 96-byte aligned depending on the bits per
pixel (16, 24, or 32), but again the format is not passed to the allocator. Consequently, either use
a fixed format for the render buffer used by the application, or use an implementation that
allocates at 192-byte alignment, which is the least common multiple.
When passing NN_GX_MEM_DISPLAYBUFFER, the memory region is used for display buffers. The
region is 16-byte aligned regardless of the data format. When allocating in VRAM, do not
allocate the last 1.5 MB.

When passing NN_GX_MEM_COMMANDBUFFER, the memory region is used for command lists. The
region is 16-byte aligned.

When passing NN_GX_MEM_SYSTEM, the memory region is used for library system memory.
Depending on the allocation size, the memory may be 1-, 2-, or 4-byte aligned, but to simplify
matters, we recommend an implementation that allocates at the maximum 4-byte alignment.

Third Argument (Object Name)

The third argument specifies the name (ID) of the object, passed as a GLuint type. It is passed
when the second argument is a value other than NN_GX_MEM_SYSTEM and is used when
managing memory.

Fourth Argument (Memory Region Size)

The fourth argument specifies the memory region size, passed as a GLsizei type. Allocate
memory of the specified size.

The application allocates memory of the appropriate alignment and size, and passes the starting
address to the library as a void* type. Memory management must be handled by the application,
such as remembering these four arguments and the starting address of the memory region as a
set, and releasing memory using the deallocator function described later.

[Link]. Deallocator

The deallocator function specified in the call to nngxInitialize is called to release the
memory allocated by the allocator. Four arguments are passed to the deallocator. The first three
have the same values as the first three arguments passed to the allocator function, and the last
argument specifies the starting address of the memory region to release.

The application must take the argument values to identify the memory region allocated by the
allocator and release it.

5.5.2. Command-List Object Creation

After the call to nngxInitialize() completes, the command-list objects required to run the gl()
and nngx() library functions called for graphics processing must be created.

Create a command-list object using the nngxGenCmdlists() function, specify the current
command list using the nngxBindCmdlist() function, and then use the nngxCmdListStorage()
function to allocate the memory for the 3D command buffer and for accumulating command
requests.

Code 5-22. Creating Command List Objects


void CreateCmdList()
{
nngxGenCmdlists(1, &m_CmdList);
nngxBindCmdlist(m_CmdList);
nngxCmdlistStorage(256*1024, 128);
nngxSetCmdlistParameteri(NN_GX_CMDLIST_RUN_MODE, NN_GX_CMDLIST_SERIAL_RUN);
}

A 3D command buffer of 256 KB and one command list that can accumulate up to a maximum of
128 command requests are allocated in this code example. However, it is possible to allocate
multiple command lists and use them by switching between them each frame. When doing so, note
that the command lists must be executed in the order that 3D commands were accumulated.

5.5.3. Allocating Memory for the Display Buffer and Render Buffer

Allocate memory for the display buffer, which is used for displaying to the LCD, and for the render
buffer, which is the render target.

To allocate a framebuffer, you must bind each of the render buffers (color, depth, stencil) to a
framebuffer object. During rendering, color data is written to the color buffer, depth data is written
to the depth buffer, and stencil data is written to the stencil buffer. Note that the stencil buffer must
share a buffer with the depth buffer.

If the format is the same for the upper and lower screens and there is no need for rendering in
parallel, we recommend sharing the same framebuffer object and render buffer between the upper
and lower screens to save memory. When doing so, set the buffer width and height large enough to
accommodate both screens.

Code 5-23. Allocating Render Buffers

void CreateRenderbuffers(
GLenum format, GLsizei width, GLsizei height)
{
glGenFramebuffers(1, m_FrameBufferObject);
glGenRenderbuffers(2, m_RenderBuffer);

glBindRenderbuffer(GL_RENDERBUFFER, m_RenderBuffer[0]);
glRenderbufferStorage(GL_RENDERBUFFER | NN_GX_MEM_VRAMA, format,
width, height);
glBindFramebuffer(GL_FRAMEBUFFER, m_FrameBufferObject);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
GL_RENDERBUFFER, m_RenderBuffer[0]);
glBindRenderbuffer(GL_RENDERBUFFER, m_RenderBuffer[1]);
glRenderbufferStorage(GL_RENDERBUFFER | NN_GX_MEM_VRAMB,
GL_DEPTH24_STENCIL8_EXT, width, height);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT,
GL_RENDERBUFFER, m_RenderBuffer[1]);
}

This code sample allocates the color buffer in VRAM-A and the depth/stencil buffer in VRAM-B.

Under the usual framebuffer architecture, you can display the contents of the color buffer as is, but
the 3DS color buffer data format is a block format and cannot be displayed on the LCD without first
converting to a linear format. Consequently, in the 3DS system, the display buffer is between the
color buffer and the actual display on the LCD. The contents of the color buffer are copied to the
display buffer and then converted. You can also apply hardware-based anti-aliasing and vertical
flipping during this process.

Initialization only allocates memory for the display buffer.


Code 5-24. Allocating Display Buffers

void CreateDisplaybuffers(
GLenum format0, GLsizei width0, GLsizei height0, GLenum area0,
GLenum format1, GLsizei width1, GLsizei height1, GLenum area1)
{
// Upper Screen (DISPLAY0)
nngxActiveDisplay(NN_GX_DISPLAY0);
nngxGenDisplaybuffers(2, m_Display0Buffers);
nngxBindDisplaybuffer(m_Display0Buffers[0]);
nngxDisplaybufferStorage(format0, width0, height0, area0);
nngxBindDisplaybuffer(m_Display0Buffers[1]);
nngxDisplaybufferStorage(format0, width0, height0, area0);
nngxDisplayEnv(0, 0);
// Lower Screen (DISPLAY1)
nngxActiveDisplay(NN_GX_DISPLAY1);
nngxGenDisplaybuffers(2, m_Display1Buffers);
nngxBindDisplaybuffer(m_Display1Buffers[0]);
nngxDisplaybufferStorage(format1, width1, height1, area1);
nngxBindDisplaybuffer(m_Display1Buffers[1]);
nngxDisplaybufferStorage(format1, width1, height1, area1);
nngxDisplayEnv(0, 0);
}

This code example uses two display buffers per LCD for multi-buffering. Display buffers are the
only buffers where multiples must be allocated for multi-buffering.

The display buffers may be allocated from main memory (device memory). If you use a display
buffer format that requires more bits per pixel than the format for the color buffer, copying from the
color buffer causes an error.

This is the extent of initialization required for display to the LCD screens. At this point, run
nngxRunCmdlist once to be sure that each buffer has been allocated.

5.6. Memory Management

Heap memory, device memory, and VRAM allocated for the application must be managed by the
application itself. Classes are provided by CTR-SDK for memory management by applications.

5.6.1. Memory Blocks

The memory block feature is equivalent to the concept of an arena used with the NTR/TWL and
Revolution systems. After a memory region of a particular size is allocated, memory can be cut
from the memory block using the heap classes of the FND library (described later) and used.
Instances of the heap classes defined in the FND library can be created by specifying a memory
block as an argument.

Memory blocks are allocated in units of 4096 bytes. Memory blocks are generally used for
allocating work memory or stacks from the heap memory available to applications, when allocations
from device memory cannot be used. There are also libraries for simplifying the implementation of
applications through the use of memory blocks.

Memory block allocation is handled by creating an instance of the nn::os::MemoryBlock class,


or nn::os::StackMemoryBlock or nn::os::StackMemory class for stacks. Before creating an
instance of these classes, you must call the nn::os::InitializeMemoryBlock() function to
specify the memory region to use as a memory block.
Code 5-25. Specifying a Memory Region to Use as a Memory Block

void nn::os::InitializeMemoryBlock(uptr begin, size_t size);

Specify the start address of the memory region in begin, and the size of the memory region in
size. Both must be aligned to nn::os::MEMORY_BLOCK_UNITSIZE (4096 bytes)

Either create an instance of the nn::os::MemoryBlock class (nn::os::StackMemoryBlock


class for stacks) using a memory region of the size to be allocated for the memory block, or
allocate the memory block using the Initialize() function after creating an empty instance. Size
specifications must be made in units of 4096 bytes.

You can get the start address of the allocated memory block using the GetAddress() function and
the size using the GetSize() function. In addition, the GetStackBottom and GetStackSize
interface functions are provided for stacks so that you can directly pass stacks to threads. The
SetReadOnly and IsReadOnly() functions can be used to get and set read-only attributes.
However, there are no versions of these two functions for use with stacks.

When a memory block is no longer needed, it can be explicitly deallocated by calling the
Finalize() member function.

Note: The nn::os::SetupHeapForMemoryBlock() function remains because it is used in


some sample demos. However, we recommend not calling it when using memory blocks.

We recommend not using the nn::os::MemoryBlock and


nn::os::StackMemoryBlock classes in your application, because future updates
could potentially change the classes in a way that breaks your code.

5.6.2. Frame Heaps

A frame heap is a memory management class used to cut a memory region of a specified size out
of a larger memory region specified at initialization time. In addition to using byte boundaries for
the alignment specification, you can select to allocate memory from the beginning of the heap or
the end of the heap depending on the sign used. Only when allocating memory from the beginning
of the heap can you change the size of the memory region ultimately allocated.

The allocated memory regions cannot be deallocated individually. When deallocating memory, all
allocated memory regions, or only the part allocated from the beginning, or only the part allocated
from the end can be deallocated at the same time.

Instances of a frame heap can be specified using a class template


(nn::fnd::FrameHeapTemplate) that locks operations, or with nn::fnd::FrameHeap, which
does not lock operations, or with nn::fnd::ThreadSafeFrameHeap, which is thread-safe (lock
operations are nn::os::CriticalSection).

5.6.3. Unit Heaps

A unit heap is a memory management class used to cut a fixed size memory region (called a unit)
from a larger memory region specified at time of initialization. Alignment is specified at time of
initialization, with 4-byte alignment specified by default. Consecutively allocated memory regions
are not necessarily allocated from a continuous memory region.
Allocated memory regions can be deallocated independently. You can free all memory regions you
have allocated by calling the Invalidate() function followed by the Finalize() function. This
terminates the heap. To continue using the heap, first rebuild it by calling Initialize.

Instances of a unit heap can be created using a class template (nn::fnd::UnitHeapTemplate)


that locks operations, or with nn::fnd::UnitHeap, which does not lock operations, or with
nn::fnd::ThreadSafeUnitHeap, which is thread-safe (lock operations are
nn::os::CriticalSection).

5.6.4. Expanded Heaps

An expanded heap is a memory management class used to cut a memory region of a specified size
from a larger memory region specified at time of initialization. In addition to using byte boundaries
for the alignment specification, you can select to allocate empty memory by searching from the
beginning of the heap or searching from the end of the heap depending on the sign used. You can
also change the size of the allocated memory region.

Allocated memory regions can be deallocated independently. If you repeatedly allocate and
deallocate memory, there is a possibility that it will become impossible to allocate a memory region
even though its size is smaller than the available memory obtained by the GetTotalFreeSize()
function. This problem occurs because there is no longer a contiguous memory region of the
specified size in the heap. You can get the maximum size that can be allocated as contiguous
memory by using the GetAllocatableSize() function.

Instances of an expanded heap can be created using a class template


(nn::fnd::ExpHeapTemplate) that locks operations, or with nn::fnd::ExpHeap, which does
not lock operations, or with nn::fnd::ThreadSafeExpHeap, which is thread-safe (lock
operations are nn::os::CriticalSection).

5.7. DLL Features (RO Library)

A dynamic link library (DLL) allows you to use a module that is loaded into memory dynamically. This
reduces the amount of memory that an application uses for code and can make an application start
up faster.

The CTR-SDK provides the RO library so that you can use DLLs.

5.7.1. DLL Glossary

This section explains the terminology related to the DLL features provided by the CTR-SDK. These
terms may have different meanings and usage than they ordinarily would outside of the context of
DLLs.

Module

A logical unit of executable code.


Static module

A module that has an entry function (nnMain) and is loaded at startup. You can make applications
start faster by shrinking their static modules.

Dynamic module

A module that can be loaded and executed dynamically. You can reduce memory usage by splitting
features into dynamic modules that are then switched in and out.

Symbol

Information that indicates the location of a variable or function in a module.

Import (Reference)

The act of using (or a statement that uses) a variable or function indicated by a symbol in another
module.

Resolve

The act of making an imported symbol usable.

Export (Publish)

The act of allowing (or a statement that allows) other modules to use a variable or function
indicated by a symbol.

Export type

The format in which a symbol is exported. There are three export types: names, indices, and
offsets.

5.7.2. Characteristics and Limitations of the RO Library

The RO library has the following characteristics that set it apart from ordinary DLL implementations.

Three different types of symbols can be exported.


You can resolve a symbol by a name, index, or offset. Although it is possible to choose an
export type for each individual symbol, the CTR-SDK build system only supports one export
type per module.
For more information about how to specify export types, see the CTR-SDK Build System Manual
(for DLLs) and the Guide to Developing a Build System (for DLLs).

References between modules are resolved automatically.


The simple act of loading a module resolves inter-module references so that functions and
variables can be used. You can manually get a pointer to any symbol that has been exported as
a name or an index.

Applications load modules into memory.


Applications must load and adjust the location of dynamic modules in memory.

C++ code is supported.


You can use C++ code in a dynamic module. You can also import and resolve C++ symbols
between modules. If you export a symbol as a name, however, you must use the symbol’s
mangled name to manually get a pointer to it.

C++ exceptions can be caught between modules.


Exceptions can be thrown by one module and then caught by another.

The RO library has the following restrictions and limitations.

A single module cannot export more than 65,535 symbol names.


The maximum length of a symbol that can be exported by name is 8,192 characters.
You cannot include functions from the C standard library in dynamic modules.
Alignment is ignored for any static variables or functions defined in a dynamic module with
more than 8-byte alignment.
Up to 64 dynamic modules can be loaded simultaneously.
When using a nonstandard feature with weak symbols in C/C++, the system does not always
operate as intended.

When using the static library as a dynamic module, first confirm with the creator of the library
whether such use is possible.

Warning: Using the static libraries provided in the SDK as dynamic modules is prohibited.

Do not combine multiple dynamic modules in the same static library. This could cause
various bugs to occur due to unintended library operation, including the existence of
multiple global variables.

5.7.3. Files Used by the RO Library

The RO library implements DLL features using data created with the following file formats.

Table 5-8. Files Used by the RO Library

File Format
Description
(Extension)
This file has import and export information for a static module. It does not contain
CRS executable code. Because an application can only have a single static module, it also
only has one CRS file.
This file has management data for one or more dynamic modules. It must be placed
CRR
directly under a specific directory (/.crr/) within a ROM archive.

CRO This file has import and export information and executable code for a dynamic module.
The following figure shows the relationship between these files.

Figure 5-4. Relationship Between Files Used by the RO Library

In general, you only need to create and use a single CRR file with information about all of the CRO
files used by an application. Use multiple CRR files if you also want to optimize the amount of
memory used by each CRR file or if you want to distribute additional programs.

Note: For more information about how to create these files, see the CTR-SDK Build System
Manual (for DLLs) or the Guide to Developing a Build System (for DLLs).

5.7.4. Importing and Exporting

If two modules share header files, they can call each other ’s functions and access each other ’s
variables with ordinary source code. They do not need to keep track of whether the symbols are
exported.

If two modules do not share header files, they must use explicit import and export statements to
call each other ’s functions and access each other ’s variables.

To import and export symbols explicitly, add the definitions in the following table to your source
code.

Table 5-9. Definitions for Explicit Import and Export Statements

Definition Description

Add this to function and variable declarations that you want to export to other
NN_DLL_EXPORT
modules.
Add this to function and variable declarations that have been exported by other
NN_DLL_IMPORT
modules.

Code 5-26. Sample Code for Explicit Import and Export Declarations

// Explicit export declarations


NN_DLL_EXPORT int g_Variable;
NN_DLL_EXPORT void Function()
{
// (omitted)
}

// Explicit import declarations


extern NN_DLL_IMPORT int g_Variable;
extern NN_DLL_IMPORT void Function();
Note: If you export a symbol explicitly, it is always exported. It will not be dead-stripped, even
if it is not used within a module.

5.7.5. A Comparison of Export Types

The following table shows the differences caused by symbol export types.

Table 5-10. Differences Caused by Export Types

Item Name Index Offset


What is the size of the import information? Large Small Small

What is the size of the export information? Large Small 0


What is a module’s required load speed? Slow Average Fast
Can pointers be obtained manually? Yes Yes No

Can the imported module be created in advance? Yes No No


How many build steps? Normal Normal Many

Symbol names present richer functionality than symbol indices, which in turn present richer
functionality than offsets. However, functionality does entail other costs in file size and processing
time.

In general, choose to export symbols as offsets. If you want to get symbol pointers manually or use
dynamic modules like libraries, choose to export symbols as names or indices as necessary.

Note: Public types can also be specified for symbols in static modules.

5.7.6. Special Dynamic Module Functions

If a dynamic module’s source code exports functions with particular names, the RO library calls
those functions when the module is initialized and at other specified times.

Code 5-27. Special Functions

extern "C" NN_DLL_EXPORT void nnroProlog();


extern "C" NN_DLL_EXPORT void nnroEpilog();
extern "C" NN_DLL_EXPORT void nnroUnresolved();

The nnroProlog() function is called after the DoInitialize() function has finished initializing
a module. This allows you to implement module-specific initialization.

The nnroEpilog() function is called after the DoFinalize() function has finished finalizing a
module. This allows you to implement module-specific finalization.

The nnroUnresolved() function is called when a module uses an external symbol that has not
been resolved. This allows you to detect and handle calls to unresolved symbols.

Implement these functions in your application as necessary.


Note: The nnroUnresolved() function may also be used in a static module. However, one
restriction is that it will not work if a symbol whose reference has been resolved later
becomes unresolved.

5.7.7. Basic Workflow

This section explains the basic workflow for using a dynamic module.

[Link]. Initialize the RO Library

Initialize the RO library with the nn::ro::Initialize() function. Initialization requires a CRS
file, which must already be loaded into memory. Note that the content of the CRS file must be
stored in a buffer that has been allocated in a memory region outside of device memory, with a
starting address that is nn::ro::RS_ALIGNMENT (4,096 bytes) aligned and the size a multiple of
nn::ro::RS_UNITSIZE (4,096 bytes).

The RO library is responsible for managing the buffer with the CRS file. Do not overwrite or
release this buffer until you have finalized the RO library.

[Link]. Register Management Data

Before using a dynamic module, you must load the CRR file that manages it and then register
management data with the nn::ro::RegisterList() function. Note that the content of the
CRR file must be stored in a buffer that has been allocated in a memory region outside of device
memory, with a starting address that is nn::ro::RR_ALIGNMENT (4,096 bytes) aligned and the
size a multiple of nn::ro::RR_UNITSIZE (4,096 bytes).

The RO library is responsible for managing the buffer with the CRR file. Do not overwrite or
release this buffer until you have unregistered the management data by calling the
Unregister() function on the nn::ro::RegistrationList class pointer returned by the
nn::ro::RegisterList() function.

[Link]. Load Dynamic Modules

Load dynamic modules with the nn::ro::LoadModule() function. The application must load a
module’s CRO file into memory in advance. Note that the content of the CRO file must be stored
in a buffer that has been allocated in a memory region outside of device memory, with a starting
address that is nn::ro::RO_ALIGNMENT_LOAD_MODULE (4,096 bytes) aligned and the size a
multiple of nn::ro::RO_UNITSIZE_LOAD_MODULE (4,096 bytes).

You must also provide a buffer to be used for the .data/.bss sections. This buffer must have a
starting address that is nn::ro::BUFFER_ALIGNMENT (8 bytes) aligned. It must not be smaller
than the value of the bufferSize member variable in the nn::ro::SizeInfo structure that is
obtained when the content of the CRO file (REQUIRED_SIZE_FOR_GET_SIZE_INFO bytes from
the beginning) is passed to the nn::ro::GetSizeInfo() function. The SizeInfo structure
also contains information about memory regions that can be used after modules are loaded. If
bufferSize is smaller than the memory region that is freed after a module is loaded, the buffer
can be allocated from that memory region.

The following table shows how the value specified for fixLevel affects dynamic module features
and the memory regions that are released after modules are loaded. If nothing is specified for
fixLevel, it is treated as if it were FIX_LEVEL_1.

Table 5-11. Effects of fixLevel on Dynamic Module Behavior

Item FIX_LEVEL_0 FIX_LEVEL_1 FIX_LEVEL_2 FIX_LEVEL_3


Can this module be reused after
Yes No No No
it is unloaded?
When another module is loaded,
are its symbols automatically Yes Yes No No
linked with this module?
When another module is loaded,
are this module’s symbols
Yes Yes Yes No
automatically linked with that
module?
Can you get the address of the
Yes Yes Yes No
symbols in this module?
Are the symbols in all modules
that have been loaded until now
Yes Yes Yes Yes
automatically linked with this
module?
What is the address at the end of
the memory region managed by fix0End fix1End fix2End fix3End
the library?

A memory region is released after a module is loaded. The size of the region for FIX_LEVEL_2 is
less than or equal to that of FIX_LEVEL_3, the region for FIX_LEVEL_1 is smaller, and the
region for FIX_LEVEL_0 is smaller still. More memory is released when symbol names are
exported than when any other type of symbol is exported.

In general, specify true for doRegister so that any references to unresolved symbols are
automatically resolved when another dynamic module is loaded. If you specify false, references
are resolved to symbols in dynamic modules that have already been loaded but they are not
automatically resolved when another module is loaded. In the latter case, you must manually
resolve symbols by calling Link with the nn::ro::Module pointer returned by the
nn::ro::LoadModule() function.

[Link]. Start to Use Dynamic Modules

Before you use the functions and variables in a dynamic module, you must call DoInitialize
with the nn::ro::Module pointer returned by the nn::ro::LoadModule() function. This
constructs global objects in the dynamic module and runs the initialization process implemented
by the nnroProlog() function.

You can call the DoInitialize() function immediately after a module is loaded as long as its
initialization process does not reference global objects in another dynamic module. If the
initialization processes for multiple dynamic modules reference each other ’s global objects,
however, load all of those dynamic modules before calling the DoInitialize() function.
You can use the Module class’s GetName() function to get the name of a dynamic module. This
function returns the same name that was specified when the module was built. (If you are using
the CTR-SDK build system, this is the name specified by TARGET_MODULE). This name is also
used by the nn::ro::FindModule() function to search for loaded dynamic modules.

You can use the Module::IsAllSymbolResolved() function to determine whether all external
references from a dynamic module have been resolved. To unresolve all of a module’s resolved
references, including references to the module itself, call the Module::Unlink() function. To
resolve those references again, call the Module::Link() function.

There are two ways to manually get a pointer to a symbol: the nn::ro::GetPointer()
function, which searches through the symbols in every module for a name, and the
nn::ro::Module::GetPointer() function, which searches through the symbols in a particular
module for a name or index. You can also use the nn::ro::GetAddress() function to get the
address of a symbol that you found by name.

[Link]. Finalize Dynamic Modules

To free the memory region used by a module that is no longer necessary, you must first call the
Module class’s DoFinalize() function to destroy the global objects in that dynamic module and
run the shutdown processing implemented by the nnroEpilog() function. After this is complete,
call the Module class’s Unload() function to unload the dynamic module.

When a dynamic module is unloaded, it returns to its preloaded state: any memory managed by
the library is released and all references are made unresolved. You can then release the buffer
that had the content of the CRO file.

You generally need to reload a dynamic module if you want to reuse it after it has been unloaded.
However, if you loaded the module with FIX_LEVEL_0 (FIX_LEVEL_NONE) specified as
fixLevel, you can call the nn::ro::LoadModule() function to reuse the same buffer and
other settings without reloading the CRO file. To reuse the buffer, you must have not initialized
any static variables or you must have initialized all static variables with the nnroProlog()()
function There are restrictions on the use of static variables because if they were overwritten
while the dynamic module was in use, they are not restored to their preloaded state when the
module is unloaded.

[Link]. Unregister Management Data

To unregister the management data and release the memory region, you must call the
Unregister() function with the nn::ro::RegistrationList class pointer returned by the
nn::ro::RegisterList() function.

After the management data has been unregistered, the buffer with the content of the CRR file is
no longer managed by the library. Even though you cannot load a new dynamic module managed
by that CRR file after the management data has been unregistered, you can still use dynamic
modules that have already been loaded.

[Link]. Finalize the RO Library

Finalize the RO library with the nn::ro::Finalize() function.


After you have called the nn::ro::Finalize() function, you can release all the memory that
was used by the RO library; this includes the buffers with the content of the CRS, CRR, and CRO
files.

5.7.8. Enumeration and Searching of Dynamic Modules

All dynamic modules can be enumerated, regardless of whether they are loaded by automatic link
with the nn::ro::Module::Enumerate() function. Pass the class inherited from the
nn::ro::Module::EnumerateCallback() function to the argument of the Enumerate()
function. For every dynamic module, the operator() function is called back with the pointer to the
dynamic module as an argument.

Dynamic modules can be searched by the nn::ro::Module::Find() function. If


nn::ro::Module::Find finds the dynamic module whose specified string and name match, it
returns the pointer to the module, and if it cannot find it, it returns NULL. However, make sure that
the search target is only a dynamic module that is loaded by automatic link.

5.7.9. Information About the Memory Region Used by Dynamic


Modules

Information about the memory region used by dynamic modules can be obtained by the
nn::ro::Module::GetRegionInfo() function. The information of the memory region is stored
in the nn::ro::RegionInfo structure passed to the pri argument.

CONFIDENTIAL

6. Input Devices
Use the libraries and classes that support each input device to easily incorporate into your application
the input from the various input devices on the system, such as the digital buttons, Circle Pad, touch
panel, accelerometer, gyro sensor, microphone, and cameras.

6.1. Libraries That Use Devices

To access the input and other devices on the system from your application, you must use the libraries
provided by CTR-SDK. The functions in many of the libraries for using devices return an instance of
the nn::Result class.

Call the nn::Result class member function IsSuccess or IsFailure to find out whether the
function completed processing successfully. If IsSuccess returns true, the function completed
successfully; if it returns false, the function failed. The IsFailure return values have the opposite
meaning; where true means the function failed, and false means the function completed
successfully.

The nn::Result class records error severity, a description, and the name of the module that raised
the error. In addition to checking for function success, you may sometimes need to use this detailed
information to find a way to work around or fix an error in your code.

6.2. HID Library

Using the HID library, you can handle input from the digital buttons (+Control Pad, A/B/X/Y/L/R
Buttons, and START), Circle Pad, touch panel, accelerometer, gyro sensor, debug pad, and Circle
Pad Pro.

Call the nn::hid:Initialize() function to initialize the HID library. Successive calls to the
initialization function do nothing and return an error.

After initialization, input is sampled automatically, and you can obtain input from each device based
on the corresponding class. The timing and cycle of the input sampling start differ depending on the
type of device. To end sampling, the class created by the acceleration and gyro sensors is destroyed.
Call the nn::hid::ExtraPad::StopSampling() function when done to finalize the Circle Pad
Pro, and for all others, finalize the HID library.

Note: The C Stick, ZL Button, and ZR Button on SNAKE hardware can be handled the same way
as a CTR unit with the Circle Pad Pro always attached. In this case, the C Stick
corresponds to the Right Circle Pad, and the ZL and ZR Buttons correspond to the same
buttons on the Circle Pad Pro.

For information about the C Stick and differences with the Circle Pad Pro, see 6.2.7. C
Stick.

Table 6-1. Types and Inputs

Type Class Handling Input Sampling Start Trigger Cycle


Digital buttons and
nn::hid::PadReader HID library initialization 4 ms
Circle Pad
Touch panel nn::hid::TouchPanelReader HID library initialization 4 ms

Avg.
Accelerometer nn::hid::AccelerometerReader Reader class creation
10 ms
Avg.
Gyro sensor nn::hid::GyroscopeReader Reader class creation
10 ms
HID library initialization (when
Debug pad nn::hid::DebugPadReader 16 ms
connected)

ExtraPad class 8–32


Circle Pad Pro nn::hid::ExtraPadReader
StartSampling() function ms

ExtraPad class 8–32


C Stick, ZL / ZR Button nn::hid::ExtraPadReader
StartSampling() function ms

Note: The library allows you to set a sampling frequency of 8 to 32 milliseconds for the C Stick,
ZL Button, and ZR Button on SNAKE hardware, but we recommend setting a sampling
frequency of 10 to 21 milliseconds to match the hardware capabilities.
Call the nn::hid::Finalize() function to finalize use of the library. When using the
nn::hid::ExtraPadReader class, you must first call the nn::hid::ExtraPad::Finalize()
function.

6.2.1. Digital Buttons and Circle Pad

Call the nn::hid::PadReader class's Read or ReadLatest member functions to get input from
the digital buttons and Circle Pad as an nn::hid::PadStatus structure. The Read() function
can get sampling results in order of the latest result, but it cannot reacquire sampling results. When
the function is called at a cycle faster than the sampling cycle, it cannot get sampling results. On
the other hand, the ReadLatest() function can get only the latest sampling result. It can also get
sampling results again, so even when it is called at a cycle faster than the sampling cycle, it can
still get sampling results.

The nn::hid::PadStatus structure's hold member variable records the button being held when
input was sampled, the trigger variable records the button that was pressed when input was
sampled, and release variable records the button that was released when input was sampled, all
mapped to bit values. For the ReadLatest() function, both the trigger and release members
are evaluated for their states at that time, so any changes between calls are not necessarily
applied. Input bits for +Control Pad input may in fact be input from the Circle Pad emulating the
+Control Pad. The actual Circle Pad input is recorded to the stick member variable as a biaxial
coordinate value.

Table 6-2. Buttons and Definitions

Definition Corresponding Button


BUTTON_UP +Control Pad (Up)

BUTTON_DOWN +Control Pad (Down)


BUTTON_LEFT +Control Pad (Left)

BUTTON_RIGHT +Control Pad (Right)


BUTTON_A A Button
BUTTON_B B Button

BUTTON_X X Button
BUTTON_Y Y Button
BUTTON_L L Button

BUTTON_R R Button
BUTTON_START START or SELECT (for standard operations)
BUTTON_SELECT_FOR_DEBUGGING SELECT (only for debugging)

BUTTON_EMULATION_UP Emulated +Control Pad (Up) input using Circle Pad


BUTTON_EMULATION_DOWN Emulated +Control Pad (Down) input using Circle Pad

BUTTON_EMULATION_LEFT Emulated +Control Pad (Left) input using Circle Pad


BUTTON_EMULATION_RIGHT Emulated +Control Pad (Right) input using Circle Pad

Use the nn::hid::EnableSelectButton() and nn::hid::DisableSelectButton()


functions to enable and disable SELECT sampling.

Code 6-1. Enabling and Disabling SELECT Sampling


bool nn::hid::EnableSelectButton();
void nn::hid::DisableSelectButton();

Calling the EnableSelectButton() function returns true when the function has successfully
enabled sampling of SELECT. The function always returns false and the sampling cannot be
enabled unless the system is set to debug mode. Calling the function when SELECT sampling is
already enabled returns BUTTON_SELECT_FOR_DEBUGGING, so you can distinguish the SELECT
input from the START input.

The Circle Pad may return a coordinate value even when the user is not touching it. Consequently,
make sure that you set an appropriate play value and clamp mode with the nn::hid::PadReader
class's SetStickClamp() and SetStickClampMode() member functions, respectively. There
are three clamp modes available: circular (STICK_CLAMP_MODE_CIRCLE), cruciform
(STICK_CLAMP_MODE_CROSS), and minimal (STICK_CLAMP_MODE_MINIMUM). Use the
GetStickClampMode() function to get the current clamp mode, and use the GetStickClamp()
function to get the clamp value. Circular clamping is the default clamp setting. The following
examples show clamped Circle Pad input, with the original input value coordinates specified as (x,
y), the distance from the origin as d, the clamp minimum and maximum values as min and max, the
clamped coordinates as (x', y'), and the clamped distance from the origin as d'.

Circular Clamping

d <= min

(x', y') = (0, 0)

min < d < max

(x', y') = ((d - min) / d * x, (d - min) / d * y)

d > max

(x', y') = ((max - min) / d * x, (max - min) / d * y)

Cruciform Clamping

x < 0

(x', y') = (x + min, y) But x + min must be 0 or greater.

x >= 0

(x', y') = (x - min, y) But x - min must be 0 or less.

y < 0

(x', y') = (x, y + min) But y + min must be 0 or greater.

y >= 0

(x', y') = (x, y - min) But y - min must be 0 or less.

d' > (max - min)

(x', y') = ((max - min) / d * x, (max - min) / d * y)

Minimal Clamping

Minimal clamping combines circular and cruciform clamping. When using minimal clamping (minimal
values), unclamped coordinate values are those that lie within the unclamped ranges of both the
circular and cruciform clamping ranges. Coordinates outside this region are clamped to within this
minimum.

Figure 6-1 shows how the ranges for clamped Circle Pad input coordinates change.

Figure 6-1. Valid Input Ranges for Circular, Cruciform, and Minimal Clamping

In STICK_CLAMP_MODE_CROSS, for coordinates close to either the x-axis or y-axis, only the value
for that axis changes (that is, if close to the x-axis, only the x-coordinate value changes).

Clamped input coordinate values for either axis produce output values as shown below in any of the
clamp modes.

Figure 6-2. Relationship Between Input and Output Coordinates

In any of the clamp modes, the output value is 0 until the input value passes the min threshold, and
the output value is ±(max - min) after the input value passes the max threshold, with the
clamped output value ranging from 0 to ±(max - min).

In any of the clamp modes, the obtainable output coordinates describe a circle with radius (max -
min). Choose the appropriate clamp mode for the input you need. STICK_CLAMP_MODE_CIRCLE
preserves the angle of the input coordinates from the origin and is better suited to uses requiring
finer directional control. STICK_CLAMP_MODE_CROSS only outputs the coordinate value for the
closest axis, either the x-axis or the y-axis, and is better suited to uses emphasizing axial
directional input, such as when selecting items from a menu.

Normalizing Circle Pad Coordinate Values

The nn::hid::PadReader class's NormalizeStick() member function takes the Circle Pad’s
coordinate value obtained from the Read() or ReadLatest() member functions and normalizes it
to an f32 floating-point value between –1.0 and +1.0 with (max – min) = 1.0.

Use the NormalizeStickWithScale() function to normalize Circle Pad coordinate values with a
sensitivity adjustment. Configure the sensitivity adjustment by calling the
SetNormalizeStickScaleSetting() function and specifying values for the scale (default of
1.5) and threshold (default of 141) parameters. Normalizing multiplies values that are lower
than threshold by (1/scale), while values equal to threshold or higher gradually approach
±1.0. Use this feature to make the Circle Pad respond as if its movable range were scale times
big.

Emulating the +Control Pad on the Circle Pad

The +Control Pad emulation bits for Circle Pad input (BUTTON_EMULATION_*) are used to judge
which direction is being indicated by the controller, by using circular clamping with a lower limit of
40 and an upper limit of 145. The values set using the SetStickClamp() and
SetStickClampMode() functions are not used for this purpose. Positive and negative values for
the x-axis and y-axis are interpreted to mean left, right, up, and down, respectively, and the bit for
a particular direction is set if that direction is within 120° of the input direction. Consequently, the
30° overlap zone between directions is interpreted as a diagonal input.

Figure 6-3. BUTTON_EMULATION_ Judgment Range*

[Link]. Circle Pad Hardware Performance

The Circle Pad included in the 3DS system is made up of a resistive plate, shaft, and key top.
This composite structure includes some tolerance, resulting in a slight lag in response compared
to the analog sticks for the Nintendo GameCube or Wii consoles when reversing the direction of
input.

Figure 6-4. Delayed Response to Input Inversion Due to the Circle Pad’s Hardware
Actual input values might also differ even when moving the Circle Pad along the same path of
movement. This phenomenon is called path dependence and is also caused due to the Circle Pad
hardware.

For example, when moving the Circle Pad from the left side to the top, going past the center, the
shaft is touching the left edge of the key top, with input values tending more toward -X.
Conversely, when moving the Circle Pad from the right side to the top, going past the center, the
shaft is touching the right edge of the key top, with input values tending more toward +X. Input
values exhibit this same tendency in the vertical direction for ±Y values.

Figure 6-5. Circle Pad Path Dependence

We recommend that your application factor in some tolerance for this phenomenon, such as for
input detection thresholds.

6.2.2. Touch Panel

Call the nn::hid::TouchPanelReader class's Read or ReadLatest member functions to get


input from the touch panel as an nn::hid::TouchPanelStatus structure. The Read() function
can get sampling results in order of the latest result, but it cannot reacquire sampling results. When
the function is called at a cycle faster than the sampling cycle, it cannot get sampling results. On
the other hand, the ReadLatest() function can get only the latest sampling result. It can also get
sampling results again, so even when it is called at a cycle faster than the sampling cycle, it can
still get sampling results.

The nn::hid::TouchPanelStatus structure's x and y members record the touch panel input
coordinates in pixels, with the upper-left corner as the origin (when holding the system so that the
lower screen is toward the user). Note that the coordinate axes on the touch panel differ from the
LCD coordinate axes used in the GX library. Areas in the five-dot region on the outermost of the
screen that is difficult to touch are returned as clamped coordinate values. Accordingly, the x
values actually returned are between 5 and 314, and the y values between 5 and 234.

Code 6-2. Converting Touch Panel Input Coordinates to LCD Coordinates

x = nn::gx::DISPLAY1_WIDTH - touchPanel.y;
y = nn::gx::DISPLAY1_HEIGHT - touchPanel.x;

The touch member records whether the stylus is touching the touch panel. The member value is 0
if the stylus is not touching the panel and 1 if it is.

6.2.3. Accelerometer

Call the Read or ReadLatest member functions of the nn::hid::AccelerometerReader class


to get input from the accelerometer as an nn::hid::AccelerometerStatus structure. The
Read() function can get sampling results in order of the latest result, but it cannot reacquire
sampling results. When the function is called at a cycle faster than the sampling cycle, it cannot get
sampling results. On the other hand, the ReadLatest() function can get only the latest sampling
result. It can also get sampling results again, so even when it is called at a cycle faster than the
sampling cycle, it can still get sampling results.

The accelerometer is turned on when an instance of the AccelerometerReader class is created


and it is turned off when an instance is destroyed. To conserve battery life, only instantiate the
AccelerometerReader class when it is used. However, if you repeatedly create and destroy an
instance of the AccelerometerReader class in a function that is called every frame (for
example), the frequent power cycling wastes battery power and leads to unexpected malfunctions.

The x, y, and z members of the nn::hid::AccelerometerStatus structure record input from


the accelerometer's three axes. The x-axis is equivalent to the left-right axis on the +Control Pad,
the y-axis is equivalent to a line perpendicular to the lower LCD screen, and the z-axis is equivalent
to the up-down axis on the +Control Pad. The raw input values from these three axes do not
accurately describe the acceleration of the system, so make sure that you first convert the raw
values to G units using the nn::hid::AccelerometerReader class's
ConvertToAcceleration() member function before using them in your application.

Figure 6-6. The Three Accelerometer Axes

The accelerometer ’s zero point may vary by up to 0.05 G over time and 0.08 G due to temperature
changes, for a maximum variance of 0.13 G. Applications that use the accelerometer must account
for this variance. Variance over time can be mostly rectified by calibrating, but variance due to
temperature can reoccur in a short period even after calibrating. This 0.08 maximum temperature-
induced variance equates to a roughly five-degree tilt. When detecting movements of finer
precision, we recommend calibrating immediately if the user notices anything strange.

Note: Hold the Y or B Button for three seconds while the HOME Menu is displayed to calibrate
the accelerometer.

The accelerometer ’s maximum sensitivity may also vary by up to ±8%. After factoring in the zero-
point variance offset, the maximum input value is 10% less than the theoretical maximum value of
approximately 1.8 G, at 1.62 G. Applications must not depend on any input values greater than this.

Warning: The output value of the accelerometer has a superimposed at-rest noise of ±0.02 G.
The output value has an additional superimposed noise of ±0.05 G from conductive
vibration when the speaker is outputting sound. The effect of this noise input must be
considered when detecting small acceleration.

The conductive vibration is highest in the 1-kHz range. The effect can be reduced by
adjusting the sound playback volume.

Call the SetSensitivity member function of the nn::hid::AccelerometerReader class to


set the input play and detection sensitivity values, and use the GetSensitivity member function
to get these values. Either of these values may be set in the range from 0 to
MAX_OF_ACCELEROMETER_SENSITIVITY, with the default values of 0 for the play and
MAX_OF_ACCELEROMETER_SENSITIVITY for the detection sensitivity. Setting the detection
sensitivity to 0 means only 0 will be returned. Specifying MAX_OF_ACCELEROMETER_SENSITIVITY
for this setting gives a response based directly on the value read from the device.

Looking at the accelerometer input value as a set of triaxial coordinates, the accelerometer change
value for any axis is interpreted to be within the play range if it is no greater than the play value.
Values within the play range are output as 0, and changes within this range are output as no
change at all.

The following figure shows the relationship between the accelerometer input and output values, and
the play range. Use this play range setting to minimize any false readings caused by small
changes in position, such as being bumped by the user's hands.

Figure 6-7. Relationship Between Input and Output Values Due to the Play Range Setting

Use the nn::hid::AccelerometerReader class to carry out axial rotation on output values
(already adjusted for sensitivity and play) after adjusting for an offset. You can also use axial
rotation of output values to apply any angle tilt to the accelerometer.

Code 6-3. Offsetting Accelerometer Output Values

class nn::hid::AccelerometerReader
{
void EnableOffset();
void DisableOffset();
bool IsEnableOffset() const;
void SetOffset(s16 x, s16 y,s16 z);
void GetOffset(s16* pX, s16* pY, s16* pZ) const;
void ResetOffset();
}

Enable offsetting of output values by calling EnableOffset, and disable by calling


DisableOffset. Call IsEnableOffset to get the current setting; a return value of true
indicates that the offset is enabled. When the offset is enabled, each component of the offset value
is subtracted from each component of the output value.

Set the offset value itself by calling SetOffset. Use SetOffset to specify offset values
individually for each axis. Note that each component of the offset value is specified not as an
acceleration value (in G), but rather as a pre-conversion sampling value. Use the GetOffset()
function to get the current setting. Call ResetOffset to return the offset to the initial value of 0 for
each component.

Note: The SetOffsetFromBaseStatus() function has been removed from CTR-SDK.

Warning: Do not use the offset value settings to calibrate the accelerometer.

Code 6-4. Accelerometer Axial Rotation

class nn::hid::AccelerometerReader
{
void EnableAxisRotation();
void DisableAxisRotation();
bool IsEnableAxisRotation() const;
void SetAxisRotationMatrix(const nn::math::MTX34 mtx);
void GetAxisRotationMatrix(nn::math::MTX34* pMtx) const;
void ResetAxisRotationMatrix();
}

Enable axial rotation of output values by calling EnableAxisRotation, and call


DisableAxisRotation to disable. Rotation is disabled by default. Call IsEnableAxisRotation
to get the current setting; a return value of true indicates that rotation is enabled. When enabled,
the previously set rotation matrix is applied to the output value.

To set the rotation matrix (a 3×4 matrix), call SetAxisRotationMatrix; to get the currently set
matrix, call GetAxisRotationMatrix. Call ResetAxisRotationMatrix to revert to the identity
matrix (no rotation).

When both an offset rotation and axial rotation are enabled, the offset is applied first, and then the
rotation is applied to the resulting value.

Warning: Do not use the axial rotation settings to calibrate the accelerometer.

[Link]. Cautions for Implementing Application-Specific Calibration


Routines
Warning:Applications are now prohibited from implementing their own calibration routines.

6.2.4. Gyro Sensor

Call the nn::hid::GyroscopeReader class’s Read or ReadLatest member functions to get the
gyro sensor ’s input value as a nn::hid::GyroscopeStatus structure. The Read() function can
get sampling results in order of the latest result, but it cannot reacquire sampling results. When the
function is called at a cycle faster than the sampling cycle, it cannot get sampling results. On the
other hand, the ReadLatest() function can get only the latest sampling result. It can also get
sampling results again, so even when it is called at a cycle faster than the sampling cycle, it can
still get sampling results.

The gyro sensor is turned on when an instance of the GyroscopeReader class is created, and it is
turned off when an instance is destroyed. To conserve battery life, only instantiate the
GyroscopeReader class when it is used. However, if you repeatedly create and destroy an
instance of the GyroscopeReader class in a function that is called every frame (for example), the
frequent power cycling wastes battery power and leads to unexpected malfunctions.

The speed, angle, and direction members of the nn::hid::GyroscopeStatus structure


record the gyro sensor ’s angular velocity, angle of rotation, and 3D attitude. The angular velocity
and angle of rotation values are 3D vectors, with the x component recording the pitch direction, the
y component the yaw direction, and the z component the roll direction. An angular velocity of 360
dps is represented by a value of 1.0, and an angle of rotation of 360° by a value of 1.0. Pitch is
the degree of tilt forward and back about the axis running along the width of the lower screen, yaw
is rotation about an axis perpendicular to the lower screen, and roll is tilt to left and right about the
axis running along the height of the lower screen.

Figure 6-8. The Three Components of the Gyro Sensor

The gyro sensor ’s maximum sensitivity to angular velocity can vary by up to ±8%. After factoring in
the zero-point variance offset, the maximum input value is 10% less than the theoretical maximum
value of approximately 1800 dps, at 1620 dps. Applications must not depend on any input values
greater than this.

The 3D attitude is recorded as a 3 × 3 matrix. This matrix is calculated based on the angular
velocity, and you can change the angular velocity multiple referenced in calculation by calling the
nn::hid::GyroscopeReader::SetDirectionMagnification() function. Specify a multiple
of 1.0 to use the output angular velocity without modification in calculation. Specify 2.0 to use a
value of double the detected angular velocity.

The nn::hid::GyroscopeReader class detects when the system is at rest and automatically
calibrates the zero-point offset (zero-point drift correction). The correction setting is applied alike to
all directions.

Code 6-5. Gyro Sensor Zero-Point Drift Correction


class nn::hid::GyroscopeReader
{
void EnableZeroDrift();
void DisableZeroDrift();
bool IsEnableZeroDrift() const;
f32 GetZeroDriftEffect() const;
void ResetZeroDriftMode();
void SetZeroDriftMode(const nn::hid::ZeroDriftMode& mode);
void GetZeroDriftMode(nn::hid::ZeroDriftMode& mode) const;
}

To enable zero-point drift correction, call EnableZeroDrift; to disable it, call


DisableZeroDrift. Correction is enabled by default. Call IsEnableZeroDrift to get the
current setting; a return value of true indicates that drift correction is enabled. When enabled, the
previously set rotation matrix is applied to the output value. The return value from a call to
GetZeroDriftEffect indicates the degree of drift correction being applied. A value of 0 indicates
that no correction has been performed (movement was detected). Values approaching 1.0 indicate
stability (no movement).

The degree of correction and the angular velocity at which the system is considered at rest both
depend on the drift correction mode. Set and get the zero-point drift correction mode using the
SetZeroDriftMode() and GetZeroDriftMode() functions, and use ResetZeroDriftMode to
revert to the initial value (GYROSCOPE_ZERODRIFT_STANDARD).

Table 6-3. Gyro Sensor Zero-Point Drift Correction Mode

Definition Description
Correction is applied more loosely, and constant-velocity
GYROSCOPE_ZERODRIFT_LOOSE
movement might not be detected.
GYROSCOPE_ZERODRIFT_STANDARD Standard correction. (Default.)

Correction is applied strictly, allowing the detection of more


GYROSCOPE_ZERODRIFT_TIGHT
precise movements.

The zero-point offset may vary by up to 50 dps when zero-point drift correction is disabled. An
application must adequately account for this variance when disabling correction, such as when
detecting extremely small movements.

Note: Unless GYROSCOPE_ZERODRIFT_LOOSE is specified, the correction may not work as


well even when the system is at rest. We recommend implementing your application to
account for this, such as by applying GYROSCOPE_ZERODRIFT_LOOSE correction for
scenes where the user is not expected to move the system, and then returning to the
previously used correction mode during other use.

Set the zero-point play correction to avoid reacting to small changes in angular velocity. The play
correction setting is applied equally to all directions.

Code 6-6. Gyro Sensor Zero-Point Play Correction

class nn::hid::GyroscopeReader
{
void EnableZeroPlay();
void DisableZeroPlay();
bool IsEnableZeroPlay() const;
f32 GetZeroPlayEffect() const;
void SetZeroPlayParam(f32 radius);
void GetZeroPlayParam(f32& radius) const;
void ResetZeroPlayParam();
}

Call the EnableZeroPlay() function to enable zero-point play correction and call
DisableZeroPlay to disable it. Correction is disabled by default. Call IsEnableZeroPlay to get
the current setting; a return value of true indicates that play correction is enabled. The return
value from a call to GetZeroPlayEffect indicates the degree of play correction being applied.
The return value is negative when correction is disabled and 0 or greater when correction is
enabled. As the angular velocity approaches the currently set play value, the return value
approaches 0. A return value of exactly 0 indicates no correction.

To set the absolute value of the angular velocity for which play correction is 0, call the
SetZeroPlayParam() function; to get the current setting, call GetZeroPlayParam. Call
ResetZeroPlayParam to revert to the initial value of 0.005. An absolute angular velocity of 360
dps is represented by a value of 1.0.

If both zero-point drift and play correction are enabled, values are corrected first for drift and then
for play.

Use the nn::hid::GyroscopeReader class to apply axial rotation to angular velocity after
correcting for zero-point drift and play. The class calculates the rotation matrix based on output
values when the system is at rest, allowing you to implement gyro sensor calibration in your
application. You can also use axial rotation of output values to apply any angle tilt to the gyro
sensor.

Code 6-7. Gyro Sensor Axial Rotation

class nn::hid::GyroscopeReader
{
void EnableAxisRotation();
void DisableAxisRotation();
bool IsEnableAxisRotation() const;
void SetAxisRotationMatrix(const nn::math::MTX34 mtx);
void GetAxisRotationMatrix(nn::math::MTX34* pMtx) const;
void ResetAxisRotationMatrix();
}

To enable axial rotation of angular velocity, call EnableAxisRotation; to disable, call


DisableAxisRotation. Rotation is disabled by default. Call IsEnableAxisRotation to get the
current setting; a return value of true indicates that rotation is enabled. When enabled, the
currently set rotation matrix is applied to the angular velocity.

To set the rotation matrix (a 3 × 4 matrix), call SetAxisRotationMatrix; to get the currently set
matrix, call GetAxisRotationMatrix. Call ResetAxisRotationMatrix to revert to the identity
matrix (no rotation).

The nn::hid::GyroscopeReader class uses the accelerometer input value to correct the gyro
sensor ’s 3D attitude. Anytime you use the gyro sensor, you are also using the accelerometer. When
passing NULL as the value of the pAccelerometerReader parameter to the (overloaded)
constructor, the GyroscopeReader class internally creates and uses an instance of the
nn::hid::AccelerometerReader class that has the default settings. Pass a pointer to a
configured instance of the nn::hid::AccelerometerReader class if you want to use different
values for input play and detection sensitivity.

Code 6-8. 3D Attitude Correction Using the Accelerometer

class nn::hid::GyroscopeReader
{
GyroscopeReader(nn::hid::AccelerometerReader* pAccelerometerReader = NULL,
nn::hid::Gyroscope& gyroscope = nn::hid::GetGyroscope());
void EnableAccRevise();
void DisableAccRevise();
bool IsEnableAccRevise() const;
f32 GetAccReviseEffect() const;
void SetAccReviseParam(f32 revise_pw, f32 revise_range);
void GetAccReviseParam(f32& revise_pw, f32& revise_range) const;
void ResetAccReviseParam();
}

To enable correction using the accelerometer, call the EnableAccRevise() function; to disable,
call DisableAccRevise. Call IsEnableAccRevise to get the current setting; a return value of
true indicates that correction is enabled. The return value from a call to GetAccReviseEffect
indicates the degree of correction being applied. The return value is always 0 when correction is
disabled and 0 or greater when correction is enabled. The return value approaches 0 as the
gyroscope’s 3D attitude direction ([Link]) approaches the
accelerometer ’s direction.

To set the accelerometer correction parameters (weight and enabled range), call
SetAccReviseParam; to get the current settings, call GetAccReviseParam. Call
ResetAccReviseParam to revert to the initial settings of 0.03 for weight and 0.4 for enabled
range. For the revise_pw parameter, specify a value of 0.0 through 1.0 for the accelerometer
weight. Higher values mean more severe correction. For the revise_range parameter, specify the
accelerometer range to correct within. The range is 1.0 to the ±revise_range value. For
example, specify a value of 0.4 to apply correction to acceleration values from 0.6 G through 1.4
G. The correction parameters are applied equally to all directions.

When applying angular velocity axial rotation to the gyro sensor with accelerometer correction
enabled, specify the same rotation matrix for use with both the gyro sensor and accelerometer.
When using an instance created with the default settings, the library makes sure internally that the
same matrix is used, but if you use an instance created with any other settings, you must do this
yourself. If different matrices are used, the accelerometer correction might not work properly.

6.2.5. Debug Pad (Control Pad for Debugging)

Call the nn::hid::DebugPadReader class’s Read or ReadLatest() functions to get input from
the digital buttons and two analog sticks on the debug pad as an nn::hid::DebugPadStatus
structure. The Read() function can get sampling results in order of the latest result, but it cannot
reacquire sampling results. Sampling results cannot be obtained if the function is called more
frequently than the sampling frequency of 16 ms. On the other hand, the ReadLatest() function
can get only the latest sampling result. It can also get sampling results again, so even when it is
called at a cycle faster than the sampling cycle, it can still get sampling results.

The nn::hid::DebugPadStatus structure’s hold member records which button was held at the
time of sampling, the trigger member records which button was pressed at the time of sampling,
and the release member records which button was released at the time of sampling. For the
ReadLatest() function, both the trigger and release members are evaluated for their states
at that time, so any changes between calls are not necessarily applied. The leftStickX and
leftStickY members record the left analog stick’s x and y axes, and the rightStickX and
rightStickY members record the right analog stick’s x and y axes as values ranging from -1.0 to
1.0.

Call the SetStickClampMode() function to set the analog stick clamp mode, and call
GetStickClampMode() to get the current setting. This value cannot be set independently for the
left and right analog sticks. You can choose either the STICK_CLAMP_MODE_CIRCLE_WITH_PLAY
or STICK_CLAMP_MODE_CIRCLE_WITHOUT_PLAY clamp mode.
Table 6-4. Debug Pad Digital Buttons and Definitions

Definition Corresponding Button


DEBUG_PAD_BUTTON_UP +Control Pad (Up)

DEBUG_PAD_BUTTON_DOWN +Control Pad (Down)


DEBUG_PAD_BUTTON_LEFT +Control Pad (Left)

DEBUG_PAD_BUTTON_RIGHT +Control Pad (Right)


DEBUG_PAD_BUTTON_A A Button
DEBUG_PAD_BUTTON_B B Button

DEBUG_PAD_BUTTON_X X Button
DEBUG_PAD_BUTTON_Y Y Button
DEBUG_PAD_TRIGGER_L L Trigger

DEBUG_PAD_TRIGGER_R R Trigger
DEBUG_PAD_TRIGGER_ZL ZL Trigger
DEBUG_PAD_TRIGGER_ZR ZR Trigger

DEBUG_PAD_BUTTON_PLUS Plus (+) Button


DEBUG_PAD_BUTTON_MINUS Minus (-) Button

DEBUG_PAD_BUTTON_HOME HOME Button

6.2.6. Circle Pad Pro

The Circle Pad Pro is an optional peripheral device that is attached to the CTR for use. Using the
Circle Pad Pro in addition to the standard input devices provided with the CTR system allows you to
make use of the circle pad installed in the Circle Pad Pro (hereafter called the Right Circle Pad).

The circle pad installed by default in the CTR system is simply called the Circle Pad, and the digital
buttons (the ZL and ZR Buttons). Input from input devices installed in the Circle Pad Pro is
transmitted to the CTR system using infrared communication. The sampling cycle covers a range
from 8 ms to 32 ms in 1-ms intervals. For a detailed description of how to perform settings, see
[Link]. Sampling Start and State Acquisition.

Note: For the process flow for using Circle Pad Pro in applications, see Appendix: Process
Flows for Using the Circle Pad Pro.

For more information about the C Stick, ZL Button, and ZR Button on SNAKE and for the
differences from the Circle Pad Pro, see 6.2.7. C Stick.

Note: CTR uses infrared to communicate with the Circle Pad Pro.
If you want to use the Circle Pad Pro from an application while another feature is using
the infrared communication, you must end the other feature first.

Infrared communication is used by the following features.

Infrared communication between systems


NFP (Only CTR)
[Link]. Hardware Internal States

The Circle Pad Pro hardware has only two states: the Active State in which communication with
the CTR system is possible, and the Standby State in which no communication is performed.

Table 6-5. Hardware Internal States

Internal
Description
State
State in which communication with the CTR is possible. Connection to the CTR can be
Active
made only in this state.
State
The device enters this state as soon as a battery is inserted.
No communication with the CTR is performed in this state. When not in use, battery
consumption is kept to a minimum to extend battery life.
Standby
The device enters this state if button input and infrared communication are not performed
State
for a period of five minutes. The device returns to Active State by pushing one of the
digital buttons on the Circle Pad Pro (ZL, ZR, or R).

Note: The Circle Pad Pro has no indicator to display its internal state. The device’s internal
state cannot be obtained from the library.

[Link]. Software Internal State

Software (library) internal states transition as shown in the following diagram.

Figure 6-9. Software Internal State Transition (Circle Pad Pro)

Connection status of internal states with the Circle Pad Pro is summarized in the following table.

Table 6-6. Software Internal States

Internal State Description


Connection with the Circle Pad Pro has not been established in this state.
NO_CONNECTION
This is the same internal state as immediately after initialization.
Connection with the Circle Pad Pro is established and sampling is being
CONNECTED
performed in this state.
Connection with the Circle Pad Pro, which had been established up to this
point, has been interrupted due to an external cause (such as battery
STOPPED consumption or detachment from the CTR system).
This internal state is also entered when the CTR system transitions to Sleep
Mode.

This internal state is defined by the nn::hid::ExtraPad::ConnectionState enumerator,


taking a CONNECTION_STATE_* value.

[Link]. Initialization

Before Circle Pad Pro input sampling can begin, the nn::hid::ExtraPad class initialization
function must be called.

Code 6-9. Circle Pad Pro Initialization Function

static void nn::hid::ExtraPad::Initialize(


void* workingMemory,
size_t workingMemorySize);

Specify the buffer for infrared communication in workingMemory. The specified buffer must be of
size nn::hid::CTR::ExtraPad::WORKING_MEMORY_SIZE (12,288 bytes), and the starting
address alignment must be nn::hid::CTR::ExtraPad::WORKING_MEMORY_ALIGNMENT (4096
bytes). Buffers allocated from device memory cannot be used.

After initialization completes, the nn::hid::ExtraPadReader class can be used.

[Link]. Sampling Start and State Acquisition

Use the following functions to begin sampling and get the state.

Code 6-10. Functions Used to Start Sampling and Get States

class nn::hid::ExtraPad
{
static nn::Result StartSampling(s32 samplingThreadPriority, s32 period);
static ConnectionState GetConnectionState();
static bool IsSampling();
}

Circle Pad Pro sampling is performed by calling the nn::hid::ExtraPad::StartSampling()


function. Calling the function before initialization has completed generates an error with the same
value as the nn::hid::MakeResultNotInitialized() function. Processing of this function
requires 50 to 200 ms when connection with the Circle Pad Pro has succeeded, and 100 ms when
connection has failed.

Specify a sampling thread priority level in samplingThreadPriority. The specifiable range is


0 to 31. The higher the priority (the closer to 0), the more stable sampling becomes.

Specify the sampling cycle (in milliseconds) in period. The specifiable range is 8 to 32.

When sampling is started, application resources are consumed and the sampling thread is
created. (The thread priority level is the value specified in samplingThreadPriority). This
thread receives sampling data from the Circle Pad Pro according to the cycle setting specified in
period, and sends approximately once every second for continuous communication with the
Circle Pad Pro. The higher the value specified for the sampling cycle, the larger the processing
volume per unit of time. The processing burden on the application core and system core grows in
proportion to the length of the cycle. To lighten the load on the system, it is most effective to set
a lower value for the sampling cycle.

Warning: When the load from other processes is especially high, it interferes with sampling
thread operating cycles and may cause delays in input, or disconnection of the Circle
Pad Pro.

Sampling cannot start unless the Circle Pad Pro is in the Active State. It is not possible to
determine whether the device is in the Active State from the library. Prompt the user to enter
input from one of the digital buttons (R/ZR/ZL) on the Circle Pad Pro when the
StartSampling() function sends back a return value indicating that the Circle Pad Pro cannot
be found (the same value as for the nn::hid::MakeResultNoConnection() function). After
you have returned to the Active State, you can reconnect.

When the StartSampling() function is called, a connection is made after a disconnect has
been performed once.

Note: To perform ongoing detection of the Circle Pad Pro, a load must be set at less than a
32-ms cycle, assuming the StartSampling() function is called at a frequency of
once per second. This is smaller than sampling processing.

The nn::hid::ExtraPad::GetConnectionState() function returns the Circle Pad Pro


connection state as an internal state. Reconnect after performing a disconnect once, when a
disconnect occurs due to an external cause and CONNECTION_STATE_STOPPED is returned.

The nn::hid::ExtraPad::IsSampling() function returns whether sampling is in progress. If


the value true is returned, sampling is being performed and Circle Pad Pro input is applied to
the nn::hid::ExtraPadStatus structure obtained by the nn::hid::ExtraPadReader class.
When the value true is returned, it guarantees that the internal state is also in a connected state
(CONNECTED), but note that if a connection or disconnect is underway, the value false could
be returned even if the internal state is in a connected state.

Note: After Circle Pad Pro sampling begins, the nn::hid::PadReader class can no longer
be used, but even if the Circle Pad Pro is not connected, input from the system’s
digital buttons and circle pad is applied to the nn::hid::ExtraPadStatus structure
obtained by the nn::hid::ExtraPadReader class.

It is unnecessary for the application supporting the Circle Pad Pro to use the
connection state to switch between classes and structures, because it uses the
nn::hid::ExtraPadReader class and the nn::hid::ExtraPadStatus structure.

[Link]. Stop Sampling

Call the nn::hid::ExtraPad::StopSampling() function to stop sampling.

Code 6-11. Function Used to Stop Sampling


static nn::Result nn::hid::ExtraPad::StopSampling();

After a disconnect has succeeded, the sampling thread is destroyed and the internal state
transitions to a not connected state (NO_CONNECTION). This function always succeeds in
processing as long as initialization has already been completed when it is called.

[Link]. Getting Sampling Results

You can get sampling results with the Read or ReadLatest() function under the
nn::hid::ExtraPadReader class, reflecting the nn:: hid::ExtraPadStatus structure. The
Read() function can get sampling results in order of the latest result, but it cannot reacquire
sampling results. When the function is called at a cycle faster than the sampling cycle, it cannot
get sampling results. On the other hand, the ReadLatest() function can get only the latest
sampling result. It can also get sampling results again, so even when it is called at a cycle faster
than the sampling cycle, it can still get sampling results.

Code 6-12. nn::hid::ExtraPadStatus Structure

struct nn::hid::ExtraPadStatus {
AnalogStickStatus stick;
AnalogStickStatus extraStick;
bit32 hold;
bit32 trigger;
bit32 release;
u8 batteryLevel;
bool isConnected;
NN_PADDING2;
};

The Circle Pad Pro’s input from the system is applied to stick, in the same way as in the
nn::hid::PadStatus structure.

Input from the Circle Pad Pro’s Right Circle Pad is applied to extraStick.

The hold, trigger, and release member variables are the same as those in the
nn::hid::PadStatus structure. The following buttons have been added to the device.

Table 6-7. Buttons Added to the Extra Pad Reader

Definition Corresponding Button

BUTTON_ZL Circle Pad Pro ZL button


BUTTON_ZR Circle Pad Pro ZR button
BUTTON_EMULATION_R_UP Up on the Right Circle Pad emulates up on the +Control Pad

BUTTON_EMULATION_R_DOWN Down on the Right Circle Pad emulates down on the +Control Pad
BUTTON_EMULATION_R_LEFT Left on the Right Circle Pad emulates left on the +Control Pad
BUTTON_EMULATION_R_RIGHT Right on the Right Circle Pad emulates right on the +Control Pad

Note: When the Circle Pad Pro is attached, it is difficult to press the R Button, so the Circle
Pad Pro has been fitted with its own R Button. The application cannot detect whether
the pressed R Button is the one on the CTR system or on the Circle Pad Pro.

When the Circle Pad Pro is not performing sampling, input from the CTR system is
applied to the nn::hid::ExtraPadStatus structure in 4-ms sampling cycles by the
nn::hid::ExtraPadReader class. However, when Circle Pad Pro sampling is being
performed, the sampling cycle for input from the CTR system uses the sampling cycle
specified by the nn::hid::ExtraPad::StartSampling() function (between 8 and
32 ms).

Circle Pad Pro battery level is stored in two values (0 and 1) in batteryLevel. Even when the
value 0 is returned, there is approximately one day remaining of continuous operation.

Warning: Even if the value 1 is returned for remaining battery level, inserting a battery
whose battery life is already partially consumed may result in the Circle Pad Pro failing
to restart. If communication no longer works after inserting the battery, replace it with
a new battery. This phenomenon occurs only when inserting a battery. After a new
battery has been inserted, normal operation continues until the remaining battery life
reaches 0.

Results of library searches for whether sampling is in progress are stored in isConnected.

Note: The time required until the application can get Circle Pad Pro input (input delay) is
approximately 2 ms or more (2 ms + sampling cycle) when the volume of other
processing being performed by the application is sufficiently small. It takes 0 sampling
cycles or more to get user input from the Circle Pad Pro, plus communication time and
CTR processing, which requires approximately 2 ms.

[Link]. Circle Pad Clamp Processing

Functions used by the CTR system for circle pad clamp processing are defined with the same
arguments as nn::hid::PadReader class member functions. Operation is also the same as
with this class.

Functions used in Right Circle Pad clamp processing use the same settings as the circle pad on
the CTR system side, except that the function name Stick is used instead of ExtraStick.
These settings can be performed individually.

Code 6-13. Functions Used in Right Circle Pad Clamp Processing

class nn::hid::ExtraPadReader
{
void SetExtraStickClamp(s16 min, s16 max);
void GetExtraStickClamp(s16* pMin, s16* pMax) const;
StickClampMode GetExtraStickClampMode() const;
void SetExtraStickClampMode(StickClampMode mode);
f32 NormalizeExtraStick(s16 x);
void NormalizeExtraStickWithScale(
f32* normalized_x, f32* normalized_y, s16 x, s16 y);
void SetNormalizeExtraStickScaleSettings(f32 scale, s16 threshold);
void GetNormalizeExtraStickScaleSettings(f32* scale, s16* threshold) const;
}

[Link]. Event Notification


You can register events (nn::os::LightEvent class) that notify changes in the connection
state and completion of sampling.

Code 6-14. Functions Used in Event Notification

class nn::hid::ExtraPad
{
static void RegisterConnectionEvent(nn::os::LightEvent* pLightEvent);
static void UnregisterConnectionEvent();
static void RegisterSamplingEvent(nn::os::LightEvent* pLightEvent);
static void UnregisterSamplingEvent();
}

Event notifications associated with changes in connection state are registered and unregistered
with the *ConnectionEvent() functions, and event notifications for completing sampling are
registered and unregistered with the *SamplingEvent() functions. After an event is registered,
thread processing for sampling increases just slightly.

The processing load during registration of events that notify of changes in the connection state is
so minor that it can be ignored. However, registration of the sampling complete event requires
processing of each sampling cycle, so there is more processing load on the CPU. We recommend
that these functions only be used on applications where event notification is used.

Changes in connection state can also be determined by nn::hid::ExtraPad class member


functions, GetConnectionState() and IsSampling(), and by the isConnected structure
member variable of nn::hid::ExtraPadStatus.

[Link]. Calibration Applet

A library applet called the Circle Pad Pro calibration applet is provided to calibrate the feel of
controls on the Right Circle Pad within the Circle Pad Pro. Follow the guidelines that provide
examples where applications supporting the Circle Pad Pro can call up the Circle Pad Pro
calibration applet.

For more information, see 12.1.7. Circle Pad Pro Calibration Applet.

[Link]. Differences Between the Circle Pad and the Right Circle Pad

Input coordinates sampled from the Circle Pad and the Right Circle Pad have a tendency to
change as a result of electrical fluctuations, even when operations are performed intentionally
using the pad directly. The amount of change in values when key positions on the pads are fixed
and the probability that this might occur differs for the Circle Pad and the Right Circle Pad, as
indicated in the following table. The Right Circle Pad is especially susceptible to these changes,
so caution is required.

Table 6-8. Probability That Circle Pad and Right Circle Pad Input Coordinates Will Change

Circle Pad Circle Pad Right Circle Pad Right Circle Pad
Amount of Change
x-coordinate y-coordinate x-coordinate y-coordinate
-3 0.000006 % 0.000006 % 0.002 % 0.001 %

-2 0.000178 % 0.000178 % 0.349 % 0.352 %


–1 0.003541 % 0.003541 % 3.485 % 3.273 %
±0 99.949837 % 99.949837 % 92.326 % 92.743 %
+1 0.003541 % 0.003541 % 3.491 % 3.285 %
+2 0.000178 % 0.000178 % 0.346 % 0.345 %

+3 0.000006 % 0.000006 % 0.002 % 0.001 %

The probability of change occurring within the unit of time noted is determined according to the
following formula:

Amount of sampling data obtained during unit of time × Probability from Table 6-8.

If the pad is positioned in the center after the user ’s finger is removed, the value will not change
from the 0 position. This is because even with some variation in value, the minimum clamping
value is never exceeded.

6.2.7. C Stick

The library allows you to use the C Stick on SNAKE without differentiating from the Right Circle
Pad on the Circle Pad Pro, but the actual hardware is different. You must consider hardware
differences when using this approach.

[Link]. Properties of C Stick Hardware

The C Stick on SNAKE is an analog input device that detects minor deformations (strain) when
pressure is applied to a resin stick and reads the magnitude of the deformation as changes in
voltage.

Note the following points because the output value from the C Stick depends on the pressure on
the stick part.

It may be difficult to maintain the maximum input value depending on the strength of the user.
When too much pressure is applied, the output value from the C Stick does not change (the
output value is saturated).
When the C Stick is strongly pressed from directly above, the output value from the C Stick is
undefined.
When the X Button is strongly held down, the C Stick output value might fluctuate due to the
close physical proximity to the X Button.

In addition, due to variances in the performance of the sensors on each system, note the
following.

Sensitivity variances between individual systems have an effect on resolution.

Effect on the Pressure Required to Input the Maximum Value

The maximum detectable pressure range of the C Stick is approximately 240 gf. Consequently, it
might be difficult for some users to apply the maximum pressure continuously.
The minimum pressure value differs depending on the clamping method and input direction. The
detectable pressure range of the C Stick is summarized in the following table.

Table 6-9. Detectable Pressure Range of C Stick

Minimum Detectable Pressure


Maximum Detectable
Clamping Method Up, Down, Left, Right (4 Diagonal 45° Pressure
directions) directions

Circular (min=40,
Approx. 68 gf Approx. 72 gf
max=145)
Cross (min=36,
Approx. 63 gf Approx. 90 gf Approx. 240 gf
max=145)
Minimal (min=40,
Approx. 63 gf Approx. 72 gf
max=145)

Effect of Saturated Output Value

When excessive force is applied to the C Stick, the output value of the C Stick no longer
changes, even if more pressure is applied. This condition is called "saturation of the output
value."

C Stick input is measured on two independent axes (the x-axis and y-axis). In a situation where
the output value is saturated for both axes, the system can no longer detect any C Stick
operations by the user.

For example, as illustrated in the following figure, if the C Stick is moved in a large circular
motion with enough force and the C Stick input axes are aligned with the system, the output value
does not change for some regions at ±45° angles from the input axes.

Figure 6-10. Saturation of Output Values (C Stick Input Axes Aligned With System)

The orange x-axis and y-axis show the orientation of the system, and the black x-axis and
y-axis show the input axis of the C Stick. The orange circle shows the input pressure, the
light shaded area (in the middle) shows where a raw output value can be obtained from the
C Stick device, and the dark shaded area shows where the output value does not change for
either axis. The area where the value does not change is drawn larger than the actual area
for illustrative purposes.

The C Stick input axes are rotated 45° from the orientation of the system to minimize problems
from the user's perspective.
Figure 6-11. Saturation of Output Values (C Stick Input Axes Rotated 45° From System Orientation)

The rotation of the input axes is handled by the system. Applications do not need to apply any
rotation to C Stick input retrieved from the library.

Behavior When Strongly Pressed From Above

This action causes the stick part to bend, which may result in the following symptoms.

The C Stick output value fluctuates even though no directional force is applied.
The input is processed as a different direction than what the user intended.

Effect of Holding Down the X Button

When strong pressure is applied to the X Button, the C Stick output value might fluctuate even if
it is not being used. This issue occurs due to the deformation of the housing when the X Button is
strongly pressed, which can also cause the C Stick to bend.

This issue does not occur when the X Button is rapidly pressed, and requires significant force to
occur when holding down the X Button. More force is required to reproduce this issue on the
development hardware and the new Nintendo 3DS than on the new Nintendo 3DS XL.

In practice, Nintendo expects the effect of this issue on the C Stick output value to stay within the
following ranges.

Table 6-10. C Stick Output Value Fluctuation Range Caused by Holding X Button

Clamping Method x-axis y-axis


Circular (min=40, max=145) 0 to 10 0 to -35
Cross (min=36, max=145) 0 0 to –38

Minimal (min=40, max=145) 0 to 10 0 to -37

Effect of Sensitivity Variations Between Individual Systems on Resolution

The C Stick has slight differences in sensitivity between individual systems and is calibrated
during the manufacturing process to mitigate these differences. Due to the calibration process,
some systems have higher resolution than others.
Depending on the individual system and the clamping mode, it may not be possible to get all of
the values within the clamping range from the C Stick. The change in value is not guaranteed to
be consecutive on systems with higher resolution, and this tendency becomes even more
pronounced on systems with lower resolution. Output is guaranteed, however, for the center point
and the outer edge.

For this reason, Nintendo does not recommend implementations that require input from a narrow
range or expect a smooth change in value.

The following figure shows the output value distribution on a low-resolution system with circular
clamping applied, in addition to an example of the narrow detection range that is not
recommended. The orange background represents the range of values available to the
application, and the black points represent the actual values that are output. Note that the values
of adjacent black dots are not guaranteed to be consecutive. The green circle on the left side
shows a narrow detection range when the C Stick output value is converted to a vector length.
The green square on the right shows a narrow detection range when using the raw C Stick output
value. In other words, depending on the input direction, there might not be any points that satisfy
the required vector length, and there might not be any points at the raw value being detected.

Figure 6-12. Distribution of C Stick Output Values and Examples of Narrow Detection Ranges (Circular
Clamping, Low Resolution System)

[Link]. Differences From the Circle Pad Pro

When using input to the SNAKE C Stick with the HID library, note the following differences from
the specifications and behavior of the Circle Pad Pro.

The configurable sampling frequency is different.


The device is never disconnected due to external factors.
The behavior when waking from Sleep Mode is different.
The center point and sensitivity are not calibrated in the Circle Pad Pro Calibration Applet.

Different Configurable Sampling Frequency

A sampling frequency of 10 to 21 ms can be set for the SNAKE C Stick, ZL Button, and ZR Button
hardware.
The nn::hid::ExtraPad::StartSampling() function allows you to specify a frequency in
the range from 8 to 32 ms, but the hardware handles values below 10 ms as 10 ms, and values
above 21 ms as 21 ms. The function returns data at that frequency.

We recommend setting a sampling frequency of 10 to 21 ms based on the sampling frequency of


the hardware.

No Disconnects due to External Factors

The Circle Pad Pro, which was connected using infrared communication, could be disconnected
by external factors such as being detached or a dead battery. The SNAKE C Stick, ZL Button, and
ZR Button, however, are part of the system itself, so
nn::hid::ExtraPad::GetConnectionState() never returns
nn::hid::ExtraPad::CONNECTION_STATE_STOPPED due to external factors.

Because the battery never runs out, the batteryLevel member of the
nn::hid::ExtraPadStatus structure is always set to 1 while input is being sampled.

Different Behavior When Waking From Sleep Mode

The Circle Pad Pro and the C Stick have the following differences in behavior when the system
enters Sleep Mode without calling nn::hid::ExtraPad::StopSampling.

Circle Pad Pro


Infrared communication stops when the system enters Sleep Mode, which would cause
sampling requests to fail immediately after the system wakes. To prevent this, the sampling
group is removed and the library treats the Circle Pad Pro as having been stopped. However,
because the internal state changes based on the infrared communication status, the internal
state before detecting the stoppage is unstable.
Specifically, the nn::hid::ExtraPad::GetConnectionState() function generally returns
CONNECTION_STATE_STOPPED, but it might sometimes return
CONNECTION_STATE_CONNECTED.
C Stick
Sampling requests succeed immediately after waking from Sleep Mode and the connection
state is retained.
Specifically, the nn::hid::ExtraPad::GetConnectionState() function always returns
CONNECTION_STATE_CONNECTED.

The value returned by the nn::hid::ExtraPad::IsSampling() function also depends on the


internal state, so that value changes at the same time.

To handle this behavior, we recommend calling nn::hid::ExtraPad::StopSampling to stop


the sampling process before entering Sleep Mode. If you call
nn::hid::ExtraPad::StopSampling, resume sampling by calling
nn::hid::ExtraPad::StartSampling after waking from Sleep Mode. Nintendo also
recommends this procedure before and after calling functions where the system could enter Sleep
Mode before control returns to the application, such as the HOME Menu or a library applet.

No Center Point or Sensitivity Calibration in Circle Pad Pro Calibration Applet

SNAKE works differently from CTR when the Circle Pad Pro Calibration Applet is called. On
SNAKE, the system enters a mode to check the operation of the C Stick and displays a
description of how the C Stick is calibrated.
This is because the C Stick center point is automatically calibrated at the following times.

When power is turned on


If the C Stick is pressed when the power is turned on, it will not be calibrated correctly. Even
so, it is still possible to correctly calibrate the C Stick after waking from Sleep Mode, as
described below.
When waking from Sleep Mode
It is physically impossible to operate the C Stick while the system is closed, so no C Stick
input ever occurs during sleep. You can safely assume that the device was calibrated
correctly in this case.
During sampling (when the user is not using the control)
The application does not need to consider the effects of this calibration because it takes
place when the C Stick output value is below the lower limit of the clamping.

Warning: The center point calibration after Sleep Mode is triggered when the system wakes.
Consequently, the calibration does not take place when the system is reopened if the
application rejects the Sleep Mode request. The message that is displayed in the
calibration applet tells the user to put the system to sleep by closing it.

Note that center point calibration does not take place when an application rejects the
sleep request for one of the following reasons.

The application does not want to disconnect even if the system is closed.
The system does not enter Sleep Mode when closed while in Sound Mode if
headphones are connected.

6.3. MIC Library

The MIC library handles audio input from the microphone obtained by automatic sampling.

Call the nn::mic::Initialize() function to initialize the MIC library. The microphone can be
used after this function successfully completes, but to sample microphone input in an application, you
must take care of other preparations required for sampling, such as allocating the buffers for storing
the sampling results, configuring the microphone gain, and dealing with microphone power control.

6.3.1. Buffer Allocation

The application allocates the buffer for storing sampling results. The starting address of the buffer
must be nn::mic::BUFFER_ALIGNMENT (4096-byte) aligned, its size must be a multiple of
nn::mic::BUFFER_UNITSIZE (4096 bytes), and it must be allocated from non-device memory.
Pass the allocated buffer to the MIC library using the nn::mic::SetBuffer() function. The last 4
bytes of the buffer are set aside for management purposes, so the actual space available for
storing sampling results is the value specified minus these 4 bytes. Use the
nn::mic::GetSamplingBufferSize() function to get the size of the buffer used to store
sampling results.
Calling the nn::mic::SetBuffer() function after a buffer is already allocated causes an error.
To reallocate a buffer, first call the nn::mic::ResetBuffer() function and then call the
nn::mic::SetBuffer() function.

6.3.2. Microphone Gain Settings

The microphone gain setting determines the amplification multiple for audio input from the
microphone.

Use the nn::mic::GetAmpGain() and nn::mic::SetAmpGain() functions to get and set the
gain. You can set the gain in the range from 0 to 119 with each increase of 1 representing a 0.5 dB
increment. 0 equates to 10.5 dB (a multiple of roughly 3.4), and 119 equates to 70.0 dB (a multiple
of roughly 3162). As shown in the following table, the four gain levels for the NITRO microphone
can be used by converting to gain setting values.

Table 6-11. NITRO Settable Multiples and Gain Settings

Ratio dB Gain Setting Values


20 x 26.0 31

40 x 32.0 43
80 x 38.0 55
160 x 44.0 67

The microphone gain setting is set to AMP_GAIN_DEFAULT_VALUE when the library is initialized.

6.3.3. Microphone Power Control

Use the nn::mic::SetAmp() function to control power to the microphone (microphone amp).
Pass true as an argument to turn the microphone on. Microphone input is unstable immediately
after turning the microphone on or recovering from sleep, so the first second of sampling is forcibly
muted.

Call the nn::mic::GetAmp() function to get the current setting.

6.3.4. Start Sampling

The preceding steps have prepared the microphone for sampling. Call the
nn::mic::StartSampling() function to start sampling automatically.

Code 6-15. Start Microphone Sampling

nn::Result nn::mic::StartSampling(
nn::mic::SamplingType type, nn::mic::SamplingRate rate, s32 offset,
size_t size, bool loop);

Specify the type of data to get in the type parameter. You can choose from the following four types
of data, depending on factors such as your bit width and sign requirements.

Table 6-12. Microphone Sampling Data Types

Bit Sampling Value Sampling Value Indicating


Type Signed
Width Range Mute

SAMPLING_TYPE_8BIT 8 None 0 to 255 128


SAMPLING_TYPE_16BIT 16 None 0 to 65535 32768

SAMPLING_TYPE_SIGNED_8BIT 8 Yes –128 to 127 0


SAMPLING_TYPE_SIGNED_16BIT 16 Yes –32,768 to 32,767 0

Specify the sampling rate in the rate parameter. You can choose from the following four sampling
rates.

Table 6-13. Microphone Sampling Rates

Rate Sampling Rate


SAMPLING_RATE_32730 32.73 kHz (32728.498046875 Hz)
SAMPLING_RATE_16360 16.36 kHz (16364.2490234375 Hz)

SAMPLING_RATE_10910 10.91 kHz (10,909.4993489583 Hz)


SAMPLING_RATE_8180 8.18 kHz (8182.12451171875 Hz)

Specify the offset from the start of the sampling data storage location (the start of the buffer) in the
offset parameter. An alignment greater than 0, and of nn::mic::OUTPUT_OFFSET_ALIGNMENT
(2 bytes) must be specified.

Specify the number of bytes of sampling results to store as a multiple of


nn::mic::CTR::OUTPUT_UNITSIZE (2) for each sampling result in the size parameter. This
value must be small enough that (offset + size) is no larger than the size specified when
allocating the buffer.

Specify whether to continue sampling after storing size bytes of sampling results in the loop
parameter. Pass true to continue sampling and treat the buffer as a ring buffer (the region of size
bytes starting from offset).

If you call the nn::mic::StartSampling() function while already sampling, the current sampling
session is terminated and a new sampling session is started. If this function is called while the
system is closed, it returns nn::mic::ResultShellClose and sampling does not begin. If the
system is closed during ongoing sampling, sampling is automatically stopped and resumes when
the system is opened.

Call the nn::mic::AdjustSampling() function to change the sampling rate during sampling.
When you change the sampling rate this way, the new rate is used for sampling data starting from
the current storage location.

Call the nn::mic::IsSampling() function to check whether the microphone is currently being
sampled. However, this entails a significant processing burden because it sends a request directly
to the device. It is not suitable for operations that involve calling at each frame.

Call the nn::mic::SetLowPassFilter() function and pass true in the enable parameter to
apply a low-pass filter using a microphone input cutoff frequency of 45% of the sampling rate. The
default value is false (no filtering). However, data sampled with a sampling rate of
SAMPLING_RATE_32730 was sampled using a low-pass filter, so applying a low-pass filter using
this function has no effect.
[Link]. Synchronization With Sound Processing

The DSP that processes sound and the processor that handles the microphone are separate
devices. It is not possible to synchronize them perfectly. It is possible, however, to keep them in
near synchronization by correcting differences in timing.

Keep the timers in sync by periodically calculating mismatches between the sampling count and
elapsed time of the microphone, and the playback time of the audio, and correct any offset. Also
correct mismatches in the sampling frequency by calling Voice::SetPitch.

6.3.5. Getting Sampling Results

Call the nn::mic::GetLastSamplingAddress() function to get the address of the most recent
sampling results.

The nn::os::Event class instance, which is obtainable by a call to the


nn::mic::GetBufferFullEvent() function, is a manually resetting event that enters the
signaled state when the buffer cannot store any more sampling results. This event enters the
signaled state when the buffer has stored size bytes of sampling results from the offset position
and remains in the signaled state until the ClearSignal() function is called.

Due to differences in individual CTR systems in terms of the range of values obtained as input, the
guaranteed input value ranges for each sampling type are defined by the
TYPE_*_GUARANTEED_INPUT_MIN(MAX) constants. Do not design applications that expect values
outside of these guaranteed input ranges when determining if there is microphone input.

Table 6-14. Guaranteed Microphone Input Ranges for Each Sampling Type

Sampling Type Lower Limit Upper Limit


SAMPLING_TYPE_8BIT 27 228

SAMPLING_TYPE_16BIT 7105 58415


SAMPLING_TYPE_SIGNED_8BIT -101 100
SAMPLING_TYPE_SIGNED_16BIT -25663 25647

The default behavior is for microphone input values to be clamped to within these guaranteed
ranges. If you need input values across a broader range, you can disable clamping by calling the
nn::mic::SetClamp() function, but note that you might still be unable to obtain input values
from outside the guaranteed input ranges.

[Link]. Prohibited Ranges for Microphone Input Determination

Call the nn::mic::GetForbiddenArea() function, passing in the microphone amplifier gain


setting and sampling type, to get the lower and upper limits for sampling results that must not be
recognized as microphone input. The application must not recognize any sampling results with
values between these lower and upper limits as microphone input. The obtainable values are
shown in Table 6-15.

The rows shaded in light gray in Table 6-15 show where the gain is 68 (44.5 dB) or higher. In
these ranges, microphone input is subject to marked static and other noise due to sources such
as system speaker output, button clicks, and the sound of the stylus contacting the touch panel.
These ranges are unsuitable for determining whether there is any microphone input based on
amplitude levels. The rows in the table shaded in darker gray show where the gain is 104 (62.5
dB) or higher, where the full range of microphone input is noisy. Only use these ranges when
invalid microphone input is not a problem.

Table 6-15. Prohibited Ranges for Microphone Input Determination

Prohibited Ranges for Microphone Input


Sampling Type Gain dB
Detection (Noise Component)
0 to 10.5 to
125 to 131
31 26.0
32 to 26.5 to
123 to 133
43 32.0
44 to 32.5 to
119 to 137
55 38.0

56 to 38.5 to
112 to 144
67 44.0
SAMPLING_TYPE_8BIT
68 to 44.5 to
96 to 160
80 50.5
81 to 51.0 to
77 to 179
91 56.0
92 to 56.5 to
18 to 238
103 62.0
104 62.5 to
0 to 255
to 119 70.0

0 to 10.5 to
32000 to 33536
31 26.0
32 to 26.5 to
31488 to 34048
43 32.0
44 to 32.5 to
30464 to 35072
55 38.0
56 to 38.5 to
28672 to 36864
67 44.0
SAMPLING_TYPE_16BIT
68 to 44.5 to
24576 to 40960
80 50.5

81 to 51.0 to
19712 to 45824
91 56.0
92 to 56.5 to
4608 to 60928
103 62.0
104 62.5 to
0 to 65535
to 119 70.0
0 to 10.5 to
-3 to +3
31 26.0

32 to 26.5 to
-5 to +5
43 32.0
44 to 32.5 to
-9 to +9
55 38.0
56 to 38.5 to
-16 to +16
67 44.0
SAMPLING_TYPE_SIGNED_8BIT 68 to 44.5 to
-32 to +32
80 50.5

81 to 51.0 to
91 56.0 -51 to +51

92 to 56.5 to
-110 to +110
103 62.0
104 62.5 to
-128 to +127
to 119 70.0
0 to 10.5 to
-768 to +768
31 26.0
32 to 26.5 to
-1280 to +1280
43 32.0

44 to 32.5 to
-2304 to +2304
55 38.0

56 to 38.5 to
-4096 to +4096
67 44.0
SAMPLING_TYPE_SIGNED_16BIT
68 to 44.5 to
-8192 to +8192
80 50.5
81 to 51.0 to
-13056 to +13056
91 56.0
92 to 56.5 to
-28160 to +28160
103 62.0

104 62.5 to
-32768 to +32767
to 119 70.0

6.3.6. Stop Sampling

Use the nn::mic::StopSampling() function to stop sampling. This function only stops sampling.
It does not turn off power to the microphone.

6.3.7. Closing the MIC Library

To stop using input from the microphone, such as when quitting an application, first stop sampling,
and then do the following.

1. Call the nn::mic::SetAmp() function and pass false as an argument to turn off the
microphone power.
2. Call the nn::mic::ResetBuffer() function to free up allocated buffers.
3. Call the nn::mic::Finalize() function to close the MIC library.

If you want to stop sampling for a time and then sample more later, just turn off the microphone
power to save battery life.

The microphone gain setting is set to AMP_GAIN_DEFAULT_VALUE when the library is closed.

6.4. Camera Library and Y2R Library

The camera library handles operations for the camera on the system. Images captured from the
camera are only in YUV format. For conversion to RGB format, we recommend using the YUVtoRGB
circuit, which also supports conversion to the native GPU format. Use the Y2R library for YUVtoRGB
circuit operations.

6.4.1. Initializing

To initialize the camera library, call the nn::camera::Initialize() function; to initialize the
Y2R library, call the nn::y2r::Initialize() function. However, calling the initializers is not
sufficient to prepare the camera and the YUVtoRGB circuit. You must also configure the settings for
each and ready them to send and receive data.

6.4.2. Photography Environment Settings

This section describes the configurable settings for the photography environment. You must specify
which camera to configure by selecting from the following options.

Table 6-16. Specifying Which Camera to Configure

Setting Value Target Camera


SELECT_NONE Specifies no cameras. Used to, for example, configure standby settings.

SELECT_OUT1 Right outer camera.


SELECT_IN1 Inner camera.

SELECT_IN1_OUT1 Right outer camera and inner camera.


SELECT_OUT2 Left outer camera.
SELECT_OUT1_OUT2 Both left and right outer cameras.

SELECT_IN1_OUT2 Right outer camera and inner camera.


SELECT_ALL All cameras. Both left and right outer cameras and inner camera.

Using both outer cameras together with SELECT_OUT1_OUT2 is also called stereo mode, with the
images taken by the outer cameras appearing in 3D. When doing so, make sure the photographic
environment settings of the two cameras are the same.

Note: For more information about stereoscopic display and calibration methods using the
stereo camera, see 3DS Programming Manual: Advanced Graphics.

Warning: Configuring the photographic environment settings while capturing could cause
distortion of the captured images. Consequently, first pause capture and then configure
the settings.

When specifying SELECT_ALL, note that the cameras selected differ from in the past
when there was only one outer camera.

Note that configuring the photographic environment settings while the camera is
restarting may cause the library’s processing to block for an extended period.

Image data taken by the camera includes variations based on the camera itself, ambient
light, and color variation in the subject. For example, camera variations include angle-of-
view, resolution, color reproducibility, rotation, and distortion. An image taken with the
same angle and camera setting is different in each system.

Resolution

Call the nn::camera::SetSize() function to configure the resolution of the images captured by
the camera. The resolution is the size of the image before trimming. Choose from the following
resolution options.

Table 6-17. Resolution

Setting Value Resolution Description


SIZE_VGA 640×480 VGA

SIZE_QVGA 320×240 QVGA


SIZE_QQVGA 160×120 QQVGA
SIZE_CIF 352×288 CIF (aspect ratio of 11: 9)

SIZE_QCIF 176×144 QCIF (aspect ratio of 11: 9)


SIZE_DS_LCD 256×192 Resolution of the DS LCD.

SIZE_DS_LCDx4 512×384 Resolution at twice the height and width of the DS LCD.
SIZE_CTR_TOP_LCD 400×240 Resolution of the 3DS upper LCD
SIZE_CTR_BOTTOM_LCD 320×240 Resolution of the 3DS lower LCD. Same as QVGA.

Images captured using the SIZE_CIF, SIZE_QCIF, and SIZE_CTR_TOP_LCD settings have a 4:3
aspect ratio, with the left and right sides of the images trimmed.

Call the nn::camera::SetDetailSize() function to configure the other resolution settings. You
can freely set the resolution of the captured images to crop from the original VGA-sized image by
specifying the height and width of the output image. Make sure that the width and height are
greater than or equal to the output image size. If not trimming, the output image width times the
height must be a multiple of 128. To maintain compatibility, make sure that you specify the cropX0
parameter as both an even number and a value that results in a multiple of 4 (cropX1 - cropX0 +
1).

To take images in stereo mode, make sure that the two cameras are set to the same resolution.

Sharpness

Call the nn::camera::SetSharpness() function to set a camera's sharpness. Valid sharpness


values range from -4 to +5.

Exposure

Call the nn::camera::SetExposure() function to set the camera's exposure. Valid exposure
values range from -5 to +5.

Call the nn::camera::SetAutoExposure() function to enable or disable the auto exposure (AE)
feature. The exposure value is unstable immediately after starting up the camera, so we
recommend leaving the AE feature enabled. Call the nn::camera::IsAutoExposure() function
to get the camera's current AE setting. Note that the AE feature is enabled when setting the
exposure with the nn::camera::SetExposure() function, even if you have disabled the AE
feature previously.

Call the nn::camera::SetAutoExposureWindow() function to set the region used as the


standard for automatic calculation when in automatic exposure mode. Specify this region as a
portion of the 640 × 480 VGA maximum size of a captured image, changing the starting
coordinates, width, and height within the following ranges. The three rightmost columns of the table
show the initial setting values for the inner, right outer, and left outer cameras.

Table 6-18. Specifying the Standard Region for Exposure

Parameter Setting Setting Value Range IN1 OUT1 OUT2


Starting coordinate
startX 0 to 600 (in 40-pixel increments) 80 0 0
(horizontal)

Starting coordinate
startY 0 to 450 (in 30-pixel increments) 60 0 0
(vertical)
40 to 640 (in 40-pixel increments) 640 or less
width Width 480 640 640
when summed with startX
30 to 480 (in 30-pixel increments) 480 or less
height Height 360 480 480
when summed with startY

Frame Rate

Call the nn::camera::SetFrame Rate() function to set how many frames per second (fps) to
capture. You can choose from the following values. When in stereo mode, use the same frame rate
for both outer cameras.

Table 6-19. Frame Rate

Setting Value Frame Rate

FRAME_RATE_15 Fixed at 15 fps.


FRAME_RATE_15_TO_5 Automatically varies from 15 to 5 fps depending on the available light.

FRAME_RATE_15_TO_2 Automatically varies from 15 to 2 fps depending on the available light.


FRAME_RATE_10 Fixed at 10 fps.
FRAME_RATE_8_5 Fixed at 8.5 fps.

FRAME_RATE_5 Fixed at 5 fps.


FRAME_RATE_20 Fixed at 20 fps.
FRAME_RATE_20_TO_5 Automatically varies from 20 to 5 fps depending on the available light

FRAME_RATE_30 Fixed at 30 fps.


FRAME_RATE_30_TO_5 Automatically varies from 30 to 5 fps depending on the available light.
FRAME_RATE_15_TO_10 Automatically varies from 15 to 10 fps depending on the available light.

FRAME_RATE_20_TO_10 Automatically varies from 20 to 10 fps depending on the available light.


FRAME_RATE_30_TO_10 Automatically varies from 30 to 10 fps depending on the available light.

White Balance

Call the nn::camera::SetWhiteBalance() function to set the camera's white balance. Choose
from the following white balance setting values.
Table 6-20. White Balance

Setting Value Aliases Description


WHITE_BALANCE_AUTO WHITE_BALANCE_NORMAL Automatic white balance.

Tungsten light
WHITE_BALANCE_3200K WHITE_BALANCE_TUNGSTEN (incandescent light
bulb).
WHITE_BALANCE_4150K WHITE_BALANCE_WHITE_FLUORESCENT_LIGHT White fluorescent.
WHITE_BALANCE_5200K WHITE_BALANCE_DAYLIGHT Daylight.

WHITE_BALANCE_CLOUDY Cloud cover.


WHITE_BALANCE_6000K
WHITE_BALANCE_HORIZON Sunset.
WHITE_BALANCE_7000K WHITE_BALANCE_SHADE Shade.

Call the nn::camera::SetWhiteBalance() function and configure the white balance setting to
WHITE_BALANCE_AUTO to enable the auto white balance feature. Any other setting value will
disable the auto white balance feature.

Use the nn::camera::SetAutoWhiteBalance() function to enable or disable the auto white


balance feature after enabling it by using WHITE_BALANCE_AUTO. If a white balance setting other
than WHITE_BALANCE_AUTO was used, you cannot use the
nn::camera::SetAutoWhiteBalance() function. Use the
nn::camera::IsAutoWhiteBalance() function to get the current auto white balance setting.

Call nn::camera::SetAutoWhiteBalanceWindow to set the region used as the benchmark for


auto white balance function when automatically calculating the white balance. Specify this region
as a portion of the 640 × 480 VGA maximum size of a captured image, changing the starting
coordinates, width, and height within the following ranges. The three rightmost columns of the table
show the initial setting values for the inner, right outer, and left outer cameras.

Table 6-21. Specifying the White Balance Standard Region

Parameter Setting Setting Value Range IN1 OUT1 OUT2

Starting coordinate
startX 0 to 600 (in 40-pixel increments) 0 0 0
(horizontal)
Starting coordinate
startY 0 to 450 (in 30-pixel increments) 0 0 0
(vertical)
40 to 640 (in 40-pixel increments) 640 or less
width Width 640 640 640
when summed with startX

30 to 480 (in 30-pixel increments) 480 or less


height Height 480 480 480
when summed with startY

Note: The automatic adjustment of the white balance may be degraded if there is little change
in the input images, or the input image has little contrast, as with a plain wall.

Photo Mode

Call the nn::camera::SetPhotoMode() function to set the camera's photo mode to match the
subject of the photo. Choose from the following values.

Table 6-22. Photo Mode


Photo
Setting Value Description
Mode

No
PHOTO_MODE_NORMAL No compensation is made to the camera settings.
correction
Portrait
PHOTO_MODE_PORTRAIT Settings are configured for portrait photography.
mode
Landscape
PHOTO_MODE_LANDSCAPE Settings are configured for landscape photography.
mode
Night view Settings are configured for photography in limited light
PHOTO_MODE_NIGHTVIEW
mode conditions.

Settings configured for photographing a QR code pattern or


PHOTO_MODE_LETTER Text mode
other written characters.

Changing the photo mode overwrites the contrast, gain, sharpness, exposure, and white balance
settings as follows, and changes the standard regions for exposure for the outer and inner
cameras. When set to All, the start coordinate is (0, 0) with a height of 480 and a width of 640.
When set to Center, the start coordinate is (80, 60) with a height of 360 and a width of 480.

Table 6-23. Photography Environment Settings Changed by Changing the Photo Mode

Outer Inner
Photo Mode Contrast Gain Sharpness Exposure White Balance
Cameras Camera
No correction NORMAL Normal 0 0 NORMAL All Center

Portrait mode LOW Normal -2 0 NORMAL Center Center


Landscape mode NORMAL Normal +1 0 DAYLIGHT All Center

Night view mode NORMAL Maximum –1 +2 NORMAL All Center


Text mode HIGH Normal +2 +2 NORMAL All Center

Flipping

Call the nn::camera::FlipImage() function to specify how to flip the camera's output images.
Choose from the following values.

Table 6-24. Flipping

Setting Value Flipping to Apply


FLIP_NONE No flipping

FLIP_HORIZONTAL Horizontal flipping


FLIP_VERTICAL Vertical flipping
FLIP_REVERSE Rotate image 180°

Effects

Call the nn::camera::SetEffect() function to specify any special effects to apply to the
camera's output images. Choose from the following values.

Table 6-25. Effects

Setting Value Effect


EFFECT_NONE No special effects
EFFECT_MONO Monochrome
EFFECT_SEPIA Sepia tone (ocher)

EFFECT_NEGATIVE Negative
Film-tone negative Same as EFFECT_NEGATIVE, but with the U and V values
EFFECT_NEGAFILM
swapped
EFFECT_SEPIA01 Sepia tone (red ocher)

Applying an effect and then changing other settings may change the effect. To give an image a
softer feel, apply a sepia effect and then reduce the image's sharpness; to give an image a warmer,
redder look, apply the film-tone negative effect and then raise the image's color temperature.

Contrast

Call the nn::camera::SetContrast() function to set the camera's contrast (gamma curve).
Choose from the following values.

Table 6-26. Contrast

Setting Value Contrast

CONTRAST_PATTERN_n (n is 01
Contrast pattern number n.
to 11)

Sets the contrast ratio to higher than the default value (pattern
CONTRAST_HIGH
number 7).
CONTRAST_NORMAL Default setting (pattern number 6).
Sets the contrast ratio to lower than the default value (pattern
CONTRAST_LOW
number 5).

Lens Correction

Lens correction is a means of adjusting differences in brightness that may occur between the center
and edges of an image by raising the brightness of the edges to more closely match the center. Call
the nn::camera::SetLensCorrection() function to set the camera's lens correction. Choose
from the following values.

Table 6-27. Lens Correction

Setting Value Lens Correction

LENS_CORRECTION_OFF Disables lens correction.


LENS_CORRECTION_ON_70 Enables lens correction at a value of 70.
LENS_CORRECTION_ON_90 Enables lens correction at a value of 90.

Sets edges to be darker than the default setting


LENS_CORRECTION_DARK
(LENS_CORRECTION_OFF).
LENS_CORRECTION_NORMAL Default setting (LENS_CORRECTION_ON_70).
Sets edges to be brighter than the default setting
LENS_CORRECTION_BRIGHT
(LENS_CORRECTION_ON_90).

Context
Use contexts to switch multiple photography environment settings at the same time. Use the
nn::camera::SwitchContext() function to switch contexts. Each camera has two contexts, A
and B, for a total of six sets of settings. These contexts can be switched independently, so the
outer camera can be in context A, while the inner camera is in context B.

Each context can specify three settings: resolution, flipping, and special effects.

Noise Filter

When the screen changes from bright to dark or from dark to bright, the camera module applies a
noise filter to remove noise from the automatically captured image. The images captured from one
of the outer cameras may appear fuzzy when in stereo mode with this noise filter feature enabled.
To prevent this, call the nn::camera::SetNoiseFilter() function and pass false for the on
parameter to disable the noise filter. Pass true for the on parameter to enable the noise filter. All
cameras have the noise filter enabled by default.

Configuring Photography Environment Settings in Batch

The API provides a feature to simultaneously configure all of the photography environment settings.

Call the nn::camera::SetPackageParameterWithoutContext() function to configure all


photography environment settings with no context value specified (other than resolution, flipping,
and effects).

Call the nn::camera::SetPackageParameterWithContext() function to configure all


photography environment settings with context values specified (resolution, flipping, and effects).

6.4.3. Capture Settings

The images captured by the camera are written line by line in FIFO order, with the YUV-format data
sent to a buffer prepared by the application. The application then sends the data to the YUVtoRGB
circuit and the resulting RGB-format data to a buffer. The following diagram shows this set of
operations.

Figure 6-13. Captured Image Data Flow

The following sections describe the settings required to take YUV-format image data through to the
application buffer.

About Ports
As with the stereo mode, it is possible to use two cameras at the same time, making it necessary to
specify which camera to use for acquiring captured images. Of the three cameras, the inner camera
and the right outer camera are connected to one port, and the left outer camera is connected to its
own port.

Most functions that configure image capture require that a port be specified, using the following
enumerators.

Table 6-28. Port Specifications

Enumerator Description
PORT_NONE No port is specified.
PORT_CAM1 Specifies the port to which the inner and right outer cameras are connected.

PORT_CAM2 Specifies the port to which the left outer camera is connected.
PORT_BOTH Specifies both ports.

Trimming

Trimming crops a captured image to the size required by the application. Trim an image if the
camera's resolution and the required size of the captured image differ.

Call the nn::camera::SetTrimming() function to enable or disable trimming. Call the


nn::camera::SetTrimmingParams or nn::camera::SetTrimmingParamsCenter()
functions to specify the range to trim. Call the nn::camera::IsTrimming() function to check
whether trimming is currently enabled, and call the nn::camera::GetTrimmingParams()
functions to get the trimming range.

Warning: Set trimming prior to capturing images.

When setting the trimming position and range using the nn::camera::SetTrimmingParams()
function, specify the position to start trimming as (x1, y1) and the position to stop trimming as (x2,
y2). The (x1, y1) coordinate is included in the trimming operation, but the (x2, y2) coordinate is
excluded. The following limitations apply to these settings.

The x1 and y1 coordinate values for the trimming start position must be even values.
x1 must be less than x2, and y1 must be less than y2.
The post-trimming width (x2 – x1) and height (y2 – y1) must also be even values.
The post-trimming image width times the height must be a multiple of 128.

When setting the trimming position and range using the


nn::camera::SetTrimmingParamsCenter() function, specify the trimming size width and
height (trimWidth and trimHeight) and the camera resolution width and height (camWidth and
camHeight) for trimming based on the center of the captured image. The following code provides
an example of how the (x1, y1) and (x2, y2) coordinates are calculated. Make sure that the
calculated positions meet the restrictions described for the SetTrimmingParams() function.

Code 6-16. Calculating Positions for Centered Trimming

x1 = (camWidth - trimWidth) / 2;
y1 = (camHeight - trimHeight) / 2;
x2 = x1 + trimWidth;
y2 = y1 + trimHeight;

The captured image size sent to the buffer is the trimmed size. The cost of a transfer is still the
same for different camera resolutions, provided you trim the images to the same size. However,
trimming high-resolution images may cause the image to appear compressed due to the narrower
field of view.

Number of Transfer Bytes (Lines)

The images captured by the camera are stored by the hardware FIFO line by line, and then written
with multiple transfers to the buffer prepared by the application. The FIFO capacity is fixed at 10
KB. The number of bytes sent in a single transfer must meet the following conditions.

The number of bytes sent in a single transfer must be a multiple of 256.


The number of bytes sent in a single transfer must be no more than 10 KB (10,240 bytes).
The total number of transfer bytes must be a multiple of the number of bytes sent in a single
DMA transfer.

The total number of transfer bytes is the width times the height of the trimmed image, multiplied by
2, which is the number of bytes per pixel.

Call the nn::camera::SetTransferLines() function to set the number of bytes in a single


transfer in terms of the number of lines. You can simply set the single-transfer line count to the
value returned by the nn::camera::GetMaxLines() function, but note that execution halts in
this function if the number of lines does not match the conditions described above. If you cannot
set the number of lines, such as when setting an optional resolution using the
nn::camera::SetDetailSize() function, instead call the
nn::camera::SetTransferBytes() function to set the number of bytes in a single transfer. You
can simply set the single-transfer byte count to the value returned by the
nn::camera::GetMaxBytes() function, but note that execution halts in this function if the
number of bytes (width × height × 2) is not a multiple of 256.

Warning: Set the number of transfer bytes (lines) prior to capturing images. To prevent buffer
errors, the GetMaxLines() and GetMaxBytes() functions calculate FIFO capacity as
5 KB.

Call the nn::camera::GetTransferBytes() function to get the current balance setting.

Receive Buffer

Call the nn::camera::GetFrameBytes function to get the size of the buffer required for receiving one
frame of captured video. The starting address of the buffer is 4-byte aligned. An alignment below
64 bytes may reduce the transfer rate. Only a buffer allocated in device memory may be
specified.

6.4.4. Starting Capture

Before transferring or capturing an image, call the nn::camera::Activate() function to activate


the camera to use for capture. You cannot activate the inner camera and the outer camera (R) at
the same time, because the two cameras are connected to the same port. The outer camera (L) is
connected to a different port. You can activate it at the same time as either the inner camera or the
outer camera (R).

After making sure that the camera is active, call nn::camera::SetReceiving to begin transfer,
and then call nn::camera::StartCapture to begin image capture.

Code 6-17. nn::camera::SetReceiving Function

void nn::camera::SetReceiving(nn::os::Event* pEvent, void* pDst,


nn::camera::Port port, size_t imageSize, s16 transferUnit);

Specify the event for receiving notification that transfer has completed in the pEvent parameter.
Specify the starting address of the receiving buffer to pDst aligned to
nn::camera::BUFFER_ALIGNMENT (4 bytes). Specify the port in port. Specify the byte size of a
single frame of captured video (the receive buffer size) in imageSize. In transferUnit, specify
the byte size of the data in a single transfer as returned by nn::camera::GetTransferBytes.

You can call nn::camera::IsFinishedReceiving to get whether the single frame of captured
video has finished transferring.

To check whether an error has occurred during capture, such as during FIFO writing, check whether
the error event returned by nn::camera::GetBufferErrorInterruptEvent is in the signaled
state. If a buffer error has occurred, this error event is in the signaled state and is an automatically
resetting event of the nn::os::Event class. This error event also changes to a signal state when
a camera error causes the camera to restart. Recover from an error by restarting in the order of
transfer, and then capture.

You can call nn::camera::IsBusy to find out whether the camera is currently capturing an
image. You can also call nn::camera::GetVsyncInterruptEvent to get the event that enters
the signaled state when the camera's VSYNC interrupt occurs. Use this function in a process that
synchronizes with camera VSYNC, or in a process that changes the camera environment when a
frame is not being transferred.

Call nn::camera::SynchronizeVsyncTiming to synchronize the camera VSYNC interrupt


timing, such as when you are using the stereo camera to display in stereoscopic view. This function
attempts to synchronize the VSYNC interrupt timing of the two cameras. This process does not
maintain the synchronization. The function must be called again when synchronization is lost. The
four frames after calling this function may be very dark, because it places the camera into standby
before restarting it. The two cameras have a small gap in the timing of the image capture being
sent, even if the cameras have the same configuration. VSYNC interrupt timing has a large gap
after calling nn::camera::Activate to start the camera, or when changing the camera
configuration even when capturing video with the same configuration and fixed frame rate.

The capture image (YUV[Link] format) is output in the following alignment.

+ 0 Byte + 1 Byte + 2 Byte + 3 Byte


Y (n) U (n) Y (n + 1) V (n)

6.4.5. YUVtoRGB Circuit Settings

Use the YUVtoRGB circuit to convert YUV-format data to RGB format at the hardware level. This
circuit also supports output to the native GPU block format. However, the CTR system is only
equipped with one YUVtoRGB circuit. When converting captured images from multiple cameras,
such as when in stereo mode, use a mutex or other mechanism to make sure that multiple
conversion requests are not issued simultaneously.

This section describes settings related to data conversion using the YUVtoRGB circuit.

Input Format

Call the nn::y2r::SetInputFormat() function to set the input YUV data format. Call the
nn::camera::GetInputFormat() function to get the current setting. Choose from the following
values.

Table 6-29. Input Format

Setting Value Format

INPUT_YUV422_INDIV_8 Input the individual Y, U, and V values for YUV[Link] as 8 bits each.
INPUT_YUV420_INDIV_8 Input the individual Y and U values for YUV[Link] as 8 bits each.
Input the individual Y, U, and V values for YUV[Link] as 16 bits each
INPUT_YUV422_INDIV_16
(padding required).

Input the individual Y and U values for YUV[Link] as 16 bits each


INPUT_YUV420_INDIV_16
(padding required).

INPUT_YUV422_BATCH Input the Y, U, and V values for YUV[Link] all together in 32 bits.

The data formats for each value are as follows.

YUV[Link]/YUV[Link] individual input (8 bit)

Component + 0 Byte + 1 Byte + 2 Byte + 3 Byte

Y Y (n) Y (n + 1) Y (n + 2) Y (n + 3)
U U (n) U (n + 1) U (n + 2) U (n + 3)
V V (n) V (n + 1) V (n + 2) V (n + 3)

YUV[Link]/YUV[Link] individual input (16 bit padding)

Component + 0 Byte + 1 Byte + 2 Byte + 3 Byte


Y Y (n) padding Y (n + 1) padding

U U (n) padding U (n + 1) padding


V V (n) padding V (n + 1) padding

YUV[Link] batch input

Component + 0 Byte + 1 Byte + 2 Byte + 3 Byte


YUV Y (n) U (n) Y (n + 1) V (n)

You can only get captured images from the camera in YUV batch format. In most cases, specify
INPUT_YUV422_BATCH when converting captured images.

Line Width and Line Count

Call the nn::y2r::SetInputLineWidth() function to specify the line width of the data to
convert (the input data). Call the nn::y2r::GetInputLineWidth() function to get the current
setting. The line width must be set to a multiple of 8, up to a maximum value of 1024.

Call the nn::y2r::SetInputLines() function to set the number of input lines. Call the
nn::y2r::GetInputLines() function to get the current setting.

Output Format

The data converted by the YUVtoRGB circuit is stored in the output buffer. Call the
nn::y2r::SetOutputFormat() function to set the output data format. Call the
nn::camera::GetOutputFormat function to get the current setting. Choose from the following
output formats.

Table 6-30. Output Format

Setting Value Format

OUTPUT_RGB_32 32-bit RGB (RGBA8888)


OUTPUT_RGB_24 24-bit RGB (RGB888)
OUTPUT_RGB_16_555 16-bit RGB (RGBA8888)

OUTPUT_RGB_16_565 16-bit RGB (RGB565)

The data formats for each value are as follows.

32-bit RGB (OUTPUT_RGB_32)

24-bit RGB (OUTPUT_RGB_24)

16-bit RGB (OUTPUT_RGB_16_555)

16-bit RGB (OUTPUT_RGB_16_565)

Formats that output the alpha component use the alpha value that was set by the
nn::y2r::SetAlpha() function. The OUTPUT_RGB_32 format uses bits 0 to 7 of the set alpha
value, while the OUTPUT_RGB_16_555 format only uses the 7th bit. Call the
nn::camera::GetAlpha() function to get the current alpha value.

Block Alignment
Call the nn::y2r::SetBlockAlignment() function to set the block alignment for the data stored
in the output buffer. Call the nn::camera::GetBlockAlignment() function to get the current
setting. Choose from the following block alignments.

Table 6-31. Block Alignment

Setting Value Block Alignment


Horizontal line format. The usual linear format. The 24-bit and 32-bit RGB output
BLOCK_LINE formats cannot be used directly as OpenGL standard-format textures due to byte-
order issues.
8x8 block format. This is the native GPU block format. Data in this format can be
BLOCK_8_BY_8
used as native-format textures.

Warning: When using BLOCK_8_BY_8 block alignment, the input image height (vertical line
count) must be a multiple of 8.

Output Buffer

You can call the nn::y2r::GetOutputImageSize() function to get the size of the buffer
required for receiving one frame of output data. The starting address of the buffer is 4-byte aligned.
An alignment below 64 bytes may reduce the transfer rate. Only a buffer allocated in device
memory may be specified.

Conversion Coefficient

Choose the coefficient for YUV to RGB conversion from the following standard conversion
coefficients. Call the nn::y2r::SetStandardCoefficient() function to set the conversion
coefficient.

Select the type of conversion coefficient for converting images output by the cameras from those
returned by the nn::camera::GetSuitableY2rStandardCoefficient() function. When
making a selection, consider the possibility of future changes in the camera module.

You can choose from the following four types of standard conversion coefficients.

Table 6-32. Types of Standard Conversion Coefficient

Setting Value Conversion Coefficient Type (Value Range)


COEFFICIENT_ITU_R_BT_601 ITU-R BT.601 (0 ≤ Y, U, V ≤ 255)

COEFFICIENT_ITU_R_BT_709 ITU-R BT.709 (0 ≤ Y, U, V ≤ 255)


COEFFICIENT_ITU_R_BT_601_SCALING ITU-R BT.601 (16 ≤ Y ≤ 235, 16 ≤ U, V ≤ 240)
COEFFICIENT_ITU_R_BT_709_SCALING ITU-R BT.709 (16 ≤ Y ≤ 235, 16 ≤ U, V ≤ 240)

Rotation

You can rotate an image when converting the format. To specify how many degrees to rotate an
image, call the nn::y2r::SetRotation() function; to get the current rotation degree setting, call
the nn::y2r::GetRotation() function. You can choose from the following four rotation settings,
including no rotation.
Table 6-33. Rotation Angle

Setting Value Rotation Angle


ROTATION_NONE No rotation.

ROTATION_CLOCKWISE_90 Clockwise 90°


ROTATION_CLOCKWISE_180 180°

ROTATION_CLOCKWISE_270 Clockwise 270° (counterclockwise 90°)

When you rotate an image, the post-conversion data no longer has valid image data ordering.
Consequently, your application must correct the data ordering after receiving each frame.

Note: This may be changed in the future so that valid data ordering is output by transfer.

Figure 6-14. Output Data Ordering Differences Due to Block Alignment and Rotation

Batch Configuring
Call the nn::y2r::SetPackageParameter() function to configure all of the YUVtoRGB circuit
settings at the same time. You can get all the settings at the same time using the
nn::y2r::GetPackageParameter() function.

6.4.6. Starting Format Conversion

Data conversion is carried out in parallel with sending and receiving, so preparations for sending
and receiving data must be completed before conversion is begun.

Use the following functions to prepare to send input data. nn::y2r::SetSendingYuv for batched
YUV data. nn::y2r::SetSendingY for just the Y data. nn::y2r::SetSendingU for just the U
data. nn::y2r::SetSendingV for just the V data. These functions take the input data buffer's
size, the total transfer data size, and the size of one line of input data, specified in bytes, as
arguments. Only a buffer allocated in device memory may be specified. Note that the total
transfer data size must be a multiple of the size of one line. The offset value added when one line
of input data is transferred (transferStride) may be specified for any of these functions.

Call nn::y2r::SetReceiving to prepare to receive output data. Specify the buffer for storing
output data. (VRAM buffer cannot be specified. Only a buffer allocated in device memory may
be specified. Again, you must align it to nn::y2r::BUFFER_ALIGNMENT (4 bytes).) Specify the
total size of received data and the received data size per transfer in byte units. To improve
performance, we recommend specifying the transfer size as the size of eight lines of output data.
Call the nn::camera::GetOutputImageSize() function to get the current setting. The size
value specified for eight lines must be the size of a single pixel in bytes returned by the
nn::y2r::GetOutputFormatBytes() function multiplied by the width of a single line multiplied
by 8. You can also specify an offset value added to each transfer (transferStride). To add an
offset to each line, specify the size of one line of output data for the size of a single transfer.

A good time to prepare to send and receive data is immediately after the event specified by the
nn::camera::IsFinishedReceiving or nn::camera::a SetReceiving function is in the
signaled state, which indicates that the captured image has been fully received and you have
confirmed the input data. Before preparing to send or receive data, call the
nn::y2r::IsBusyConversion() function to check whether the system is busy converting
formats.

Call the nn::y2r::StartConversion() function to begin format conversion after data


send/receive preparations are complete. Data is sent and received at the same time as format
conversion begins. Call the nn::y2r::StopConversion() function if you must forcibly halt
format conversion, such as when an error occurs during capture operations.

Call the nn::y2r::IsBusyConversion() function to check whether the system is busy


converting formats. To check whether the system has completed sending input data, call the
nn::y2r::IsFinishedSendingYuv() function for batched YUV data, the
nn::y2r::IsFinishedSendingY() function for just the Y data, the
nn::y2r::IsFinishedSendingU() function for just the U data, and the
nn::y2r::IsFinishedSendingV() function for just the V data. To check whether the system
has finished receiving output data, call the nn::y2r::IsFinishedReceiving() function.

You can use a combination of these functions to check whether format conversion and data
transmission has completed for a frame. You can also check whether data transfer is complete by
using the event class (automatically resetting event) returned by the
nn::y2r::GetTransferEndEvent() function. This event class receives an interrupt notification
when transfer has completed. The nn::y2r::SetTransferEndInterrupt() function must be
called to enable interrupt notifications before getting the event class. (The default is to disable
them.) Call the nn::y2r::GetTransferEndInterrupt() function to check whether notifications
are currently enabled.

A hardware bug prevents the transfer completion event obtained by the


nn::y2r::GetTransferEndEvent() function from being signaled, and prevents the
nn::y2r::IsBusyConversion() function from returning true, if the following (rare) sequence
occurs: (1) while using the cameras and YUVtoRGB circuit simultaneously, (2) a buffer error occurs
during camera data transfer, and (3) during recovery from that error the processing of the
nn::camera::SetReceiving() function overlaps the YUVtoRGB circuit data transfer processing
at a particular time. For this reason, always make sure to specify a timeout when waiting for the
signal status of a transfer completion event.

Specify a timeout longer than the time required for conversion. The time required for conversion
depends on the size of the input image and the output format. Output of a VGA image in 16-bit RGB
format takes about 13 milliseconds. Output of a VGA image in 24-bit RGB format takes about 16
milliseconds.

If this bug occurs, forcibly terminate conversion by calling the nn::y2r::StopConversion()


function. This makes the YUVtoRGB circuit usable again.

The probability of occurrence of this bug is proportional to the frequency of camera buffer errors, so
use the camera under conditions where buffer errors are unlikely to arise. When buffer errors occur
frequently, respond by lowering the camera frame rate or with another method. Also, set a higher
priority for the thread that calls the nn::camera::SetReceiving() function.

6.4.7. Playing the Shutter Sound

The shutter sound is forcibly played even when the speaker volume is set to 0 and the camera LED
is temporarily turned off.

Note: SNAKE and FTR do not have a camera indicator LED.

Code 6-18. Playing the Shutter Sound

nn::Result nn::camera::PlayShutterSound(nn::camera::ShutterSoundType type);

Specify one of the following shutter sound types for the type parameter.

Table 6-34. Shutter Sound Types

Setting Value Shutter Sound Type


SHUTTER_SOUND_TYPE_NORMAL Normal shutter sound.

SHUTTER_SOUND_TYPE_MOVIE Sound played when starting video capture.


SHUTTER_SOUND_TYPE_MOVIE_END Sound played when ending video capture.

6.4.8. Ending Capture

To stop image capture, do the following. Stopping image capture without completing the following
steps can lead to audio noise when displaying the HOME Menu.
1. Call the nn::y2r::StopConversion() function to stop format conversion.
2. Call the nn::camera::StopCapture() function to stop capture.
3. Call nn::camera::Activate(SELECT_NONE) to put all cameras into standby mode.
4. Call the nn::camera::Finalize() and nn::y2r::Finalize() functions to close the
camera and Y2R libraries.

6.4.9. Support for Sleep Mode

nn::camera::ResultIsSleeping (or nn::y2r::ResultIsSleeping) is returned to indicate


the system is closed when a Camera (or Y2R) library function is called while the system is closed.
This includes during initialization. Note that this return value is returned as long as the system is
closed, even if it is not in Sleep Mode.

The camera is designed so that operations stop if the system is closed, regardless of Sleep Mode.
If the system is closed while capturing an image, capture resumes when the system is opened.
Note, however, that when capture resumes there is a period of time immediately after the camera is
activated by the Activate() function where the image is unstable. This is due to processing
equivalent to the nn::camera::Activate(SELECT_NONE)() function being performed inside the
library when the system is closed. There is also a possibility that the system will enter a state
where the IsBusy() function always returns true depending on the exact timing the system was
closed. This status will be canceled when the system is opened, but do not perform polling using
the IsBusy() function if a process for entering Sleep Mode when the system is closed has been
implemented.

If the system enters sleep in the midst of RGB conversion by the Y2R library, conversion is forcibly
terminated. Conversion does not resume after recovering from sleep. If supporting sleep, do not
enter sleep during conversion (when the IsBusyConversion() function returns true or while
waiting for an event obtained by the GetTransferEndEvent() function). Implement code so that
the system only enters sleep after checking that conversion is complete. Particularly note that
events will not enter signal status after recovering from sleep if sleep is entered while waiting for
an event obtained using the GetTransferEndEvent() function.

6.4.10. Conflicts With Camera Mode

You cannot press the L Button and the R Button simultaneously to enter camera mode from the
HOME Menu while an application is using the cameras. If you attempt to do so, a dialog box is
displayed with a message stating that the cameras are in use.

The cameras are considered to be in use if either the CAMERA or the Y2R library has been
initialized. If an application that uses the cameras initializes either of these libraries when it is
started, the cameras are considered to be in use when the HOME Menu is displayed even if the
application is not actually using them. Even if the Y2R library alone is used to convert YUV images
into RGB images—during movie playback, for example—the cameras are considered to be in use
and the system does not enter camera mode.

Initialize the CAMERA and Y2R libraries just before you use them and shut them down afterwards
whenever you can. Do not leave the CAMERA and Y2R libraries initialized while they are not in use.

CONFIDENTIAL
7. File System

This chapter describes the FS library used to access media and cautions for each media type.

7.1. FS Library

You must use the FS library to access 3DS system memory devices (Nintendo 3DS Cards and SD
cards).

7.1.1. Initializing

Call the nn::fs::Initialize() function to initialize the FS library. It is not a problem to call this
function again even after the library is already initialized.

Function calls against files or directories cause an error unless they are made after the FS library
is initialized.

7.1.2. Finalizing

To finish using the FS library, close all open files and directories, and unmount all archives.

You must also either finalize or cancel use of the FS library when shutting down the application and
when transitioning to Sleep Mode. For more information, see [Link]. Cautions When Shutting
Down and [Link]. Prohibited Processes While in Sleep Mode.

7.1.3. Specifying Paths

All paths (to files or directories) must be specified as absolute paths. The "/" symbol (slash) is
used as the path delimiter.

You can use wide or multiple-byte characters (ASCII characters only) in path specification strings
used with FS library functions. However, keep track of the stack size when using multiple-byte
strings because the library must convert multiple-byte strings to wide-character strings, which
requires the allocation of a large buffer from the stack. Consequently, use wide-character strings
unless you have a good reason not to.

7.1.4. Accessing Files

Choose the best class for your purposes from the following three types for accessing files on
media.
Table 7-1. Classes Used for File Access

Class Description
nn::fs::FileInputStream Opens a file for reading.

Opens a file for writing. If the specified file does not exist, one may
nn::fs::FileOutputStream be created depending on the settings specified in the class
initialization.
Opens a file for reading and writing, depending on the access mode
nn::fs::FileStream
specified.

Warning: Much as with Initialize() and TryInitialize(), the functions used for file
access have versions prepended with and without Try. Both versions operate the same
way, but make sure to use the Try versions within your application.

FS library functions generally do not return until they complete their execution. The
library also processes file-access requests in the order they arrive, so a file access by a
high-priority thread will not be processed until the file accesses requested before it have
completed.

[Link]. nn::fs::FileInputStream Class

Use this class to open a file for reading. Specify the file for reading in the arguments to the
constructor or to the Initialize or TryInitialize() functions. The application halts if the
file specified in the arguments to the constructor or to the Initialize() function does not
exist. Use the return value from the TryInitialize() function instead for error checking. The
application halts if file access is attempted from an instance that did not successfully open the
file or if the file is opened again from an instance that already has it open.

Use the GetSize or TryGetSize() functions to get the size of the file. This can be used to
determine the size to use for the read buffer.

Use the Read or TryRead() functions to read a file. The arguments specify the buffer into which
to copy file contents, and the size of the buffer. The return value is the number of bytes actually
copied to the buffer, or 0 if the end of the file has been reached. If at all possible, allocate the
buffer with a 4-byte alignment. Though dependent on device and memory region, read speeds are
substantially slower for buffers that are not 4-byte aligned and whose current position does not
change in 4-byte units.

Use the GetPosition or TryGetPosition() functions to get, or the SetPosition or


TrySetPosition() functions to set, the file read start position (current position). The current
position is the number of bytes from the start of the file. You can also use the Seek or
TrySeek() functions to set the base position and the offset from there. Note that although read
speeds are dependent on device and memory region, read speeds are substantially slower if the
current position is not set to a multiple of 4 bytes from the start of the file. If the current position
is set before the start of the file or after the end of the file, the application halts on the next
attempt at accessing the file.

Table 7-2. Specifying the Base Position

Setting Value Description


POSITION_BASE_BEGIN Sets the current position based on the start of the file.

POSITION_BASE_CURRENT Sets the current position based on the current position in the file.
POSITION_BASE_END Sets the current position based on the end of the file.

Call the Finalize() function to close the file after you have finished using it.

The following code sample shows opening a file using TryInitialize, checking whether it
opened properly, and then reading from it.

Code 7-1. File Reading

nn::fs::FileInputStream fis;
nn::Result result = [Link](L"rom:/[Link]");
if ([Link]())
{
s64 fileSize;
result = [Link](&fileSize);
NN_LOG("FileSize=%lld\n", fileSize);
buf = [Link](fileSize);
s32 ret;
result = [Link](&ret, buf, fileSize);
...
[Link](buf);
}
[Link]();

[Link]. nn::fs::FileOutputStream Class

Use this class to open a file for writing. Specify the file for writing in the arguments to the
constructor or to the Initialize or TryInitialize() functions. Use the return value from the
TryInitialize() function instead for error checking. If the specified file does not exist and the
createIfNotExist parameter is set to true, a new file of size zero is created. The application
halts if the file specified in the arguments to the constructor or to the Initialize() function
does not exist and the createIfNotExist parameter is set to false. The application halts if
file access is attempted from an instance that did not successfully open the file or if the file is
opened again from an instance that already has it open.

Use the SetSize() or TrySetSize() functions before writing to set the size of the file. The file
writing position will be adjusted if the file size is reset smaller in the middle of a write operation.
The GetSize or TryGetSize() functions return the current file size.

Note: The TryRead() function successfully reads in the file of size zero created by the
TryInitialize() function. You must be careful if the TryInitialize() function is
called with true passed to the createIfNotExist parameter, and then later the file
size is set using the TrySetSize() function.

When the Game Card is removed or the process is otherwise interrupted between calls
to TryInitialize() and TrySetSize(), the next time TryRead() is called, the
result is that a file of size zero is read.

For handling files of fixed size, if the application is not checking the size using the
TryGetSize() function before calling TryRead, we recommend creating the file
using the TryCreateFile() function, which can set the file size, rather than using
the createIfNotExist parameter to create the file.

Use the Write or TryWrite() functions to write data to the file. The arguments specify the
starting address of a buffer that contains data to write to the file and the number of bytes of data
to write. The return value is the number of bytes actually written to the file. When writing past the
end of the file, the file is expanded if possible. If at all possible, allocate the buffer with a 4-byte
alignment. Though dependent on device and memory region, write speeds are substantially
slower for buffers that are not 4-byte aligned. You can specify whether to flush the file cache
(write the cache contents to media) while the file is written to by passing the appropriate value in
the flush parameter. We recommend specifying false for this value to avoid wearing out the
memory media and to prevent data corruption from memory cards possibly being removed during
write operations. Instead, call the Flush or TryFlush() functions just before closing the file.
However, if you chose not to write and flush concurrently, make sure to flush the cache before
you close the file.

Use the GetPosition or TryGetPosition() functions to get, or the SetPosition or


TrySetPosition() functions to set, the file write start position (current position). The current
position is the number of bytes from the start of the file. You can also use the Seek or
TrySeek() functions to set the base position and the offset from there. Note that although write
speeds are dependent on device and memory region, write speeds are substantially slower if the
current position is not set to a multiple of 4 bytes from the start of the file. If the current position
is set before the start of the file or after the end of the file, the application halts on the next
attempt at accessing the file.

Call the Finalize() function to close the file after you have finished using it.

The following code sample shows opening a file using TryInitialize, checking if it opened
properly, and then writing to it.

Code 7-2. File Writing

nn::fs::FileOutputStream fos;
nn::Result result = [Link](L"sdmc:/[Link]", true);
if ([Link]())
{
s32 ret;
result = [Link](&ret, buf, sizeof(buf));
}
[Link]();

[Link]. nn::fs::FileStream Class

Use this class to open a file for both reading and writing. The initialization arguments are
different, but the member functions operate the same way as for the previous two classes.

Specify the file to access and the access mode in the arguments to the constructor or to the
Initialize or TryInitialize() functions. Specify the access mode as a combination of the
following flags.

Table 7-3. Access Mode Flags

Flag Description

OPEN_MODE_READ Opens a file for reading.


OPEN_MODE_WRITE Opens a file for writing. Reading is also possible.
When combined with OPEN_MODE_WRITE, if the file specified at initialization
OPEN_MODE_CREATE
does not already exist, it is created.
7.1.5. Accessing Directories

Use the nn::fs::Directory class to get information about directories and directory contents
(subdirectories and files).

[Link]. nn::fs::Directory Class

Use this class to open directories and list their contents. Specify the directory in the arguments to
the constructor or to the Initialize or TryInitialize() functions. The application halts if
the directory specified in the arguments to the constructor or to the Initialize() function does
not exist. Use the return value of the TryInitialize() function instead for error checking. The
application halts if directory access is attempted from an instance that did not successfully open
the directory, or if the directory is opened again from an instance that already has it open.

A slash is needed at the end of the path when specifying the root directory for media, such as the
"/" at the end of the "sdmc:/" path. The ending slash may also be added for non-root directories,
but it is not needed.

Use the Read() function to get directory entries. The arguments specify an array for directory
entry information (an nn::fs::DirectoryEntry structure) and the number of elements in the
array. The return value is the number of entries actually stored in the array. After all entries have
been obtained, calls to the Read() function return 0.

Call the Finalize() function to close the directory after you have finished using it.

The following code sample shows opening a directory using TryInitialize, checking whether
it opened properly, and then getting its entries.

Code 7-3. Directory Access

nn::fs::Directory dir;
nn::fs::DirectoryEntry entry[ENTRY_MAX];
nn::Result result = [Link](L"sdmc:/TestDirectory");
if ([Link]())
{
s32 readCount;
while (true)
{
result = [Link](&readCount, entry, ENTRY_MAX);
if (readCount == 0) break;
...
}
}
[Link]();

7.1.6. File and Directory Operations

Use these functions to create, rename, or delete files and directories.

[Link]. Creating Files

Use the nn::fs::CreateFile or nn::fs::TryCreateFile() functions to create a file of the


specified size. The application halts if an error occurs on a call to the nn::fs::CreateFile()
function. Use the return value from the nn::fs::TryCreateFile() function instead for error
checking.

An error occurs if you attempt to create a file of the same full name as an existing file. If an error
occurs for some reason other than during an attempt to create a file with the same name as an
existing file, it is possible that an unnecessary file has been created, so delete any such
unnecessary files. If no unnecessary files have been created, the delete operation returns
nn::fs::ResultNotFound.

[Link]. Renaming Files

Use the nn::fs::RenameFile or nn::fs::TryRenameFile() functions to rename a file. The


application halts if an error occurs on a call to the nn::fs::RenameFile() function. Use the
return value from the nn::fs::TryRenameFile() function instead for error checking.

An error occurs if you call these functions on a file that is open or if you attempt to rename a file
to a name already used by another file in the same directory (including attempting to rename a
file and specifying the name it already has).

[Link]. Deleting Files

Use the nn::fs::DeleteFile or nn::fs::TryDeleteFile() functions to delete a file. The


application halts if an error occurs on a call to the nn::fs::DeleteFile() function. Use the
return value from the nn::fs::TryDeleteFile() function instead for error checking.

An error occurs if you call these functions on a file that is open.

[Link]. Creating Directories

Use the nn::fs::CreateDirectory or nn::fs::TryCreateDirectory() functions to


create a directory. The application halts if an error occurs on a call to the
nn::fs::CreateDirectory() function. Use the return value from the
nn::fs::TryCreateDirectory() function instead for error checking.

An error occurs if you call these functions on a directory located in a nonexistent directory or if
you attempt to create a directory with a name already used by another directory in the same
parent directory.

[Link]. Renaming Directories

Use the nn::fs::RenameDirectory or nn::fs::TryRenameDirectory() functions to


rename a directory. The application halts if an error occurs on a call to the
nn::fs::RenameDirectory() function. Use the return value from the
nn::fs::TryRenameDirectory() function instead for error checking.

An error occurs if you call these functions on a directory that is open or if you attempt to rename
a directory to a name already used by another directory in the same parent directory (including
attempting to rename a directory and specifying the name it already has).

[Link]. Deleting Directories

Use the nn::fs::DeleteDirectory or nn::fs::TryDeleteDirectory() functions to


delete a directory. A directory cannot be deleted unless it is empty. The application halts if an
error occurs on a call to the nn::fs::DeleteDirectory() function. Use the return value from
the nn::fs::TryDeleteDirectory() function instead for error checking.

An error occurs if you call these functions on a directory that is open.

The nn::fs::TryDeleteDirectoryRecursively() function attempts to completely delete


the specified directory by recursively deleting all entries. If an error occurs during this process,
the function returns that error.

7.1.7. Checking the SD Card State

The SDK provides functions to check whether an SD card is inserted, to send notification when an
SD card is inserted or removed, and to check whether an SD card can be written to.

Code 7-4. Checking the SD Card Insertion State, Notifying of Insertion/Removal, and Checking for Writability

bool nn::fs::IsSdmcInserted();
void RegisterSdmcInsertedEvent(nn::os::LightEvent* p);
void UnregisterSdmcInsertedEvent();
void RegisterSdmcEjectedEvent(nn::os::LightEvent* p);
void UnregisterSdmcEjectedEvent();
bool nn::fs::IsSdmcWritable();
nn::Result nn::fs::GetSdmcSize(s64* pTotal, s64* pFree);

Call the nn::fs::IsSdmcInserted() function to check whether an SD card is inserted. The


function returns true if a device is currently inserted in the SD card slot, even if it is a broken SD
card or something other than an SD card, such as an empty SD card adapter. The processing to
check whether an SD card is inserted entails a heavy processor load; we recommend using the
following functions to register an event class that waits for notification when an SD card is inserted
or removed.

Call the nn::fs::RegisterSdmcInsertedEvent() and


nn::fs::UnregisterSdmcInsertedEvent() functions to register and unregister an instance of
the nn::os::LightEvent class to receive notifications of SD card insertions.

Call nn::fs::RegisterSdmcEjectedEvent and nn::fs::UnregisterSdmcEjectedEvent to


register and unregister an instance of the nn::os::LightEvent class to receive notifications of
SD card ejections.

The nn::fs::IsSdmcWritable() function returns true if an SD card is inserted and can be


written to.

The nn::fs::GetSdmcSize() function returns the total capacity of an SD card (pTotal) and the
available capacity (pFree).
7.1.8. Latency Emulation

You can use the following function to debug changes in access speeds. It emulates file access
latency in an application caused by conflicting file system access with SpotPass or another
background process.

Code 7-5. Initializing Latency Emulation

void nn::fs::InitializeLatencyEmulation(void);

Use Debug Mode in the Config tool’s Debug Settings to enable and disable latency emulation.
While the application is running, file accesses are delayed by the number of milliseconds specified
by FS Latency Emulation only when the corresponding item is set to enable in the Config tool and
this function has enabled latency emulation.

7.1.9. Access Priority Settings

File system access priority is supported. This allows you to access multiple files from multiple
threads, because the execution order is appropriately adjusted, and items with a higher priority
setting are processed more quickly.

Access priority requires real-time capability in the same way as streaming (an established number
of cycles is required for an established amount of processing). Appropriate settings are provided for
file access. Using these settings keeps delay time resulting from file access to a minimum.

[Link]. Types of Access Priority

The access priorities that can be set are shown in the following table, in priority order.

Table 7-4. Access Priority Types

Type Definition Description


Special priority level used for accessing files which, if loading
Real-
is delayed, could detract from the user ’s experience. This
time PRIORITY_APP_REALTIME
includes the loading of streaming data.
Priority
There are, however, some restrictions on its use.
Normal Priority level used for accessing general purpose files, such
PRIORITY_APP_NORMAL
Priority as model data, scene data, and all types of save data.
Priority level used for accessing files which can be executed
Low
PRIORITY_APP_LOW anytime the system is available for processing, for which the
Priority
priority is lower than normal. This includes autosave.

The definition used to specify access priority level is defined in the


nn::fs::PriorityForApplication enumerator.

Note: For the restrictions on real-time priority see the CTR-SDK API Reference.
[Link]. Access Priority Setting Targets

The following list shows a range of targets for access priority settings.

Table 7-5. Access Priority Setting Targets and Functions Used in Settings

Setting Target Function Used in Setting

Overall File System nn::fs::SetPriority


Archive nn::fs::SetArchivePriority
File Stream and Directory TrySetPriority or SetPriority of all classes

The Overall File System setting target refers to general access from the application and
accessing of archives performed without specifying an archive name. This includes save data
formats. If an access priority has not been set explicitly, the normal priority is applied.

The Archive setting target refers to archives whose name must be specified for an archive that is
already mounted. Different access priority settings are possible for each archive. If the access
setting is not set explicitly, the overall file system setting at the time the archive was mounted is
applied. Even if the setting is changed for the overall system after the archive is mounted, the
archive setting itself is not affected. Similarly, even if the archive setting is changed, the setting
for the overall file system is not affected.

The File Stream and Directory setting is applied to objects accessed in the file stream (such as
nn::fs::FileStream) and directory objects (nn::fs::Directory). Access priority settings
can differ for each object even if it resides in the same file or directory. Unless the access priority
setting is set explicitly, the archive setting at the point when the object was created is applied.
Even if the archive setting is changed after creating an object, the various settings for each
object is not affected. Similarly, even if individual settings of various objects are changed, the
archive setting is not affected.

[Link]. Cautions

Access priority changes the priorities within the file system. If the priority of the threads that call
file access functions is not set at a high level, the file access process will not immediately start
(even if a particular item has real-time priority). Set the priority of threads that call up file access
functions at a high enough priority to meet the demands of their particular processing.

In addition, the order of completing access of desired files is not guaranteed. There is always a
possibility that access of files with a lower priority level may be completed before files with a
higher access priority, even though they were requested at a later time. Do not implement an
application so that it is dependent on the order of file access completion, when multiple files are
accessed in parallel.

Also, do not perform file access performance design based on measured performance values. For
estimated access times, see CTR-SDK API Reference.

7.1.10. Restriction on the Number of Files and Directories Opened


Simultaneously
There are limits to the number of files and directories that can be opened at the same time using
the file system.

For safe operation, limit the number of files and directories that an application opens
simultaneously to the following.

Up to four files of the archive that directly access the SD card and the extended save data
archive combined
Up to 10 files for the save data archive (including the save data of other applications)
Up to 10 total files

Note: A ROM archive is not subject to these limits. Its limits are in accordance with the
parameters defined by calling the MountRom() function.

7.2. ROM Archives

ROM archives are read-only archives for accessing ROMFS that are created at build time. ROM
archives must be mounted explicitly by applications. The mounting procedure and arguments are the
same regardless of whether the application is card-based software or a downloadable application.

The use of ROM archives requires working memory allocated by the application. The required size of
this working memory depends on the number of files and directories that can be opened
simultaneously. You can get the required working memory size by calling
nn::fs::GetRomRequiredMemorySize. If this function returns a negative value, you cannot mount
a ROM archive. After your application has allocated the required amount of memory (or more) with
nn::fs::WORKING_MEMORY_ALIGNMENT (4 bytes) alignment, call nn::fs::MountRom() to mount
the ROM archive.

Code 7-6. Mounting a ROM Archive

s32 nn::fs::GetRomRequiredMemorySize(size_t maxFile, size_t maxDirectory,


bool useCache = true);
nn::Result nn::fs::MountRom(const char* archiveName, size_t maxFile,
size_t maxDirectory, void* workingMemory,
size_t workingMemorySize, bool useCache = true);

Specify the number of files and directories that can be opened simultaneously in maxFile and
maxDirectory, respectively. The number of files that can be opened simultaneously only depends
on the amount of working memory. This number is not affected by factors like the length of filenames.

Specify a value of true in the useCache parameter to cache metadata and shorten the time required
to open files or scan directories. Note that this increases the amount of working memory required.

Specify the name of the archive to mount in the archiveName parameter. The archive is mounted to
rom: if you call the overloaded version that omits this parameter.

Pass the working memory and its size to workingMemory and workingMemorySize.

Note: You do not need to handle errors for these functions. If an error occurs in a function, an
error screen displays but an error is not returned to the application.

7.2.1. Specifying Archive Names

Only single-byte alphanumeric characters and some symbols excluding the colon as archive name
delimiter may be used to specify archive names. Names are case-sensitive. Names must be
specified as at least one character and no more than eight, including the colon delimiter.

Do not use archive names that start with the dollar symbol ($). For information about characters
and words that cannot be used in archive, file, and directory names, see the API Reference.

7.3. Save Data

For card-based software, the application-specific save data region is located in the backup memory.
For downloaded applications, the save data region is located in an archive file on an SD card. The
save data region can be accessed by the FS library as an archive. The function and parameters used
to access this archive are the same regardless of whether the accessing application is stored on a
Nintendo 3DS Card or an SD card.

Code 7-7. Mounting, Formatting, and Committing Save Data Regions

nn::Result nn::fs::MountSaveData(const char* archiveName = "data:");


nn::Result nn::fs::MountSaveData(const char* archiveName, bit32 uniqueId);
nn::Result nn::fs::MountDemoSaveData(const char* archiveName, bit32 uniqueId,
bit8 demoIndex);
nn::Result nn::fs::FormatSaveData(size_t maxFiles, size_t maxDirectories,
bool isDuplicateAll = false);
nn::Result nn::fs::CommitSaveData(const char* archiveName = "data:");

Call the nn::fs::MountSaveData() function to mount a save data region to the archive path name
specified in the archiveName parameter. The save data from another application can be mounted by
calling the overloaded function with the uniqueId parameter. Call the
nn::fs::MountDemoSaveData() function to mount the save data region of the demos. Specify the
index of the demo for demoIndex.

Note: The values to be specified in uniqueId upon mounting the save data region are based on
the unique ID specified in each application’s RSF file. To specify another application’s
unique ID, you must first specify that application’s unique ID in the RSF file of the
application to be mounted.

If an error belonging to the nn::fs::ResultNotFormatted, nn::fs::ResultBadFormat, or


nn::fs::ResultVerificationFailed is returned in the save data region mount, format the save
data region by calling the nn::fs::FormatSaveData() function and then try mounting it again.
When formatting, specify the maximum number of files and directories using the maxFiles and
maxDirectories parameters. There are no restrictions on the values that can be specified. For
more information about specifying archive names, see 7.2.1. Specifying Archive Names.
Warning: The same error class (nn::fs::ResultNotFound) is returned when mounting the
save data region from a downloaded application (including demos) if a Nintendo 3DS
Card is not inserted, or if the inserted 3DS Card does not have a formatted save data
region.

You cannot escape from this state if you do not start the card-based software application,
or start the card-based software application but do not format the save data region. Be
sure to display a message that accommodates the reason the caused the error. For
example, your message could say something like "Could not find save data for (media
name). If you have never run the game before, run it now, create save data, and try
again."

Note: Depending on the media, some errors are not returned.


For example, currently, CARD1 may return nn::fs::ResultBadFormat when it is
mounted, but CARD2 and downloaded applications will not return the error
nn::fs::ResultBadFormat.
We recommend that you write media independent processing that can adapt to future
changes in error handling even in such situations.

The library supports automatic redundancy for the entire save data region. The library mirrors data
over the entire save data region when the isDuplicateAll parameter during formatting was
specified as true. When writing files with automatic redundancy enabled, you must call
nn::fs::CommitSaveData before unmounting the save data region or the file updates will not be
written to the media. Likewise, if the power is cut before the updates are committed to memory, only
the old data will be available the next time the system is booted and the save data is mounted.

Warning: If save data consists of multiple interdependent files, you must call
nn::fs::CommitSaveData when these dependencies are not contradictory.

When automatic redundancy is enabled, the memory available for save data files is half of the
physical capacity of the backup region. A portion of this space is also reserved for file system
management. Use the Worksheet for Calculating the Save Data Capacity, found in the API Reference,
to calculate the actual available capacity of the backup region. The current implementation uses
regions of 512 byte blocks when saving files.

The mounted save data region can be treated as an archive. You can create files freely within the
archive. Filenames and directory names can be up to 16 characters long, and the maximum path
length is 253 characters. Only half-width alphanumeric characters and certain symbols can be used
for filenames and directory names. Slashes ("/") are used to delimit folders in paths. Other than the
amount of available capacity for save data, there are no restrictions on the maximum size of the file
that can be created or the maximum size that can be written at one time.

Call the nn::fs::GetArchiveFreeBytes() function to get the available capacity of an archive.

Code 7-8. Getting an Archive’s Available Capacity

nn::Result nn::fs::GetArchiveFreeBytes(s64* pOut, const char* archiveName);

For the pOut parameter, specify a pointer to a variable of type s64 to receive the number of available
bytes. For the archiveName parameter, use the name of the archive specified when it was mounted.

The files within the archive are protected by a tamper detection feature that uses hash values. If the
hash value of the data that is accessed does not match the expected value, the system determines
that the data has been tampered with and returns an error. When automatic redundancy is not
enabled, mismatching hash errors can also occur when trying to access data that was corrupted if the
system was turned off during a save or if the card was removed during a save.

When you finish accessing the save data region, call the nn::fs::Unmount() function, specifying
the archive name as an argument, to unmount the save data.

Code 7-9. Unmounting

nn::Result nn::fs::Unmount(const char* archiveName);

Downloaded applications, much like Nintendo DSiWare titles, create a save data region automatically
as soon as they are imported into the target media. When a downloaded application is deleted, its
save data is also deleted.

7.3.1. Measures Against Rollback of Save Data

Note: See the File System: Save Data Rollback Prevention Feature section in the CTR-SDK
API Reference.

Warning: In general, if an application uses the save data rollback prevention feature, do not
support backing up save data in the banner specifications file. (Set
DisableSaveDataBackup to True.) For more information, see the 3DS Overview
developer manual, included in the CTR Overview package, and the description of
ctr_makebanner, included in the ../documents/tools/ folder of the CTR-SDK.
Switching between using and not using the save data rollback prevention feature when
applying a patch is prohibited.

7.3.2. Handling Errors When Files or Directories Do Not Exist

Even if an application takes various measures to create save data correctly, if the following
conditions are met, it is still possible to reach a state where mounting is completed successfully,
but save data files do not exist.

The save data was formatted the first time it is created.


After the formatting is completed, while creating files or directories (and before the save data is
committed when automatic redundancy is enabled), the application crashes for one of the
following reasons.
The Power Button was held down.
The card was removed.
Power was cut off partway through the process.
This occurs particularly in applications that allow the system to go into a sleep state, when
the system is closed and then left alone.
In cases like these, nn::fs::ResultNotFound could be returned for file or directory operations.
Implement your application so that the user ’s save data can be restored to its normal state.

7.4. Extended Save Data

The term extended save data refers to data that is created on an SD card and managed separately
from the save data. The extended save data region can only be used on the system it was created
on. We recommend that the extended save data region be used to store any application-specific data
that is linked to (relies on) information that only exists on a specific system (for example, friend
information). Do not use the extended save data region to store data that is required in order to make
progress in the game. The system never automatically creates extended save data.

Code 7-10. Mounting, Creating, and Deleting Extended Save Data

nn::Result nn::fs::MountExtSaveData(const char* archiveName,


nn::fs::ExtSaveDataId id);
nn::Result nn::fs::CreateExtSaveData(nn::fs::ExtSaveDataId id,
const void* iconData, size_t iconDataSize,
u32 entryDirectory, u32 entryFile);
nn::Result nn::fs::DeleteExtSaveData(nn::fs::ExtSaveDataId id);

To use extended save data, you must first mount it by specifying the extended save data ID in a call
to the nn::fs::MountExtSaveData() function. If the function returns a value of
nn::fs::ResultNotFormatted, nn::fs::ResultNotFound, nn::fs::ResultBadFormat, or
nn::fs::ResultVerificationFailed and extended save data has already been created, delete
extended save data using the nn::fs::DeleteExtSaveData() function and create extended save
data by calling the nn::fs::CreateExtSaveData() function and then remount. If an error
belonging to nn::fs::ResultOperationDenied has been returned, it may be that the SD card or
file cannot be written to, or that there is a contact fault in the SD card. See the File System: Error
Handling section in the API Reference.

The extended save data is mounted to the archive path specified in the archiveName parameter. For
more information about specifying archive names, see 7.2.1. Specifying Archive Names.

Specify the extended save data ID in the id parameter. The extended save data ID is a number that
identifies the extended save data. The extended save data ID for extended save data that
applications can access must be set as either ExtSaveDataNumber in the makerom RSF file (only
one ID can be specified) or AccessibleSaveDataIds in AccessControlInfo (where multiple IDs
can be specified). Normally you would specify a unique ID issued by Nintendo for extended save data
IDs created by applications. Consequently, when sharing extended save data among multiple
applications, you must specify the unique ID for one of those applications.

When creating extended save data, the iconData and iconDataSize parameters are used to
specify an icon for display on the data management screen (as an ICN file created using the
ctr_makebanner32 tool) and the icon’s size. Use the entryDirectory and entryFile parameters
to specify the number of directories and files to store in the save data

The mounted extended save data region can be treated as an archive. You can create files freely
within the archive. However, the size of files cannot be changed after they are created unless the
nn::fs::TryCreateFile() function is used to create them. Filenames and directory names can
be up to 16 characters long, and the maximum path length is 248 characters. Only half-width
alphanumeric characters and certain symbols can be used for filenames and directory names.
Slashes ("/") are used to delimit folders in paths. Archive sizes are variable.
The files within the archive are protected by a tamper detection feature that uses hash values. If the
hash value of the data that is accessed does not match the expected value, the system determines
that the data has been tampered with and returns an error. Mismatching hash errors call also occur
when trying to access data that was corrupted if the system was turned off during while writing data
or if the card was removed while writing data. Unlike save data, the library does not support extended
save data mirroring.

There are no data protection features, so removing an SD card while a file is writing will very likely
result in corrupted extended save data. You must re-create any corrupted extended save data after
deleting using the nn::fs::DeleteExtSaveData() function.

When you finish accessing the extended save data region, call the nn::fs::Unmount() function,
specifying the archive name as an argument, to unmount the extended save data.

Deleting downloaded applications has no effect on extended save data. Extended save data can be
deleted from the Data Management screen of System Settings.

Note: The application is free to use up to a total of 32 MB of extended save data. If you want to
use more than 32 MB, please contact Nintendo.

7.4.1. Uses of Extended Save Data

Extended save data can be used to save information such as the following.

Application-specific data
Data shared by a series (shared among multiple titles or versions of titles in the same series)
Downloaded data (such as additional items or levels)
Contextual CTR banner data (created in the application, or downloaded from a server)

[Link]. Application-Specific Data

This can include relatively unimportant data that is unrelated to the game's progress, data that is
linked to information that is only saved on the system, or large user-created data.

[Link]. Data Shared by a Series

This refers to data that is shared by multiple titles in the same series (including different versions
of a title). You can share data between titles in a series by using the common extended save data
ID throughout the series.

[Link]. Downloaded Data


This refers to additional levels or other such data that are downloaded by the application by
registering a download task. Only the application that registered a download task can access the
data obtained from that download task, but you can make this data accessible to other
applications that share the extended save data by moving it to extended save data.

[Link]. Contextual CTR Banner Data

This refers to data that can replace portions of the data in CTR title banners. The portions that
can be replaced are the scrolling text that is displayed at the bottom of the upper screen and a
single texture for the replacement name. The system does not support replacing icon data.

There are two types of contextual CTR banner data: local contextual banners, which are created
by applications, and downloaded contextual banners, which are downloaded from servers.

Local Contextual Banners

Local data can be used to display a message based on the game's progress or to change part of
the texture that is displayed in the CTR title banner. The text can be up to 255 characters (either
single byte or double byte), and the data size can be up to 128 KB. It is possible to include
multiple images within a single texture and to apply these images to multiple models. Prepare
"COMMON" data used for all languages in addition to any language-specific data that may be
enabled for the target region.

Downloaded Contextual Banners

This is provided to a server and can be used to change the display. The text and textures have
the same specifications as for local contextual banners. You can specify an expiration date (year,
month, and day) for downloaded contextual banners.

Displaying Contextual CTR Banner Data

The text displayed for the contextual banner data alternates in the following order: (1) text for
downloaded contextual banners, (2) text for local contextual banners. If the texture included
within a local contextual banner has the same name as a texture included within a downloaded
contextual banner, the system prioritizes the downloaded contextual banner ’s texture.

7.4.2. Access From Multiple Game Cards

More than one person may share a single 3DS system using a single SD card, with both people
owning game cards having the same title. This alone does not represent a problem because save
data is written to the backup memory of each separate game card. Note, however, that bugs may
arise depending on the configuration of the data saved in extended save data.

Specifically, situations like the following may occur.

The program may run out of control if the integrity of save data and extended save data is lost.
If you use fixed filenames for save data, files stored in extended save data created on one
game card may overwrite that of the other, regardless of the user ’s intention.
Contrary to the developer ’s intention, the same number of SD cards is required for multiple
game cards to be played on the same 3DS system.

For example, unintentional overwriting of data can be avoided by assigning unique directory and
filenames when creating files in extended save data. However, rather than using a name freely
chosen by the user, such as a character name, as the basis for naming such files, apply a text
string such as one that includes the current time. In addition, when writing information linked with
extended save data in save data, note that the extended save data associated with that information
may not exist on the SD card.

Warning: As for contextual CTR banner data, there is no way to handle situations like this. The
downloaded contextual banner data or local contextual banner data last created is
enabled, regardless of which game card is inserted.

7.4.3. Accessing Multiple Items of Extended Save Data

You can specify multiple unique IDs (up to a maximum of six) for the accessible save data
attributes (AccessibleSaveDataIds in AccessControlInfo) in the RSF file to enable access
to the following data.

Save data with the same unique IDs as the unique IDs specified
Extended save data with the same extended save data IDs as the unique IDs specified

Warning: Individual items of extended save data can be deleted in System Settings. Avoid
saving data that needs to be as consistent as extended save data whenever possible.

For titles that use CTR contextual banners, create extended save data that takes the
title's unique ID as its extended save data ID in addition to the extended save data for
shared access. This is because the files used to display the contextual banner must be
in the ExBanner directory for extended save data created with the title’s unique ID.

Note: For information about how to write the RSF file, see the reference for the ctr_makerom
CTR-SDK tool or the 3DS Programming Manual: Creating Applications.

7.5. Archives That Directly Access SD Cards

Call the nn::fs::MountSdmcWriteOnly() function to mount an archive that directly accesses the
inserted SD card. Note that the FileSystemAccess permission attribute under
AccessControlInfo in the RSF file must specify - DirectSdmcWrite.

Code 7-11. Mounting an Archive That Directly Accesses an SD Card


nn::Result nn::fs::MountSdmcWriteOnly(const char* archiveName = "sdmcwo:");

No files or directories can be read in archives mounted using this function, but written files are not
encrypted, so they can be read from a PC. Unlike any extended save data on the same SD card, you
can change the size of files after they are created.

There are fewer errors possible when mounting this kind of archive than when mounting extended
save data, and you can handle these errors in the same way as for extended save data. However,
because reading is prohibited, you cannot directly check for the existence of any file or directory. You
must attempt writing once and then handle any nn::fs::ResultAlreadyExists errors returned by
the write function.

Call the nn::fs::Unmount() function and pass the archive name as an argument to unmount an
archive.

Note: Applications are restricted from writing depending on the path. For more information about
restrictions and cautions, see the File System section of the guidelines.

7.6. Media-Specific Cautions

7.6.1. Nintendo 3DS Cards

Verifying Different Access Speeds

The performance of access to ROM archives on Nintendo 3DS Cards by the FS Library can vary
greatly depending on the condition of the media. System updates may also improve performance. If
you plan to sell your application on Nintendo 3DS Card, verify the following to make sure that
changes in access speed do not cause issues.

For CARD1 applications:


Enable DebugMode using the Config tool's Debug Setting (with latency emulation enabled)
and play through all modes.
Set the development card speed to Fast and play through all modes of the Release build of
the application written to a development card

For CARD2 applications:


Enable DebugMode using the Config tool's Debug Setting (with latency emulation enabled)
and play through all modes.
From the PARTNER-CTR Debugger Card Control dialog box, in the Card Emulation Control
tab, set the transfer speed in the emulation memory speed settings to Fast, and play through
all modes of the application (Release build CCI file) loaded in emulation memory.
Set the development card speed to Fast and play through all modes of the Release build of
the application written to a development card
Warning: Make sure to test using the latest version of the PARTNER-CTR Debugger/Writer.

7.6.2. SD Memory Cards

Note the following points when you implement an application that accesses an SD card, including
extended save data or downloadable application stored in the SD card.

Cautions for Directly Accessing SD Memory Cards

Archives for accessing SD cards mounted using the nn::fs::MountSdmc() function (mounted
under sdmc:/ if not otherwise specified) are provided for debugging purposes only. Note that these
archives are not accessible from production versions.

Note that this mount point cannot be accessed in retail versions of games. Note also that SD cards
may be swapped out while the system is asleep. To remount an SD card that has been removed,
first call nn::fs::Unmount to unmount it from the file system, and then call the appropriate
mounting function to remount.

Streaming Playback Cautions

In some situations, SD card file access may become slower, such as when the SD card is almost
full, it is significantly fragmented, or it is near the end of its service life. Most functions that access
SD cards are affected by this and slow down; the effect is particularly large with functions for
creating files (for example, nn::fs::CreateExtSaveData).

Note that calling these functions while the system is streaming sound or movies may cause choppy
audio or dropped frames, because file access blocks until the functions complete executing.

"Read Disturb" Measures

SD cards have a limit to their data storage capacity in relation to their read capacity. When trying to
read beyond the limit, the data can become degraded in a phenomenon known as read disturb.
Depending on the area where the read disturb phenomenon occurs, all data on the SD card might
become unreadable.

Of particular concern is that this can occur just by opening a file to be loaded from the SD
card. Some SD cards on the market are not sufficiently resistant to the read disturb phenomenon.
The best way to lower the risk of this phenomenon occurring is to reduce, as much as possible, the
amount of file opening and closing that occurs on the application side. For example, when multiple
accesses are required at different offsets to the same file, perform the accesses all within a small
number of open and close processes, rather than opening and closing the file for each access.

In addition, by repeatedly reading the same region on the SD card, data retention becomes
unstable. Problems can occur most often in access to ROM archives and downloadable content that
has not been rewritten. Even if the application is card-based software, be careful if there is a
possibility of selling the card-based software as a downloadable application in the future.

As a measure to retain the data longer when the particular data is repeatedly read from the SD card
(such as when repeatedly streaming playback of short waveform data), you can read the data from
the SD card to the buffer in internal memory and use the data on that buffer. Making the buffer size
16 KB or more is effective, because the situation improves with a larger size.

This measure is not a restriction, and you are not required to implement it when memory is limited.

Note: For the developers using the NW4C Sound Library, see the description in the
nw::snd::SoundArchivePlayer class reference.

7.7. Difference Between Save Data and Extended Save Data

When an application goes on sale, you can now select the 3DS Card (CARD2) as the storage media
for that application, thereby increasing the storage capacity available for application-specific save
data. This allows you to save data that had to be saved as extended save data (due to storage
capacity issues) in the past. With this change in specifications, you must carefully distinguish
between save data and extended save data based on the type and purpose of the data being saved.
This chapter describes the distinguishing characteristics of save data and extended save data and
where to save each type of data.

7.7.1. Characteristics of Save Data and Extended Save Data

Save data and extended save data have the following distinguishing characteristics.

Table 7-1. Characteristics of Save Data and Extended Save Data

Item Save Data Extended Save Data

Redundancy Supported Not supported


Supported. (Downloadable applications only.)
Backup Not supported
Usually enabled. (See below.)
Supported. (Downloadable applications only.)
Rollback
Backup must be disabled if this feature is Not supported
prevention
used.
Storage Card-based applications: Card backup region
SD card
media Downloadable applications: SD card

When Card-based applications: From the first use.


storage is Downloadable applications: At time of When allocated by the application.
allocated installation.

Can be removed from inside an


application or on the Data
Removal Only from inside the application.
Management screen in System
Settings.

The risk of losing data can be reduced by The risk of losing data cannot be
Data loss
using the backup and redundancy features. reduced.

With redundancy: Lost data needs to be


recovered only when an entire archive is lost. Lost data needs to be recovered when
Lost data
Without redundancy: Lost data needs to be an entire archive is lost or when
recovery
recovered when an entire archive is lost or individual files or directories are lost.
when individual files or directories are lost.
Ensuring
sufficient Unnecessary. (Save data must fit into a fixed Must be performed each time a file is
storage amount of allocated storage.) created.
capacity

In some applications, the backup feature has been disabled for the following reasons.

They are using rollback prevention.


Inconsistencies may arise between extended save data and save data if only save data is rolled
back using the backup feature.
Restoring save data from backups may be problematic.
For example, users might repeatedly transfer downloadable content or rare or hidden
characters or items to other users over and over again, restoring their save data each time.

7.7.2. Policy for Selecting the Storage Location for Data

In general, save important data in save data. Important data may include data that can make further
progress in a game impossible if lost. For data that can be lost and the player can still progress in
the game, and data where the amount of storage required is variable, save in extended save data.
Because extended save data can be deleted on the Data Management screen under System
Settings, developers must note that this data may disappear at times that are beyond the control of
the application. In addition, developers must also consider that packaged versions of applications
available on cards may be played on other consoles, and that an SD card other than the one on
which the expanded save data accessed last time is stored may be inserted in a console the next
time a particular game is played.

Save the following information in save data to protect a user ’s data.

Data necessary for progress in a game


Data necessary for consistency with game progress data
If data necessary for consistency with game progress is saved in extended save data and save
data is deleted, initialization may be required or consistency of data may be lost if the backup
feature is used. This can massively increase work involved in the debugging process, such as
requiring you to roll back save data in each location of the game or having to delete save data
and then verify the consistency of data.

We recommend that the following data also be saved in save data.

Important data required to enjoy the game


Such data includes screenshots taken when a stage is first cleared or congratulatory messages
sent from friends.
Data that requires consistency to be maintained across multiple files
Such data includes command data and textures used to replay scenes.
The chance of losing data at the file level is lower with save data. However, with data linked to
a single file and saved, the risk of losing the file is the same whether you use extended save
data or save data without redundancy.
We recommend that the following data be saved in extended save data if losing it will not adversely
affect game progress.

Data that can be redownloaded from the network or that can be re-created while playing the
game
Data shared with other applications
We advise developers to save data shared between games, particularly games played in
parallel, in extended save data. Developers must note that using shared data in conjunction
with the save data backup feature may lead to dishonest activity such as generating an
excessive number of items.
Data where the number of files or amount of storage required is not fixed
Such data includes music or stages created by the user or videos recorded by the user.
With extended save data, you specify the number of files and directories that can be saved, but
there is no limit on the size of files. In the case of save data, on the other hand, even though
the number of directories and files that can be saved is similarly specified when save data is
initialized, save data is unsuitable for saving multiple instances of data having an unfixed size
because storage capacity is limited.
Data imported from the save data of another application
Although you can access the save data of other applications, incompatibilities can occur
depending on the particular combination of titles. For this reason, use extended save data even
when transferring data from an earlier version. Save data may be accessed, however, when
transferring save data from a downloadable demo or another download-only application.

CONFIDENTIAL

8. Time

The 3DS system can handle two kinds of time: ticks as a unit of time, calculated from the system clock;
and the real-time clock (RTC) that comes with a battery backup.

The system provides a timer and alarm feature that uses system ticks to allow applications to measure
the passage of time.

In addition to a feature to get the current time from the RTC, there is also a feature to set off an alarm
at a specified time and date.

8.1. Class for Representing Time

The SDK is designed to use the nn::fnd::TimeSpan class as an argument to time functions, in part
to avoid any confusion about units. Within this class, time is expressed in nanoseconds using 64-bit
integers. To prevent ambiguity in the units, no implicit conversion from integers to this data type is
provided. However, 0 can be converted implicitly to this data type.

The nn::fnd::TimeSpan class has static member functions (the From*Seconds() functions) for
generating instances of this class from integer values expressing the time in various time units
(seconds, milliseconds, microseconds, or nanoseconds). It also has member functions (the
Get*Seconds() functions) that can obtain the time value of the instance (measured in seconds,
milliseconds, microseconds, or nanoseconds) as an s64 type.

You can also perform comparison (==, !=, <, >, <=, ,>=) and arithmetic (+, -, +=, -=) operations
between instances of this class.

Code 8-1. Instance Generation and Get Functions for the Various Time Units

// From*Seconds()
static nn::fnd::TimeSpan FromSeconds(s64 seconds);
static nn::fnd::TimeSpan FromMilliSeconds(s64 milliSeconds);
static nn::fnd::TimeSpan FromMicroSeconds(s64 microSeconds);
static nn::fnd::TimeSpan FromNanoSeconds(s64 nanoSeconds);
// Get*Seconds()
s64 GetSeconds() const;
s64 GetMilliSeconds() const;
s64 GetMicroSeconds() const;
s64 GetNanoSeconds() const;

8.2. Ticks

A tick is the time it takes for the CPU to go through one clock cycle while the CPU is operating at 268
MHz (roughly 3.73 nanoseconds). The number of ticks in one second is defined by the
nn::os::Tick::TICKS_PER_SECOND constant.

Warning: When running in extended mode on SNAKE, the tick value remains the same as in
standard mode. In other words, one tick is the equivalent to three CPU clock cycles while
in extended mode.

Tick values are available using the nn::os::Tick class, and the conversion constructors and
conversion operators let you convert from this class to the nn::fnd::TimeSpan class (which
represents actual time), and vice versa.

Two constructors are available: one generates an instance from the tick value represented by an s64
type, and the other generates an instance from the tick value converted from the time represented by
an object of the nn::fnd::TimeSpan class.

The conversion operators include one to convert from a tick value to an s64-type value and another
to convert to an object of the nn::fnd::TimeSpan class. You can also perform arithmetic (+, -, +=, -
=) operations between instances of the nn::os::Tick class.

In addition, you can call the ToTimeSpan() function to generate an instance of the
nn::fnd::TimeSpan class that expresses the time span converted from a tick value.

Call the GetSystemCurrent() function to get the time elapsed since the system was started up as
a tick value (an instance of the nn::os::Tick class).

8.3. Timers
The timer feature sends a notification when the specified amount of time has passed. A timer object
may be in the signaled or non-signaled state, and it transitions from the non-signaled to the signaled
state when the specified time has passed. Timer instances are limited to eight at any one time.

Timer objects are defined by the nn::os::Timer class. Generate an instance and then call the
Initialize or TryInitialize() function to initialize. When initializing a timer object, you can
choose whether to manually or automatically reset its signaled state. After a manual-reset timer
enters the signaled state, all threads waiting for that timer to enter the signaled state are released
and the timer remains in the signaled state until it is cleared. After an automatic-reset timer enters
the signaled state, of those threads that are waiting for it to enter the signaled state, only the
highest-priority thread is released, after which the object resets itself to the non-signaled state.

There are single-use timers that only transition to the signaled state once after the allotted time, and
multiple-use periodic timers that transition to the signaled state in cycles of the specified period. Call
the StartOneShot() function to start a single-use timer. Call the StartPeriodic() function to
start a periodic timer. You can stop both timer types by calling the Stop() function.

Call the Wait() function to cause a thread to wait until a timer enters the signaled state. Call the
Signal() function to put a timer in the signaled state without waiting the specified time. After a
manually resetting timer enters the signaled state, it remains that way until you call the
ClearSignal() function.

Call the Finalize() function to explicitly destroy an instance.

For more information about applications that use timers, see [Link]. Cautions When Shutting Down.
Be sure to carry out all proper processing, such as destroying object instances, at this time.

8.4. Alarms

An alarm is a feature that calls the registered handler after the specified time has passed. Alarms
generate threads internally to call the handler. Consequently, you must call the
nn::os::InitializeAlarmSystem() function to initialize the alarm system prior to using alarms.

Alarm objects are defined by the nn::os::Alarm class. Generate an instance and then call the
Initialize or TryInitialize() function to initialize.

There are single-use alarms that only call the handler once after the allotted time, and multiple-use
periodic timers that call the handler in cycles of the specified period. Call the SetOneShot()
function to set a single-use alarm. Call the SetPeriodic() function to set a periodic alarm. You can
cancel both alarm types by calling the Cancel() function.

Call the CanSet() function to check whether alarms can be set. This returns false if an alarm is set
and the handler has yet to be called. Even after canceling an alarm by calling Cancel, this check will
not return true until the handler has been called once.

Handler types are defined below.

Code 8-2. Alarm Handler Type Definitions

typedef void(* nn::os::AlarmHandler)(void *param, bool cancelled);

Any arguments needed when setting the alarm are passed in the param parameter. Generally, false
is passed in the cancelled parameter, but true is passed when the handler has been called after
the alarm was canceled with the Cancel() function.
Call the Finalize() function to explicitly destroy an instance.

Warning: In the current implementation, the alarm system uses two threads internally. As a
result, the alarm system becomes unstable if you register many handlers that take a long
time to complete.

For more information about applications that use alarms, see [Link]. Cautions When Shutting Down.
Be sure to carry out all proper processing, such as destroying object instances, at this time.

8.5. RTC

RTC stands for "real-time clock," the hardware clock included in the system. The clock has its own
battery backup and continues keeping time even if the system loses power.

You can only set the clock from the system settings. The clock can be set forward as far as
2049/12/31, but the clock itself can keep time until it reaches 2099/12/31. Because the clock will not
reach its upper limit until at least 50 years have passed since it was set, there is no need to check
whether the clock has reset itself while an application is running.

This section explains how to use RTC time in an application.

8.5.1. Class for Representing Dates and Times

The CTR-SDK system provides the nn::fnd::DateTime class for representing dates and times.
You can get the current time of the RTC by calling the nn::fnd::DateTime::GetNow() function,
which returns an instance of this class.

The constructor with arguments allows you to specify milliseconds in addition to the year, month,
day, hours, minutes, and seconds. The constructor with no parameters generates an instance
representing 2000/01/01 [Link].000.

The dates and times that can be handled by this class are in the range from
nn::fnd::DateTime::MIN_DATE_TIME (1900/01/01 [Link].000) to
nn::fnd::DateTime::MAX_DATE_TIME (2189/12/31 [Link].999). All years are calculated in
the Gregorian calendar with 2000/01/01 (Sat) as the standard, and the dates and times that are
calculated assume that one day is exactly 86,400 seconds in length.

When subtracting one nn:fnd::DateTime class instance from another, the date and time
difference is returned as an instance of the nn::fnd::TimeSpan class. When adding an
nn::fnd::TimeSpan instance to an nn::fnd::DateTime instance, the resulting date and time is
returned as an instance of the nn::fnd::DateTime class.

The year, month, day, day of the week, hour (in 24-hour notation), minute, second, and millisecond
are available as date and time parameters. All of these parameters can be obtained using Get*()
functions, and all of them except for the day of the week can be replaced using Replace*()
functions. The Get*() function for the day of the week parameter returns an enumerated type
(nn::fnd::Week), whereas the Get*() functions for all other parameters return s32 values. The
Replace*() functions return new instances that have the replaced parameters. The parameters of
the original instance will not be overwritten.
In the versions of the GetParameters() and FromParameters() functions that take the
nn::fnd::DateTimeParameters structure as arguments, you can get or replace all parameters
at the same time. In calls to FromParameters, the member of the structure that indicates the day
of the week are ignored. The result is indeterminate if the date/time parameters are replaced with
invalid values. You can check whether a particular set of date/time parameters is valid by calling
the IsValidParameters() function. In functions that take a structure as an argument, the day of
the week is also checked for validity, so you must be careful when setting the parameters for the
structures. If the day of the week is unknown, check the validity using an overloaded version of the
function that does not take the day of the week as an argument.

The DateToDays() function returns a value indicating how many days have passed between the
reference date (2000/01/01) and the specified date. The result is indeterminate if an invalid date
was specified. To check whether a given date is valid, call the IsValidDate() function. To check
whether a particular year is a leap year, call the IsLeapYear() function, which returns 1 if the
specified year is a leap year.

The DaysToDate() function returns the number of days that have elapsed since the reference
date. The DaysToWeekday() function returns the number of days that have elapsed since the
reference day of the week.

8.5.2. RTC Alarm Feature

The system includes an RTC alarm feature. This feature sends a notification to the running
application when the time set for the alarm is reached. Alarm times can be set by the minute.
Depending on the clock settings, the alarm notification may be delayed by around one minute. The
alarm feature is suitable as an alarm clock or for similar uses, but it is not suitable for precise time-
based notification applications, such as an hourglass. For these uses, see 8.3. Timers or 8.4.
Alarms.

Use the RTC alarm feature in the PTM library. You must first initialize the library by calling
nn::ptm::Initialize before you can use it. After you are done using it, call
nn::ptm::Finalize.

Code 8-3. Initializing and Finalizing the PTM Library

nn::Result nn::ptm::Initialize();
nn::Result nn::ptm::Finalize();

The RTC alarm feature notifies the running application that the clock has reached the set time by
setting the specified event to the signaled state.

Warning: The application cannot receive notification while in Sleep Mode. When displaying the
HOME Menu or a library applet, the thread that started the HOME Menu or library applet
remains suspended and the event enters the signaled state. Care must be taken not to
implement an RTC alarm in a standby thread because it will not have permission to
perform graphics or sound operations.

Use the following functions to specify the event to use and to set the alarm time.

Code 8-4. RTC Alarm Feature Functions


nn::Result nn::ptm::RegisterAlarmEvent(nn::os::Event &event);
nn::Result nn::ptm::SetRtcAlarm(nn::fnd::DateTime datetime);
nn::Result nn::ptm::GetRtcAlarm(nn::fnd::DateTime *pDatetime);
nn::Result nn::ptm::CancelRtcAlarm();

Call nn::ptm::RegisterAlarmEvent to specify the event to receive the RTC alarm feature
notification. The application must generate and initialize the instance of the nn::os::Event class
passed in the event parameter.

Call nn::ptm::SetRtcAlarm to set the alarm time. For the datetime parameter, specify an
instance of the nn::fnd::DateTime class with the instance’s time specified in minutes. This
function returns nn::ptm::ResultOverWriteAlarm if the alarm time is already set, but it does
indeed overwrite the alarm time with the new value. There is no immediate notification if the time
set is in the past.

Call nn::ptm::GetRtcAlarm to get the alarm’s current setting. For the pDatetime parameter,
pass a pointer to an instance of the nn::fnd::DateTime class to receive the alarm time. The
function returns nn::ptm::ResultNoAlarm if the alarm is not set.

Call nn::ptm::CancelRtcAlarm to cancel a previously set RTC alarm. The function returns
nn::ptm::ResultNoAlarm if the alarm is not set.

8.5.3. Handling Time Modification Offset Values

The system records the cumulative offset value in seconds when the user changes the clock time in
system settings either forward or back. Call nn::cfg::GetUserTimeOffset to get this offset
value. For more information, see 11.2.8. RTC Modification Offset Value.

When starting an application on the same system, you can compare this offset value with the offset
from the last time the application was started to check whether the user has changed the clock in
the meantime. You can then use this in your application, but note that, when started on a different
system, the application must not impede player progress due to different values.

8.6. Mixed Use of Ticks and RTCs Prohibited

Ticks (the nn::os::Tick class) and RTCs (the nn::fnd::DateTime class) handle time similarly,
but how time advances and the accuracy is different. If the same time period is measured by tick and
RTC, the results are not necessarily the same. For this reason, values obtained with ticks must not
be used in combination with values obtained using an RTC.

Ticks differ on individual systems and vary significantly with temperature. They can have a margin of
error up to ±300 seconds over a period of a month.

RTCs also differ on individual systems and vary significantly with temperature. They can have a
margin of error up to ±60 seconds over a period of a month. nn::fnd::DateTime::GetNow returns
an interpolated current time value so its speed of advance can vary by up to ±1 s per hour.

Most of the time that time is handled within libraries, such as for sound, animation or streaming
playback, the speed of advancing time is again different from that for ticks and an RTC. Because of
this, when an application must synchronize with a performance or sound playback, you must
understand the specifications of individual libraries and use speeds that match those specifications.
CONFIDENTIAL

9. Threads

The nn::os::Thread class defined in the SDK only defines the basic functions for dealing with
threads, such as how to start them, join them, get parameters, and change parameters. Threads used
by applications must inherit from this thread class.

9.1. Initializing and Starting

After the constructor creates a thread, the thread has yet to be either initialized or started. If you are
managing the stack in your application, call the Start() function to initialize and start a thread; to
have the library automatically manage stack memory, call StartUsingAutoStack(). Calling the
member functions that start with Try (TryStart() and TryStartUsingAutoStack()) returns an
error if the functions fail, such as due to insufficient resources.

If the application manages stack memory, a GetStackBottom member function (that returns the
bottom of the stack as an uptr value) must be defined for the object passed as a stack region to the
member functions. Both the nn::os::StackMemoryBlock class, which allocates stack memory
blocks, and the nn::os::StackBuffer class, which can be located within static memory and
structures, satisfy this condition and can be safely passed as arguments. Make sure that the
application does not allow the stack memory to become invalid (released) until after the threads have
terminated.

9.1.1. Running Application Threads in the System Core

You can allocate some of the system core’s CPU time for applications to run application threads in
the system core.

To start a thread in the system core, use the nn::os::SetApplicationCpuTimeLimit()


function to specify the percentage of CPU time to allocate, and then specify 1 for coreNo. You
must allocate CPU time before you start a thread on the system core; failure to do so results in an
error.

Code 9-1. Functions for Allocating CPU Time on the System Core for an Application

nn::Result nn::os::SetApplicationCpuTimeLimit(s32 limitPercent);


s32 nn::os::GetApplicationCpuTimeLimit();

Use the limitPercent parameter to specify the percentage of the system core’s CPU time to
allocate to your application. You can specify values in the range from 5 to 30. The specified
percentage of CPU time is allocated from the beginning of each 2-millisecond cycle. In other words,
specifying a value of 25 causes 0.5 milliseconds (2 * 25 / 100) to be allocated to the application
and the remaining 1.5 milliseconds to be allocated to the system.

You can use the nn::os::GetApplicationCpuTimeLimit() function to get the current ratio of
CPU allocation. This returns 0 by default because no CPU time is initially allocated to the
application.

Warning: After you have allocated CPU time for your application, you cannot restore this
setting to 0.

Even if no application threads are running, the system will not use the time allocated to
the application. As a result, wireless communication and other system core processes
slow down after CPU time is allocated to the application.

9.1.2. ManagedThread Class

The ManagedThread class adds functionality to the Thread class. Basic operation for the two
classes is the same, except that the functions for initialization and execution are separated. The
following table details the distinction.

Table 9-1. Functions Added to the ManagedThread Class

Added Features Associated Functions


Retrieving information related to GetStackBufferBegin, GetStackBufferEnd,
the stack GetStackBufferSize, GetStackBottom, GetStackSize
Name retention GetName, SetName

Acceleration of ID retrieval GetId, GetCurrentId


Retrieving the instance
corresponding to the current GetCurrentThread
thread
Enumuration of the thread Enumerate

Searching the thread FindById, FindByStackAddress

When sharing resources with the Thread class, the total number of ManagedThread class threads
is subject to the restriction of the number of threads that can be created. Also, twice the local
storage is consumed to use ManagedThread when the
nn::os::ManagedThread::InitializeEnvironment() function is called.

9.2. Thread Functions

The operations performed by a thread are implemented in the thread function passed to the member
function that initializes and starts the thread. Declare a thread function that takes one or no
arguments and does not return a value (a void return type).

Some of the Start() functions of the Thread class can be used by a thread function that does not
take any arguments, and there is an overloaded version that allows you to use a template to specify
the type of argument that is passed to a thread function. Note that because the arguments passed in
are copied to the thread's stack, the template must specify arguments of types that can be copied
and the amount of space available in the stack will be reduced by an equivalent amount.
9.3. Completion and Destruction

A thread completes when its thread function completes or when it is released from the blocked state
by the parent nn::os::WaitObject class's Wait*() functions. The Join() function waits for
thread completion unconditionally. If you need more fine-grained control (for example, to wait with a
timeout), wait for the thread to exit using the nn::os::WaitObject::WaitOne() function and then
call Join. Follow this same approach if you need to avoid thread blocking due to the Join()
function.

Call the IsAlive() function to check whether a thread is still alive (not yet completed).

Call the Finalize() function to destroy an unneeded thread. When doing so, you must use the
following procedure.

If the thread was started using the Start or TryStart() functions, you must explicitly call the
Join() function before destroying the thread. Do not call Detach.

If the thread was started using the StartUsingAutoStack or TryStartUsingAutoStack()


functions, you must explicitly call the Join or Detach() functions before destroying the thread. After
calling Detach, you can only call the Finalize() function or the destructor for that thread.

9.4. Scheduling

Threads are scheduled according to their priority, and you can set the priority of any thread. Thread
priorities can be specified as integers between 0 and 31, with 0 indicating the highest priority and 31
the lowest. Standard threads specify the DEFAULT_THREAD_PRIORITY of 16.

Call the Yield() function to yield execution to other threads of the same priority. This has no effect
if there are no threads with the same priority.

Call the Sleep() function to put threads into a sleep state for a specified time.

When scheduling occurs as a result of an interrupt to a currently executing thread, the interrupted
thread is placed at the top of the thread queue and scheduling is controlled so there is as little thread
switching as possible. If you try to create and execute a new thread with a priority lower than or equal
to the currently executing thread, thread switching may not occur if an interrupt from a system
process prevents immediate scheduling. We recommend using events to wait for thread switching if
you need to ensure thread switching.

Warning: Specifying a short time and repeatedly calling Sleep() places a heavy load on the
system core, and reduces the overall system performance.

9.5. Getting and Setting Parameters

Each thread has its own parameters: a thread ID and a priority.

To get a parameter, call the Get* member function to get the instance parameter; to get the
parameter for the current thread, call the GetCurrent* member function. Similarly, call the Change*
or ChangeCurrent*() functions to set a parameter.

Call the GetMainThread() function to get the current thread (main thread) object.

Thread ID

Each thread is assigned its own ID as a bit32 type. You can get these IDs by calling the GetId or
GetCurrentId() functions, but the IDs cannot be changed.

Priority

You can set the priority for each thread as an s32 type. Use the GetPriority or
GetCurrentPriority() functions to get the current priority. Call the ChangePriority or
ChangeCurrentPriority() functions to change the priority.

9.6. Thread-Local Storage

Each thread for storing uptr types has 16 slots of thread-local storage. You can reserve thread-local
storage for use by generating an instance of the nn::os::ThreadLocalStorage class, but
attempting to reserve more than 16 slots will fail, and the application will be forcibly halted by a
PANIC.

To set a value in thread-local storage, call the SetValue() function; to get a value, call the
GetValue() function. All thread-local storage slots are set to 0 when a thread is started.

Use the nn::os::ThreadLocalStorage class constructor with parameters to register a callback


function to be called when ending a thread. The thread-local storage value is passed as an argument
when the callback function is called.

9.7. Synchronization Objects

Access to non-thread-safe libraries or shared resources must be handled by means of the application
synchronizing between threads. The SDK provides synchronization objects such as the
CriticalSection class.

9.7.1. Critical Sections

A CriticalSection is a synchronization object used to provide mutual exclusion. Allowing only


one thread at a time to enter a CriticalSection object effectively prohibits multiple threads from
accessing the same resource at the same time. CriticalSection objects require more memory
than mutexes (described below), but are faster in most cases. There is no upper limit on how many
can be created as long as there is memory available.

CriticalSection objects are defined by the nn::os::CriticalSection class. Generate an


instance and call the Initialize or TryInitialize() function to initialize, and then call the
Enter or TryEnter() functions to enter the CriticalSection and lock its resources. If you call
Enter to lock a CriticalSection object that is already locked, execution of the thread that
called the function is blocked until the existing lock is released. These objects allow recursive
locks. A lock request from the thread that has the original lock will be nested (the nesting level will
be incremented) without the thread being blocked. If you call TryEnter instead, the function
returns only whether it succeeded in entering the object, and the thread is not blocked.

Call the Leave() function to release a lock on a CriticalSection object. If this function is
called from the locking thread, the nesting level is decremented and the lock is released when the
nesting level reaches 0.

Call the Finalize() function to explicitly destroy an instance.

Use the nn::os::CriticalSection::ScopedLock object to lock a CriticalSection object.


The lock is automatically released when the object goes out of scope.

[Link]. Thread Priority Inversion

CriticalSection objects do not inherit the priority of the thread that generates them.
Consequently, if low-priority thread A has a lock on a CriticalSection object and then high-
priority thread B requests a lock on that object, thread B may be blocked by some thread C,
which has a priority higher than A, but lower than B. In other words, B's priority is effectively
lowered, with B's and C's relative priority levels being inverted.

9.7.2. Mutexes

Mutexes are synchronization objects used for providing mutual exclusion. Much like
CriticalSection objects, these objects effectively prohibit multiple threads from accessing the
same resource at the same time. Unlike CriticalSection objects, they implement thread priority
inheritance, and there is a limit of 32 on how many can be created. Due to priority inheritance, if a
high-priority thread requests a lock on a mutex object that is already locked by a low-priority
thread, the locking thread's priority is temporarily increased to match the priority of the thread that
requested a lock.

Mutex objects are defined by the nn::os::Mutex class. Generate an instance and call the
Initialize or TryInitialize() function to initialize, and then call the Lock or TryLock()
functions to lock a mutex object. If you call Lock to lock a mutex object that is already locked,
execution of the thread that called the function will be blocked until the existing lock is released.
Mutex objects allow recursive locks. Requesting another lock from the same thread causes the new
lock request to be nested. If you call TryLock instead, the function returns only whether it
succeeded in locking the object within the timeout period, and the thread will not be blocked.

Call the Unlock() function to release the lock on a locked mutex object. Unlock can only be
called from the thread that has the current lock. The object will not be unlocked until all recursive
locks have been released.

Call the Finalize() function to explicitly destroy an instance.

Use the nn::os::Mutex::ScopedLock class to lock a mutex object. The lock begins when the
ScopedLock object is created. The lock is automatically released when the object goes out of
scope.
9.7.3. Events

Events are simple synchronization objects that send notification that an event has occurred. An
event object may be in the signaled or non-signaled state, and transitions from the non-signaled to
the signaled state when the specified event has occurred. You can synchronize threads by having
them wait for an event to occur. You can only have 32 instances of the Event class at any one
time.

Event objects are defined by the nn::os::Event class. Generate an instance and then call the
Initialize() function to initialize. During initialization, you can choose whether the event is a
manually resetting event or an automatically resetting event. When manually resetting events enter
the signaled state, they remain in that state until manually cleared, during which time all threads
waiting for that event are released. When automatically resetting events enter the signaled state,
only the highest priority thread of all those threads waiting for it to enter the signaled state is
released, after which the object resets itself to the non-signaled state.

Call the Wait() function to cause threads to wait until an event enters the signaled state. You can
specify the length of the timeout period and also check to determine whether an event has occurred
during the timeout period. If you set the timeout period to 0, control returns immediately, allowing
you to check for event occurrence without execution being blocked.

Call the Signal() function to put an event in the signaled state. When a manually resetting event
enters the signaled state, it remains in that state until manually cleared by calling the
ClearSignal() function. If one or more threads are waiting when an automatically resetting event
enters the signaled state, only the highest priority thread of all those threads waiting for it to enter
the signaled state will be released, after which the object resets itself to the non-signaled state. If
no threads are waiting for events, the object remains in the signaled state.

Call the Finalize() function to explicitly destroy an instance.

[Link]. Light Events

Light events are simple synchronization objects that send flag notifications between threads.
They are functionally no different from standard events.

Light events are defined using the nn::os::LightEvent class. The nn::os::LightEvent
class is better than the nn::os::Event class in most respects. The only exception is that it
cannot wait for multiple synchronization objects like nn::os::WaitObject::WaitAny() can.
We recommend the use of the nn::os::LightEvent class whenever possible. There is no
upper limit on how many can be created as long as there is memory available.

You must call the Initialize() function on instances that were created using the constructor
that has no arguments. During initialization, you can choose whether the event is a manually
resetting event or an automatically resetting event. The behavior is the same as the
nn::os::Event class in either case. You can also specify the type of event using the non-
default constructor. There is no need to call the Initialize() function on instances that were
created using the constructor that takes arguments.

With the exception that the Wait() member function of the nn::os::LightEvent class does
not allow you to specify a timeout, the Wait(), Signal(), ClearSignal(), and Finalize()
member functions of this class work just like the corresponding functions in the nn::os::Event
class.

The nn::os::LightEvent class adds the following member functions.

You can check the flag using the TryWait() member function. This function only returns the
flag's status and does not block execution of the thread. For LightEvents that are configured as
automatically resetting events, the flag is only cleared (that is, set to false to indicate the "non-
signaled" state) if the flag had previously been set (set to true to indicate the "signaled" state).
You can use the TryWait() function to specify a timeout time.

The Pulse() member function releases threads that are waiting for the flag to be set. For
automatically resetting LightEvent objects, it only releases the single highest-priority thread
and clears the flag. Unlike the Signal() member function, it clears the flag even if no threads
are waiting. For manually resetting events, it releases all threads that are waiting, and then
clears the flag.

9.7.4. Semaphores

Semaphores are synchronization objects that have counters. Every time there is a request to
acquire a semaphore, the semaphore counter decrements by 1, while the thread that requested the
semaphore waits for the counter to be greater than 0. When a semaphore is released, its counter is
incremented by 1. After its counter is greater than 0, the semaphore is passed to the highest-
priority thread of those threads that are waiting on the semaphore. Semaphores can be used to
manage resources by limiting the number of threads that can access those resources at the same
time. You can only have eight instances of the Semaphore class at any one time.

Semaphore objects are defined by the nn::os::Semaphore class. Generate an instance and then
call the Initialize or TryInitialize() function to initialize. When initializing, specify the
counter's initial and maximum values.

Call the Acquire or TryAcquire() functions to acquire a semaphore. If the counter is less than 0
when Acquire is called, the calling thread blocks until the semaphore can be obtained. You can
specify the length of the timeout period when calling TryAcquire() and also check to determine
whether the semaphore was obtained during the timeout period. If you set the timeout period to 0,
control returns immediately, allowing you to check for semaphore acquisition without execution
being blocked.

Call the Release() function to release an obtained semaphore. You can specify either 1 or
something other than 1 for the counter increment value; this value is used to assign the initial value
of instances that were created with an initial value of 0. When releasing a semaphore, call the
function with no arguments so that the counter increment is 1.

Call the Finalize() function to explicitly destroy an instance.

[Link]. Light Semaphores

Light semaphores are synchronization objects that have the same features as standard
semaphores. Light semaphores are defined using the nn::os::LightSemaphore class. The
nn::os::LightSemaphore class is better than the nn::os::Semaphore class in most
respects. The only exception is that it cannot wait for multiple synchronization objects like
nn::os::WaitObject::WaitAny() can. We recommend the use of
nn::os::LightSemaphore class whenever possible. There is no upper limit on how many can
be created as long as there is memory available.

The difference between light semaphores and standard semaphores is that light semaphores do
not define a TryInitialize member function. The member functions of the
nn::os::LightSemaphore class work just like the corresponding functions in the
nn::os::Semaphore class.
9.7.5. Blocking Queues

Blocking queues are synchronization objects used for safely passing messages between threads.
Thread execution is blocked when attempting to either insert more elements into a queue than the
buffer size allows or extract an element from an empty queue. When a thread is waiting for
messages from multiple other threads, blocking queues can be used to pass messages from one to
many, or from many to many. These blocking queues are equivalent to the "message queue" feature
in the NITRO, TWL, and Revolution SDKs.

Two classes define blocking queues: nn::os::BlockingQueue and


nn::os::SafeBlockingQueue. nn::os::BlockingQueue uses CriticalSection objects
internally for synchronizing between threads, which could lead to deadlocks in situations where
priorities might be inverted. If this problem is a possibility, use nn::os::SafeBlockingQueue,
which uses mutexes to avoid such issues. The member variables and functions are the same for
both classes.

Generate an instance and then call the Initialize or TryInitialize() function to initialize.
When initializing, specify an array of type uptr and the number of array elements to use as the
queue buffer.

Call the Enqueue or TryEnqueue() functions to add an element (of type uptr) to the end of the
queue. If the queue is full when Enqueue is called, the calling thread is blocked until an element
can be added to the queue. When calling TryEnqueue, control is returned whether or not an
element was successfully added, allowing you to attempt to add an element without execution being
blocked.

Call the Jam or TryJam() functions to insert an element at the beginning of the queue. If the
queue is full when Jam is called, the calling thread is blocked until an element can be added to the
queue. When calling TryJam, control is returned regardless of whether an element was
successfully added, allowing you to attempt to add an element without execution being blocked.

Call the Dequeue or TryDequeue() functions to remove an element from the beginning of the
queue. If the queue is empty when Dequeue is called, the calling thread is blocked until an element
can be removed from the queue. When calling TryDequeue, control is returned regardless of
whether an element was successfully removed, allowing you to attempt to remove an element
without execution being blocked.

Call the GetFront or TryGetFront() functions to get the first element in a queue without
removing the element. If the queue is empty when GetFront is called, the calling thread is blocked
until an element can be obtained from the queue (that is, until an element is added to the queue).
When calling TryGetFront, control is returned regardless of whether an element was successfully
obtained, allowing you to attempt to remove an element without execution being blocked.

Call the Finalize() function to explicitly destroy an instance.

9.7.6. Light Barriers

Light barriers are synchronization objects that wait for the arrival of multiple threads. Until the
number of threads specified during initialization is met, any threads that arrive early are made to
wait. However, this class cannot be used to wait for the arrival of M out of N threads (M < N).

Light barriers are defined using the nn::os::LightBarrier class. You must call the
Initialize() function on instances that were created using the constructor that has no
arguments. During initialization, you specify how many threads to wait for. You can also specify the
number of threads to wait for using the version of the constructor that takes arguments. There is no
need to call the Initialize() function on instances that were created using the constructor that
takes arguments. There is no upper limit on how many light barriers can be created as long as
there is memory available.

The Await member function waits for other threads to arrive. Thread execution is blocked until the
number of threads specified when this function was called have arrived.

9.7.7. Deadlocks

You must make sure to avoid deadlocking when using critical sections, mutexes, and semaphores
for mutual exclusion between multiple threads.

For example, assume two threads A and B, which need mutex objects X and Y to access resources.
If A locks X and B locks Y, both threads A and B will be permanently blocked. This situation is
known as being deadlocked.

One simple method to avoid deadlocks is to have all threads that need to access resources request
mutex locks in the same predetermined order.

9.8. Upper Limit of Resources

The number of instances that can be generated at the same time for threads, synchronous objects,
and timers is restricted.

The number of instances that can be generated is restricted for the following classes.

nn::os::Thread class (see 9.1. Initializing and Starting through 9.6. Thread-Local Storage.)
nn::os::Event class (see 9.7.3. Events.)
nn::os::Mutex class (see 9.7.2. Mutexes.)
nn::os::Semaphore class (see 9.7.4. Semaphores.)
nn::os::Timer class (see 8.3. Timers.)

The following sample shows the member functions that are defined to get that number of resources
and the upper limit.

Code 9-2. Member Functions to Get Number of Resources Used and Upper Limit

static s32 GetCurrentCount();


static s32 GetMaxCount();

The GetCurrentCount() function returns the number of instances currently being used.

The GetMaxCount() function returns the upper limit value for the number of instances that can be
generated.

Because the library and other functions begin generating these instances as soon as the application
has started, further restrictions to the number of resources the application can use might occur. For
more information, see 8. Upper Limit of Resources in the CTR System Programming Guide, and the
explanations for each library.
CONFIDENTIAL

10. Sound

The CTR-SDK sound library controls sound playback by communicating with the components loaded by
the DSP library. The sound library comes with 24 voice objects. An application can allocate the required
number of voice objects from this selection and then play back sounds by registering a wave buffer
(information about sound source data).

Figure 10-1. How Sound-Related Libraries Interact

10.1. Initialization

You must use the DSP library to play back sound on the system. Consequently, before initializing the
SND library used for sound playback, you must initialize the DSP library and load the components for
sound playback.

Use the nn::dsp::Initialize() function to initialize the DSP library, and then use the
nn::dsp::LoadDefaultComponent() function to load the components for sound playback.

Call the nn::snd::Initialize() function to initialize the SND library.

Code 10-1. Initializing the DSP and SND Libraries

nn::Result result;
result = nn::dsp::Initialize();
result = nn::dsp::LoadDefaultComponent();
result = nn::snd::Initialize();

10.1.1. Setting the Number of Output Buffers


You can use the nn::snd::SetOutputBufferCount() function to specify the number of output
buffers for the sound data that is ultimately output from the DSP.

Code 10-2. Setting the Number of Output Buffers

void nn::snd::SetOutputBufferCount(s32 count);


s32 nn::snd::GetOutputBufferCount();

You can specify a count of either 2 or 3. The default is 2.

Two output buffers minimize the delay until sound data is actually played, but sound may break up
if there are delays in sound thread processing.

Three output buffers cause sound to be delayed by approximately five milliseconds more than two
output buffers, but they also prevent sound from breaking up when there are slight delays in the
sound thread.

This function clears the content of the output buffers. Call it immediately after the library is
initialized or at another time when sound is not being played.

You can get the current setting with the nn::snd::GetOutputBufferCount() function.

10.2. Allocating Voice Objects

The SND library synthesizes and then plays sounds on the basis of the information about the sound
source data bound to the library's 24 voice objects. In other words, the library can play sounds from
up to 24 sound sources at the same time.

Voice objects are limited resources. An application must use the nn::snd::AllocVoice() function
to allocate voice objects to bind to the sound source data for playback.

Code 10-3. Allocating Voice Objects

nn::snd::Voice* nn::snd::AllocVoice(
s32 priority, nn::snd::VoiceDropCallbackFunc callback, uptr userArg);

The priority parameter specifies the priority of the voice object to allocate in the range from 0 to
nn::snd::VOICE_PRIORITY_NODROP (0x7FFF = 32767). The higher the value, the higher the
priority. If you specify a value outside of this range, an assert fails and processing halts.

Specifying a value for priority changes the behavior of the function when the maximum number of
voice objects has already been allocated.

When priority is set to nn::snd::VOICE_PRIORITY_NODROP and the lowest-priority allocated


voice objects are also nn::snd::VOICE_PRIORITY_NODROP, voice object allocation fails and the
function returns NULL. For any other value, the lowest-priority allocated voice object is dropped.
Attempting to allocate a lowest-priority voice object will fail.

The callback parameter specifies the callback function to be called after the library forcibly drops a
voice object. No callback occurs when this value is NULL. The userArg parameter specifies any
arguments to be passed to the callback. Specify NULL if this is not needed.

The types of the callback functions are defined as follows.

Code 10-4 Voice Object Drop Callback Function


typedef void (* VoiceDropCallbackFunc)(nn::snd::Voice *pVoice, uptr userArg);

The pVoice parameter takes a pointer to the dropped voice object, and the userArg parameter
takes any arguments specified at allocation.

10.2.1. Dropped-Voice Mode

The SND library releases (drops) low-priority voice objects when an attempt is made to allocate
more than the maximum number of voice objects and when internal DSP processing is in danger of
increasing.

Two modes exist to control this latter restriction; you can use the
nn::snd::SetVoiceDropMode() function to configure them.

Code 10-5 Setting the Dropped-Voice Mode

void nn::snd::SetVoiceDropMode(nn::snd::VoiceDropMode mode);

When mode is VOICE_DROP_MODE_DEFAULT, only predicted values are used to determine whether
to drop voice objects.

When mode is VOICE_DROP_MODE_REAL_TIME, both predicted values and the actual processing
load are used to determine whether to drop voice objects. You must configure three output buffers
to be used when you set this mode.

10.3. Setting Sound Source Data Information

The memory region (buffer region) for storing sound source data must be allocated as a contiguous
32-byte-aligned region in device memory. The buffer region must be 32-byte aligned and sized to a
multiple of 32 bytes. When allocating sound source data buffers from this region, buffers must be
contiguous memory regions and the starting addresses must be 32-byte aligned, but the 32-byte
multiple size restriction does not apply.

In some cases, sound source data written to a buffer might only be written to the cache. Be sure to
call the nn::snd::FlushDataCache() function to write the data to memory.

Pass the sound source data to the DSP as an nn::snd::WaveBuffer sound source data
information structure. Sound source data information must be initialized with the
nn::snd::InitializeWaveBuffer function before configuring information about the sound
source data. This is also the case when reusing information about sound source data that has already
been played.

After initializing the sound source data, set the following member variables: bufferAddress for the
starting address of the buffer, sampleLength for the sample length, and loopFlag for marking
whether to loop playback. Specify the DSP ADPCM sample format using the pAdpcmContext
member variable. You can specify a value of your own choosing for the user parameter (userParam).
Do not change the values of the other members of this structure.

Set other information about the sound source data (number of channels, sample format, sampling
rate, and basic information about ADPCM parameters) in the voice object.

Use the nn::snd::Voice::SetChannelCount() function to set the number of channels. Set this
number to 1 for monaural data, and 2 for stereo data. Any other values are invalid.

Use the nn::snd::Voice::SetSampleFormat() function to set the sample format. The SND
library supports 8-bit, 16-bit PCM, and DSP ADPCM formats.

Table 10-1. List of Sample Formats

Value Type Stereo Playback

SAMPLE_FORMAT_PCM8 8-bit PCM Interleaved


SAMPLE_FORMAT_PCM16 16-bit PCM Interleaved

SAMPLE_FORMAT_ADPCM DSP ADPCM No

When the sample format is DSP ADPCM, set the ADPCM parameters using the
nn::snd::Voice::SetAdpcmParam() function. Set the nn::snd::WaveBuffer member variable
pAdpcmContext to the address of the ADPCM context data structure stored in the sound source
data, and do not change the context data until after playback has finished.

Use the nn::snd::Voice::SetSampleRate() function to set the sampling rate. Use the frequency
of the sound source data.

Use the nn::snd::Voice::SetInterpolationType() function to set the method of interpolating


the sound data. Use the nn::snd::Voice::GetInterpolationType() function to get the current
setting. Three interpolation methods are supported, as shown in the following table. The default value
is INTERPOLATION_TYPE_POLYPHASE.

Table 10-2. List of Interpolation Methods

Value Interpolation Method


Interpolation using four points. The optimal coefficients are
INTERPOLATION_TYPE_POLYPHASE
chosen based on the specified sampling rate and pitch.

INTERPOLATION_TYPE_LINEAR Linear interpolation.


No interpolation. Noise occurs in the sound played back when the
INTERPOLATION_TYPE_NONE
sound source data sampling frequency is not 32,728 Hz.

After setting all the information about the sound source, use the
nn::snd::Voice::AppendWaveBuffer() function to register the sound source to the voice
object. Multiple sound sources can be registered consecutively, but 4 is the maximum number of
sound sources that can be played back by one voice object in one sound frame (roughly 4.889 ms).

Use the nn::snd::WaveBuffer member variable Status to check the state of a sound source after
registration. This state is STATUS_FREE before a voice object is registered, STATUS_WAIT while
waiting for playback, STATUS_PLAY during playback, and STATUS_DONE after playback has finished.
Sound source data is managed by the voice object when in STATUS_WAIT and STATUS_PLAY, so do
not change any settings when in these states.

You can modify a registered sound source by calling nn::snd::Voice::UpdateWaveBuffer.


However, the only information that can be modified is the sample length (sampleLength) and loop
specification flag (loopFlag). Additionally, depending on when the data is updated using
SendParameterToDsp, data after the specified sample length might be played back, or
modifications might be invalid because playback is already completed.

You can delete a registered sound source by calling nn::snd::Voice::DeleteWaveBuffer. Do


not overwrite this data by using SendParameterToDsp to update the data until playback is
complete.

For streaming playback of DSP ADPCM sound source data, you only have to set the pAdpcmContext
member variable of the first sound source data registered to the voice object. The context is not
updated for sound source data registered later if their pAdpcmContext member variables are set to
NULL.

10.4. Creating Sound Threads

Design your application to create threads (sound threads) and have the threads wait for notifications
from the DSP library to the SND library. Notifications occur roughly every 4.889 ms. Set the priority of
the created threads as high as possible and avoid any intensive processing within the threads to keep
sound playback from being interrupted.

Basically, what sound threads do is loop the following processes.

1. The sound thread waits for notification from the DSP library using the
nn::snd::WaitForDspSync() function.
2. The application checks the state of the sound source data information registered to the allocated
voice object. After playback has finished, it registers the next sound source data information for
playback using the nn::snd::Voice::AppendWaveBuffer() function. When doing so it
normally only registers the sound source data information within the sounds thread, while another
thread has already loaded the sound source data to be used for sound playback into a buffer.
3. The sound thread calls the nn::snd::SendParameterToDsp() function to send any
parameters that have changed since the previous notification and any newly registered sound
source data information to the DSP to update sound playback.

Warning: Functions that manipulate voice objects are not thread-safe, with the exception of
nn::snd::AllocVoice and nn::snd::FreeVoice. You must ensure mutual exclusion
(for example, by using a critical section) when calling these functions by threads other
than the sound thread.

10.5. Sound Playback

Starting, stopping, and pausing sound playback is carried out by changing the state of a voice object.
Use the nn::snd::Voice::SetState and nn::snd::Voice::GetState() functions to set and
get the voice object state. Set the state to STATE_PLAY to start playback, STATE_STOP to stop
playback, and STATE_PAUSE to pause playback.

Table 10-3. Voice Object States

State Description
STATE_PLAY Order to start sound playback or an indication that playback is underway.

STATE_STOP Order to stop sound playback or an indication that playback is stopped.


Order to pause sound playback or an indication that playback is paused. Set the state to
STATE_PAUSE
STATE_PLAY to resume playback.

State changes may not be applied immediately. There may be a slight lag until parameters are sent
by the nn::snd::SendParameterToDsp() function (in other words, until the sound frame update
period (roughly 4.889 ms) has passed).

There are various other playback settings aside from states.

Priority

Use the nn::snd::Voice::SetPriority() function to set the priority of a voice object. Use the
nn::snd::Voice::GetPriority function to get the current setting. Sounds are played in the
order of voice object priority (for objects of the same priority, the most recently allocated object is
played first), so there is a chance that low-priority voice objects might not be played if the processing
load is too great.

Master Volume

The master volume is the overall volume for the SND library. Use the
nn::snd::SetMasterVolume() function to set this. Set to a value of 1.0 to play sounds at their
normal volume. Call the nn::snd::GetMasterVolume() function to get the current setting.

Volume

Represents the volume for each voice object. Use the nn::snd::Voice::SetVolume() function to
set the volume. Use the nn::snd::Voice::GetVolume() function to get the current setting. Set
the volume to 1.0 to specify unity gain.

Mix Parameters

Mix parameters control the gain for four channels


(FrontLeft/FrontRight/RearLeft/RearRight).

Use the nn::snd::Voice::SetMixParam() function to set these parameters. Use the


nn::snd::Voice::GetMixParam function to get the current settings. The gain values for the left
and right channels are stored in the members of the nn::snd::MixParam structure. Set the gain to
1.0 to play the channel at its normal volume.

The sound playback volume is the product of the master volume, volume, and mix parameter values.

Pitch

Pitch is the speed applied to the sampling rate of the sound source data. Use the
nn::snd::Voice::SetPitch() function to set the pitch. Use the nn::snd::Voice::GetPitch
function to get the current setting. Set the pitch to 1.0 to play the data at its normal rate. Setting the
pitch to 0.5 for sound source data with a sampling rate of 32 kHz would cause it to play at 16 kHz.

Checking

Use the nn::snd::Voice::IsPlaying() function to check whether a sound is currently playing.


Use the nn::snd::Voice::GetPlayPosition() function to get the current playback position. The
value that is returned indicates the starting position (measured in the number of samples) in main
memory of the next value that will be processed by the DSP.

AUX Buses

The two AUX buses (A and B) can be used to apply effects like delay to the sounds being played. To
set and get the volume of the AUX buses, call nn::snd::SetAuxReturnVolume() and
nn::snd::GetAuxReturnVolume(), respectively. Set the volume to 1.0 to specify unity gain.
Because the mix parameters contain members for setting the gain of the AUX buses, you can adjust
the audio data that is sent to the AUX buses for each channel independently.

To apply an effect to sounds on the AUX buses, set a callback function for each bus. Set callbacks
using nn::snd::RegisterAuxCallback, and clear callbacks using
nn::snd::ClearAuxCallback. To get the callback that is currently set, call
nn::snd::GetAuxCallback. The types of the callback functions are defined as follows.

Code 10-6. Callback Functions for AUX Buses

typedef void(*AuxCallback)(nn::snd::AuxBusData* data, s32 sampleLength,


uptr userData);

The sampling length (not the size in bytes) is passed into the sampleLength parameter, and the
sound data is passed into the data parameter. Buffer addresses are set for each channel; by
overwriting the data in these buffers, you can play sound data with effects applied as the output from
the AUX buses.

Voice Interpolation

You can set the interpolation method used when converting from the sound source data’s sampling
rate to the desired playback frequency. Call the nn::snd::Voice::SetInterpolationType or
nn::snd::Voice::GetInterpolationType() functions to set or get the interpolation method.

You can set the interpolation method to the default four-point (INTERPOLATION_TYPE_POLYPHASE),
linear (INTERPOLATION_TYPE_LINEAR), or none (INTERPOLATION_TYPE_NONE). With four-point
interpolation, the library chooses the optimal coefficients based on the sampling rate and pitch.

Filters

You can apply a filter to each voice object. Call the nn::snd::Voice::SetFilterType or
nn::snd::Voice::GetFilterType to set or get the filter setting. You can set the filter to
monopolar (FILTER_TYPE_MONOPOLE), bipolar (FILTER_TYPE_BIQUAD), or the default
(FILTER_TYPE_NONE). There are also functions for setting the coefficients for each filter.

Call the nn::snd::Voice::SetMonoFilterCoefficients or


nn::snd::Voice::GetMonoFilterCoefficients() functions to set or get the monopolar filter
coefficients. Even if you don’t alter the coefficients, you can specify a cutoff frequency to act as a
monopolar low-pass filter.

Call the nn::snd::Voice::SetBiquadFilterCoefficients or


nn::snd::Voice::SetBiquadFilterCoefficients() functions to set or get the bipolar filter
coefficients. This filter type only allows the coefficients to be specified.

When setting using coefficients, calculate the coefficients for each filter using a base frequency of
32,728 Hz.
Clipping Mode

Any portion of the final sound output that exceeds 16 bits is clipped. Two clipping methods are used:
either normal clipping or soft clipping that reduces high-frequency noise. Call the
nn::snd::SetClippingMode to set the clipping mode. Specify either CLIPPING_MODE_NORMAL or
CLIPPING_MODE_SOFT. Call the nn::snd::GetClippingMode() function to get the current
setting. Soft clipping is the default clipping mode.

The sound output is clipped non-linearly when using soft clipping, which reduces distortion, but this
also corrects sample values that are not at the maximum amplitude. This has almost no effect on
normal music and most other audio, but harmonics such as single-frequency sine waves may be
added for sound input.

The following figure shows the waveforms for a high-amplitude sine wave after clipping in each mode.
The green shows the results of soft clipping.

Figure 10-2. Clipping Mode Differences

Automatic Fade-In

You can automatically fade in audio from a volume of 0 at the start of playback to the previously set
volume over the course of a single sound frame. You can set fade-in processing for each voice object
using the nn::snd::Voice::SetStartFrameFadeInFlag() function, passing true as the
argument to enable fade-in. The default is false (no fade-in).

Using BCWAV Files

Use the SetupBcwav() function of the nn::snd::Voice class to easily use the waveform files
converted by the ctr_WaveConverter tool (BCWAV files) for sound playback.

Code 10-7. Using BCWAV Files

bool nn::snd::Voice::SetupBcwav(uptr addrBcwav,


nn::snd::WaveBuffer* pWaveBuffer0, nn::snd::WaveBuffer* pWaveBuffer1,
nn::snd::Bcwav::ChannelIndex channelIndex = nn::snd::Bcwav::CHANNEL_INDEX_L);

For the addrBcwav parameter, specify the starting address of a buffer to load the bcwav file to. The
data loaded into the buffer is used (as is) as the sound source data, so the buffer is also subject to
the requirements for sound source data. Specifically, the memory must be in device memory with the
starting address 32-byte aligned, the buffer size must be a multiple of 32 bytes, and the loaded data
must be written to memory using the nn::snd::FlushDataCache() function.

For the pWaveBuffer0 and pWaveBuffer1 parameters, specify pointers to the


nn::snd::WaveBuffer structures used for initial and loop playback. The structures must be
created by the application. If you know that the sound source does not have a loop flag set, you can
pass NULL to the pWaveBuffer1 parameter.

For the channelIndex parameter, when using stereo sound source data, specify which channel’s
sound source data to allocate to the Voice class. When using monaural sound source data, you must
specify CHANNEL_INDEX_L for this parameter. In the current version, stereo sound source data
converted by the ctr_WaveConverter tool is not interleaved. Consequently, you must prepare two
instances of the Voice class to play stereo BCWAV files.

The function returns true if successful and the data is prepared for playback. Of the parameters not
included in the BCWAV file header, values of 1.0 are used as the defaults for the volume and pitch.
Mix parameters are not set.

Use the nn::snd::Bcwav class to access BCWAV file header information.

Effects (Delay and Reverb)

You can apply effects to sound playback by either implementing effects in your application by using
an AUX bus callback function, or by using the effects provided by the library.

The SND library provides delay (nn::snd::FxDelay class) and reverb (nn::snd::FxReverb
class) effects. To use them, generate an instance of the appropriate class, set the effect parameters,
allocate working memory, and then call the nn::snd::SetEffect() function, specifying the effect
instance and the AUX bus to apply the effect to. You can only apply one type of effect per bus, and if
you call SetEffect multiple times, only the effect from the last call is applied. The CPU rather than
the DSP handles the actual effect processing.

To stop using an effect, call the nn::snd::ClearEffect() function, specifying the AUX bus to
clear. This function only clears effects, and does not cancel any AUX bus callback function.

Note: For more information about the effect parameters, see the API Reference.

10.5.1. Sound Output Mode

The SND library provides three sound output modes: mono, stereo, and surround. Call
nn::snd::SetSoundOutputMode to set the output mode and nn::snd::GetSoundOutputMode
to get the current setting.

Code 10-8. Sound Output Mode Definitions and Getter/Setter Functions

typedef enum
{
OUTPUT_MODE_MONO = 0,
OUTPUT_MODE_STEREO = 1,
OUTPUT_MODE_3DSURROUND = 2
} nn::snd::OutputMode;

bool nn::snd::SetSoundOutputMode(nn::snd::OutputMode mode);


nn::snd::OutputMode nn::snd::GetSoundOutputMode(void);
If nn::snd::SetSoundOutputMode returns true, the operation succeeded, and the sound output
mode is set to the value specified by the mode parameter. By default, the sound output mode is set
to stereo (OUTPUT_MODE_STEREO).

[Link]. Mono

The mix that is output through the left and right speakers is as follows. The four input channels
(FrontLeft, FrontRight, RearLeft, and RearRight) are the result of mixing the outputs of
all voices with the output of the AUX buses.

Output = ( FrontLeft + FrontRight + RearLeft + RearRight ) * 0.5

[Link]. Stereo

The mixes that are output through the left and right speakers are as follows. The four input
channels (FrontLeft, FrontRight, RearLeft, and RearRight) are the result of mixing the
outputs of all voices with the output of the AUX buses.

OutputLeft = ( FrontLeft + RearLeft )

OutputRight = ( FrontRight + RearRight )

[Link]. Surround

The four input channels (FrontLeft, FrontRight, RearLeft, and RearRight) undergo 3D
surround processing to impart spatial widening to the output. The four input channels are the
result of mixing the outputs of all voices with the output of the AUX buses. This 3D surround
operation calculates the speaker output based on the "position" and "depth" parameters of the
virtual speakers. Although the 3D surround operation is performed on the mixed output (as
opposed to individual voices), it is possible to bypass the output of the front channels
(FrontLeft and FrontRight) on a per-voice basis so that they are not affected by the 3D
surround operation.

After 3D surround processing, sound volume tends to seem greater when the pan position is set
on either left or right edges compared to when set in the center. If this difference in volume is a
concern, correct output accordingly to mitigate the effect, such as by limiting the left and right
edge mix parameters to a maximum of roughly 0.8.

Positions of the Virtual Speakers

Use nn::snd::SetSurroundSpeakerPosition to specify the virtual speaker position mode. If


this function succeeds in setting the mode, it returns true.

Code 10-9. Setting the Virtual Speaker Position

bool nn::snd::SetSurroundSpeakerPosition(nn::snd::SurroundSpeakerPosition pos);


Specify one of the following modes for the pos parameter.

Table 10-4. Modes for the Virtual Speaker Position

Setting Value Description


Square mode. Positions the virtual speakers in a square
SURROUND_SPEAKER_POSITION_SQUARE
pattern.
Wide mode. Positions the left and right virtual speakers
SURROUND_SPEAKER_POSITION_WIDE
further apart.

The virtual speaker mode determines the angle of symmetry of the four channels. The following
figure shows the angles at which the speakers are positioned for each mode.

Figure 10-3. Virtual Speaker Placement for Each Mode

Surround Depth

You can vary the intensity of the surround effect by specifying a depth value using
nn::snd::SetSurroundDepth. If this function succeeds in setting the depth value, it returns
true.

Code 10-10. Setting the Surround Depth Value

bool nn::snd::SetSurroundDepth(f32 depth);

Set the depth parameter to a value between 0.0 and 1.0, inclusive. Specifying 0.0 produces
the minimum effect, and specifying 1.0 produces the maximum effect. The default value is
currently 1.0.

The 3D surround operation automatically adapts depending on whether the sound is being output
through the speakers or through headphones. This depth value is only valid when sound is output
through the speakers. It is disabled when sound is output through the headphones and has no
effect on the surround effect.

Front Bypass Setting

The output for the rear channels is always affected by the 3D surround operation. In contrast,
each voice object that is output to the front channels can be configured on an individual basis to
bypass the 3D surround effect. You can configure the bypass setting by calling the
SetFrontBypassFlag member function of the nn::snd::Voice class.

Code 10-11. Front Bypass Setting for Voice Objects

void nn::snd::Voice::SetFrontBypassFlag(bool flag);

Set the flag parameter to true to make the output for the front channels bypass the 3D sound
operation. This argument defaults to false (in which case, the data for the front channels does
not bypass the 3D surround operation).

Figure 10-4. Effect of the Front Bypass Setting on the 3D Surround Operation

You can also set the front bypass for the front channel of the AUX bus.

Code 10-12. Setting the AUX Bus Front Bypass

bool nn::snd::SetAuxFrontBypass(nn::snd::AuxBusId busId, bool flag);

Specify the AUX bus ID in the busId parameter, and pass true in the flag parameter to
bypass. The default value for all buses is false (no bypass).

10.5.2. Headphone Connection State

Call the nn::snd::GetHeadphoneStatus or nn::snd::UpdateHeadphoneStatus() function


to find out whether headphones are connected to the audio jack. The function returns true when
headphones are connected.

The GetHeadphoneStatus() function returns the headphone state from the states regularly
updated by the SendParameterToDsp() function called by the sound thread, and so the results
may be up to 32 sound frames (approximately 160 ms) old. Use the UpdateHeadphoneStatus()
function if you need real-time results. Note that this function entails a heavier processing load, as it
updates the DSP state.

Code 10-13. Headphone Connection State

bool nn::snd::GetHeadphoneStatus();
bool nn::snd::UpdateHeadphoneStatus();
10.5.3. Getting the Output Audio Data

You can load the mixed audio data that will ultimately be played back through the speakers into a
buffer.

Code 10-14. Getting the Output Audio Data

bool nn::snd::GetMixedBusData(s16* pData,


s32 nSamplesPerFrame = NN_SND_SAMPLES_PER_FRAME);

For the pData parameter, specify a buffer for storing the audio data. The buffer ’s starting address
must be nn::snd::MIXED_BUS_DATA_ALIGNMENT (4 bytes) aligned. Calculate the required size
of the buffer as sizeof(s16) * nSamplesPerFrame * 2.

For the nSamplesPerFrame parameter, specify the number of samples per channel (usually
NN_SND_SAMPLES_PER_FRAME).

The function returns true if it successfully loads the audio data. Audio data is stored in stereo 16-
bit PCM format, with the left and right channels interleaved. Header information is not included. The
sampling rate is the sound DSP sampling rate (approximately 32,728 Hz).

10.5.4. Encoding to DSP ADPCM Format

Call the nn::snd::EncodeAdpcmData() function to encode monaural 16-bit PCM data in the
DSP ADPCM format.

Code 10-15. Encoding to DSP ADPCM Format

s32 nn::snd::GetAdpcmOutputBufferSize(s32 nSamples);


void nn::snd::EncodeAdpcmData(s16* pInput, u8* pOutput, s32 nSamples,
s32 sampleRate, s32 loopStart, s32 loopEnd,
nn::snd::DspsndAdpcmHeader* pInfo);

For the pInput parameter, specify the starting address of a buffer storing the monaural 16-bit PCM
data to encode. The function can only encode monaural audio data. Specify the number of samples
and the sampling rate in the nSamples and sampleRate parameters.

For the pOutput parameter, specify a buffer for storing the encoded DSP ADPCM data. The
buffer ’s starting address must be 4-byte aligned. Get the required size of the buffer by calling the
nn::snd::GetAdpcmOutputBufferSize() function, passing the number of audio data samples
as an argument.

For the loopStart and loopEnd parameters, specify the starting and ending sample positions for
looping. Specify sample position values relative to the start of the 16-bit PCM data, which is
considered as 0.

For the pInfo parameter, specify a pointer to a structure storing the DSP ADPCM data header
information.

10.5.5. Decoding From DSP ADPCM Format


Call the nn::snd::DecodeAdpcmData() function to decode audio data in the DSP ADPCM format
to monaural 16-bit CPM data.

Code 10-16. Decoding From DSP ADPCM Format

void nn::snd::DecodeAdpcmData(const u8* pInput, s16* pOutput,


const nn::snd::AdpcmParam& param,
nn::snd::AdpcmContext& context, s32 nSamples);

For the pInput parameter, specify the starting address of a buffer storing the DSP ADPCM data to
decode. For the param and context parameters, specify the DSP ADPCM data parameters and
context. For the nSamples parameter, specify the number of audio data samples.

For the pOutput parameter, specify a buffer for storing the decoded monaural 16-bit PCM data.
The buffer ’s starting address must be 4-byte aligned.

After execution completes, context stores the context as of the end of decoding.

10.5.6. Converting Between Sample Positions and Nibbles

The library provides functions for converting between PCM data sample positions and the number
of DSP ADPCM nibbles.

Code 10-17. Converting Between Sample Positions and Nibbles

u32 nn::snd::ConvertAdpcmPos2Nib(u32 nPos);


u32 nn::snd::ConvertAdpcmNib2Pos(u32 nNib);

10.5.7. Sound Output When the System Is Closed and Sleep Is


Rejected

If sleep is rejected when the system is closed, sound is forcibly output to the headphones, but you
can control that behavior with the following function.

Code 10-18. Controlling Sound Output When the System Is Closed and Sleep Is Rejected

nn::Result nn::snd::SetHeadphoneOutOnShellClose(bool forceout);

Specify false for the forceout argument to have sound output from the speakers even when the
system is closed (if the headphones are not connected). Specify true to have sound output from
the headphones regardless of whether the headphones are connected.

Warning: If you specify false for the forceout argument, make sure to reject the request to
sleep. If the request is not rejected, sound is output from the speakers in the time
between when the system is closed and when it transitions to the sleep state.

If this function is called from within the callback function that receives the sleep request,
however, that particular sleep request does not change the setting, and the setting takes
effect starting with the next sleep request.
10.5.8. Source of Noise and Countermeasures

The device design may cause noise in the following circumstances.

[Link]. A buzzing noise comes from the speaker

The speaker and physical structure of CTR has shown a tendency for clipping noise to occur
around 500 Hz. This noise is caused by the surround process amplifying the playback when
playing a sound containing the 500-Hz range with a pan.

It occurs under the following conditions.

The speaker is playing sound.


The playback is a sine wave in the 400-Hz to 600-Hz range, or a sine wave like sound, such
as a flute or French horn.
The volume is set to a high value.
The pan is set to the maximum value.
Surround is ON.

Possible ways of handling this include:

Do not pan across extremes.


Center the playback, and maintain the volume by playing back in moderate volume from the
two speakers.
Suppress the 400-Hz to 600-Hz range using the equalizer.
Change the tone (waveform) of the playback.

[Link]. A mechanical buzzing noise comes from the speaker

This noise, which is caused by the vibration from loud playback, is transmitted to the 3D depth
slider. The 3D depth slider and the body rattle against one another other.

This noise becomes most noticeable with a simple sine wave tone in the 400-Hz to 1-kHz range.
This noise is conclusively identified by holding down the 3D depth slider with a finger. The noise
is this type if the buzzing is muted by the pressure on the slider.

This noise is mitigated by reducing the speaker output volume, or playing a sound that is not a
simple sine wave tone.

[Link]. Light static sound is superimposed on the playback

This noise is caused by the design of the speaker.


This noise is somewhat prominent in single tones using sine waves at about 400 Hz to 1 kHz.

This noise is mitigated by reducing the speaker output volume, or playing a sound that is not a
simple sine wave tone.

[Link]. A snapping or popping noise is heard from the earphone


under certain circumstances

In the following cases, a snapping or popping noise is audible from the earphones.

When power is turned on or off


When entering or leaving Sleep Mode
When entering compatibility mode

This noise is due to the design of the system. There is no mitigation measure. If the noise is
heard when the power is turned on with the volume slider set to zero, the sound always occurs on
that unit in these circumstances.

There are production and development units that cause this noise. But this noise does not occur
with PARTNER-CTR.

10.6. Reusing Sound Source Data Information

Information for sound source data that has been played back can be reused by initializing with the
nn::snd::InitializeWaveBuffer() function. Use the status member of the sound source data
information to check whether sound source data playback has finished.

Table 10-5 Sound Source Data Information States

State Description
STATUS_FREE State immediately after initializing sound source data information.
State immediately after registering sound source data information to a voice
STATUS_WAIT
object.

STATUS_PLAY State when sound source data is being played.


State when sound source data playback is finished. Includes when sound
STATUS_DONE
playback has been stopped.
State when registered sound source data is scheduled for deletion using
nn::snd::Voice::DeleteWaveBuffer.
STATUS_TO_BE_DELETED
The state transitions to STATUS_DONE the next time
nn::snd::SendParameterToDsp is executed.

The following diagram shows these state transitions.

Figure 10-5. Sound Source Data Information State Transitions


10.7. Releasing Voice Objects

Use the nn::snd::FreeVoice() function to release voice objects that are no longer needed.
However, voice objects targeted for dropping by the library at the time of allocation must not be
released using this function.

10.8. Finalizing

Complete the following steps to finalize the DSP and SND libraries used for sound playback.

1. Call the nn::snd::Finalize() function to close the SND library.


2. Call the nn::dsp::UnloadComponent() function to unload the components loaded into the
DSP library and stop the library.
3. Call the nn::dsp::Finalize() function to close the DSP library.

Code 10-19. Finalizing Sound Processing

nn::snd::Finalize();
nn::dsp::UnloadComponent();
nn::dsp::Finalize();

CONFIDENTIAL

11. System Settings


This chapter describes how applications can get settings, such as user information and sound settings,
from the system settings.

11.1. Initialization

Use the CFG library to access information handled by System Settings or about the 3DS system
itself.

You must call the nn::cfg::Initialize() function to initialize the CFG library before you can use
its functionality, with the exception of a few functions not bound by this restriction. After initialization,
CFG library functions can be called up until finalization occurs.

Code 11-1. Initializing the CFG Library

void nn::cfg::Initialize(void);

11.2. Getting Information

After initializing the library, call the provided functions to access various kinds of information.

11.2.1. User Name

You can get the user name configured in the system settings.

Code 11-2. Getting the User Name

void nn::cfg::GetUserName(nn::cfg::UserName* pUserName);

struct nn::cfg::UserName
{
wchar_t userName[CFG_USER_NAME_LENGTH];
bool isNgUserName;
NN_PADDING1;
};

For the pUserName parameter, specify a pointer to an nn::cfg::UserName structure for storing
the user name.

The user name is stored as a wide-character string in the userName member of the
nn::cfg::UserName structure, and true is stored in the isNgUserName member if the user ’s
name includes any words that fail a profanity check.

In the Japan region, the profanity check uses Japanese, in the North American region it uses
American English and the System Settings language, and in Europe it uses British English and the
System Settings language.
11.2.2. Birthday

You can get the user ’s birthday configured in the system settings.

Code 11-3. Getting the User ’s Birthday

void nn::cfg::GetBirthday(nn::cfg::Birthday* pBirthday);

struct nn::cfg::Birthday
{
s8 month;
s8 day;
};

For the pBirthday parameter, specify a pointer to an nn::cfg:Birthday structure for storing
the birthday.

11.2.3. Country Code

You can get the country code for the user ’s country and region of residence configured in the
system settings.

Code 11-4. Getting the Country Code

nn::cfg::CfgCountryCode nn::cfg::GetCountry(void);

For more information about the country codes defined in the system, see the
nn/cfg/CTR/cfg_CountryCode.h header file.

Use the following function to make conversions between country codes and country name codes
(ISO 3166-1 alpha-2 format).

Code 11-5. Converting Between Country Codes and Country Name Codes

nn::Result nn::cfg::ConvertCountryCodeToIso3166a2(
char* iso3166a2, nn::cfg::CfgCountryCode countryCode);
nn::Result nn::cfg::ConvertIso3166a2ToCountryCode(
nn::cfg::CfgCountryCode* pCountryCode, const char* iso3166a2);

Note: The nn::cfg::GetCountryCodeA2() function will be removed from CTR-SDK.

Warning: Applications that use the function for converting between country codes and country
name codes must be sure to handle the nn::cfg::ResultNotFound return value if it
is returned.

11.2.4. Language Code


You can get the language code for the language used for display configured in the system settings.

Code 11-6. Getting the Language Code

nn::cfg::CfgLanguageCode nn::cfg::GetLanguage(void);

The following table shows the language codes defined in the system.

Table 11-1. Language Codes

Values Language Language Name (ISO 639-1 alpha-2)

CFG_LANGUAGE_JAPANESE Japanese ja
CFG_LANGUAGE_ENGLISH English en

CFG_LANGUAGE_FRENCH French fr
CFG_LANGUAGE_GERMAN German de
CFG_LANGUAGE_ITALIAN Italian it

CFG_LANGUAGE_SPANISH Spanish es
CFG_LANGUAGE_SIMP_CHINESE Chinese (Simplified) zh
CFG_LANGUAGE_KOREAN Korean ko

CFG_LANGUAGE_DUTCH Dutch nl
CFG_LANGUAGE_PORTUGUESE Portuguese pt
CFG_LANGUAGE_RUSSIAN Russian ru

CFG_LANGUAGE_TRAD_CHINESE Chinese (Traditional) zh

Use the following function to covert the obtained language code to a language name in ISO 639-1
alpha-2 format.

Code 11-7. Converting From Language Code to Language Name

const char* nn::cfg::GetLanguageCodeA2(CfgLanguageCode cfgLanguageCode);

The function returns NULL if there is no string corresponding to the language code specified in the
cfgLanguageCode parameter.

Note: This function can be called without initialization occurring beforehand.

11.2.5. Simple Address Information

You can get the user ’s simple address information (country, region, latitude, longitude) configured
in System Settings.

Code 11-8. Getting Simple Address Information

void nn::cfg::GetSimpleAddress(nn::cfg::SimpleAddress* pSimpleAddress);

struct nn::cfg::SimpleAddress
{
u32 id;
wchar_t countryName[CFG_SIMPLE_ADDRESS_NUM_LANGUAGES]
[CFG_SIMPLE_ADDRESS_NAME_LENGTH];
wchar_t regionName[CFG_SIMPLE_ADDRESS_NUM_LANGUAGES]
[CFG_SIMPLE_ADDRESS_NAME_LENGTH];
u16 latitude;
u16 longitude;
};

For the pSimpleAddress parameter, specify a pointer to an nn::cfg::SimpleAddress structure


for storing the simple address information.

The countryName and regionName members of the nn::cfg::SimpleAddress structure store


the country name and region name as wide-character strings, and the latitude and longitude
members store the latitude and longitude.

The latitude and longitude member values are displayed in increments of 360° ÷ 65546
(approximately 0.005°). Northern latitudes from 0° through 90° are stored as values from 0x0000
through 0x4000, southern latitudes from 0.005° through 90° as values from 0xFFFF through
0xC000, eastern longitudes from 0° through 179.995° as values from 0x0000 through 0x7FFF, and
western longitudes from 0.005° through 180° as values from 0xFFFF through 0x8000.

[Link]. Simple Address Information ID

The following function gets only the simple address information ID.

Code 11-9. Getting Simple Address Information ID

void nn::cfg::GetSimpleAddressId(nn::cfg::SimpleAddressId* pSimpleAddressId);

struct nn::cfg::SimpleAddressId
{
u32 id;

nn::cfg::CfgCountryCode GetCountryCode(void) const;


u8 GetRegionCode(void) const;
};

For the pSimpleAddressId parameter, specify a pointer to an nn::cfg::SimpleAddressId


structure.

The nn::cfg::SimpleAddressId structure has a GetCountryCode member function to get


the country code and a GetRegionCode member function to get the region code. For the values
available from these respective member functions, see 11.2.3. Country Code.

[Link]. Getting the Simple Address Information From the ID

The simple address information can be obtained from the simple address information ID

Code 11-10. Getting Simple Address Information From the ID

nn::Result nn::cfg::GetSimpleAddress(nn::cfg::SimpleAddress* pSimpleAddress,


nn::cfg::SimpleAddressId simpleAddressId,
uptr pWorkMemory, u32 workMemorySize);
The simple address information obtained based on the ID specified in simpleAddressId is
stored in pSimpleAddress.

Working memory is needed for operation of this function. A memory region at least as large as
nn::cfg::CFG_SIMPLE_ADDRESS_WORKMEMORY_SIZE must be reserved and the memory
location and size specified in pWorkMemory and workMemorySize respectively.

[Link]. Converting Simple Address Information IDs Between 3DS


and Wii U

Use the following function to convert simple address information IDs between 3DS and Wii U.

Code 11-11. Converting Simple Address IDs Between 3DS and Wii U

nn::cfg::SimpleAddressId nn::cfg::ConvertToWiiUSimpleAddressId(
nn::cfg::SimpleAddressId ctrSimpleAddressId);
nn::cfg::SimpleAddressId nn::cfg::ConvertToCtrSimpleAddressId(
nn::cfg::SimpleAddressId wiiUSimpleAddressId);

11.2.6. Region Codes

Note: The region codes here are those set in the 3DS system at the time of shipment. Region
codes set in Game Cards are specified in a BSF file, and applications will not run if the
system and card region codes do not match or if the card does not have a set region
code. For more information about BSF files, see the reference manual for the CTR-SDK
tool ctr_makebanner.

You can get the region code for the system’s target market.

Code 11-12. Getting the Region Code

nn::cfg::CfgRegionCode nn::cfg::GetResion(void);

The following table shows the region codes defined in the system.

Table 11-2. Region Codes

Values Target Market 3-Letter Code


CFG_REGION_JAPAN Japan JPN

CFG_REGION_AMERICA Americas USA


CFG_REGION_EUROPE Europe and Australia EUR
CFG_REGION_CHINA China CHN

CFG_REGION_KOREA South Korea KOR

CFG_REGION_TAIWAN Taiwan TWN


Use the following function to covert the obtained region code to a three-letter string.

Code 11-13. Converting From Region Code to 3-Letter String

const char* nn::cfg::GetRegionCodeA3(CfgRegionCode cfgRegionCode);

The function returns NULL if there is no string corresponding to the language code specified in the
cfgRegionCode parameter.

11.2.7. Sound Output Mode

You can get the sound output mode configured in System Settings.

Code 11-14. Getting the Sound Output Mode

nn::cfg::CfgSoundOutputMode nn::cfg::GetSoundOutputMode(void);

The following table shows the sound output modes defined in the system.

Table 11-3. Sound Output Modes

Values Sound Output Mode

CFG_SOUND_OUTPUT_MODE_MONO Monaural
CFG_SOUND_OUTPUT_MODE_STEREO Stereo

CFG_SOUND_OUTPUT_MODE_SURROUND 3D Surround Sound

11.2.8. RTC Modification Offset Value

You can get the offset value saved to the hardware as the cumulative total of user changes to the
RTC time.

Code 11-15. Getting the RTC Modification Offset Value

nn::fnd::TimeSpan nn::cfg::GetUserTimeOffset(void);

The function returns the modification offset in seconds. This is the cumulative total of the absolute
values of all user changes made to the clock with the system settings. For more information about
how to handle these values, see 8.5.3. Handling Time Modification Offset Values.

11.2.9. Parental Controls

An application must check whether restrictions are enabled in Parental Controls before photographs
and images can be exchanged or friends added. Call nn::cfg::IsParentalControlEnabled to
check whether the parental controls are enabled.
Code 11-16. Checking Whether Parental Controls Are Enabled

bool nn::cfg::IsParentalControlEnabled(void);

Table 11-4. Parental Controls Settings and Restricted Functionality

Restricted Application
Item Section
Functionality
N/A (restricted to the system
Age Restriction -
side)
N/A (restricted to the system
Use of Internet Browser -
side)
Using Nintendo e-Shopping to Purchase Merchandise
Purchasing Content Using ECDK [Link]
and Services

N/A (restricted to the system


3D Image Display -
side)

Posting to Miiverse and Viewing


Use of Miiverse [Link]
Miiverse
Sending and Receiving Rich
[Link]
Sending and Receiving Photos, Images, Voice UGC
Recordings, Videos, or Text Uploading Rich UGC (North
11.2.12
America)
Internet Communication With Other Users Data Exchange With Other Users [Link]

StreetPass Communication With Other Users Using StreetPass [Link]


Friend Registration Within the
Friend Registration [Link]
Application
Use of DS Download Play N/A -

Viewing videos obtained by


Viewing Distributed Videos [Link]
communication

[Link]. Temporarily Suspending Parental Controls by Entering a PIN

When Parental Controls are in force, applications can temporarily suspend Parental Controls by
having the user input a PIN. Parental Controls are then suspended if the entered PIN matches the
PIN in the Parental Controls settings. Note that neither the PIN entered by the user nor the PIN in
the Parental Controls settings are ever displayed on the screen.

You can compare to the PIN in the Parental Controls settings by calling the
nn::cfg::CheckParentalControlPinCode() function.

Code 11-17. Comparing to the Parental Controls PIN

bool nn::cfg::CheckParentalControlPinCode(const char *input);

This function returns true if the string passed to input matches the PIN (four single-byte
digits).

The software keyboard applet includes a mode for temporarily suspending Parental Controls.
When started in this mode, the software keyboard applet operates according to a specialized
sequence. The application can then determine whether to suspend Parental Controls based on
the return value from the applet.

[Link]. Restrictions on the Transmission of Data That May Include


Personal Information

Call the nn::cfg::IsRestrictPhotoExchange() function to check whether photo exchanges


(Sending and Receiving Photos, Images, Voice Recordings, Videos, or Text) are restricted. If
restricted, and so long as Parental Controls are not temporarily suspended, the application
cannot exchange photos or other images with any other system. This restriction includes not only
the sending and receiving of photos, but also screen images, voice recordings, videos, and text;
or any data that could include private information.

Code 11-18. Confirmation of Restrictions on the Transmission of Data That May Include Personal
Information

bool nn::cfg::IsRestrictPhotoExchange(void);

Be careful, as true is returned when the restriction is active.

[Link]. Restrictions on Adding Friends

Call the nn::cfg::IsRestrictAddFriend() function to check whether adding friends is


restricted by the Parental Controls settings. If restricted, friend registration from within an
application will end in an error. In principle, the application is not permitted to temporarily cancel
this restriction.

Code 11-19. Checking Whether Adding Friends Is Restricted

bool nn::cfg::IsRestrictAddFriend(void);

Be careful, as true is returned when the restriction is active.

[Link]. Restrictions on Internet Communication With Other Users

The nn::cfg::IsRestrictP2pInternet() function can be used to confirm whether


exchanging data or online play with other players through Internet communication has been
restricted by Parental Controls. If restriction on Internet communication has been enabled,
Internet communication must not be performed for the purpose of downloading user-created
content or engaging in online play with other players unless the restriction is temporarily lifted.

Code 11-20. Confirmation of Restrictions of Internet Communication With Other Users

bool nn::cfg::IsRestrictP2pInternet(void);

Be careful, as true is returned when the restriction is active.


[Link]. Restrictions on Communication via StreetPass

The nn::cfg::IsRestrictP2pCec() function can be used to confirm whether exchanging data


with other players through StreetPass has been restricted by Parental Controls. If restriction on
communication through StreetPass has been enabled, StreetPass does not work at all, and an
error will occur if an attempt is made to register StreetPass data. The application cannot
temporarily cancel this restriction.

Code 11-21. Checking Restrictions on Communication via StreetPass

bool nn::cfg::IsRestrictP2pCec(void);

Be careful, as true is returned when the restriction is active.

[Link]. Restrictions on Nintendo 3DS Shopping Services

You can use the nn::cfg::IsRestrictShopUse() function to determine whether Nintendo


3DS Shopping Services have been restricted by Parental Controls. When they have been
restricted, applications cannot add to their balance, purchase content, or take other similar
actions unless the restriction is temporarily revoked.

Code 11-22. Checking Restrictions on Nintendo 3DS Shopping Services

bool nn::cfg::IsRestrictShopUse(void);

Be careful, as true is returned when the restriction is active.

[Link]. Miiverse Restrictions

Nintendo provides a function to check whether viewing or posting to Miiverse is restricted in


Parental Controls. Call nn::cfg::IsRestrictMiiverseBrowse to check whether viewing
Miiverse is restricted. Call nn::cfg::IsRestrictMiiversePost to check whether posting to
Miiverse is restricted.

The device Parental Controls configuration values correspond to the function's return values are
described below.

Table11-2. Relationship Between Miiverse Restrictions Configuration and Return Values

Restriction Name in Device Configuration IsRestrictMiiverseBrowse IsRestrictMiiversePost


"Restrict Posting and Browsing" true true
"Posting only" false true

"Do Not Restrict" false false

The application cannot get Miiverse posts from other users, or post to Miiverse unless these
restrictions are temporarily lifted.
[Link]. Restriction of Video Content Acquired Through
Communication

Call nn::cfg::IsRestrictWatchVideo to check whether Parental Controls is restricting


viewing of video content acquired through communication. An application may not play back video
acquired through communication when this restriction is active, unless the restriction is
temporarily lifted.

Code 11-23. Check for Restriction of Video Content Acquired Through Communication

bool nn::cfg::IsRestrictWatchVideo(void);

Be careful, as true is returned when the restriction is active.

11.2.10. Checking for EULA Acceptance

Users must first accept the terms of the End-User Licensing Agreement (EULA) before the
application can use the Nintendo 3DS Network Service, which includes Nintendo Network and
StreetPass. Call the nn::cfg::IsAgreedEula() function to check whether the user has
accepted the EULA terms. If they have not, the application must not use these networking features.

Code 11-24. Checking for EULA Acceptance

bool nn::cfg::IsAgreedEula(void);

The function returns true if the user has accepted the terms of the EULA. The FS library must be
initialized before calling this function.

For more information about which features require EULA acceptance, see the Guidelines.

11.2.11. System-Specific ID

You can get the system-specific ID used to identify the system.

Code 11-25. Getting the System-Specific ID

bit64 nn::cfg::GetTransferableId(bit32 uniqueId);

For the uniqueId parameter, specify the unique 20-bit ID assigned to the application.

The system-specific ID is a 64-bit value that is guaranteed to be unique to a certain degree. This ID
is also transferable in cases such as when a new replacement system has been bought. However,
this ID cannot be recovered if the system was lost, stolen, or broken. The system–specific ID is
changed when the user runs format system memory, and the original ID cannot thereafter be
recovered.

Possible uses of this system ID include the following.


Ensuring that initial values of application parameters are different for each system.
Creating data accessible only from one specific system.
The identifier or its seed for use in local communication.

Note: When save data is created, the system-specific ID is saved, and subsequently the use of
that saved system-specific ID in the save data can support changes such as those
resulting from repairs. In addition, upon saving, a 128-bit value that matches the current
time is saved, which can prevent the same data from being saved to multiple cards.

If you are using the ID to create data accessible only from that system, note that this data will be
inaccessible if the system is formatted with format system memory, lost, or stolen.

11.2.12. Restrictions Based on COPPACS

The COPPA Compliance System (COPPACS) is a system provided by Nintendo to enable


applications created for North America (if the system region is North America, and the country
setting is U.S. or Canada) to comply with the Children’s Online Privacy Protection Act (COPPA).

Note: For a list of applications subject to COPPACS, see the UGC section of the guidelines.
Plans are to provide details about COPPACS in the future in the System Application and
Applet Specifications.

You can determine whether restrictions based on COPPACS are in effect.

Code 11-26. Getting Restrictions Based on COPPACS

bool nn::cfg::IsCoppacsSupported();
nn::cfg::CfgCoppacsRestriction nn::cfg::GetCoppacsRestriction(void);

You can confirm the support method for COPPACS by calling the
nn::cfg::GetCoppacsRestriction() function.

When CFG_COPPACS_RESTRICTION_NONE is returned, no restrictions are in effect, so COPPACS


support is unnecessary.

When CFG_COPPACS_RESTRICTION_NEED_PARENTAL_PIN_CODE or
CFG_COPPACS_RESTRICTION_NEED_PARENTAL_AUTHENTICATION is returned, restrictions are in
effect, so COPPACS support must be provided.

The former can be released by entering the parental control authentication number. Enter and
check the authentication number within the application. If the correct authentication number is
entered, a temporary suspension can be performed (see [Link]. Temporarily Suspending Parental
Controls by Entering a PIN). The latter cannot be released from within the application. The
COPPACS authentication procedure within the system settings must be used.

Note that you must temporarily exit the application when performing the authentication procedure
for this setting. You can confirm the setting by restarting the application when returning to the
application from System Settings. When continuing a process which was being performed before
jumping to System Settings, always reconfirm the support method for COPPACS. In some cases,
when reconfirming, you may be asked to perform the authentication procedure within System
Settings.

To simply confirm which countries are subject to COPPACS with the current System Settings, you
can use the nn::cfg::IsCoppacsSupported() function. Note that if the system setting is for
one of the countries on the list, the value true is returned even if COPPACS restrictions have not
been enabled.

Note: A description of how to jump to the COPPACS authentication procedure screen


(PARENTAL_CONTROLS_COPPACS) is provided in 5.3.7. Jump to System Settings. For
implementation examples, see the sample demo (coppacs in cfg).

11.3. Finalizing

Call the nn::cfg::Finalize() function when done using the CFG library.

Code 11-27. Finalizing the CFG Library

void nn::cfg::Finalize(void);

If this function is called before initialization, nothing will happen.

The number of times the library’s initialization function has been called is recorded. Until the finalize
function is called the same number of times, the library remains in use.

CONFIDENTIAL

12. Applets

This chapter describes the libraries that are required in order to use the applets provided by the 3DS
system.

The library provided by each applet can be used to start the applet from an application and use its
features. The system generally allocates the memory required to start applets, but some applets
require that the application pass the working memory. In addition, the application effectively stops while
applets are running because they use the same CPU as the application after the operations of the
called thread are stopped.

12.1. Library Applets

Features often used by applications are provided with 3DS as the following library applets.
Software Keyboard Applet
Photo Selection Applet
Mii Selection Applet
Sound Selection Applet
Error/EULA Applet
Circle Pad Pro Calibration Applet
EC Applet
Login Applet

Note: The Mii Selection applet is provided in the CTR Face Library package.

12.1.1. Information Common to All Library Applets

The processing required to start and return from a library applet is basically the same as that used
to start and return from the HOME Menu. Likewise, while a library applet is running, there are
similar restrictions on using devices such as getting key input and rendering, and only the thread
that called the library applet will stop it. Try not to execute unnecessary threads while library
applets are executing. When creating threads that will continue to operate even while a library
applet is running, note that the library applet creates threads with the priority settings shown in the
following table. Set your thread priority so that it does not affect these thread processes.

Table 12-1. Priorities of Threads Created by Library Applets

Library Applet Priority Comments


Used for notifications such as sleep
(Common to all) 15
notifications.
Software Keyboard Applet 17 to 20

16, 18, 20, 21,


Photo Selection Applet
25

Mii Selection Applet 17


Sound Selection Applet 16, 18

Error/EULA Applet 17, 20


Circle Pad Pro Calibration
17 to 19
Applet
EC Applet 17, 22 Used for communication and related functions.

Login Applet 17, 22 Used for communication and related functions.

The application must support close processing when


nn::applet::IsExpectedToCloseApplication() returns true after control returns to the
application in the same way as it returns to the application when
nn::applet::WaitForStarting() returns true while calling the function after the HOME Menu
starts. However, when the POWER Button is pressed while the library applet is being displayed, the
application transitions to a state where it has no rendering rights and at the same time where
nn::applet::IsExpectedToProcessPowerButton returns true. If the application responds to
the close request in this case, the screen refresh will halt during close processing. Please handle
the POWER Button first in your implementation.

[Link]. Recovering From Library Applets

When the HOME Button, the POWER Button, or the software reset button combination (L + R +
START) is pressed while a library applet is starting, the application immediately recovers and the
following behavior results.

Table 12-2. Behavior When Recovering From Library Applets

Button Behavior

HOME Returns a value as the return code indicating that the HOME Button was pressed. After
Button recovery, nn::applet::IsExpectedToProcessHomeButton returns true.

Returns a value as the return code indicating that the POWER Button was pressed. After
POWER
recovery, the nn::applet::IsExpectedToProcessPowerButton() function returns
Button
true.

Software
Returns a value as the return code indicating that there was a software reset.
Reset

[Link]. Preloading

Library applets that support preloading can perform processes such as loading in advance. As a
result, you can shorten the time that the screen freezes between the function call that starts the
library applet and its display on the screen.

The names of the functions for preloading differ for each library applet, but they follow the same
basic standards. Functions that begin the preloading process start with Preload. Functions that
wait for preloading to complete start with WaitForPreload. And functions that cancel the
preloading process start with CancelPreload.

When starting a preloaded library applet, you must wait for preloading to complete before
starting. Note also that if you call a function that waits for completing without having first called a
function that begins preloading, control never returns. However, this caution applies when both
function calls take place on one thread. If the waiting-for-completion thread is different from the
thread that begins preloading, control returns even if the order of the calls is reversed. Displaying
the HOME Menu using nn::applet::ProcessHomeButton cancels preloading. If you navigate
to the HOME Menu between calls to Preload and WaitForPreload, control does not return
from WaitForPreload.

You cannot preload multiple library applets at the same time. To start a library applet other than
one you have preloaded, you must first cancel the preload. Otherwise, after a library applet is
started the preload is released, so there is no need to cancel the preload after recovery from a
library applet.

12.1.2. Software Keyboard Applet

This library applet is called when an application requires text input from the user, such as for a user
name or password. It displays a software keyboard on the lower screen. It includes several features
such as entering passwords, restricting the number and type of characters that can be entered, and
filtering prohibited words. On the upper screen, you can darken or continue displaying the
application screen that is displayed when the software keyboard was started.

To use this library applet, you must include the nn/swkbd.h header file and add the libnn_swkbd
library file.

Parameters passed when starting the library applet are defined by the nn::swkbd::Parameter
data structure. Detailed operational settings are made with the config (nn::swkbd::Config)
member of parameters. However, initialization is required before configuration.

Code 12-1. Initializing Software Keyboard Operational Settings

void nn::swkbd::InitializeConfig(nn::swkbd::Config* pConfig);

The Config structure passed in pConfig is initialized to the default settings. For operational
settings that can be made in members of the Config structure, see the Applet Specifications.

The application must allocate work memory for this library applet. The size of the work memory
required differs depending on the operational settings, and can be obtained using the
nn::swkbd::GetSharedMemorySize() function.

Code 12-2. Getting the Work Memory Size for the Software Keyboard

s32 nn::swkbd::GetSharedMemorySize(
const nn::swkbd::Config* pConfig,
const void* pInitialStatusData = DELETEME,
const void* pInitialLearningData = DELETEME);

Pass a pointer to the Config structure used to make operational settings in pConfig.

If the operational settings of the software keyboard being used the last time it was run have been
saved, you can restore that previous state by passing the start address for that data in
pInitialStatusData. If this data does not need to be restored or it was not saved, specify
NULL.

If training data for predictive text input has been saved with the operational settings of the software
keyboard that were used the last time it was run, the training state last in effect can be restored by
passing the start address for that data in pInitialLearningData. If this data does not need to
be restored or it was not saved, specify NULL.

The start address of work memory must be allocated with nn::swkbd::MEMORY_ALIGNMENT


(4096-byte) alignment and its size must be a multiple of nn::swkbd::MEMORY_UNITSIZE (4096
bytes). Memory allocated from device memory must not be specified for the work memory.

After parameters have been set and work memory allocated, you can start the software keyboard
using the nn::swkbd::StartKeyboardApplet() function.

Code 12-3. Starting the Software Keyboard Applet

bool nn::swkbd::StartKeyboardApplet(
nn::applet::AppletWakeupState* pWakeupState,
nn::swkbd::Parameter* pParameter,
void* pSharedMemoryAddr,
size_t sharedMemorySize,
const wchar_t* pInitialInputText = DELETEME,
const nn::swkbd::UserWord* pUserWordArray = DELETEME,
const void* pInitialStatusData = DELETEME,
const void* pInitialLearningData = DELETEME
nn::applet::AppTextCheckCallback callback = DELETEME);
In pParameter, specify a pointer to the Parameter structure, which has the Config structure
used to configure operational settings as one of its members. Information such as the input text
string is stored in the structure specified by this argument.

Specify the start address and size of work memory in pSharedMemoryAddr and
PSharedMemorySize.

If a UTF-16LE text string that is not NULL is passed in pInitialInputText, the software
keyboard is run with the specified text string set in the input field.

In pUserWordArray, specify an array of words to register in the user dictionary.

Specify the same value obtained for the work memory size in pInitialStatusData and
pInitialLearningData.

In callback, specify the callback function to use when the application is checking an input string.
When the application does not perform a check due to its operation settings, this argument is
ignored.

Startup fails if the return value of this function is false. Startup succeeds if the return value is
true.

The software keyboard thread operates at a priority between 17 and 20. So even while the software
keyboard is running, attention must be paid to the priority settings of threads that continue to
operate.

12.1.3. Photo Selection Applet

This library applet is used to select the data for one photo from among those registered in the
Nintendo 3DS camera album for use by the application. You can specify extraction conditions such
as the capture time and type when starting the applet. You can also filter the data to be listed. Note
that the applet can only select photos that are saved on an SD card.

To use this library applet, you must include the nn/phtsel.h header file and add the
libnn_phtsel library file.

Parameters to be passed when this library applet is started are defined in the
nn::phtsel::Parameter structure. Basic operational settings are made in the m_config
member (nn::phtsel::Config structure). Detailed settings such as extraction conditions are
made in the m_input member (nn::phtsel::PhtselInput structure). The execution result is
stored in the m_output member (nn::phtself::PhtselOutput structure). For the available
settings and other information, see the sample demo.

The application must allocate work memory when the applet background displays images captured
by the application. The size required by work memory can be obtained using the
nn::phtsel::GetWorkBufferSize() function.

Code 12-4. Getting the Work Memory Size for Photo Selections

size_t nn::phtsel::GetWorkBufferSize();

The start address of work memory must be allocated with 4096-byte alignment and its size must be
a multiple of 4096 bytes. Memory allocated from device memory must not be specified for the
work memory.

After the parameters have been set and the work memory allocated, you can start the photo
selection library applet using the nn::phtsel::StartPhtsel() function. Call the
nn::phtsel::StartPhtselNoCapture() function if the applet background does not display
images captured by the application.

Code 12-5. Starting the Photo Selection Library Applet

nn::applet::AppletWakeupState nn::phtsel::StartPhtsel(
nn::phtsel::Parameter* pParameter, void* pWorkBuffer);
nn::applet::AppletWakeupState nn::phtsel::StartPhtselNoCapture(
nn::phtsel::Parameter* pParameter);

In pParameter, specify the pointer to the Parameter structure for which settings were performed.
Specify the start address of work memory in pWorkBuffer.

Execution results are stored in the m_output member of pParameter upon return to the
application.

12.1.4. Mii Selection Applet

This library applet is called when using Mii characters in an application to select from among
registered Mii characters. You can select only one registered Mii or one guest Mii. (Six are
available by default.) You can also select whether to display a list of guest Mii characters and
whether to display the operating screen in the upper or lower screen.

Note: To use the Mii Selection applet from inside your application, you must use the CTR Face
Library (middleware).

For instructions on using this library, see the CTR Face Library documentation.

12.1.5. Sound Selection Applet

This library applet is called when selecting sound data recorded using Nintendo 3DS Sound for a
sound to be used, such as an application sound effect. Only one instance of sound data can be
selected. This applet cannot be used to record sounds, sort data, delete data, or otherwise
manipulate data. Note that the applet can only select sound data that is saved on an SD card.

To use this library applet, you must include the header file (nn/voicesel.h) and add the library
file (libnn_voicesel).

The nn::voicesel::Parameter structure defines the parameters to pass into the library applet
when starting it. Use the parameter's config member (nn::voicesel::Config structure) to
configure basic operations, and the input member (nn::voicesel::Input structure) to configure
advanced features, such as extraction conditions. The results are stored in the output member
(nn::voicesel::Output structure). For the available settings and other information, see the
sample demo.

After you are finished configuring the parameters, you can start the sound selection library applet
by calling nn::voicesel::StartVoiceSel.

Code 12-6. Starting the Sound Selection Applet

nn::applet::AppletWakeupState nn::voicesel::StartVoiceSel(
nn::voicesel::Parameter* pParameter);
In pParameter, specify the pointer to the Parameter structure for which settings were performed.

When control returns to the application, pParameter 's output member will store the results of
execution.

12.1.6. Error/EULA Applet

When using wireless-based communication features such as StreetPass, you must check the EULA
(the licensing agreement related to Nintendo 3DS network services) and gain the consent of the
user. If an application that uses these communication features has not confirmed that the user has
agreed to the latest EULA and gained the user ’s consent, you can use this library applet to display
the EULA and get consent. Acceptance of the EULA is required in order to use communication
features, but applications are not required to display the EULA. If the application does not display
the EULA, display an error at the time of communication and guide the user to System Settings.

Not only does this library applet display the EULA, it can also display error messages. As long as
infrastructure communication-related libraries (such as AC and FRIENDS) are being used, the
correct corresponding error message is displayed when an error code is passed to the applet. You
can also display proprietary error messages from the application.

To use this library applet, you must include the nn/erreula.h header file and add the
libnn_erreula library file.

Parameters to be passed when this library applet is started are defined in the
nn::erreula::Parameter structure. Error codes, error messages, and operational settings are
made in the config member (nn::erreula::Config structure). Be sure to configure the setting
after initializing with the nn::erreula::InitializeConfig() function. For more information
about parameter settings, see the Applet Specifications and sample demos.

After you set the parameters and allocate the work memory, you can start the Error/EULA applet
using the nn::erreula::StartErrEulaApplet() function.

Code 12-7. Starting the Error/EULA Applet

void nn::erreula::StartErrEulaApplet(
nn::applet::AppletWakeupState* pWakeupState,
nn::erreula::Parameter* pParameter);

In pParameter, specify the pointer to the Parameter structure for which settings were performed.

If the EULA display is made by the setting, the agreement sequence of the Network Services
Agreement is displayed and the result of agreement or disagreement to the Network Services
Agreement is returned to the application.

All of the infrastructure communication features associated with the application, and StreetPass
communication, return the EULA disagreement error if the user does not agree to the Network
Services Agreement in the version that the application requires. NEX login, registration of download
tasks, creation of StreetPass box, etc., will all be subject to the agreement. The version of the
Network Services Agreement that requires an agreement will automatically be embedded in ROM.
Be sure to confirm whether the Network Services Agreement is required with the
nn::cfg::IsAgreedEula() function, and call back this applet if the agreement is not accepted.
Only when the applet is called back in the state of not agreeing to the Network Services
Agreement, will "EULA agreement" or "EULA disagreement" be returned correctly.
12.1.7. Circle Pad Pro Calibration Applet

This library applet is used for calibrating the feel of controls on the Right Circle Pad installed in the
Circle Pad Pro. Applications supporting the Circle Pad Pro must provide a start scene for this
applet so that users can calibrate the feel of controls on the Right Circle Pad.

Note: The C Stick on SNAKE functions like the Right Circle Pad on a permanently connected
Circle Pad Pro. If an application running on SNAKE calls the Circle Pad Pro Calibration
Applet, a message describing how to calibrate the C Stick is displayed.

To use this library applet, you must include the header file (nn/extrapad.h). In addition, if you
are using CTR-SDK 11.3.x or earlier, you must also add the library file (libnn_extrapad).

The parameter passed when the library applet is started is defined by the
nn::extrapad::Parameter structure. Operating settings are performed in the config
parameter member variables (nn::extrapad::Config structure). Always perform initialization
with the nn::extrapad::InitializeConfig() function before performing settings for whether
to support the HOME Button and software reset.

When parameter setting procedures are complete, you can start the library applet with the
nn::extrapad::StartExtraPadApplet() function.

Code 12-8. Starting the Circle Pad Pro Calibration Applet

void nn::extrapad::StartExtraPadApplet(
nn::applet::AppletWakeupState* pWakeupState,
nn::extrapad::Parameter* pParameter);

In pParameter, specify the pointer to the Parameter structure for which settings were performed.

12.1.8. EC Applet

The EC applet is a library applet for purchasing and managing downloadable content and service
items.

Note: For more information about EC features, see the CTR-SDK API Reference.

12.1.9. Login Applet

This library applet communicates with the account server, working on behalf of the application to
authenticate the account and get various service tokens and the like.

Note: For more information about the login applet, see the 3DS Programming Manual: Wireless
Communication.
12.2. System Applets

A system applet is generally an applet that is started from the HOME Menu, but libraries are provided
to start and use the applets from an application. The processing required to start and return from a
system applet from an application is basically the same as that used to start and return from the
HOME Menu.

12.2.1. Internet Browser Applet

The WEBBRS library allows you to start the built-in Internet browser by specifying a URL from an
application. The startup method is similar to that of a library applet, but it differs in that closing the
Internet browser that was started results in returning to the HOME Menu with the application
suspended.

Warning: Depending on which system updates have been applied, there may be systems that
do not have the Internet browser installed. For this reason, you must check whether the
Internet browser is installed on the system before starting it.

Note: For more information about the WEBBRS library, see the CTR-SDK API Reference.

12.2.2. Miiverse Application and Post App

The OLV library enables the use of Miiverse features in an application. This makes it possible to
start the Miiverse application or Post app and receive post data from within an application. These
are referred to as "applications" but are actually included with the system applets.

Note: For more information about the OLV library, see the CTR-SDK API Reference.

[Link]. Starting an Application From the Miiverse Post Page

Applications can be started from a post in Miiverse.

The application must be configured as follows to enable this feature.

Specify EnableMiiverseJumpArgs:True in the BSF file.


Specify FLAG_APP_STARTABLE flag with the nn::olv::UploadPostDataByPostApp()
function when posting.

Operation in the Miiverse Application

When you view a post with this flag set in the Miiverse application, a Launch button appears in
the post. Any user who owns the application can use this button to start it. Users who do not own
the application can view the post but cannot start the application.

Figure 12-1. Starting an Application From the Post Page

CONFIDENTIAL

13. Supplemental Libraries

This chapter describes supplemental libraries provided for using internal system resources, such as
power and the pedometer, and shared resources such as internal fonts.

13.1. PTM Library

The PTM library is provided for getting power-related information and using RTC-based alarms.

13.1.1. Initializing and Finalizing

The PTM is initialized and finalized using the nn::ptm::Initialize() function and the
nn::ptm::Finalize() function.

Code 13-1. Initializing and Finalizing the PTM Library

nn::Result nn::ptm::Initialize();
nn::Result nn::ptm::Finalize();

Neither function returns an error as long as the hardware is not damaged.


13.1.2. Getting Power-Related Information

The following power-related information can be obtained: connection status of the power adapter,
charge status of the battery, and the battery level.

Code 13-2. Getting Power-Related Information

nn::ptm::AdapterState nn::ptm::GetAdapterState();
nn::ptm::BatteryChargeState nn::ptm::GetBatteryChargeState();
nn::ptm::BatteryLevel nn::ptm::GetBatteryLevel();

You can get the connection status of the power adapter by calling the
nn::ptm::GetAdapterState function. The following table shows the possible return values.

Table 13-1. nn::ptm::AdapterState Enumerator

Values Description
ADAPTERSTATE_NOCONNECTED Power adapter not connected

ADAPTERSTATE_CONNECTED Power adapter connected

You can get the battery charge status by calling the nn::ptm::GetBatteryChargeState()
function. The following table shows the possible return values.

Table 13-2. nn::ptm::BatteryChargeState Enumerator

Values Description

BATTERYCHARGESTATE_NOCHARGING Not charging


BATTERYCHARGESTATE_CHARGING Charging

You can get the battery level by calling the nn::ptm::GetBatteryLevel() function. The
following table shows the possible return values.

Table 13-3. nn::ptm::BatteryLevel Enumerator

Values Description
BATTERYLEVEL_0 (BATTERYLEVEL_MIN) Battery level is 0%

BATTERYLEVEL_1 Battery level is 1% to 5%


BATTERYLEVEL_2 Battery level is 6% to 10%
BATTERYLEVEL_3 Battery level is 11% to 30%

BATTERYLEVEL_4 Battery level is 31% to 60%


BATTERYLEVEL_5 (BATTERYLEVEL_MAX) Battery level is 61% to 100%

13.1.3. Alarms Using the RTC

For information about this feature, see 8.5.2. RTC Alarm Feature.
13.2. PL Library

The PL library is provided so that applications can use features and resources built into the 3DS
system such as the pedometer and internal (shared) fonts. Although there are no functions for
initializing or finalizing the PL library itself, other libraries may need to be initialized in order to use
these features and resources.

13.2.1. Pedometer

Pedometer information is recorded in terms of the number of steps for each hour for up to 120
months (approximately 10 years). Because pedometer information is stored in the power-related
module, you must initialize the PTM library ahead of time in order to access pedometer information.
For information about initializing the PTM library, see 13.1.1. Initializing and Finalizing.

Use the following functions to get pedometer-related information.

Code 13-3. Functions for Getting Pedometer Information

bool nn::pl::GetPedometerState();
u32 nn::pl::GetTotalStepCount();
s8 nn::pl::GetStepHistoryEntry(nn::pl::PedometerEntry* pEntry);
void nn::pl::GetStepHistory(u16 pStepCounts[], s32 numHours,
nn::fnd::DateTime start);
nn::Result nn::pl::GetStepHistoryAll(nn::pl::PedometerHistoryHeader& header,
nn::pl::PedometerHistoryData& data);

The nn::pl::GetPedometerState() function returns whether the accelerometer is functioning


as a pedometer. The accelerometer functions as a pedometer when the power is on and the system
is closed. Applications therefore do not need to track whether the accelerometer is functioning as a
pedometer because they have usually transitioned into Sleep Mode.

The nn::pl::GetTotalStepCount() function returns the number of steps accumulated so far.

The nn::pl::GetStepHistoryEntry() function stores the year and month of recorded step
count entries in the nn::pl::PedometerEntry array specified by pEntry and returns the
number of entries stored as its return value. Allocate enough memory for the array specified by
pEntry so that the maximum number of entries (NUM_MONTHHISTORIES) can be stored.

The nn::pl::GetSetpHistory() function stores step count information for each hour for the
number of hours specified by numHours (future direction only) starting from and including the hour
specified by start in the array specified by pStepCounts. Zero is stored for hours for which there
is no step count information.

The nn::pl::GetHistoryAll() function is used to get all step count information. The step count
information obtained is split into header information and data. The order of data corresponds to the
order of header information, but header information is not necessarily arranged in order of year and
month. The year and month associated with step count information can be confirmed using
monthInfo included in header information. If INVALID_COUNTER is stored in unusedCounter for
an entry, it indicates that the data is invalid and no step count data has been recorded for that
entry.

Note: CTR-SDK includes the PedometerChanger development tool, which lets you view and
manipulate the pedometer information.
13.2.2. Internal Fonts

Internal fonts are provided as shared resources that can be accessed by applications. The type of
internal fonts loaded as standard differs depending on the region set for the system.

The types of internal fonts that can be used are defined as shown below by the
nn::pl::SharedFontType enumerator type.

Table 13-4. nn::pl::SharedFontType Enumerator

Values Description

European fonts. Standard for the Japanese, European, and United States
SHARED_FONT_TYPE_STD
regions.
SHARED_FONT_TYPE_CN Chinese fonts. Standard for the Chinese region.
SHARED_FONT_TYPE_KR Korean fonts. Standard for the Korean region.

SHARED_FONT_TYPE_TW Taiwanese fonts. Standard for the Taiwanese region.

[Link]. Loading Internal Fonts

Internal fonts are loaded using the nn::pl::InitializeSharedFont() function. Although


fonts for all regions are built into systems for any region, the InitializeSharedFont()
function only loads those fonts associated with the system region.

Code 13-4. Loading Internal Fonts

nn::Result nn::pl::InitializeSharedFont();
nn::pl::SharedFontLoadState nn::pl::GetSharedFontLoadState();

Calling the nn::pl::InitializeSharedFont() function starts loading internal fonts.


However, loading of internal fonts may not yet be complete when the function finishes executing.
The return value (nn::pl::SharedFontLoadState) of the
nn::pl::GetSharedFontLoadState() function can be used to verify that fonts have finished
loading.

Table 13-5. nn::pl::SharedFontLoadState Enumerator

Values Description
SHARED_FONT_LOAD_STATE_NULL Load not started.

SHARED_FONT_LOAD_STATE_LOADING Loading.
SHARED_FONT_LOAD_STATE_LOADED Load complete.

SHARED_FONT_LOAD_STATE_FAILED Load failed.

You can get the start address, size, and font type of loaded font data using the
nn::pl::GetSharedFontAddress, nn::pl::GetSharedFontSize, and
nn::pl::GetSharedFontType() functions, respectively.

Code 13-5. Getting Loaded Font Data Information


void* nn::pl::GetSharedFontAddress();
size_t nn::pl::GetSharedFontSize();
nn::pl::SharedFontType nn::pl::GetSharedFontType();

If you want to display characters only included in the internal font of a region other than the
system region to, for example, display messages sent from a system of another region, you must
mount the internal font archive using the nn::pl::MountSharedFont() function and have the
application load the corresponding internal font.

Code 13-6. Loading Internal Fonts (Other Regions)

nn::Result nn::pl::MountSharedFont(const char* archiveName,


nn::pl::SharedFontType sharedFontType, size_t maxFile,
size_t maxDirectory, void* workingMemory, size_t workingMemorySize);
nn::Result nn::pl::UnmountSharedFont(const char* archiveName);
nn::Result nn::pl::GetSharedFontRequiredMemorySize(
s32* pOut, nn::pl::SharedFontType sharedFontType, size_t maxFile,
size_t maxDirectory);

archiveName specifies the archive name of the font archive to be mounted.

sharedFontType specifies the type of internal font to be mounted. The files loaded differ
depending on the type of internal font specified here. If the specified font does not exist,
nn::pl::ResultSharedFontNotFound is returned as the return value.

Table 13-6. Internal Font Types and Associated Filenames (When "font" Is Specified in the Archive Name)

Internal Font Type Filename

SHARED_FONT_TYPE_STD font:/cbf_std.[Link]
SHARED_FONT_TYPE_CN font:/cbf_zh-[Link]
SHARED_FONT_TYPE_KR font:/cbf_ko-[Link]

SHARED_FONT_TYPE_TW font:/cbf_zh-[Link]

Font files are compressed in LZ77 format and should be decompressed using the CX library
before using them as fonts.

Specify the number of files and directories that can be opened simultaneously in maxFile and
maxDirectory, respectively. Pass the working memory and its size to workingMemory and
workingMemorySize. Use the nn::pl::GetSharedFontRequiredMemorySize() function to
get the size of the working memory required.

Unmount the archive using the nn::pl::UnmountSharedFont() function after font files have
finished loading.

[Link]. Using Font Data

The same format as BCFNT files converted using ctr_FontConverter is used for internal font
data. The nn::font::ResFont class of the FONT library must be used to display internal fonts.

For information about how to display fonts using the FONT library, see the API Reference and
sample demos.
13.2.3. Play Coins

Play Coins accumulate as the user walks around with their system. You are free to make use of
these accumulated Play Coins. For example, users might exchange Play Coins for bonus items in
your application.

To use Play Coins, you must include the header file nn/pl/CTR/pl_GameCoin.h and add the
library file libnn_plCoin.

Note: CTR-SDK includes a development tool (PlayCoinSetter) for setting the number of Play
Coins a system has.

[Link]. Initialization and Finalization

The initialization and finalization of the library are performed by calling the
nn::pl::InitializeGameCoin() and nn::pl::FinalizeGameCoin() functions.

Code 13-7. Initializing and Finalizing the Play Coin Library

void nn::pl::InitializeGameCoin();
void nn::pl::FinalizeGameCoin();

Before initializing the Play Coin library, you must initialize the FS and PTM libraries. Because
there is no support for nested initialization calls, even if you call the initialization function multiple
times, calling the nn::pl::FinalizeGameCoin() function once will finalize use of the library.

[Link]. Getting the Number of Play Coins Held

You can get the number of Play Coins held by the system with the
nn::pl::GetGameCoinCount() function.

Code 13-8. Getting the Number of Play Coins Held

nn::Result nn::pl::GetGameCoinCount(u16* pCount);

The currently held number of Play Coins is stored in pCount.

This function has a high load and entails a write to system NAND memory. Other than when
starting the application, call it only when there is a possibility that the number of Play Coins might
have increased or decreased, such as when recovering from Sleep Mode, the HOME Menu, or a
library applet.

When this function returns nn::pl::ResultGameCoinDataReset, it indicates that the Play


Coins have been reset due to corrupt data or some other reason, but it is handled as a success
(meaning that the IsSuccess() function returns true). When informing the user that the Play
Coins have been reset, check whether this value has been returned.
[Link]. Spending Play Coins

Play Coins are spent with the nn::pl::UseGameCoin() function.

Code 13-9. Spending Play Coins

nn::Result nn::pl::UseGameCoin(u16* pCount, u16 useCount);

When the process is successful, the number of Play Coins after spending is stored in pCount.
When the process fails, an undefined value is stored there.

Specify the number of Play Coins to spend in useCount. When the number specified is more
than the number held, no Play Coins are spent, and nn::pl::ResultLackOfGameCoin is
returned to indicate that there were not enough Play Coins.

This function’s load is high and it entails a write to system NAND memory.

CONFIDENTIAL

14. Infrared Communication Between Systems


This chapter describes communication between systems using the IR library, which allows applications
to use the CTR infrared communications module.

Note: CTR-SDK includes a development tool (IrCommunicationChecker) for checking the


information communicated between systems by infrared communication.

Warning: If you are developing an extended SNAKE application that uses infrared for
communication between systems, make sure that you test communication between CTR and
SNAKE to verify that the processing speed differences do not cause any communication
problems.

Warning: If you want to use infrared communication from an application while another feature is
using infrared communication, you must end the other feature first.

Infrared communication is used by the following features.

Circle Pad Pro


NFP (Only CTR)

14.1. nn::ir::Communicator Class


Functions for communication between systems using the infrared communications module are all
defined as static functions of the nn::ir::Communicator class.

Warning: The nn::ir::Communicator class only supports 1:1 communication between


systems. It cannot be used for communicating with devices other than 3DS systems, or for
communicating among multiple CTR systems.

14.1.1. Security

The library handles encryption and guarantees completeness of transmitted data. Each sent packet
also includes a sequence number that is incremented with each send to prevent resend attacks.

14.1.2. Packets

Packets transmitted by the IR library include an IR library header, which is used for communication
control, and a system communication header, which includes security and other information.

Figure 14-1. IR Library Packet

Get the size of the user data in a packet after appending header information using the following
functions.

Code 14-1. Functions to Get User Data Size From Data With Header Information Appended

size_t nn::ir::Communicator::GetPacketSize(size_t dataSize);


size_t nn::ir::Communicator::CalculateBufferSizeToCommunicate(size_t dataSize);

The user data size is specified in dataSize.

The GetPacketSize() function calculates the size required to save the packet, and the
CalculateBufferSizeToCommunicate() function calculates the space required for send and
receive buffers.

14.1.3. Support for Sleep Mode


When transitioning to Sleep Mode, connections are interrupted and any processing is forcibly
stopped. Unsent packets and unreceived data are all discarded.

When a connection is interrupted due to Sleep Mode, no request to disconnect is sent to the
partner. When sending a disconnect notification to the partner, perform this processing before
transitioning to Sleep Mode, and only transition to Sleep Mode when the disconnection is
confirmed.

14.1.4. Communication Range

The following table shows the infrared communication range for two 3DS systems placed with the
infrared receivers (on the back of the units) directly facing each other.

Table 14-1. Communication Range for Infrared Communication

Item Range
Distance 0 to 20 cm from the infrared receiver when the lower screen (Touch Screen) is horizontal

14.2. Initialization

The IR library is initialized by calling the Initialize() function of the nn::ir::Communicator


class.

Code 14-2. IR Library Initialization Function

nn::Result nn::ir::Communicator::Initialize(
void* pBuf, size_t bufSize,
size_t receiveBufferDataSize, size_t receiveBufferManagementSize,
size_t sendBufferDataSize, size_t sendBufferManagementSize);

Specify the buffer used by the library to manage transmitted packets, the buffer size in pBuf, and
bufSize. The buffer passed to the library must be allocated by the application beforehand. The
starting address must be nn::ir::Communicator::BUFFER_ALIGNMENT (4096-byte) aligned, and
its size must be a multiple of nn::ir::Communicator::BUFFER_UNITSIZE (4096-byte). Buffers
allocated from device memory cannot be used.

Buffers allocated from device memory cannot be used. The buffer passed to the library is divided into
several areas, and the sizes of each area are specified in receiveBufferDataSize,
receiveBufferManagementSize, sendBufferDataSize, and sendBufferManagementSize.
For more information about specifying sizes, see 14.2.1. Buffer Passed to the Library.

The communication speed is fixed at 115200 bits per second. In 3DS infrared communication, a start
bit and an end bit are added to each byte sent. Consequently, the transmission rate for raw data is
the baud rate value multiplied by 0.8, which is 92160 bits per second (11520 bytes/s).

14.2.1. Buffer Passed to the Library


IR library transmission processes are implemented with asynchronous functions. The buffer passed
to the library during initialization is used mainly for temporary storage of sent packets.

The buffer passed to the library is divided into the regions shown in the following figure.

Figure 14-2. Buffer Used by the IR Library

The size of each region is calculated as follows.

[Link]. Reserved Region

This region is fixed size, and is required for operation of the library.

You can get its size with the following function.

Code 14-3 Function for Getting the Size of the Reserved Region

size_t nn::ir::Communicator::GetReservedSize();

[Link]. Send and Receive Packet Management Region

This region is used for storing management information for transmitted packets.

Its size is calculated based on the maximum number of packets that can be maintained at any
time. For connection processing, specify a size returned by GetManagementSize(1) for
both send and receive.

Get the management information size using the following function.

Code 14-4. Function to Get the Packet Management Information Size

size_t nn::ir::Communicator::GetManagementSize(s32 dataNum);

Specify the maximum number of packets maintained at any time in dataNum.


[Link]. Send and Receive Packet Save Region

This region is used to store sent and received packets. For connection processing, the size
must be at least 32 bytes for both send and receive.

If the data sizes are variable, specify a size equal to or greater than that calculated using the
maximum packet size.

If the data sizes are fixed, specify the size of a single data packet multiplied by the maximum
number of packets.

Calculate the packet size from the data size using the following function.

Code 14-5. Function to Get the Size of a Packet

size_t nn::ir::Communicator::GetPacketSize(size_t dataSize);

Specify the size of the data in dataSize.

Notes About the Receive Packet Save Region

Due to noise, packets stored in the received packet storage area could be invalid. Because of
this, we recommend allocating a larger area than the actual number of packets that will be
received to ensure that normal packets received after such an invalid packet will be saved
reliably. For example, one approach would be to allocate the unused region to the receive packet
save region after deciding the sizes for the other regions.

The following code example sends and receives fixed-size data and maximizes the allocation of
the receive packet save region.

Code 14-6. Code Sample for Calculating Region Sizes

// Buffer, send/receive data sizes, and number of packets to save.


static u8 buffer[4096] NN_ATRRIBUTE_ALIGN(4096);
size_t sendDataSize = 100;
size_t sendPacketNum = 10;
size_t recvDataSize = 50;
size_t recvPacketNum = 20;

// Calculate required sizes.


size_t sendBufferSize =
nn::ir::Communicator::GetPacketSize(sendDataSize) * sendPacketNum;
size_t sendManagementSize =
nn::ir::Communicator::GetManagementSize(sendPacketNum);
size_t receiveBufferSize =
nn::ir::Communicator::GetPacketSize(recvDataSize) * recvPacketNum;
size_t receiveManagementSize =
nn::ir::Communicator::GetManagementSize(recvPacketNum);
size_t reservedSize = nn::ir::Communicator::GetReservedSize();

// Check whether the total region size is less than the buffer size.
NN_ASSERT((sendBufferSize + sendManagementSize + receiveBufferSize +
receiveManagementSize + reservedSize) <= sizeof(buffer));

// Add an unused region size to the size of the receive packet save region.
receiveBufferSize = sizeof(buffer) -
(sendManagementSize + sendBufferSize +
receiveManagementSize + reservedSize);
14.3. Connections

To start infrared communications between 3DS systems, the systems must authenticate each other as
communication devices. The called functions are divided into those for the system waiting for a
connection request, and those for the system sending a connection request. The waiting system must
call the WaitConnection() function, and the sending system must call the
RequireConnection() function. Both of these functions are asynchronous calls, so control returns
before connection processing has completed.

Code 14-7. Connection Processing Functions

static nn::Result nn::ir::Communicator::WaitConnection();


static nn::Result nn::ir::Communicator::WaitConnection( nn::fnd::TimeSpan
sendReplyDelay);
static nn::Result nn::ir::Communicator::RequireConnection();
static nn::ir::ConnectionStatus nn::ir::Communicator::GetConnectionStatus();
static nn::ir::TryingToConnectStatus
nn::ir::Communicator::GetTryingToConnectStatus();

Processing is asynchronous, so the GetConnectionStatus() function is provided to confirm the


overall state of the library and whether connection processing has completed. It returns
CONNECTION_STATUS_CONNECTED if connection processing has completed. The
GetTryingToConnectStatus() function provides processing state details. For other possible
return values, see 14.7. Getting the Connection.

The following table shows a list of return values for the GetTryingToConnectStatus() function.

Table 14-2. GetTryingToConnectStatus Function Return Values

Return Value Description

TRYING_TO_CONNECT_STATUS_NONE Not processing a connection.


TRYING_TO_CONNECT_STATUS_SENDING_REQUEST Sending a connection request.
TRYING_TO_CONNECT_STATUS_WAITING_REPLY Waiting for a reply to a connection request.

TRYING_TO_CONNECT_STATUS_WAITING_REQUEST Waiting for a connection request.


TRYING_TO_CONNECT_STATUS_SENDING_REPLY Sending a reply to a connection request.

14.3.1. Connecting Automatically

When it is not clear which of two 3DS systems to use to receive a connection request, such as
when IR communications is requested from the menu on both systems at the same time, use
automatic connection.

With automatic connection, both systems switch automatically between waiting for a connection and
requesting a connection with different timing, in order to connect. As a result, the connection
relationship changes based on the combination when it occurs.

Code 14-8. Automatic Connection Functions

static nn::Result nn::ir::Communicator::AutoConnection();


static nn::Result nn::ir::Communicator::AutoConnection(
nn::fnd::TimeSpan sendReplyDelay,
nn::fnd::TimeSpan waitRequestMin, nn::fnd::TimeSpan waitRequestMax,
nn::fnd::TimeSpan waitReplyMin, nn::fnd::TimeSpan waitReplyMax);

The following figure shows an example of the operation of automatic connection.

Figure 14-3. Operation of Automatic Connection

If called without parameters, values to be used are calculated by the library based on the baud rate
set at initialization. Automatic connection may not succeed in some cases, when it is called without
specifying parameters, due to other processing by the application. In such cases, specify the
parameters while considering the following points.

sendReplyDelay is the interval between receiving a connection request packet and sending a
connection reply packet. The default value is 3 ms. The value must be at least 3 ms, to allow the
partner to switch between sending and receiving.

waitRequestMin and waitRequestMax are the minimum and maximum times for sending a
connection request packet and waiting to receive a connection reply packet. waitRequestMin
must be larger than the total of the connection request packet send time, the connection reply
packet receive time, and the time for switching between sending and receiving (3 ms).

waitReplyMin and waitReplyMax are the minimum and maximum times to wait for a connection
request packet and to send a connection reply packet. waitReplyMin must be greater than the
total of the time to receive a connection request packet, the time to send a connection reply packet,
and the interval for switching between sending and receiving (3 ms).

The size of a connection request packet is defined by CONNECTION_REQUEST_PACKET_SIZE (8


bytes), and the size of a connection reply packet is defined by CONNECTION_REPLY_PACKET_SIZE
(5 bytes). The time required to send or receive can be calculated from the actual data transmission
speed. Packet sizes are defined in bytes. Use care when calculating from transmission speeds,
which are in bits.

14.3.2. Connection Roles

You can obtain connection roles using the GetConnectionRole() function. Operation of the IR
library does not differ based on connection role. Use them as the policy that determines the role
within infrared communication.

Code 14-9. Function for Getting a Connection Role

static nn::ir::ConnectionRole nn::ir::Communicator::GetConnectionRole(void);

The following table shows the return values for the GetConnectionRole() function.

Table 14-3. GetConnectionRole Function Return Values


Return Value Description
The connection role is not set because connection processing has
CONNECTION_ROLE_NONE
not started or is in progress.
Connected after requesting a connection.
CONNECTION_ROLE_TO_REQUIRE The system has received a connection reply packet after a
RequireConnection or AutoConnection() function call.
Connected after receiving a connection request.
CONNECTION_ROLE_TO_WAIT The system has sent a connection reply packet after a
WaitConnection or AutoConnection function call.

14.4. Confirming the Communication ID

To prevent communication with any other application, each system must confirm the communication
ID of the other system after connection processing has completed. Any send or receive-related
functions that are called before confirming communication IDs results in an error.

Use the CreateCommunicationId() function to get a communication ID to use for confirmation.

Code 14-10. Function for Getting a Communication ID

static bit32 nn::ir::Communicator::CreateCommunicationId(


bit32 uniqueId, bool isDemo = false);

Specify the unique ID allocated by Nintendo for each title for uniqueId. When communicating
between different titles, specify the unique ID for either one of them.

To prevent infrared communication between the retail version and the downloaded demo version of a
title, specify true for isDemo in the demo version. Always set this parameter to false in the retail
version, regardless of whether you want to communicate with demo versions.

Note: For test programs and titles that use infrared communication and have not been allocated
a unique ID by Nintendo, specify a prototype software code in the range from 0xFF000 to
0xFF3FF for uniqueId. In retail software, you must always specify the unique ID
assigned by Nintendo.

To confirm communication IDs, one 3DS calls the RequireToConfirmId() function and the other
calls the WaitToConfirmId() function. Both of these functions complete successfully if the
communication IDs, communication mode IDs, and passphrases specified by each system match.
Send and receive functions can only be used after this point.

Code 14-11. Functions for Confirming Communication IDs

static nn::Result nn::ir::Communicator::RequireToConfirmId(


u8 subId, bit32 communicationId,
const char passphrase[], size_t passphraseLength);
static nn::Result nn::ir::Communicator::WaitToConfirmId(
u8 subId, bit32 communicationId,
const char passphrase[], size_t passphraseLength,
nn::fnd::TimeSpan timeout);
static bool nn::ir::Communicator::IsConfirmedId();
Use the communication ID generated by CreateCommunicationId for communicationId.

When an application using infrared communication has multiple scenes, communication between
different scenes can be prevented by specifying a communication mode identifier in subId. Specify a
different value for each scene.

The passphrase specified in passphrase is used as the key for encrypting infrared communications
packets. Avoid using strings that are easy to guess. The length of a passphrase must be in the range
between IR_PASSPHRASE_LENGTH_MIN and IR_PASSPHRASE_LENGTH_MAX.

Specify the length of the passphrase in passphraseLength.

Note that both the RequireToConfirmId() and WaitToConfirmID() functions implicitly consume
one of the nn::os::Event class instances assigned to the application.

Communication IDs must be reconfirmed after a disconnect and reconnection, even if they have been
confirmed earlier.

The IsConfirmedId() function can be used to check whether ID confirmation has already been
completed.

14.5. Transmission

Data can be sent by infrared communications using the Send() function after completing
confirmation of communication IDs.

Code 14-12. Functions for Transmitting Data

static nn::Result nn::ir::Communicator::Send(


void *pBuffer, size_t dataSize, size_t bufferSize,
bool restore = false);
static void nn::ir::Communicator::GetSendEvent(nn::os::Event* pEvent);
static nn::Result nn::ir::Communicator::GetLatestSendErrorResult(bool clear);
static void nn::ir::Communicator::GetSendSizeFreeAndUsed(
size_t *pSizeFree, s32 *pCountFree,
size_t *pSizeUsed, s32 *pCountUsed);

Specify the send buffer and its size in pBuffer and bufferSize respectively. The send buffer must
be at least the size obtained by passing the size of the raw data to be sent to the
CalculateBufferSizeToCommunicate() function. In addition, this buffer must have a starting
address that is nn::ir::Communicator::SEND_BUFFER_ALIGNMENT (4 bytes) aligned.

Specify the data size in dataSize. Data in the range from 0 to 16316 bytes can be sent.

The data starting at the beginning of pBuffer and of length dataSize is sent. After data is sent, the
buffer contains the encrypted data with headers appended. If the original data is needed after it is
sent, specify true for the restore parameter. Note that this consumes more processing time than
only sending.

The actual transmission is done asynchronously and the Send() function returns control to the caller
after the packet has been saved in the buffer being used by the library. An event notifies when the
data is actually sent, and can be obtained using the GetSendEvent() function. The event passed to
pEvent is initialized as an automatically resetting event. The Result class instance returned by the
GetLatestSendErrorResult() function indicates whether a send error occurred. If called with the
clear parameter set to true, the error in the library’s internal Result class instance is cleared.

The library monitors the state of the buffer, and transmits send packets as quickly as possible
through the IR communications module. If the send function is called repeatedly in a short period of
time, the library could temporarily store multiple packets in the buffer at the same time. These
packets are sent in the order they were saved in the buffer.

If the total size of packets saved exceeds the size of the buffer when sending large packets or
sending packets repeatedly, the library discards any further packets without saving them. For this
reason, we recommend using GetSendSizeFreeAndUsed to check the available space in the send
packet management region and the save region before sending a packet. In particular, note that
available space in the save region of at least the packet size obtained from the GetPacketSize()
function is required in order to save the packet.

14.6. Receipt

Data can be received by calling the Receive() function after completing confirmation of the
communications IDs. After the connection process, the library can continue to receive data unless it
is maintaining a packet to be sent. So strictly speaking, the Receive() function is not a function that
performs receipt. It retrieves data that has already been received from the buffer. It is not possible to
determine whether received data is valid until it has been retrieved using the Receive() function.

Code 14-13. Functions Using Receive Processes

static nn::Result nn::ir::Communicator::Receive(


void* pDst, size_t size, size_t *pReceiveSize,
s32 *pRemainCount);
static void nn::ir::Communicator::GetReceiveEvent(nn::os::Event* pEvent);
static nn::Result nn::ir::Communicator::GetLatestReceiveErrorResult(
bool clear);
static nn::Result nn::ir::Communicator::GetNextReceiveDataSize(size_t *pSize);
static nn::Result nn::ir::Communicator::DropNextReceiveData(s32 *pRemainCount);
static void nn::ir::Communicator::GetReceiveSizeFreeAndUsed(
size_t *pSizeFree, s32 *pCountFree,
size_t *pSizeUsed, s32 *pCountUsed);

Specify the receive buffer and buffer size in pDst and size respectively. The receive buffer ’s
starting address must be nn::ir::Communicator::RECEIVE_BUFFER_ALIGNMENT (4 bytes)
aligned. The data is temporarily saved in encrypted form with appended headers (and so forth) in the
buffer, so the buffer must be at least as large as the size calculated by the
CalculateBufferSizeToCommunicate() function (the size of the raw data to be received) when
passed. The required buffer size can be retrieved beforehand using the
GetNextReceiveDataSize() function.

One packet of data is retrieved for each call to the Receive() function. The size of the retrieved
data is stored in pReceiveSize. When storing the data, the Receive() function stores the amount
remaining to be retrieved in pRemainCount. To read all of the stored data, call the Receive()
function repeatedly until 0 (zero) bytes are remaining.

The GetReceiveEvent() function returns an event containing notification of when received packets
were saved in the buffer, so it can be used if data must be retrieved as soon as it has been received.
The event passed to pEvent is initialized as an automatically resetting event. The
GetLatestReceiveErrorResult() function returns an instance of the Result class, indicating
whether an error has occurred during receive processing. If called with the clear parameter set to
true, the error in the library’s internal Result class instance is cleared.

If the GetNextReceiveDataSize() function returns a data size that is clearly different from the
expected data (such that the data in the next packet is not needed), one packet’s worth of data can
be discarded by using the DropNextReceiveData() function.
If packets are received more frequently than data is retrieved and the total size of packets to be
saved exceeds the size of the buffer, the library discards any further packets without saving them.
Use the GetReceiveSizeFreeAndUsed function to determine the available space for the receive
packet management region and the save region, and use it to manage receipt.

If the current connection is interrupted while received data remains, the remaining data can be
received in the interval until the next connection begins.

14.7. Getting the Connection Status

Use the GetConnectionStatus() function to get the status of the connection. In addition, the
GetConnectionStatusEvent() function returns an event that indicates when the connection
status changed. The event passed to pEvent is initialized as an automatically resetting event.

Code 14-14. Functions Used to Get the Connection Status

static void nn::ir::Communicator::GetConnectionStatusEvent(


nn::os::Event* pEvent);
static nn::ir::ConnectionStatus nn::ir::Communicator::GetConnectionStatus();

The following table shows the return values from the GetConnectionStatus() function.

Table 14-4. GetConnectionStatus Return Values

Return Value Description


CONNECTION_STATUS_STOPPED The infrared module is stopped.
CONNECTION_STATUS_TRYING_TO_CONNECT Indicates that a connection is being established.

CONNECTION_STATUS_CONNECTED A connection has been established.


CONNECTION_STATUS_DISCONNECTING Indicates that a connection is being disconnected.
CONNECTION_STATUS_FATAL_ERROR A fault or other fatal error has occurred.

14.7.1. Handling Faults

When the connection status is CONNECTION_STATUS_FATAL_ERROR, a fault may have occurred in


the infrared communication module. In this state, all functions that return a result except for
Finalize always return nn::ir::ResultFatalError. Infrared communication cannot be
used in any further processing. Immediately finalize the IR library.

This state could indicate a fault in the infrared module, but power-cycling the system may also
recover the module. We recommend displaying a message to the user indicating the possibility of a
fault and how to handle it.

14.8. Disconnecting
To disconnect infrared communication, call the Disconnect() function.

Code 14-15. Functions Using the Disconnect Processes

static nn::Result nn::ir::Communicator::Disconnect(void);


static nn::Result nn::ir::Communicator::ClearSendBuffer(void);
static nn::Result nn::ir::Communicator::ClearReceiveBuffer(void);

The Disconnect() function sends a disconnect request to the communication partner. If a


connection is being established, the process is interrupted. Disconnection is handled asynchronously,
so control is returned before disconnection processing has completed.

If Disconnect is called while some send packets are still unprocessed, the send packets are
discarded and the connection is disconnected. If necessary, confirm that all packets have been sent
before calling this function. To explicitly discard packets not sent, call the ClearSendBuffer()
function.

Packets received before disconnection can be retrieved using the Receive() function after
disconnecting. Note, however, that they will be discarded when processing the next connection and
connection authentication begins. To explicitly discard all received packets, call the
ClearReceiveBuffer() function.

14.9. Finalization

To finalize use of the IR library, call the Finalize() function.

Code 14-16. Functions Using the Finalization Processes

static nn::Result nn::ir::Communicator::Finalize(void);

If this function is called before the library is initialized, the nn::ir::ResultNotInitialized()


function returns an error.

CONFIDENTIAL

15. Differences Between TWL Mode and Actual


TWL Systems

An application works differently on a 3DS system in TWL mode (DSi-compatible mode) than it does on
an actual Nintendo DSi system. This chapter describes those differences. Consider these differences
when you are developing applications for systems in the Nintendo DS family that will also run on 3DS
systems. (In particular, refer to this information when developing a Nintendo DSiWare service or game
that is available on Nintendo eShop.)

Note: For information about how to import a Nintendo DSiWare service or game, see the API
Reference page for ctr_makecia.

15.1. Display-Related Differences

15.1.1. LCD Screens

Because the LCD screen layouts differ (see 5.5 Initializing the GX Library), the direction of
scanning on the 3DS screens is different from that on the TWL screens. Also note that because
emulation is used, the images are rendered with a delay of approximately 1.3 frames.

15.2. Input-Related Differences

15.2.1. Touch Panel

Normally, on TWL or NITRO systems, it is difficult for applications to detect touches at the very
edge of the screen. But when the application has been started in 1:1 pixel display mode on the
3DS, the SDK (TWL or NITRO) functions might be able to detect touches at the very edges of the
application’s screen (a display area that is a subset of the physical screen) because the user has
touched the screen outside of the application’s display area.

15.2.2. Key Input

When the +Control Pad is being emulated by the Circle Pad (see 6.2.1. Digital Buttons and Circle
Pad), although the state of the +Control Pad is updated in real time when the Circle Pad is not
being moved (when no input is detected), it updates roughly once per frame when the Circle Pad is
being moved. For this reason, input from the +Control Pad may not always be reflected accurately
when the user is operating the Circle Pad.

When the Circle Pad and the +Control Pad are both being used and the user simultaneously
presses Up and Down (or Left and Right) on the +Control Pad, the application is notified of a
prohibited operation. The input is treated as Up when Up and Down are pressed simultaneously,
and Left when Left and Right are pressed simultaneously. If your application uses such prohibited
input to trigger entry to debug mode or some other mode, your application will not operate correctly,
because it will never receive the prohibited input.

15.2.3. Closing the System


For the application, the system state while displaying the HOME Menu is no different from when the
system is closed. The PAD_DetectFold() function returns TRUE just like it normally does when
the system is closed, but the application is not notified of key input or touch panel input.

15.2.4. Sound Volume

The system has a slider for sound volume. This allows users to change the volume faster than is
physically possible on the Nintendo DSi.

15.3. OS-Related Differences

15.3.1. Reset and Shutdown

On CTR, processing of the OS_RebootSystem() function takes much longer than it does on a
TWL system.

15.3.2. Application Jumps

On CTR, the filename and content of the system menu version are different from TW. If your
application incorporates routines that check this information, it might operate incorrectly when
running in TWL mode on a CTR system.

15.4. System Settings-Related Differences

15.4.1. Region Codes

On 3DS, the Australia region is included in the Europe region. Applications intended for the TWL
Australia region can also operate in the 3DS Europe region.

Table 15-1. TWL Application Regions and Corresponding Operable 3DS System Regions

TWL Application Region Operable 3DS System Regions


JP (Japan) JP (Japan)

US (United States) US (United States)


EU (Europe) EU (Europe)
AU (Australia) EU (Europe)

EU (Europe) and US (United States) EU (Europe) or US (United States)


EU (Europe) and AU (Australia) EU (Europe)
EU (Europe), AU (Australia), and US (United States) EU (Europe) or US (United States)

CN (China) CN (China)
KR (Korea) KR (Korea)

15.4.2. Country Setting

If the country setting of the 3DS system is set to a value that would not be available in the region of
the TWL application, the value of the country member of the OSOwnerInfoEx structure is set to
254 (OTHER).

15.4.3. Language Setting

If the language setting of the 3DS system is set to a value that would not be available in the region
of the TWL application, the language member of the OSOwnerInfoEx structure is set to either
OS_LANGUAGE_JAPANESE for the Japan region or to OS_LANGUAGE_ENGLISH for all other regions.

CONFIDENTIAL

16. Appendix: Process Flows for Using the Circle


Pad Pro

Because status indicators are not built into the Circle Pad Pro, it is unable to determine from the
outside whether it is active, whether its battery level is low, or whether it is connected to the CTR
system. To ensure ease of use, some mechanism is required for the detection, connection, and
redetection processes of the Circle Pad Pro.

Note: Because the C Stick, ZL Button, and ZR Button are included in the SNAKE hardware, the
system is treated as having a Circle Pad Pro always attached. The remaining battery life
also never decreases.

Consequently, your application can omit some of the process flows described in this section
if you verify the application is running on SNAKE hardware in advance.

This chapter provides Nintendo’s recommended process flows. Use them as a reference upon
implementation. Yellow backgrounds indicate an example of the message to be displayed for that
process.

Refer to the following when deciding whether to use "Circle Pad Pro" or "C Stick" in user-facing
messages.
Display as "Circle Pad Pro"
CTR Application
When the Circle Pad Pro is used on CTR and the C Stick on SNAKE in a SNAKE-compatible title

Display as "C Stick"


SNAKE-Only Title
When the C Stick is used on SNAKE but the Circle Pad Pro is not used on CTR in a SNAKE-
compatible title

16.1. Process Flow for Initial Startup

To allow the save data to be saved when using the Circle Pad Pro, we recommend conforming to the
following process flow upon initial startup.

Figure 16-1. Process Flow for Initial Startup

16.2. Process Flow for Normal-Usage Startup

As shown in 6.1. Process Flow for Initial Startup, to allow the save data to be saved using the Circle
Pad Pro, we recommend conforming to the following process flow upon normal-usage startup.

Parts of the flow that are not necessary on SNAKE are indicated with a dotted red line around a light
red background.
Figure 16-2. Process Flow for Normal-Usage Startup

16.3. Process Flow for Detecting the Circle Pad Pro

Note: The following process flow does not apply to SNAKE.


We recommend the following process flow for detecting the Circle Pad Pro, except for when it is
detected after initial startup. The following figure assumes that the Circle Pad Pro is detected from a
thread other than the main thread. To detect explicitly, or to set up an option for connection, refer to
16.4. Options Screen Process Flows.

Figure 16-3. Process Flow for Detecting the Circle Pad Pro

16.4. Options Screen Process Flows

When using the Options screen to enable the Circle Pad Pro and transition to
connection/reconnection, we recommend the following process flows.

16.4.1. Transition to Enabling the Circle Pad Pro

Using the Options screen to transition to enabling the Circle Pad Pro, we recommend the following
process flow.

Figure 16-4. Process Flow for Enabling the Circle Pad Pro
Note: On SNAKE, the system is always connected when sampling begins, so the following
process flow is unnecessary.

When setting up the option to connect to the Circle Pad Pro, use the following process flow.

Figure 16-5 Process Flow for Connecting the Circle Pad Pro
16.4.2. Process Flow for Disconnecting the Circle Pad Pro

When setting up the option to disconnect from the Circle Pad Pro, we recommend the following
process flow.

Figure 16-6. Process Flow for Disconnecting the Circle Pad Pro
16.5. Process Flow for Using the Circle Pad Pro

Note: The following process flow does not apply to SNAKE.

We recommend the following process flow for using the Circle Pad Pro.

In the figures below, the process is determined by either setting up the option for the Circle Pad Pro
detection in the application or by monitoring full-time, before being incorporated into the main thread.
The figure on the left shows cases when the option for redetecting the Circle Pad Pro is set up in the
application, and the figure on the right shows cases when the Circle Pad Pro is being pulsed for
detection full-time. Refer to Figure 16-3. Process Flow for Detecting the Circle Pad Pro for cases
when the Circle Pad Pro is being monitored full-time in the thread other than the main thread.

Figure 16-7. Process Flow for Circle Pad Pro Detection in Application (left) or Full-Time (right)
17. Appendix: Testing the Operations of Standard
Applications on SNAKE

SNAKE has new features, including Super-Stable 3D, an additional input device, and better camera
performance. To realize these features, the SNAKE hardware is partially incompatible with CTR.
System processes also differ. For more information about SNAKE's added features, see the 3DS
Overview.

There may be differences in behavior when running standard applications on CTR and when running
them on SNAKE, so please test standard applications on both CTR and SNAKE.

When testing your titles, note the following.

SNAKE has a higher memory access latency than the CTR, so waiting for memory access
could cause some application processes to take longer. Application processing has been
found to take as much as around 5% longer in some scenes. If your application has been
Performance-
tuned to the limits on the CTR, there is a chance of processing slowdowns on SNAKE.
pursuing
scenes
You can handle this in your application in two ways: lighten the processing burden, or
build the application as an extended application that runs three times faster (at 804 MHz)
on SNAKE. For more information, see 4.2. Standard and Extended Applications.

SNAKE processes local communications differently from the CTR, so unexpected


behaviors are possible when local communications take place among a mix of the two
Local
systems.
communication
To verify the operations of local communications, use a mix of CTR and SNAKE systems
and test operations with each acting as the host.
The Super-Stable 3D feature of SNAKE uses input from the cameras, so this feature is
Starting and disabled when the application uses the cameras.
ending the Note that on SNAKE, initializing and finalizing the camera library involves the processes
cameras to disable and enable the Super-Stable 3D feature, so it takes a little longer to initialize
and finalize the library.
On SNAKE, the HOME Menu operates in extended mode. When a standard application is
started, the mode switches to standard mode.
Strictly speaking, the standard application runs in extended mode until
nn::applet::Enable() is called, at which time the application switches to standard
HOME Menu
mode. For more information, see 4.2.4. When the Operating Mode Changes.

Note that when the HOME Button is pressed, the entire system including the suspended
application operates in extended mode.

The inner camera on SNAKE has a wider angle than the inner camera on CTR
systems.
The inner and outer cameras on SNAKE have less noise than those on the CTR
systems.
SNAKE has an automatic LCD brightness adjustment function. (Because the auto-
brightness adjustment feature uses camera input, it is disabled when an application
Input interface is using the camera; the immediately prior screen brightness level is maintained.)
differences
Using the C Stick on SNAKE has a different feel than the Circle Pad on the Circle
Pad Pro.
The C Stick on SNAKE and the Circle Pad on the Circle Pad Pro change their state
differently upon waking up from Sleep. For more information, see [Link].
Differences From the Circle Pad Pro.
The SNAKE NFC has a different feel than the NFC Reader/Writer.
CONFIDENTIAL

18. Appendix: Important Notes on 3DS Application


Development Using the IS-SNAKE DevKit

Although the IS-SNAKE DevKit is a tool for developing SNAKE applications, it can also be used for
developing standard applications for CTR.

As noted in 4.2. Standard and Extended Applications, standard applications run in standard mode for
both CTR and SNAKE. However, when extended applications are run on CTR and SNAKE, the CPU's
running speed and usable memory size are different. Extended applications are normally run in
extended mode in the IS-SNAKE DevKit, but by using forced CTR-compatibility mode, you can test in
standard mode even for extended applications.

However, as shown in the table in 17. Appendix: Testing the Operations of Standard Applications on
SNAKE, operations in the IS-SNAKE DevKit may differ from behavior on CTR because some of the
specifications are incompatible. For both standard applications and extended applications, always
perform testing in the CTR environment.

Notes: For more information about forced CTR-compatibility mode, see IS-CTR-DEBUGGER Help.

CONFIDENTIAL

Revision History

Version 1.6 (2016-06-24)

Changes

17. Appendix: Testing the Operations of Standard Applications on SNAKE

Added that the auto-brightness adjustment feature is disabled during camera use.

Version 1.5 (2016-05-10)


Additions

18. Appendix: Important Notes on 3DS Application Development Using the IS-SNAKE DevKit

Changes

3.1.2. Device Memory

Added conditions to use when resizing device memory.

3.1.4. Heap Memory

Added conditions to use when resizing heap memory.

17. Appendix: Testing the Operations of Standard Applications on SNAKE

Moved operations testing with the IS-SNAKE DevKit to a separate page.

Version 1.4 (2015-11-05)

Additions

12.1.9. Login Applet

Changes

Overall

Appended a definition name for the alignment restriction.

4.3.2. Libraries

Added aacdec, aacenc, act, nfp, and qtm to the library list.

6.2.6. Circle Pad Pro

Added an instruction to end any other infrared communication functions in advance if they
are being used.

[Link]. Initialization
Changed the buffer size specified by the Circle Pad Pro initialization function to 12,288
bytes.

9.4. Scheduling

Added a warning describing the drop in system performance caused by excessive repeated
calls of Sleep().

12.1.1. Information Common to All Library Applets

Added information describing the login applet.

12.1.7. Circle Pad Pro Calibration Applet

Including libnn_extrapad is no longer needed with CTR-SDK 11.4 or higher.

14. Infrared Communication Between Systems

Added an instruction to end any other infrared communication functions in advance if they
are being used.

Version 1.3 (2015-04-28)

Changes

2.7.1. Game Card Slot

Revised the description of which cards can be inserted in the Game Card slot.

2.9.1. Speakers

Added information about the system speaker sound pressure frequency characteristics.

5.3.2. Handling HOME Button Presses

Added the NFC to the descriptions for handling devices during HOME Menu display.

5.3.9. Jump to Nintendo eShop

Added information about jumping to the patch page.

7.1.8. Latency Emulation

Fixed the hierarchy of the Config tools menu to match the current tools.
7.3. Save Data

Added to the errors that require handling when mounting a save data region.
Added that some errors are not returned for some media.

7.4. Extended Save Data

Added to the errors that require handling when mounting an extended save data region.

7.6.1. Nintendo 3DS Cards

Deleted the descriptions if error handling relating to save data.


Fixed the hierarchy of the Config tools menu to match the current tools.

17. Appendix: Testing the Operations of CTR Titles on SNAKE

Added a note about the forced CTR-compatibility mode of the IS-SNAKE Devkit.
Added information about input interface differences.

Version 1.2 (2015-01-15)

Additions

12. Applets

Merged the information about library applets and system applets into a single chapter.

12.2. System Applets

[Link]. Starting an Application From the Miiverse Post Page

17. Appendix: Testing the Operations of CTR Titles on SNAKE

Changes

1. Overview

Revised the link descriptions according to the addition of the Applets page.

4.3.2. Libraries

Revised the description of the OLV library.


5.7.2. Characteristics and Limitations of the RO Library

Added an upper limit to the number of dynamic modules that can be loaded simultaneously.

[Link]. Differences From the Circle Pad Pro

Added information about system behavior when waking from Sleep Mode.

12.1. Library Applets

Removed redundant parts from the applet description.


Moved the chapter.
Removed unnecessary library information.

12.2.1. Internet Browser Applet

Moved the chapter.


Changed the chapter name from WEBBRS Library (Internet Browser Applet).
Added the library name.

12.2.2. Miiverse Application and Post App

Changed the chapter name from "OLV Library (Miiverse Post App)."
Revised the descriptions pertaining to the OLV library.
Moved the chapter.

13. Supplemental Libraries

Moved the chapter about system applets.

Deletions

Platform Notation

Deleted this page because the information about platform notation was moved to the
Readme file.

Version 1.1 (2014-11-10)

Additions

5.3.10. Jump to E-manual


12.2.2. Miiverse Application and Post App

Changes

2.3.1. CPU

Added descriptions of the core configuration and use cases.

4.2.6. SNAKE-Only Titles

Added a description of SNAKE-only titles.

4.3.2. Libraries

Added a description of adding the OLV library.

9.6. Thread-Local Storage

Added a description of the feature to call the thread local storage destructor function.

Version 1.0 (2014-09-04)

Additions and Changes

Initial version.

CONFIDENTIAL
3DS Programming Manual: Basic Graphics

Version 1.6

Nintendo Confidential

This document contains confidential and proprietary information of Nintendo, and is protected under
confidentiality agreements as well as the intellectual property laws of the United States and of other
countries. No part of this document may be released, distributed, transmitted, or reproduced in any form ,
including by any electronic or mechanical means and by including within information storage and retrieval
systems, without written permission from Nintendo.

©2015–2016 Nintendo. All rights reserved.


All company and product names in this document are the trademarks or registered trademarks of their respective companies.
CTR-06-0077-002-F

1. Introduction
This document explains basic information and programming procedures necessary to use 3DS graphics
features. 3D graphics features for 3DS are based on OpenGL ES 1.1. To read this document, you must
have an understanding of OpenGL in addition to matrix and vector math.

Warning: Even though the 3DS system supports the use of OpenGL, for performance reasons, it is
often advisable to avoid calling GL functions when you frequently employ 3D graphics
features. Instead, write directly to PICA registers using 3D commands and command
caches, as described in the 3DS Programming Manual: Advanced Graphics. Please review
the CTR Programming Manual: Advanced Graphics, which explains many other features,
after you read this document.

Four libraries are provided for drawing graphics: GL, GD, GR, and the collection of graphics
libraries included in NintendoWare for CTR (NW4C), distributed by Nintendo.

GL is a basic library that provides functions conforming to OpenGL ES specifications and a


rich variety of features, including error handling and state delta management. Accordingly,
this library places a heavy processing load on the system. We do not recommend using the
GL library when high performance is required.

GD is a lighter-weight library than GL. Although it is not compatible with the GL library, it
has equivalent features and provides an easy-to-use API. The GD library requires a certain
amount of CPU cycles for internal processing.

GR is a library intended to support the direct creation of 3D commands. Although this library
executes the fastest, its use requires in-depth knowledge of how registers are set, meaning
that error handling is the responsibility of the developer.

NW4C provides libraries and source code files that use command cache-related features, so
be sure to look into the possible use of NW4C. Although NW4C has been implemented to
take fullest advantage of optimized performance when used without modification, developers
who want to create customized processing can use the GR and GD libraries in combination
with NW4C.

Note: There are alignment and size restrictions on the memory passed from the application to the
graphics library. Within the programming manual you will find numerical references to the
alignment and size restrictions, but they are defined by constants. For more information
about the defined constants, see the API reference.

1.1. Document Structure

2. GPU provides an overview of the GPU built into the 3DS system, in addition to a brief summary of
its features. This chapter also explains the order in which graphics are processed. Read this chapter
to get a grasp of what you can render using 3DS graphics, in addition to an overall picture of
graphics processing.

3. LCD explains the relationship between the GX library and the LCD screens built into the 3DS
system. Read this chapter to understand the close connection between the GX library and the images
displayed on the LCDs.

4. Command Lists explains command lists, which are required to execute 3D graphics commands.

5. Shader Programs describes the types of shader programs and explains how to use them.

6. Vertex Buffers explains how to create and use vertex buffers to input batches of vertex data to the
vertex shader.

7. Textures describes the types of texture images that can be handled by the 3DS system, and the
CTR native format.

8. Vertex Shaders describes input vertex data, and the vertex attributes that are output for use in
later-stage processing.

9. Geometry Shaders provides an overview of and explains how to use the shaders provided by the
SDK for generating basic geometry, such as points and lines, and for other useful features, such as
silhouettes and subdivision.

10. Rasterization describes the processing that immediately precedes per-fragment operations. The
3DS system runs the scissor test at this stage.

11. Texture Processing explains how texture units, combiners, and combiner buffers determine texel
colors; how to combine lights with colors; and how to create procedural textures.

12. Reserved Fragment Shaders describes per-fragment operations that can be controlled using
reserved uniforms, such as fragment lighting, shadows, fog, and gas rendering.

13. Per-Fragment Operations describes the operations performed on fragments, before they are
written to the render buffer. Refer to this chapter for more information about blending, and the alpha
test and other tests.
14. Framebuffer Operations describes framebuffer processing and explains how to clear buffers.

15. Miscellaneous summarizes the differences between the CTR graphics libraries and the OpenGL
ES specifications.

Note: 3DS Programming Manual: Advanced Graphics presents PICA register information,
graphics command caching, and sample implementations of lighting models. It also
introduces features that require special settings such as stereoscopic display, block mode,
early depth tests, and so on.

1.2. Graphics Processing Errors

3D graphics for 3DS are based on OpenGL ES. Behavior follows the same specifications as OpenGL
ES if an error occurs during a call to a 3D graphics function.

The processing of the function is ignored, except when there is a GL_OUT_OF_MEMORY error.
You can get error codes with the glGetError() function.
Only a single error code is recorded. The error code is not updated until it is read by the
glGetError() function, even if another error is generated in the meantime.

The following error codes are recorded.

Table 1-1. Error Code List

Error Code Conditions for Generating an Error


GL_INVALID_ENUM An out-of-range value was specified as a GLenum parameter.

An out-of-range value was specified as a numeric argument (and similar


GL_INVALID_VALUE
cases).
GL_INVALID_OPERATION A function was called when it was prohibited to do so (and similar cases).
GL_OUT_OF_MEMORY There is not enough memory to execute the function.

1.3. Single-Threaded Model

The 3D graphics library was designed under the assumption that it would be called from a single
thread. Operation is not guaranteed when functions are called from multiple threads. Functions must
also be called on only one rendering context.

CONFIDENTIAL

2. GPU

A PICA graphics core (268 MHz) developed by Digital Media Professionals (DMP) is installed as the
3DS GPU. It employs a framebuffer architecture. There are no equivalents of the OBJ and BG
concepts previously employed on the Nintendo DS. Unlike rendering with the 3D engine for the
Nintendo DS (which was closely connected with the VSync), on CTR rendering is possible until buffers
are swapped. In other words, you can control the trade-off between the frame rate and rendering
quality.

3D graphics features are based on OpenGL ES 1.1, but some of them correspond to OpenGL ES 2.0.
Frequently used 3D graphics features are built into the hardware. Shader programs are supported.
Programmable vertex shaders and non-programmable pixel shaders are also provided.

Note: A simple comparison of the GPU reveals that its specifications do not match the Wii or
Nintendo GameCube GPU in terms of vertex operations or fill rate. However, built-in
hardware features allow richer imagery to be shown for some scenes than is possible with
the Wii or Nintendo GameCube.

2.1. Built-In Hardware Features

Frequently used graphics features, including some that are not stipulated by the OpenGL ES
standard, are built into the hardware. This makes it possible to express varied imagery in a small
number of processing steps.

The following features have been provided.

Silhouette generation
Particle generation
Procedural textures
Fragment lighting
Gas rendering
Self-shadowing and soft shadowing
Polygon subdivision

Shader programs and reserved uniform settings can use these features for graphics processing.

2.1.1. Silhouette Generation

Silhouette generation is a feature that uses the GPU to detect object edges and render only those
edges. You can use this feature to outline a selected object for emphasis or to render soft shadows.

2.1.2. Particle Generation

Particle generation is a feature that renders a large number of random particles. You can use this
feature to render explosions, snow, and other effects.
2.1.3. Procedural Textures

Procedural textures represent a feature that automatically generates textures using a combination
of random noise and geometric patterns. This feature is a fast, low-cost way to generate textures
with regular geometric patterns, in addition to textures that have some randomness in addition to a
regular pattern, such as wood grain and marble.

2.1.4. Fragment Lighting

Fragment lighting is used on the 3DS system to light scenes on a per-fragment basis. This feature
allows you to use bump mapping, apply environment maps, and run lighting calculations quickly.

2.1.5. Gas Rendering

Gas rendering is a feature used to render gaseous objects, while accounting for intersections and
foreground/background relationships with polygon objects. This feature allows you to render
surface boundaries with other objects more naturally than, for example, gaseous objects rendered
by particles.

2.1.6. Self-Shadowing and Soft Shadows

Self-shadowing and soft shadows are supported by the 3DS system. These features allow an object
to cast shadows on itself, and soft shadows to be rendered more naturally around the contours of
an object.

2.1.7. Polygon Subdivision

Polygon subdivision is a feature that takes a group of vertices that are input according to fixed
rules and automatically splits them into smooth polygons, using the GPU. This feature allows you to
render objects with curved surfaces, given a small number of input vertices.

2.2. Rendering Pipeline

3D graphics for 3DS are processed in a series of steps known as the rendering pipeline.

Figure 2-1 shows a block view of the rendering pipeline and Table 2-1 shows what each process
does. For more information, see the relevant pages for each process.

Figure 2-1. Schematic View of the Rendering Pipeline


Table 2-1. Actions Performed by Each Process

Process Description

Vertex input Processes vertex data input from the application.


Vertex operations Runs the vertex shaders.

Geometry generation Runs the geometry shaders.


Triangle generation
Converts all primitives into triangles.
(Triangle setup)

Clipping Uses a clipping volume to clip primitives.

Rasterization Converts primitives into fragments.

Scissor test Eliminates unnecessary fragments via scissoring.

Early depth test Runs depth tests before per-fragment operations.

Fragment lighting Processes per-fragment lighting.


Texel generation Determines texel colors from texture coordinates.

Texture combiners Combines fragment light colors, texture colors, and so on.

Fog Renders fog and gas.

Per-fragment operations Runs tests, blending, logical operations, and other processing on fragments.

Framebuffer operations Copies data from the framebuffer, runs pixel-read operations, and so on.

CONFIDENTIAL

3. LCD
On the 3DS system, the GX (graphics) library is closely involved with the images displayed on the LCD
screens. Figure 3-1 shows the workflow starting from rendering and continuing until display on the
LCD.

Figure 3-1. Workflow From Rendering to Display on the LCD.

The process is roughly split into the following steps.

1. Render.
2. Copy data from the render buffer to the display buffer and convert its format.
3. Swap buffers to update the regions displayed on the LCDs.

The rendering in step 1 is described in the documentation, starting from 4. Command Lists. This
chapter describes the specifications for the output destination, the LCD, and the initialization needed
when using the GX library. It also describes the allocation and transfer of buffers, synchronizing screen
updates, and finalization, in that order.

Note: For more information about stereoscopic display, see the 3DS Programming Manual:
Advanced Graphics.

3.1. LCD Resolution, Placement, and Orientation

The LCDs on the 3DS system differ from the LCDs on the Nintendo DS (NITRO) and Nintendo DSi
(TWL) systems, both in terms of resolution and arrangement. Table 3-1 shows the differences in
resolution and Figure 3-2 shows the differences in placement and orientation.

Table 3-1. LCD Resolution

LCD NITRO/TWL 3DS


Upper screen (width × height) 256 × 192 240 × 400
Lower screen (width × height) 256 × 192 240 × 320

Figure 3-2. LCD Placement and Orientation


The standard LCD arrangement on the NITRO and TWL systems places the long edges of the LCDs
(the screen width) horizontally, with the lower screen in front. The standard LCD arrangement on the
3DS system places the short edges of the LCDs horizontally, with the lower screen to the left of the
upper screen (as when the front of the controller is rotated 90 degrees to the right).

3.2. Required Initialization Process

A framebuffer architecture has been adopted for 3DS. Because the images displayed on the LCDs are
based on the framebuffer content, the GX library must be initialized first.

3.2.1. Initializing the GX Library

Call the nngxInitialize() function to initialize the GX library.

Code 3-1. Initialization Function for the GX Library

GLboolean nngxInitialize(
GLvoid* (*allocator)(GLenum, GLenum, GLuint, GLsizei),
void (deallocator)(GLenum, GLenum, GLuint, GLvoid));
GLboolean nngxGetIsInitialized();

An allocator and a deallocator are specified to the nngxInitialize() function, using the
allocator and deallocator parameters, respectively. The allocator is called when a display
buffer or other memory region is allocated, and the deallocator is called when the same memory
region is freed. For more information about the allocator and deallocator, see [Link]. Allocator and
[Link]. Deallocator.

A value of GL_TRUE is returned when initialization succeeds, and GL_FALSE is returned when
initialization fails. A value of GL_FALSE is also returned if this function is called after the library
has already been initialized but before the nngxFinalize() function has been called to shut down
the library. Behavior is not guaranteed if any other gl or nngx() functions are called before using
this function to initialize the library. The default render buffer, among others, is not allocated during
initialization. These buffers must be allocated by the application after initialization.

The nngxGetIsInitialized() function returns GL_TRUE if the nngxInitialize() function


has initialized the GX library.

[Link]. Allocator

The first allocator argument indicates the memory from which to allocate a region. The following
values can be passed.

NN_GX_MEM_FCRAM
Allocate from main (device) memory.
NN_GX_MEM_VRAMA
Allocate from VRAM-A.
NN_GX_MEM_VRAMB
Allocate from VRAM-B.

Given a value of NN_GX_MEM_FCRAM, the allocator allocates from main memory, but the region to
allocate must also be located in device memory. Device memory is a memory region that
guarantees address consistency when it is accessed by peripheral devices. Applications are
responsible for memory management. You can use the nn::os::GetDeviceMemoryAddress()
and nn::os::GetDeviceMemorySize() functions to get the starting address and size of the
memory region, allocated as device memory.

Note: For more information about device memory, see the 3DS Programming Manual:
System.

If a value of NN_GX_MEM_VRAMA or NN_GX_MEM_VRAMB is specified for the allocator, use the


nn::gx::GetVramStartAddr, nn::gx::GetVramEndAddr, and nn::gx::GetVramSize()
functions to get the starting address, ending address, and size of VRAM, respectively. As with
main memory, the application is responsible for memory management.

The second allocator argument is passed a value indicating the usage (buffer type) of the memory
to allocate. Because addresses are aligned differently for each type, implement your application's
allocator to comply with the following rules.

Table 3-2. Differences in Alignment Between Buffer Types

Argument Value Buffer Type Alignment

Texture
(2D,
NN_GX_MEM_TEXTURE 128 bytes for all formats.
environment
map)

Depends on the vertex attribute.


Vertex 4 bytes (GLfloat type)
NN_GX_MEM_VERTEXBUFFER
buffers. 2 bytes (GLshort type, GLushort type)
1 byte (GLbyte type, GLubyte type)

Color buffer. 64 bytes

NN_GX_MEM_RENDERBUFFER Depth buffer 32 bytes (16-bit depth)


(stencil 96 bytes (24-bit depth)
buffer) 64 bytes (24-bit depth + 8-bit stencil)

Display
NN_GX_MEM_DISPLAYBUFFER 16 bytes
buffer

3D command
NN_GX_MEM_COMMANDBUFFER 16 bytes
buffer
4 bytes (when the memory size to allocate is a
multiple of 4).
System
NN_GX_MEM_SYSTEM 2 bytes (when the memory size to allocate is a
buffer
multiple of 2, but not a multiple of 4).
1 byte (when none of the above).

In addition to these rules, your implementation must comply with the following hardware
specifications.

All six faces of a cube map texture must fit within the same 32 MB boundaries.
All six faces of a cube map texture must have addresses that share the same value for the
most-significant 7 bits.
The GL_TEXTURE_CUBE_MAP_POSITIVE_X face of a cube map texture must have a smaller
address, or the same address, as every other face.
Data must not be placed so that it spans VRAM-A and VRAM-B.

Warning: Do not allocate a display buffer in the last 1.5 MB of VRAM-A or VRAM-B.

If texture addresses are not correctly aligned, the GPU may hang, rendered results
may be corrupted, or other problems may arise.

If NN_GX_MEM_VERTEXBUFFER, NN_GX_MEM_RENDERBUFFER, NN_GX_MEM_DISPLAYBUFFER, or


NN_GX_MEM_COMMANDBUFFER was specified for the second argument, the third allocator
argument is passed the name (ID) of that object.

The fourth allocator argument is passed the size of the region to allocate.

The application must account for address alignment and allocate the specified memory region
based on these arguments, and then return the region's starting address. The application must
return a value of 0 if it fails to allocate the region.

[Link]. Deallocator

The first, second, and third deallocator arguments are passed the same values that were used
when the memory region was allocated. The fourth argument is passed the starting address of the
memory region. The application must use these arguments to release the memory region
allocated by the allocator.

[Link]. Getting an Allocator

You can get the allocator that was configured when the GX library was initialized by calling the
nngxGetAllocator() function.

Code 3-2. Getting an Allocator

void nngxGetAllocator (
GLvoid* (**allocator)(GLenum, GLenum, GLuint, GLsizei),
void (*deallocator)(GLenum, GLenum, GLuint, GLvoid));
The allocator and deallocator parameters specify pointers that will be assigned pointers to
the allocator and deallocator. No allocator or deallocator is obtained when specifying NULL for
each of the arguments.

[Link]. Getting an Initialization Register Setting Command

You can get an executable Initialization Register Setting Command when calling the
nngxGetInitializationCommand() and nngxInitialize() functions.

Note: This function was added as a measure related to rendering that takes place when
returning to an application from the HOME Menu when generating a setting command
directly in the register without using the graphics library supported by SDK.
Consequently, it is not normally necessary to use it.

Code 3-3. Getting an Initialization Register Setting Command

GLsizei nngxGetInitializationCommand(GLsizei datasize, GLvoid* data);

For the data parameter, you can specify a pointer to the command that receives the register
setting command. For the datasize parameter, you can specify the buffer byte size indicated by
the data parameter.

This function returns the size in bytes of the Initialization Register Setting Command. Specifying
0 (NULL) for data gets the register setting command, and returns the buffer size to allocate.
After you specify 0 for data, prepare that size of the buffer and get the command by making a
call-out again.

If you are calling the nngxInitialize() function before initialization, 0 is returned without
getting the command.

Table 3-3. Error Generated by the nngxGetInitializationCommand Function

Error Code Cause

The value specified for datasize is smaller than the obtained register
GL_ERROR_80B4_DMP
command.

3.2.2. Command-List Object Creation

You must create command list objects after initializing the GX library. Command list objects are
introduced independently by the Nintendo 3DS, and are handled as execution units for 3D graphics
processing. This section only explains how to create them. For more information about command
lists and command list objects, see 4. Command Lists.

First use the nngxGenCmdlists() function to create one or more command list objects,
individually bind them to the GPU with the nngxBindCmdlist() function, and then allocate
individual memory regions with the nngxCmdlistStorage() function.

Command list objects comprise a 3D command buffer and command requests. Set bufsize and
requestcount for the nngxCmdlistStorage() function to the size of the 3D command buffer
and the number of command requests that can be queued. When you create multiple command list
objects, you must call the nngxBindCmdlist() and nngxCmdlistStorage() function on each of
them.

The following code sample creates a single command list object that has a 3D command buffer size
of 256 KB and can queue 128 command requests.

Code 3-4. Creating a Command List Object

GLuint commandList;

nngxGenCmdlists(1, &commandList);
nngxBindCmdlist(commandList);
nngxCmdlistStorage(256 * 1024, 128);

3.3. Allocating Buffers

Two of the buffers used graphics processing are involved in LCD output: the framebuffer, which is the
GPU render target; and the display buffer, which is used to copy rendering results and display them
on the LCDs. Processing to the framebuffer and the display buffer occur using the framebuffer object
and the display buffer object, respectively. For the configuration, see Figure 3-3.

Figure 3-3. Configuration of the Framebuffer and the Display Buffer

3.3.1. Allocating Render Buffers

First, you must use the glGenFramebuffers() function to create a framebuffer object to specify
as the render target. You can then bind the various render buffers (color, depth, or stencil) to the
framebuffer object to specify those render buffers as the render target. The 3DS system has two
screens (an upper and a lower screen), but if they have the same format and do not need to be
rendered in parallel, we recommend that they share an object to conserve memory resources. In
this case, specify the dimensions of the upper screen as the width and height of the buffer.

Code 3-5. Creating a Framebuffer Object

GLuint frameBufferObject;
glGenFramebuffers(1, &frameBufferObject);

Next, use the glGenRenderbuffers() function to create the render buffer objects. If the render
target includes a depth buffer, stencil buffer, or both, in addition to a color buffer, you need two
render buffer objects for the framebuffer object.

Code 3-6. Creating Render Buffer Objects

GLuint renderBuffer[2];
glGenRenderbuffers(2, renderBuffer);

Use the glBindRenderbuffer() function to specify the render buffer object to bind to the
framebuffer object, and then use the glRenderbufferStorage() function to allocate a render
buffer.

Code 3-7. Definition of the glRenderbufferStorage Function

void glRenderbufferStorage(GLenum target, GLenum internalformat,


GLsizei width, GLsizei height);

Set width and height to the width and height of the render buffer. Neither the width nor the
height can be greater than 1024 pixels.

Warning: Block-shaped noise may be rendered on some pixels if the render buffer has 262,128
or more pixels (the product of its width and height). For more information, see 15.8.
Block-Shaped Noise Is Rendered on Certain Pixels.

By setting target to the bitwise OR of GL_RENDERBUFFER and the following bitmasks, you can
specify the memory from which to allocate the buffer. If no bitmask is specified, the buffer is
allocated as if NN_GX_MEM_VRAMA was specified.

Table 3-4. Specifiable Bitmasks for the target Parameter of glRenderbufferStorage

Bitmask Buffer Is Allocated In


NN_GX_MEM_VRAMA VRAM-A
NN_GX_MEM_VRAMB VRAM-B

Specify the buffer type (format) in the internalformat parameter. You can choose from the
following formats on the 3DS system.

Table 3-5. Specifiable Formats for the internalformat Parameter of glRenderbufferStorage

Format Bits Description of Format

GL_DEPTH_COMPONENT16 16 16-bit depth.


GL_DEPTH_COMPONENT24_OES 24 24-bit depth.

GL_RGBA4 16 The R, G, B, and alpha components are 4 bits each.


The R, G, and B components are 5 bits each, and the alpha
GL_RGB5_A1 16
component is 1 bit.
5-bit RB components and 6-bit G component. No alpha
GL_RGB565 16
component.

GL_RGBA8_OES 32 The R, G, B, and alpha components are 8 bits each.


GL_DEPTH24_STENCIL8_EXT 32 24-bit depth and 8-bit stencil value.

Density values used during gas rendering (the rendered results


GL_GAS_DMP 32
cannot be copied to the display buffer).
To specify an allocated render buffer as the render target, use the
glFramebufferRenderbuffer() function to bind it to the framebuffer object that was specified
by the glBindFramebuffer() function.

Code 3-8. Definitions of glBindFramebuffer and glFramebufferRenderbuffer Functions

void glBindFramebuffer(GLenum target, GLuint framebuffer);


void glFramebufferRenderbuffer(GLenum target, GLenum attachment,
GLenum renderbuffertarget, GLuint renderbuffer);

Set target to GL_FRAMEBUFFER for both functions.

Set framebuffer to the framebuffer object and renderbuffer to the render buffer object. Set
renderbuffertarget to GL_RENDERBUFFER.

The value to specify for attachment depends on the render buffer format.

Table 3-6. Render Buffer Formats and the attachment Values to Use With Them

Format Value of attachment

GL_DEPTH_COMPONENT16
GL_DEPTH_ATTACHMENT
GL_DEPTH_COMPONENT24_OES

GL_RGBA4
GL_RGB5_A1

GL_RGB565 GL_COLOR_ATTACHMENT0
GL_RGBA8_OES
GL_GAS_DMP

GL_DEPTH24_STENCIL8_EXT GL_DEPTH_STENCIL_ATTACHMENT

You can call the glCheckFramebufferStatus() function to get the state of the render buffer
object that is bound to the framebuffer object. Set the function’s target parameter to
GL_FRAMEBUFFER. A GL_INVALID_ENUM error is generated if you specify any other value.

If GL_FRAMEBUFFER_COMPLETE is returned, a proper render buffer object has been bound.

If GL_FRAMEBUFFER_INCOMPLETE_MISSING_ATTACHMENT is returned, neither the color buffer nor


the depth (stencil) buffer has been attached.

If GL_FRAMEBUFFER_INCOMPLETE_ATTACHMENT is returned, either memory has not been


allocated for the buffer that is bound, or the same render buffer object has been bound to both the
color buffer and the depth buffer.

If GL_FRAMEBUFFER_INCOMPLETE_DIMENSIONS is returned, buffers with different sizes have been


bound.

The following sample code allocates render buffers. Note that settings differ between the color and
depth (stencil) buffers. The render buffer is shared by the upper and lower screens, so its width and
height are specified using the dimensions of the upper screen.

Code 3-9. Allocating Render Buffers (Color, Depth, Stencil)

// FrameBuffer
glBindFramebuffer(GL_FRAMEBUFFER, frameBufferObject);
// Color
glBindRenderbuffer(GL_RENDERBUFFER, renderBuffer[0]);
glRenderbufferStorage(GL_RENDERBUFFER | NN_GX_MEM_VRAMA, GL_RGBA8_OES,
nn::gx::DISPLAY0_WIDTH, nn::gx::DISPLAY0_HEIGHT);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
GL_RENDERBUFFER, renderBuffer[0]);
// Depth / Stencil
glBindRenderbuffer(GL_RENDERBUFFER, renderBuffer[1]);
glRenderbufferStorage(GL_RENDERBUFFER | NN_GX_MEM_VRAMB,
GL_DEPTH24_STENCIL8_EXT, nn::gx::DISPLAY0_WIDTH, nn::gx::DISPLAY0_HEIGHT);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT,
GL_RENDERBUFFER, renderBuffer[1]);

3.3.2. Allocating Display Buffers

Before you allocate a display buffer, you must call the nngxActiveDisplay() function to specify
whether to use it for the upper or lower screen.

Code 3-10. Definition of the nngxActiveDisplay Function

void nngxActiveDisplay(GLenum display);

Specify either the upper or lower screen with the value you pass to display. A
GL_ERROR_801F_DMP error occurs when display is set to a value other than the ones listed in
Table 3-7.

Table 3-7. Values to Set for display

Value of display Specified Screen

NN_GX_DISPLAY0 Upper screen (during stereoscopic display, this is the image for the left eye).
Upper screen (specifiable only during stereoscopic display, when this is the
NN_GX_DISPLAY0_EXT
image for the right eye).
NN_GX_DISPLAY1 Lower screen.

Note: For more information about stereoscopic display, see the 3DS Programming Manual:
Advanced Graphics.

Next, use the nngxGenDisplaybuffers() function to create the display buffer objects.

Code 3-11. Definition of the nngxGenDisplaybuffers Function

void nngxGenDisplaybuffers(GLsizei n, GLuint* buffers);

Set n to the number of display buffer objects to be created. Set buffers to point to an array in
which to store the display buffer objects. If you are using more than one display buffer for a single
screen through multi-buffering, create only as many objects as you need.

Table 3-8. Errors Generated by the nngxGetInitializationCommand Function

Error Code Cause

GL_ERROR_801C_DMP A negative value was specified for n.


GL_ERROR_801D_DMP The internal buffer failed to be allocated.

Call the nngxBindDisplaybuffer() function on the generated display buffer object to set it as
the target display buffer.

Code 3-12. Definition of the nngxGenDisplaybuffers Function


void nngxBindDisplaybuffer(GLuint buffer);

An object is created as long as an unused object name is specified for buffer. A


GL_ERROR_8020_DMP error occurs when the internal buffer failed to be allocated.

Use the nngxDisplaybufferStorage() function to allocate a display buffer.

Code 3-13. Definition of the nngxDisplaybufferStorage Function

void nngxDisplaybufferStorage(GLenum format, GLsizei width, GLsizei height,


GLenum area);

For format specify one of the following buffer formats. You cannot specify a format that uses more
bits per pixel than the format set when the color buffer was allocated.

Table 3-9. Specifiable Formats for format of nngxDisplaybufferStorage

Format Bits Description of Format

GL_RGBA4 16 The R, G, B, and alpha components are 4 bits each.


GL_RGB5_A1 16 The R, G, and B components are 5 bits each, and the alpha component is 1 bit.

GL_RGB565 16 5-bit RB components and 6-bit G component. No alpha component.


GL_RGB8_OES 24 8-bit RGB components. No alpha component.

For width and height, specify the display buffer size. Both dimensions must be positive numbers
that are multiples of the block size. The block size is either 8 or 32, depending on the render
buffer's current block mode. For information about block mode settings, see the 3DS Programming
Manual: Advanced Graphics.

For area, specify the memory in which to allocate the buffer.

Table 3-10. Specifiable Flags for area of nngxDisplaybufferStorage Function

Flag Buffer Is Allocated In


NN_GX_MEM_FCRAM Main (device) memory.

NN_GX_MEM_VRAMA VRAM-A
NN_GX_MEM_VRAMB VRAM-B

To capture and save a screen image, the display buffer must be allocated in main memory where it
can be accessed by the CPU.

If a display buffer object already has a display buffer allocated and you allocate a new display
buffer for it, the old memory region is freed and a new memory region is allocated.

Table 3-11. Errors Generated by the nngxGetInitializationCommand Function

Error Code Cause

GL_ERROR_8021_DMP The target is a display buffer whose object name is 0.


GL_ERROR_8022_DMP An invalid value is specified for width or height.
GL_ERROR_8023_DMP An invalid value is specified for format.

GL_ERROR_8024_DMP An invalid value is specified for area.


GL_ERROR_8025_DMP The memory region failed to be allocated.

You can allocate multiple display buffers. The following sample code uses double-buffering.
Code 3-14. Allocating Display Buffers

GLuint display0Buffers[2];
GLuint display1Buffers[2];
// Displaybuffer for UpperLCD
nngxActiveDisplay(NN_GX_DISPLAY0);
nngxGenDisplaybuffers(2, display0Buffers);
nngxBindDisplaybuffer(display0Buffers[0]);
nngxDisplaybufferStorage(GL_RGB8_OES, nn::gx::DISPLAY0_WIDTH,
nn::gx::DISPLAY0_HEIGHT, NN_GX_MEM_FCRAM);
nngxBindDisplaybuffer(display0Buffers[1]);
nngxDisplaybufferStorage(GL_RGB8_OES, nn::gx::DISPLAY0_WIDTH,
nn::gx::DISPLAY0_HEIGHT, NN_GX_MEM_FCRAM);
nngxDisplayEnv(0, 0);
// Displaybuffer for LowerLCD
nngxActiveDisplay(NN_GX_DISPLAY1);
nngxGenDisplaybuffers(2, display1Buffers);
nngxBindDisplaybuffer(display1Buffers[0]);
nngxDisplaybufferStorage(GL_RGB8_OES, nn::gx::DISPLAY1_WIDTH,
nn::gx::DISPLAY1_HEIGHT, NN_GX_MEM_FCRAM);
nngxBindDisplaybuffer(display1Buffers[1]);
nngxDisplaybufferStorage(GL_RGB8_OES, nn::gx::DISPLAY1_WIDTH,
nn::gx::DISPLAY1_HEIGHT, NN_GX_MEM_FCRAM);
nngxDisplayEnv(0, 0);

[Link]. Getting Parameters

You can use the nngxGetDisplaybufferParameteri() function to get information about the
target display buffer.

Code 3-15. Function for Obtaining Display Buffer Parameters

void nngxGetDisplaybufferParameteri(GLenum pname, GLint* param);

You can set pname to one of the following values. Specifying any other value results in a
GL_ERROR_8033_DMP error.

Table 3-12. Display Buffer Parameters

pname Parameter Information to Get


NN_GX_DISPLAYBUFFER_ADDRESS Starting address of the display buffer.

NN_GX_DISPLAYBUFFER_FORMAT The display buffer format.


NN_GX_DISPLAYBUFFER_WIDTH The display buffer width.
NN_GX_DISPLAYBUFFER_HEIGHT The display buffer height.

3.4. Copying From the Color Buffer to the Display Buffer

When rendering is done, the content of the color buffer is in block format. This cannot be output to
the LCDs, which can only display data in a linear format. When rendering has finished, the content of
the color buffer is in block format. This data cannot be output to the LCDs, which can only display
data in a linear format. You must call the nngxTransferRenderImage() function to convert the
data into a format that can be output to the LCDs. This function also copies the data to the display
buffer from the color buffer that is bound to the framebuffer object specified by
glBindFramebuffer.
Code 3-16. Function for Copying Data From the Color Buffer to the Display Buffer

void nngxTransferRenderImage(GLuint buffer, GLenum mode, GLboolean yflip,


GLint colorx, GLint colory);

The buffer parameter specifies the display buffer object into which to copy the data. If the 3D
command buffer has accumulated unsplit commands, the function adds a split command and a
transfer command request to the command requests. There is no guarantee that any added transfer
commands have completed running after this function has executed. Wait until the command list has
finished executing before performing other tasks such as deleting color buffers or changing buffer
contents.

The mode parameter specifies the degree of anti-aliasing to apply when the rendered results are
copied to the display buffer.

Table 3-13. Anti-Aliasing Specifications

mode Anti-Aliasing Specification Width Height


NN_GX_ANTIALIASE_NOT_USED Off (none). Equal Equal

NN_GX_ANTIALIASE_2x1 2&times;1 anti-aliasing. 2 times Equal


NN_GX_ANTIALIASE_2x2 2&times;2 anti-aliasing. 2 times 2 times

The value of yflip specifies whether to apply a y-flip (a vertical flip) when the rendered results are
copied to the display buffer. It is applied if the argument is set to GL_TRUE. Any value other than 0 is
treated as GL_TRUE.

The values of colorx and colory specify the offsets to use when copying the data from the color
buffer. (The function assumes that the origin is at the lower-left corner and that the positive axes are
pointing up and right.) You must specify offsets that are positive multiples of the block size, which is
8 in block-8 mode and 32 in block-32 mode. (For more information about block mode settings, see the
3DS Programming Manual: Advanced Graphics.)

Starting from the offset position in the color buffer, this function copies a region with the same width
and height as the display buffer to the display buffer. Subtract the offset values from the color buffer's
width and height to find the dimensions of the region that can be copied from the color buffer.

The height and width of the region to copy, as measured in pixels, must be at least as big as the
minimum allowed. The minimum height and width when copying from a color buffer is 128. The
minimum height and width when copying to a display buffer depends on the anti-alias setting. If anti-
aliasing is disabled, the minimum for both height and width is 128. If 2x1 anti-aliasing is enabled, the
height minimum is 128 and the width minimum is 64. If 2x2 anti-aliasing is enabled, the minimum for
both height and width is 64.

Table 3-14. Errors Generated by the nngxTransferRenderImage Function

Error Code Cause


GL_ERROR_8027_DMP Called when the bound command list’s object name is 0.

The command request has already reached the maximum number of


GL_ERROR_8028_DMP
accumulated command requests allowable.

GL_ERROR_8029_DMP A valid display buffer has not been bound.


GL_ERROR_802A_DMP A valid color buffer has not been bound.

GL_ERROR_802B_DMP An invalid value was specified for mode.


The display buffer is larger than the region that can be copied. (When anti-
GL_ERROR_802C_DMP
aliasing, apply the width and height ratios from Table 3-13.)

GL_ERROR_802D_DMP Invalid values were specified for colorx and colory.


GL_ERROR_802E_DMP The display buffer uses more bits per pixel than the color buffer.
GL_ERROR_802F_DMP The 3D command buffer is full because of commands added by this function.
The width and height of the color buffer and display buffer are not multiples of
GL_ERROR_8059_DMP
32 in block-32 mode.
The width of the color buffer and display buffer are not multiples of 16 when the
GL_ERROR_805A_DMP
display buffer ’s pixel size is 24 bits and the block mode is block 8.

The specified height or width for copying from a color buffer was smaller than
GL_ERROR_80B5_DMP
the minimum.

The specified height or width for copying to a display buffer was smaller than
GL_ERROR_80B6_DMP
the minimum.

3.5. Updating the LCD Render Regions by Swapping Buffers

After data has finished being copied to the display buffer, the buffer-swap function displays the
rendered results to one or both LCDs.

Use the nngxActiveDisplay() function to specify a display and the nngxBindDisplaybuffer()


function to bind the display buffer from which to display. You can then call the nngxDisplayEnv()
function to specify the offsets to use when data is output from the display buffer to the LCDs. The
function assumes that the origin is at the lower-left corner and the positive axes are pointing up and
right, and that they are always positive values. Set (0, 0) for the offsets, if the display buffer is the
same size as the LCD. Specifying a negative value for the offset results in a GL_ERROR_8026_DMP
error.

Code 3-17. Functions for Specifying the Display and the Display Buffer

void nngxActiveDisplay(GLenum display);


void nngxBindDisplaybuffer(GLuint buffer);
void nngxDisplayEnv(GLint displayx, GLint displayy);

Next, use the buffer-swap function to switch the buffer that is output to the LCD.

Code 3-18. Buffer-Swap Function

void nngxSwapBuffers(GLenum display);

The nngxSwapBuffers() function swaps the buffers during the next VSync. This function can be
called at any time, but if it is called more than once before a VSync, only the last call is valid.

Use display to specify which display is affected when the buffers are swapped. Specify
NN_GX_DISPLAY0 to target only the upper screen, NN_GX_DISPLAY1 to target only the lower screen,
orNN_GX_DISPLAY_BOTH to target both screens.

This function configures the GPU with the address of the display buffer to display, switching the
image that is displayed on the LCDs. The display buffer address that is ultimately set in the GPU is
calculated from the starting address of the buffer allocated by the nngxDisplaybufferStorage()
function. The calculation takes into account several variables, including the display buffer resolution,
the pixel size or number of bits per pixel (bpp), the LCD resolution, and the offsets configured by the
nngxDisplayEnv() function.

The following equation calculates the address to configure.

BufferAddress + PixelSize × (DisplayBufferWidth × (DisplayBufferHeight - LcdHeight - DisplayY)


+ DisplayX)

Table 3-15. Errors Generated by the nngxTransferRenderImage Function

Error Code Cause

GL_ERROR_8030_DMP An invalid value was specified for display.


GL_ERROR_8031_DMP A valid display buffer has not been bound.
The display region that considered the offset will fall outside of the display
GL_ERROR_8032_DMP
buffer.
GL_ERROR_8053_DMP The display buffer address set in the GPU is not 16-byte aligned.

The display buffer bound to the upper screen for the right eye during
GL_ERROR_9000_DMP stereoscopic display (NN_GX_DISPLAY0_EXT) is 0, or that region has not been
allocated.
The nngxDisplayEnv() function specifies a display region that is outside the
GL_ERROR_9001_DMP
display buffer.
The display buffers bound to the two upper screens (NN_GX_DISPLAY0 and
GL_ERROR_9002_DMP
NN_GX_DISPLAY0_EXT) have different resolutions, formats, or memory regions.

In the system’s initial state, a black screen is forced to be displayed on the LCDs. If a valid display
buffer for LCD output has been prepared in the buffer-swap function, call the
nngxStartLcdDisplay() function at the same time as a VSync to start LCD output. You only need
to call this function the first time.

Code 3-19. Function for Starting LCD Output

void nngxStartLcdDisplay( void );

3.5.1. Swapping Buffers for a Specified Address

By calling the nngxSwapBuffersByAddress() function, you can specify the address of a buffer
to display (on the LCDs) without using a display buffer object.

Code 3-20. Swapping Buffers for a Specified Address

void nngxSwapBuffersByAddress(GLenum display,


const GLvoid* addr, const GLvoid* addrB,
GLsizei width, GLenum format);

Buffers are swapped during the first VSync after this function (like the nngxSwapBuffers()
function) is called. If this function is called more than once on the same display before a VSync
occurs, only the last function call is valid.

Use display to specify which display is affected when the buffers are swapped. Specify
NN_GX_DISPLAY0 for the upper screen or NN_GX_DISPLAY1 for the lower screen.

Use addr to specify the address of the buffer to display. If stereoscopic display is enabled and you
have specified the upper screen with display, this argument provides the address of images to
display for the left eye. You must specify an address that is 16-byte aligned.

Use addrB to specify the address of images to display for the right eye when stereoscopic display
is enabled. This argument is only valid when you have specified the upper screen with display. It is
ignored if stereoscopic display is disabled or if you have specified NN_GX_DISPLAY1 in display.
You must specify an address that is 16-byte aligned.
The display position specified by the nngxDisplayEnv() function is ignored when the buffer
specified by this function is displayed. Consequently, consider offsets and other factors in the
addresses that you specify in addr and addrB. For more information about how addresses are
calculated, see 3.5. Updating the LCD Render Regions by Swapping Buffers.

Use width to specify the width (in pixels) of the buffer to display. Note that width specifies the
width of the buffer rather than the width of the LCD. Both the upper and lower screens have a width
of 240 pixels, but if you want to partially display a buffer that is wider than that, specify the width of
the entire buffer (in pixels), including the parts that will not be displayed. width must be a multiple
of 8 and cannot be less than 240.

Use format to specify the buffer format. You can specify the same formats that can be specified
when allocating a display buffer (see Table 3-9).

Table 3-16. Errors Generated by the nngxSwapBuffersbyAddress Function

Error Code Cause

GL_ERROR_8087_DMP An invalid value was specified for display.


GL_ERROR_8088_DMP The address specified in addr is not 16-byte aligned.

GL_ERROR_8089_DMP The address specified in addrB is not 16-byte aligned.


GL_ERROR_808A_DMP An invalid value was specified for width.
GL_ERROR_808B_DMP A value other than a specifiable format was specified for format.

3.5.2. Details of Swapping Buffers

The nngxSwapBuffers() and nngxSwapBuffersByAddress() functions do not show the


content of the display buffer after they swap buffers. These functions simply schedule the display
buffer address (used when the screen is rendered after a VSync) to be changed during a VBlank.

The GPU displays the content of the display buffer to the LCD. Because the GPU reads a single
line of data from the display buffer for each scan line, memory is accessed frequently outside of a
VBlank. This causes tearing to occur if the content of the display buffer is overwritten outside of a
VBlank.

The GPU accesses memory at the highest priority when it transfers (displays) the content of the
display buffer to the LCD. By placing the display buffer in VRAM that can only be accessed by the
GPU, you can avoid problems caused by memory access conflicts. If, however, you place the
display buffer in main memory (device memory), there may be memory access conflicts with the
CPU and other devices.

You can use the nngxSetMemAccessPrioMode() function to adjust the priority at which the GPU,
CPU, and other devices access main memory.

Code 3-21. Function for Adjusting the Priority of Main Memory Access

void nngxSetMemAccessPrioMode(nngxMemAccessPrioMode mode);

Table 3-17. Differences in Main Memory Access Priorities

mode GPU CPU Other Devices

NN_GX_MEM_ACCESS_PRIO_MODE_0 Uniform priority.


NN_GX_MEM_ACCESS_PRIO_MODE_1 High priority.
NN_GX_MEM_ACCESS_PRIO_MODE_2 Extremely high priority.

NN_GX_MEM_ACCESS_PRIO_MODE_3 High priority. High priority.


NN_GX_MEM_ACCESS_PRIO_MODE_4 High priority.

NN_GX_MEM_ACCESS_PRIO_MODE_1 is the default setting.

By raising the priority of memory accesses from the CPU, you can reduce the effect of the GPU and
other devices on the time taken by processes that involve accessing main memory from the CPU.

Warning: If you place the display buffer in main memory and specify
NN_GX_MEM_ACCESS_PRIO_MODE_2, a large number of memory accesses from the CPU
may cause noise resembling vertical lines to appear on the screen because of
insufficient bandwidth for transferring images to be displayed on the LCD. To avoid this,
either place the display buffer in VRAM or specify another mode.

Note: If there is insufficient bandwidth for displaying images on the LCD, the
nngxGetCmdlistParameteri() function returns a bit array with a value of 1 stored in
both bit 17 and bit 18 when NN_GX_CMDLIST_HW_STATE is specified for param. For
information about the nngxGetCmdlistParameteri() function, see 4.1.10. Getting
Command List Parameters.

3.6. Screen Update Synchronization

You can use the following functions for processing that needs to line up with vertical LCD
synchronization (VSync).

Code 3-22. VSync Functions

GLint nngxCheckVSync(GLenum display);


void nngxWaitVSync(GLenum display);
void nngxSetVSyncCallback(GLenum display, void (*func)(GLenum));

The nngxCheckVSync() function returns the value of the VSync counter updated within the library.
By checking for changes in this value you can asynchronously check for VSync updates. Because the
return value is an update counter used internally by the library, it wraps around to 0 after it exceeds
the implementation's maximum value. (This may change in future implementations.)

The nngxWaitVSync() function waits for a VSync update on the specified LCD. Control does not
return until the VSync update.

The nngxSetVSyncCallback() function registers the callback function that is invoked when VSync
is updated. If this function is called with func set to 0 (NULL), it unregisters the callback function.
The registered callback function is called from a different thread than the main thread, so mutual
exclusion is required when referencing any data shared with the main thread. However, mutual
exclusion is not required for data shared with any interrupt handlers registered using the
nngxSetCmdlistCallback() function even if they are for the same graphics processing.

Warning: You can call the nngx() functions from within a VSync callback, but note that
command request completion interrupts are forced to wait until the callback function
completes. Consequently, minimize any calls within callback functions to functions that
issue command requests.
Setting display to NN_GX_DISPLAY0, NN_GX_DISPLAY1, or NN_GX_DISPLAY_BOTH in any of these
functions causes it to operate on the upper screen, lower screen, or both, respectively. Passing any
other value causes a GL_ERROR_8019_DMP error in calls to nngxCheckVSync, a
GL_ERROR_801A_DMP error in calls to nngxWaitVSync, and a GL_ERROR_801B_DMP error in calls
to nngxSetVSyncCallback.

The system tries to prevent extreme time delay between screens, but there are approximately 100
microseconds between VSync updates for the upper and lower screens. Avoid creating code that
strongly depends on this delay because heavy processing in the VSync callback for the upper screen
or the start of an extremely high-priority thread may force the VSync callback for the lower screen to
wait.

The LCD screen VSync interval is 59.831 Hz for both the upper and lower screens. This does not
change when stereoscopic display (the parallax barrier) is enabled or disabled.

3.7. Finalizing

Call the nngxFinalize() function when you stop using the GX library (such as, when the
application shuts down). This releases all remaining unreleased objects.

Code 3-23. Definition of the nngxFinalize Function

void nngxFinalize(void);

To destroy the framebuffer objects, render buffers, and display buffers allocated for displaying
graphics on the LCDs, call glDeleteFramebuffers, glDeleteRenderbuffers, and
nngxDeleteDisplaybuffers, respectively.

Code 3-24. Functions for Destroying Framebuffer Objects, Render Buffers, and Display Buffers

void glDeleteFramebuffers(GLsizei n, const GLuint* framebuffers);


void glDeleteRenderbuffers(GLsizei n, const GLuint* renderbuffers);
void nngxDeleteDisplaybuffers(GLsizei n, GLuint* buffers);

In each function, n specifies the number of object arrays passed to the second parameter. A
GL_ERROR_801E_DMP error occurs when n is set to a negative value in the
nngxDeleteDisplaybuffers()function.

If any of the specified framebuffer objects or render buffers is in use, it is unaffected. All other
objects are deleted. Display buffers are swapped for display buffers with an object name of 0 and the
objects in use are destroyed.

3.8. Specifying Display Portions

Figure 3-4 indicates which display portions are set to be specified for the transfer from the color
buffer to the display buffer, and then to display on the LCD from the display buffer. The transfer from
the color buffer to the display buffer assumes that anti-aliasing is disabled.

Figure 3-4. Specifying Display Portions


Table 3-18. Specifying Display Portions

Variable
Description
Name

Width and height of the color buffer


cw, ch Values specified by width and height in the glRenderbufferStorage() function. (See
Code 3-7.)
Offset for when copying from the color buffer to the display buffer.
cx, cy Values specified by colorx and colory in the nngxTransferRenderImage() function.
(See Code 3-16.)

Width and height of the display buffer


dw, dh Values specified by width and height in the nngxDisplaybufferStorage() function.
(See Code 3-13.)
Offset for when producing output from the display buffer to the LCD.
dx, dy Values specified by displayx and sy in the nngxDisplayEnv() function. (See Code 3-
17.)

The height and width of the output destination LCD. The height and width differ for the
lw, lh upper screen and lower screen.
See 3.1. LCD Resolution, Placement, and Orientation.

CONFIDENTIAL

4. Command Lists
Command lists are new to the Nintendo 3DS system. The gl and nngx functions called using 3D
graphics processing can be recorded as commands and then executed all at the same time. Command
list processing occurs using command list objects. The Nintendo 3DS handles command lists by the 3D
graphics rendering execution unit.

Command lists include commands that write to registers using direct GPU execution (3D commands),
and command requests for communicating instructions from the CPU to the GPU. 3D commands
accumulate in the 3D command buffer as gl() and nngx() functions carry out rendering work and
other tasks. Command requests are queued by the specific gl() and nngx() functions that called
them. For information about the types of command requests, see 4.2. Command Request Types.

4-1. Command Lists


When a 3D execution command queued in the command request is processed, the GPU loads the 3D
command from the command buffer and executes it. Multiple 3D commands are handled and run
together as a single command set. Each command set ends with a split command, so the GPU can
track where to end the loading of the 3D buffer.

Figure 4-2. Command Sets

Note: The function issuing command requests can only be called in the application core (Core 0).

4.1. How to Use

The Nintendo 3DS GPU renders 3D graphics by running commands in units of command lists.
Applications create command list objects into which gl() functions and other functions accumulate
3D commands, which are then executed as one batch when the GPU runs the command list.

4.1.1. Creating Objects

First, use the nngxGenCmdlists() function to create the command list objects.

Code 4-1. Function for Creating Command Lists

void nngxGenCmdlists(GLsizei n, GLuint* cmdlists);

This code creates n command list objects and stores their object names in cmdlists.

Command lists have their own namespace. The command list with an object name of 0 is reserved
by the system.
Table 4-1. Errors Generated by the nngxGenCmdlists Function

Error Code Cause


GL_ERROR_8000_DMP A negative value was specified for n.

GL_ERROR_8001_DMP The internal buffer failed to be allocated.

4.1.2. Binding Command Lists

Next, use the nngxBindCmdlist() function to bind a generated command list object to the GPU.
3D commands are accumulated in the bound command list's 3D command buffer.

Code 4-2. Function for Binding Command Lists

void nngxBindCmdlist(GLuint cmdlist);

If cmdlist is set to an unused object name, that object is created.

Table 4-2. Errors Generated by the nngxBindCmdlist() Function

Error Code Cause

GL_ERROR_8004_DMP The internal buffer failed to be allocated.


Called while command caches and command lists were in a state of being
GL_ERROR_8005_DMP saved. (For more information, see the CTR Programming Manual: Advanced
Graphics.)

4.1.3. Allocating Memory Regions

Use the nngxCmdlistStorage() function to allocate a memory region for the bound command
list.

Code 4-3. Allocating a Memory Region for a Command List

void nngxCmdlistStorage(GLsizei bufsize, GLsizei requestcount);

Set bufsize to the size of the 3D command buffer and requestcount to the number of command
requests that can be queued.

You must call the nngxBindCmdlist() and nngxCmdlistStorage() functions on each


command list object that you create. Calls to these functions are ignored when a command list with
an object name of 0 is bound. If you call this function again on an object that already has an
allocated region, that region is freed and reallocated.

A GL_ERROR_COMMANDBUFFER_FULL_DMP error is generated by the relevant function if 3D


commands have accumulated past this allocated 3D command buffer size or if the 3D command
buffer has not been set. A GL_ERROR_COMMANDREQUEST_FULL_DMP error is generated by the
relevant function if command requests have been queued past this maximum queue size or if the
buffer has not been set.

Table 4-3. Common Errors When Allocating a Memory Region for a Command List

Error Code Cause


GL_ERROR_8006_DMP Failed to allocate a memory region.

GL_ERROR_8007_DMP Called on a command list that is being executed.


GL_ERROR_8008_DMP A negative value was specified as an argument.

4.1.4. Running Command List Objects

Call the nngxRunCmdlist() function to start executing command requests that have been queued
in the bound command list.

Code 4-4. Function for Executing Command Lists

void nngxRunCmdlist(void);
void nngxRunCmdlistByID(GLuint cmdlist);

Execution is ignored if the bound command list has an object name of 0. Likewise, attempts to bind
a different command list and run this function are ignored while command requests are executing.

After command requests have started executing, you can accumulate more commands in that same
list or you can bind another command list and accumulate commands there. However, commands
must be executed in the same order in which they were accumulated.

Accumulation of commands in an executing command list must occur in the application core.
Undefined actions can result when accumulation occurs outside the application core.

The nngxRunCmdlistByID() function runs the command list specified by cmdlist rather than
the currently bound command list. Besides running the specified command list, it works the same
as the nngxRunCmdlist() function.

Table 4-4. Common Errors When Running Command Lists

Error Code Cause


Called on a command list for which a memory region has not been allocated
GL_ERROR_8009_DMP
(nngxRunCmdlist()).
Called on a command list for which a memory region has not been allocated
GL_ERROR_80B1_DMP
(nngxRuncmdlistByID()).

[Link]. Getting the State of an Executing Command List

Determine whether a command list is running using the nngxGetIsRunning() function.

Code 4-5. Function for Determining Command List Execution State

GLboolean nngxGetIsRunning(void);

This function returns GL_TRUE if a command list is currently running, regardless of whether the
command list is currently bound.

Similarly, you can determine whether a command list is currently running by passing the
NN_GX_CMDLIST_IS_RUNNING value for pname for the nngxGetCmdlistParameteri()
function. This method can be used to determine whether the currently bound command list is
running. For information about the nngxGetCmdlistParameteri() function, see 4.1.10.
Getting Command List Parameters.
4.1.5. Destroying Command List Objects

You can call the nngxDeleteCmdlists() function to free command list objects that are no longer
necessary.

Code 4-6. Function for Destroying Command Lists

void nngxDeleteCmdlists(GLsizei n, const GLuint* cmdlists);

This destroys the command list objects specified by the n object names in cmdlists. A
GL_ERROR_8003_DMP error occurs if any of the specified command list objects are currently being
executed while all other specified command list objects are being freed.

Table 4-5. Errors Generated by the nngxDeleteCmdlist() Function

Error Code Cause

GL_ERROR_8002_DMP A negative value was specified for n.

GL_ERROR_8003_DMP A command list included in cmdlists is still being executed.

4.1.6. Stopping a Command List

Call either of the following functions to stop a command list that is being executed.

Code 4-7. Functions for Stopping Command Lists

void nngxStopCmdlist(void);
void nngxReserveStopCmdlist(GLint id);

When the nngxStopCmdlist() function is called, it waits for any executing command request to
complete and then stops the command list. You cannot stop a command request after it has started
executing (or is waiting to commence execution).

The nngxReserveStopCmdlist() function stops the command list immediately after the id th
accumulated command request finishes executing.

Call the nngxRunCmdlist() function to resume a stopped command list. Note, however, that this
function will be ignored if it is called after the instruction to stop the command list but before the
executing command requests finish executing.

Table 4-6. Errors Generated by the nngxReserveStopCmdlist() Function

Error Code Cause


GL_ERROR_800A_DMP Called on a command list that is being executed.

Zero, a negative number, or a value that exceeds the maximum number of


GL_ERROR_800B_DMP
command requests was specified for id.

4.1.7. Splitting the 3D Command Buffer

Use the nngxSplitDrawCmdlist() function to add a buffer loading complete command to the 3D
command buffer and start queuing render command requests. If a command list accumulates 3D
commands while it is executing, its 3D commands are run up to the point at which they are split by
this function.

Code 4-8. Function for Splitting the 3D Command Buffer

void nngxSplitDrawCmdlist(void);

Render command requests are not queued until the buffer loading complete command is added. In
addition to this function, other functions also queue render command requests. Because functions
such as glClear() and glTexImage2D() must stop 3D command execution, they each add a
buffer loading complete command, and then queue the render command requests.

Table 4-7. Common Errors Generated by the nngxSplitDrawCmdlist() Function

Error Code Cause

GL_ERROR_800C_DMP Called when the bound command list’s object name is 0.

The command request has already reached the maximum number of


GL_ERROR_800D_DMP
accumulated command requests allowable.

GL_ERROR_800E_DMP The 3D command buffer is full because of commands added by this function.

These errors may also be generated by other functions that call this one internally.

[Link]. Flushing the Accumulated 3D Command Buffer

When the nngxSplitDrawCmdlist() function is called, the buffer loading complete command
is added and queuing of render command requests is carried out, even if there are no 3D
commands accumulated in the 3D command buffer. In other words, this function can potentially
add unneeded commands. We recommend calling the nngxFlush3DCommand() function (which
adds a command to split the 3D command buffer) and the
nngxFlush3DCommandNoCacheFlush() function only when 3D commands have accumulated. If
a cache flush occurs multiple times, call a later function that reflects the cache content all at the
same time by using the nngxUpdateBufferLight() function. This can reduce CPU overhead.

Code 4-9. 3D Function for Flushing the Command Buffer

void nngxFlush3DCommand(void);
void nngxFlush3DCommandNoCacheFlush(void);

These functions will not add a buffer loading complete command and render command request if
there are no 3D commands accumulated in the 3D command buffer of the bound command list
after the buffer has been split for the last time. This function only adds a buffer loading complete
command and render command request when 3D commands have been accumulated in the buffer.
If 3D commands are accumulating as they are executing, the 3D commands execute up to where
the buffer was split by this function.

Table 4-8. Errors Generated by the nngxFlush3DCommand() and nngxFlush3DCommandNoCacheFlush()


Functions

Error Code Cause

GL_ERROR_8084_DMP
Called when the bound command list’s object name is 0.
GL_ERROR_80AE_DMP
GL_ERROR_8085_DMP The command request has already reached the maximum number of
GL_ERROR_80AF_DMP accumulated command requests allowable.
GL_ERROR_8086_DMP
The 3D command buffer is full because of commands added by this function.
GL_ERROR_80B0_DMP

[Link]. Partially Flushing the Accumulated 3D Command Buffer

The nngxFlush3DCommandPartially() function is provided for executing a 3D command of a


specified size. This function is an extension of the features provided with the
nngxFlush3DCommand() function and can be called to correctly execute 3D commands,
including the command buffer execution register kick command added to functions such as
nngxAdd3DCommand(). For more information, see 3DS Programming Manual: Advanced
Graphics, in 8.8.23. Command Buffer Execution Registers (0x0238 – 0x023D).

Code 4-10. Function for Partially Flushing the Command Buffer

void nngxFlush3DCommandPartially(GLsizei buffersize);

Specify the size, in bytes, of the command buffer to be executed in buffersize. The number
must be a multiple of 16.

The size must be correctly specified in buffersize from the address following the previous
command flush to the first kick command (and including the kick command). If the wrong value is
specified, the command is executed in an unintended order and the operation may not be able to
complete properly.

Using the application, accurately perform cache flush on 3D commands accumulated from the
previous command flush up to the point this function is called. Overall flushing of the cache must
be performed after calling the function because commands that generate an interrupt are
generated within this function. In addition, as a means of avoiding cache execution before the
cache is flushed, this function cannot be called for command lists whose execution is in progress.
Functions such as glClear, nngxTransferRenderImage(), and glCopyTexImage2D()
execute flushes according to the same method as the nngxFlush3DCommand() function. Always
perform a flush using this function before calling those functions.

When a kick command is added using the nngxAddJumpCommand() or


nngxAddSubroutineCommand() function, the driver adjusts the size so that the command kicks
the execution size up to the first kick command. It is unnecessary to call the
nngxFlush3DCommandPartially() function when using those functions.

Note that when a partial flush is performed for a command buffer with a kick command added with
the nngxAddSubroutineCommand() function, the execution size used is specified in
buffersize instead of an execution size calculated by the driver.

Table 4-9. Errors Generated by the nngxFlush3DCommandPartially() Function

Error Code Cause


GL_ERROR_80A9_DMP Called when the bound command list’s object name is 0.

The command request has already reached the maximum number of


GL_ERROR_80AA_DMP
accumulated command requests allowable.

GL_ERROR_80AB_DMP The 3D command buffer is full because of commands added by this function.

A value less than 0 or that is not a multiple of 16 was specified in


GL_ERROR_80AC_DMP
buffersize.

GL_ERROR_80AD_DMP Called on a command list that is being executed.


4.1.8. Clearing the Command List

The following function clears the command list and sets both the 3D command buffer and command
request queue to the unused state (the state immediately after their memory regions are allocated).

Code 4-11. Function for Clearing the Command List

void nngxClearCmdlist(void);

Table 4-10. Errors Generated by the nngxClearCmdlist Function

Error Code Cause

GL_ERROR_800F_DMP Called on a command list that is being executed.

[Link]. Clearing the Command List and Filling Its 3D Command Buffer

The following function clears the command list and initializes the 3D command buffer with the
specified data. Both the 3D command buffer and command request queue enter the unused state.

Code 4-12. Function for Clearing the Command List and Filling Its 3D Command Buffer

void nngxClearFillCmdlist(GLuint data);

Table 4-11 Errors Generated by the nngxClearFillCmdlist() Function

Error Code Cause

GL_ERROR_8065_DMP Called on a command list that is being executed.

4.1.9. Setting Command List Parameters

You can call the nngxSetCmdlistParameteri() function to set command list parameters

Code 4-13. Function for Setting Command List Parameters

void nngxSetCmdlistParameteri(GLenum pname, GLint param);

Table 4-12. Command List Parameters That Can Be Set

pname Setting

This parameter is set for individual command list objects and can
have one of the following values.
GL_TRUE: Update the additive blend results for rendering gas
density information.
GL_FALSE: Normal behavior (default)
If this parameter is GL_TRUE, when the nngxSplitDrawCmdlist()
or nngxFlush3DCommand() function is called, additive blend results
are updated for rendered gas density information after the
accumulated render command requests have finished running.
If this parameter is GL_FALSE, normal behavior is restored; that is,
NN_GX_CMDLIST_GAS_UPDATE commands that update gas density information are only accumulated
when necessary.
This setting takes effect depending on whether it is GL_TRUE when
nngxSplitDrawCmdlist() or nngxFlush3DCommand() is called.
A setting of GL_TRUE does not have any effect when render
command requests are executed. This setting also does not affect
render command requests accumulated by any function call other
than nngxSplitDrawCmdlist() or nngxFlush3DCommand().
For more information about how additive blend results are updated
for rendered gas density information, see the Gas Control Setting
Registers section in the 3DS Programming Manual: Advanced
Graphics.

Table 4-13. Errors Generated by the nngxSetCmdlistParameteri() Function

Error Code Cause

GL_ERROR_8015_DMP Called on a command list that is being executed.

GL_ERROR_8016_DMP An invalid value was specified for pname or param.

4.1.10. Getting Command List Parameters

You can call the nngxGetCmdlistParameteri() function to get command list parameters.

Code 4-14. Function for Getting Command List Parameters

void nngxGetCmdlistParameteri(GLenum pname, GLint* param);

Table 4-14. Getting Command List Parameters

pname Parameter Obtained

The command list execution state.


NN_GX_CMDLIST_IS_RUNNING GL_TRUE: The command list is currently being executed.
GL_FALSE: The command list is not currently being executed.

NN_GX_CMDLIST_USED_BUFSIZE The number of bytes accumulated in the 3D command buffer.


NN_GX_CMDLIST_USED_REQCOUNT The number of accumulated command requests.

The maximum 3D command buffer size.


NN_GX_CMDLIST_MAX_BUFSIZE This value is specified as bufsize to the
nngxCmdlistStorage() function.
The maximum number of command requests.
NN_GX_CMDLIST_MAX_REQCOUNT This value was specified in the requestcount parameter to
the nngxCmdlistStorage() function.

NN_GX_CMDLIST_TOP_BUFADDR The starting address of the 3D command buffer.


NN_GX_CMDLIST_BINDING The object name of the command list that is currently bound.

NN_GX_CMDLIST_RUN_BUFSIZE The number of executed bytes in the 3D command buffer.

NN_GX_CMDLIST_RUN_REQCOUNT The number of executed command requests.


The starting address of the data region used for the command
NN_GX_CMDLIST_TOP_REQADDR
request queue.
The type of command request that is currently executing or will
be executed next.
The value returned in param depends on the state of the
command list that is currently bound. If the command list is
currently stopped, the return value is the type of command
request that will be executed next. If the command list is
executing, the return value is the type of command request
that is currently executing. NULL is returned when all command
requests have finished executing.
NN_GX_CMDLIST_NEXT_REQTYPE Command request types are defined by the following macros.
NN_GX_CMDLIST_REQTYPE_DMA: DMA transfer command
request.
NN_GX_CMDLIST_REQTYPE_RUN3D: Render command request.
NN_GX_CMDLIST_REQTYPE_FILLMEM: Memory fill command
request.
NN_GX_CMDLIST_REQTYPE_POSTTRANS: Post transfer
command request.
NN_GX_CMDLIST_REQTYPE_COPYTEX: Copy texture command
request.

The command buffer's address and byte size.


The command buffer's address is stored in the first element of
param and its size (in bytes) is stored in the second element.
You must pass param a pointer to an array of at least two
GLint values.
If the bound command list is currently stopped, parameter
information is returned for the command request that will be
NN_GX_CMDLIST_NEXT_REQINFO executed next. If the bound command list is executing,
parameter information is returned for the command request
that is currently executing. Nothing is returned if all command
requests have finished executing.
The function only returns information when a render command
request is the command request that is currently executing or
that will be executed next. Nothing is returned for any other
type of command.

32 bits of data indicating the hardware state.


Each of the following bits is set to 1 to indicate that the
command list is in the specified state.
bit 20: A post transfer command request is executing.
bit 19: A memory fill is in progress.
bit 18: An underrun error has occurred in the FIFO for the
lower screen.
bit 17: An underrun error has occurred in the FIFO for the
upper screen.
bit 16: The post vertex cache is busy.
bit 15: Bits [1:0] of register 0x0252 are 1.
bit 14: Vertex processor 3 is busy.
bit 13: Vertex processor 2 is busy.
bit 12: Vertex processor 1 is busy.
bit 11: Vertex processor 0 (which can also be used as the
NN_GX_CMDLIST_HW_STATE geometry shader processor) is busy.
bit 10: Bits [1:0] of register 0x0229 are nonzero.
bit 9: Input to the module that loads command buffers and
vertex arrays is busy.
bit 8: Output from the module that loads command buffers and
vertex arrays is busy.
bit 7: The early depth test module is busy.
bit 6: The per-fragment operation module is busy processing
data from the module in the previous stage.
bit 5: The per-fragment operation module is busy accessing
the framebuffer.
bit 4: The texture combiners are busy.
bit 3: Fragment lighting is busy.
bit 2: The texture units are busy.
bit 1: The rasterization module is busy.
bit 0: Triangle setup is busy.

Buffer address of the next 3D command stored in the currently


NN_GX_CMDLIST_CURRENT_BUFADDR
bound command list.

Table 4-15. Common Errors When Setting Command List Parameters

Error Code Cause

GL_ERROR_8017_DMP An invalid value was specified for pname or param.


NN_GX_CMDLIST_BINDING is not specified for pname when the bound
GL_ERROR_8018_DMP
command list’s object name is 0.

4.1.11. Command Completion Interrupts

You can cause interrupts to occur and call interrupt handlers, when the command requests in a
command list finish. You can register an interrupt handler with the nngxSetCmdlistCallback()
function.

Code 4-15. Function for Registering an Interrupt Handler

void nngxSetCmdlistCallback(void (*func)(GLint));

An interrupt handler is valid only for the bound command list. If this function is called with func set
to 0 (NULL), the handler is unregistered.

The interrupt handler is called from a different thread than the main thread, so mutual exclusion is
needed when referencing any data shared with the main thread. However, mutual exclusion is not
needed for data shared with any callback functions for the same graphics processing registered
using the nngxSetVSyncCallback() function.

Table 4-16. Common Errors When Setting Command List Callbacks

Error Code Cause

GL_ERROR_8010_DMP Called on a command list that is being executed.

Use the nngxEnableCmdlistCallback() function to specify a command request that normally


triggers an interrupt when it ends. The nngxDisableCmdlistCallback() function can disable
interrupts.

Code 4-16. Interrupt Control Functions

void nngxEnableCmdlistCallback(GLint id);


void nngxDisableCmdlistCallback(GLint id);

An interrupt occurs upon completion of the id th accumulated command request. You can call this
function on a single command list several times with separate id values to cause multiple
interrupts to occur. Note that id indicates a command request in the order that it was accumulated,
not in the order that it was executed. You can call nngxGetCmdlistParameteri() with pname
set to NN_GX_CMDLIST_USED_REQCOUNT to get a value to specify for id. If id is -1, an interrupt
occurs when all command requests accumulated in the command list have finished.

The command list is still executing when an interrupt handler is called. This occurs for every
interrupt except for the last to the command request accumulated in the command list.
Consequently, the interrupt handler cannot, itself, call any functions that cannot be called while a
command list is executing.

Even without registering an interrupt handler, you can determine when a command request has
finished executing by calling nngxGetCmdlistParameteri() passing pname as
NN_GX_CMDLIST_IS_RUNNING, and then waiting until you get a value of GL_FALSE.

Table 4-17. Common Errors When Enabling and Disabling Command List Callbacks

Error Code Cause


GL_ERROR_8012_DMP Zero, a negative number other than -1, or a value equal to the maximum
GL_ERROR_8014_DMP number of command requests was specified for id.

4.1.12. Waiting for Command Execution to Complete

You can call nngxWaitCmdlistDone to wait for all of the command requests accumulated in the
command list to complete.

Code 4-17. Function That Waits for Commands to Complete

void nngxWaitCmdlistDone(void);

Render command requests are executed until the point at which they are split. To execute all of the
accumulated render command requests, call nngxSplitDrawCmdlist() before this function.

This function does not return until command execution is complete. However, you can use the
nngxSetTimeout() function to set a timeout period.

Code 4-18. Function for Setting a Timeout When Waiting for Command Execution to Complete

void nngxSetTimeout(GLint64EXT time, void (*callback)(void));

Set time to the number of ticks to wait before the nngxWaitCmdlistDone() function times out.
Timeouts do not occur when a value of 0 is specified.

Set callback to the callback function to invoke when a timeout occurs. If this is NULL, a callback
function is not invoked when the timeout occurs.

No timeouts occur by default because the initial values for time and callback are 0 and NULL
respectively.

4.1.13. Adding a DMA Transfer Command Request

When the nngxAddVramDmaCommand() or nngxAddVramDmaCommandNoCacheFlush() function


is called, a command request that runs a DMA transfer to VRAM is accumulated in the command
list. The former flushes the source cache, but the latter does not. This function can only use DMA
transfers from main memory to VRAM.

Code 4-19. Function for Adding a DMA Transfer Command Request

void nngxAddVramDmaCommand(
const GLvoid* srcaddr, GLvoid* dstaddr, GLsizei size);
void nngxAddVramDmaCommandNoCacheFlush(
const GLvoid* srcaddr, GLvoid* dstaddr, GLsizei size);

An amount of data specified by size is transferred from the address specified by srcaddr to the
address specified by dstaddr.

When calling the nngxAddVramDmaCommand() function, a GL_ERROR_8062_DMP error indicates


that this function was called when no valid command list was bound, while a GL_ERROR_8064_DMP
error indicates that size is negative.

When calling the nngxAddVramDmaCommandNoCacheFlush() function, a GL_ERROR_8090_DMP


error indicates that this function was called when no valid command list was bound, and a
GL_ERROR_8091_DMP error indicates that size is negative.

4.1.14. Adding an Anti-Aliasing Filter Transfer Command Request

When the nngxFilterBlockImage() function is called, a command request that transfers an


image with an anti-aliasing filter applied is accumulated in the command list. (This is one kind of
post-filter command request.) The image is transferred in block format, unconverted. The only
supported antialiasing specification is 2×2.

Code 4-20. Function for Adding an Anti-Aliasing Filter Transfer Command Request

void nngxFilterBlockImage(const GLvoid* srcaddr, GLvoid* dstaddr,


GLsizei width, GLsizei height, GLenum format);

An image with a width, height, and format specified by width, height, and format respectively is
transferred from the address specified by srcaddr to the address specified by dstaddr.

The width and height arguments are restricted as follows by the value specified for format.

Table 4-18. Format Restrictions on the Width and Height of Images to Be Transferred

format width height


GL_RGBA8_OES A multiple of 64, greater than or equal to A multiple of 16, greater than or equal
GL_RGB8_OES 64. to 64.
GL_RGBA4
A multiple of 128, greater than or equal A multiple of 16, greater than or equal
GL_RGB5_A1
to 128. to 128.
GL_RGB565

If the transfer source and destination memory regions overlap, the function works properly when
the scraddr and dstaddr values are the same, or when the scraddr value is bigger than the
dstaddr value. The transfer results could be corrupted if the scraddr value is smaller than the
dstaddr value.

When the value for srcaddr specifies an address in device memory, the transfer results could be
incorrect if the destination memory cache has not been flushed.

Table 4-19. Errors Generated by the nngxFilterBlockImage() Function

Error Code Cause

Called when a command list with an object name of 0 is bound or when there
GL_ERROR_8068_DMP
is no space in the command request queue.

GL_ERROR_8069_DMP The address specified for srcaddr or dstaddr is not 8-byte aligned.

GL_ERROR_806A_DMP A width or height value is specified that violates the restrictions.

GL_ERROR_806B_DMP A format value is specified that is not listed in the restrictions.

4.1.15. Adding an Image Transfer Command Request

When the nngxTransferLinearImage() function is called, a command request that transfers an


image to a render buffer or texture is accumulated in the command list. (This is one kind of copy-
texture command request.) If the current 3D command buffer has accumulated unsplit commands, a
split command is added, and then the transfer command request is added.
Although images are converted from linear format to block format while they are transferred, this
conversion only affects addressing. If this function is called on a render buffer, the block mode
setting automatically determines whether a conversion to 8 block addressing or 32 block addressing
is applied during the transfer. If this function is called on a texture, a conversion to 8 block
addressing is applied. In either case, you must flip an image in the V direction and convert its byte
order before you transfer it.

Note: For information about block mode, see Block Mode Settings in the 3DS Programming
Manual: Advanced Graphics.

Code 4-21. Function for Adding an Image Transfer Command Request

void nngxTransferLinearImage(const GLvoid* srcaddr, GLuint dstid,


GLenum target);

For srcaddr, specify the starting address of the image to transfer. The image must have the same
format, width, and height as the render buffer or texture to which it is transferred. However, the
source pixel format must be 32-bit when the target pixel format is 24-bit because the hardware
does not support transfers between 24-bit pixel formats. In this case, for each 4 bytes that are
transferred, the first byte (the internal format's alpha component) is truncated.

The image is transferred to the render buffer or texture that has the object ID specified by dstid
and the object type specified by target.

Table 4-20. Values to Specify for target and dstid

When target is: Set dstid to:

The object ID of a render buffer.


If a value of 0 is specified, data is transferred to the
GL_RENDERBUFFER
color buffer that is attached to the current
framebuffer.
GL_TEXTURE_2D The object ID of a 2D texture.

GL_TEXTURE_CUBE_MAP_POSITIVE_X{,Y,Z}
The object ID of a cube map texture.
GL_TEXTURE_CUBE_MAP_NEGATIVE_X{,Y,Z}

The width and height of the target render buffer must be multiples of 8, in block 8 mode or multiples
of 32, in block 32 mode. Both the width and height must be at least 128.

Table 4-21. Errors Generated by the nngxTransferRenderImage() Function

Error Code Cause


GL_ERROR_805B_DMP Called when the bound command list’s object name is 0.

GL_ERROR_805C_DMP The maximum number of command requests has already accumulated.

GL_ERROR_805D_DMP The 3D command buffer is full because of commands added by this function.

The render buffer or texture specified for dstid does not exist, or it does not
GL_ERROR_805E_DMP
have an allocated memory region.

There is a violation of the width and height restrictions for the target render
GL_ERROR_805F_DMP
buffer.
GL_ERROR_8060_DMP An invalid value was specified for target.

The target render buffer or texture does not use 32-bit, 24-bit, or 16-bit pixel
GL_ERROR_8067_DMP
sizes.
4.1.16. Adding a Block-to-Linear Image Conversion and Transfer
Command Request

A command request for converting a block image to a linear image and transferring the result can
be added to the command list by calling the nngxAddB2LTransferCommand() function. (This is
one kind of post-filter command request.) Although the nngxTransferRenderImage() function
provides the same functionality, the nngxAddB2LTransferCommand() function is more versatile.
They also differ in that the latter function adds only a transfer request command and does not add a
split command.

Code 4-22. Function for Adding a Command Request for Converting From a Block Image to a Linear Image
and Transferring

void nngxAddB2LTransferCommand(
const GLvoid* srcaddr, GLsizei srcwidth, GLsizei srcheight, GLenum
srcformat,
GLvoid* dstaddr, GLsizei dstwidth, GLsizei dstheight, GLenum dstformat,
GLenum aamode, GLboolean yflip, GLsizei blocksize);

The srcaddr parameter specifies the transfer source (block image) address. The dstaddr
parameter specifies the transfer destination (linear image) address. Both srcaddr and dstaddr
must be 16-byte aligned.

The srcwidth, srcheight, dstwidth, and dstheight parameters specify the transfer source
image width and height and transfer destination width and height, in pixels. The height and width of
the source image and destination image must be a multiple of the block size (8 or 32). Finally, if the
pixel size of the destination image is 24 bits and the block size is 8, the width of the source image
and width of the destination image must be a multiple of 16. If 0 is specified for srcwidth,
srcheight, dstwidth, or dstheight, the command is not issued. The height and width of the
destination image in pixels must be equal to, or less than, that of the source image.

The height and width of the source and destination images, as measured in pixels, must be at least
as big as the minimum allowed. The minimum height and width for source images is 128. The
minimum height and width for destination images depends on the anti-alias setting. If anti-aliasing
is disabled, the minimum for both height and width is 128. If 2x1 anti-aliasing is enabled, the height
minimum is 128 and the width minimum is 64. If 2x2 anti-aliasing is enabled, the minimum for both
height and width is 64.

The srcformat and dstformat parameters specify the pixel format of the source and destination
image. The five types of pixel formats that can be specified are listed in the following table.

Table 4-22. Pixel Format Specifications

Definition Bits Description of Format

GL_RGBA4 16 The R, G, B, and alpha components are 4 bits each.


The R, G, and B components are 5 bits each, and the alpha component is 1
GL_RGB5_A1 16
bit.
GL_RGB565 16 5-bit RB components and 6-bit G component. No alpha component.

GL_RGB8_OES 24 8-bit RGB components. No alpha component.

GL_RGBA8_OES 32 The R, G, B, and alpha components are 8 bits each.

Conversion to a pixel format with a higher pixel depth is not supported. For example, you cannot
convert from a 24-bit format to a 32-bit format, or from a 16-bit format to the 24-bit or 32-bit format.

aamode specifies the anti-alias filter mode. The three modes that can be specified are listed in the
following table. The widths and heights indicate the minimum dimensions of the source image
relative to the destination image.

Table 4-23. Anti-Aliasing Specifications

Definition Anti-Aliasing Width Height

NN_GX_ANTIALIASE_NOT_USED No anti-aliasing. Equal Equal


NN_GX_ANTIALIASE_2x1 Transferred using 2x1 anti-aliasing. 2 times Equal

NN_GX_ANTIALIASE_2x2 Transferred using 2x2 anti-aliasing. 2 times 2 times

yflip specifies whether vertical flipping is enabled during image transfer. Flipping is performed if
GL_TRUE (or a value other than 0) is specified. Flipping is not performed if GL_FALSE (or 0) is
specified.

For blocksize, specify the block size used for the transfer source image (8 or 32).

Table 4-24. Errors Generated by the nngxTransferRenderImage() Function

Error Code Cause

A command list with object name 0 was bound, or there is no space in the
GL_ERROR_807C_DMP
command request queue.

GL_ERROR_807D_DMP Either srcaddr or dstaddr is not 16-byte aligned.

GL_ERROR_807E_DMP A value other than 8 or 32 is specified in blocksize.

GL_ERROR_807F_DMP An invalid value is specified in aamode.


GL_ERROR_8080_DMP An invalid value is specified in either srcformat or dstformat.

GL_ERROR_8081_DMP The pixel size of srcformat is greater than that of dstformat.

An invalid value is specified for srcwidth, srcheight, dstwidth, or


GL_ERROR_8082_DMP
dstheight.

The specified width or height of the destination image is greater than the width
GL_ERROR_8083_DMP
or height in pixels of the source image.

The specified height or width of the source image was smaller than the
GL_ERROR_80B7_DMP
minimum.

The specified height or width of the destination image was smaller than the
GL_ERROR_80B8_DMP
minimum.

4.1.17. Adding a Linear-to-Block Image Conversion and Transfer


Command Request

A command for converting from a linear image to a block image and then transferring the result can
be added to the command list by calling the nngxAddL2BTransferCommand() function. (This is
one kind of post-filter command request.) The nngxTransferLinearImage() function also
provides the same functionality, but the nngxAddL2BTransferCommand() function is more
versatile. They also differ in that the latter function adds only a transfer request command and does
not add a split command.

Code 4-23. Function for Adding a Command Request for Converting From a Linear Image to a Block Image
and Transferring

void nngxAddL2BTransferCommand(
const GLvoid* srcaddr, GLvoid* dstaddr,
GLsizei width, GLsizei height, GLenum format, GLsizei blocksize);
srcaddr specifies the transfer source (linear image) address. dstaddr specifies the transfer
destination (block image) address. Both srcaddr and dstaddr must be 16-byte aligned.

width and height specify the height and width, in pixels, of the transfer source and transfer
destination images. The transfer source and transfer destination images must have the same width
and height, and each dimension must be 128 or greater and a multiple of the block size (8 or 32).
Finally, if the bit depth of the source image is 24 bits, the image width must be a multiple of 32,
even if the block size is 8. The command is not added if 0 is specified for either width or height.

format specifies the pixel format of the image being transferred. The specifiable pixel format is the
same as that for the nngxAddB2LTransferCommand() function (Table 4-22). The source and
destination images must have the same pixel format. Note, however, that if the format is 24-bit, the
source image must be in 32-bit format because hardware does not support 24-bit to 24-bit transfer.
In this case, the last byte of every 4 bytes of source data is thrown away.

The blocksize parameter specifies the block size of the source image as either 8 or 32.

Table 4-25. Errors Generated by the nngxAddL2BTransferCommand() Function

Error Code Cause

A command list with object name 0 was bound or there is no space in the
GL_ERROR_806F_DMP
command request queue.

GL_ERROR_8070_DMP Either srcaddr or dstaddr is not 16-byte aligned.

GL_ERROR_8071_DMP A value other than 8 or 32 is specified in blocksize.

GL_ERROR_8072_DMP An invalid value is specified in either width or height.


GL_ERROR_8073_DMP An invalid value is specified in format.

4.1.18. Adding a Block Image Transfer Command Request

A command request for transferring a block image is added to the command list by calling the
nngxAddBlockImageCopyCommand() function. The added command request allows you to copy
graphics between textures and render buffers that contain rendered images. Because transfer is
performed by specifying a combination of transfer size and skip size, you can clip part of the source
image region or paste to part of the destination image region. The main purpose of this function is
to transfer block format images. It can be used for transfer of various types of data because it does
not perform format conversion.

Code 4-24. Function for Adding a Block Image Transfer Command Request

void nngxAddBlockImageCopyCommand(
const GLvoid* srcaddr, GLsizei srcunit, GLsizei srcinterval,
GLvoid* dstaddr, GLsizei dstunit, GLsizei dstinterval,
GLsizei totalsize);

Use the srcaddr parameter to specify the transfer source start address. dstaddr specifies the
transfer destination start address. Both srcaddr and dstaddr must be 16-byte aligned.

totalsize specifies the total amount of data to be transferred, in bytes. totalsize must be 16-
byte aligned.

srcunit and srcinterval specify the unit size used for reading each transfer and the skip size,
respectively. srcunit bytes of data are transferred, and then srcinterval bytes in the address
being read are skipped, repeating alternately. Transfer ends when the amount of data transferred
reaches totalsize. If srcinterval is 0, memory is read continuously from the start address
until totalsize is reached. If srcinterval is any value other than 0, srcunit bytes of data are
read and then srcinterval bytes are skipped, repeatedly. This operation allows part of the
source image to be clipped.

dstunit specifies the write unit size of the transfer destination, and dstinterval specifies the
skip size, in bytes. dstunit bytes of data are written and dstinterval bytes in the address
being written are skipped, repeating alternately. Transfer ends when the amount of data transferred
reaches totalsize. If dstinterval is 0, memory is written continuously from the start address
until totalsize is reached. If dstinterval is any value other than 0, writing and skipping are
repeated, allowing the image to be inserted into a portion of the memory region for the transfer
destination image.

Figure 4-3. Sample Block Image Transfer

The srcunit, srcinterval, dstunit, and dstinterval parameters must be multiples of 16.
Negative values and values greater than or equal to 0x100000 cannot be specified.

When transferring rendering results, such as block images, note that the start address of the
transfer image (at both the source and destination) is normally the upper-left corner of the image
(or the lower-left corner in OpenGL ES), and that data is arranged in block units of 8×8 pixels when
using a format with a block size of 8. For more information about the block format, see 7.10. Native
PICA Format.

Table 4-26. Errors Generated by the nngxAddBlockImageCopyCommand Function

Error Code Cause

A command list with object name 0 was bound or there is no space in the
GL_ERROR_8074_DMP
command request queue.

GL_ERROR_8075_DMP Either srcaddr or dstaddr is not 16-byte aligned.


GL_ERROR_8076_DMP totalsize is not a multiple of 16.

An invalid value was specified in srcunit, srcinterval, dstunit, or


GL_ERROR_8077_DMP
dstinterval.

4.1.19. Adding a Memory Fill Command Request

A command request for filling the specified region of memory with the specified data can be added
to the command list by calling the nngxAddMemoryFillCommand() function. The command
request added by this function can be used for purposes such as clearing the color buffer or depth
buffer (stencil buffer). The glClear() function provides the same functionality, but this function is
more versatile. Two memory regions of different sizes can be cleared simultaneously by making
settings for two channels with independently specifiable parameters.

Code 4-25. Function for Adding a Memory Fill Command Request

void nngxAddMemoryFillCommand(
GLvoid* startaddr0, GLsizei size0, GLuint data0, GLsizei width0,
GLvoid* startaddr1, GLsizei size1, GLuint data1, GLsizei width1);
startaddr0, size0, data0, and width0 represent settings for Channel 0. startaddr1, size1,
data1, and width1 represent settings for Channel 1. Memory is filled simultaneously for both
channel 0 and channel 1. If the memory regions specified for Channel 0 and Channel 1 overlap, the
fill data that is ultimately applied to the overlapping part is undefined.

startaddr0 and startaddr1 specify the start addresses of the memory regions. Addresses must
be 16-byte aligned. If 0 is specified for an address, that channel is not used. If 0 is specified for
startaddr0, no error checking is performed for size0, data0, or width0. If 0 is specified for
startaddr1, no error checking is performed for size1, data1, or width1.

size0 and size1 specify the sizes of the memory regions, in bytes. Sizes must be multiples of 16.

data0 and data1 specify the fill pattern data. The specified fill pattern is repeatedly inserted into
the memory region until it is full.

width0 and width1 specify the bit width of the fill pattern. The values 16, 24, or 32 can be
specified for the bit width. If 16 is specified, the memory region is filled in 16-bit units using bits
[15:0] of the data. If 24 is specified, the memory region is filled in 24-bit units using bits [23:0] of
the data. If 32 is specified, the memory region is filled in 32-bit units using bits [31:0] of the data.

The following table provides fill pattern specifications (bit width and various parameter values)
according to the render buffer format being used.

Table 4-27. Fill Pattern by Render Buffer Format

Bit
Render Buffer Format R / D G / S B A
Width
[31:24] [23:16] [15:8] [7:0]
GL_RGBA8_OES 32 0 through 0 through 0 through 0 through
255 255 255 255
[23:16] [15:8] [7:0]
GL_RGB8_OES 24 0 through 0 through 0 through -
255 255 255

[15:12] [11:8] [7:4] [3:0]


GL_RGBA4 16
0 through 15 0 through 15 0 through 15 0 through 15

[15:11] [10:6] [5:1] [0:0]


GL_RGB5_A1 16
0 through 31 0 through 31 0 through 31 0 through 1

[15:11] [10:5] [4:0]


GL_RGB565 16 -
0 through 31 0 through 63 0 through 31
GL_DEPTH24_STENCIL8_EXT 32 [23:0] [31:24] - -

GL_DEPTH_COMPONENT24_OES 24 [23:0] - - -

GL_DEPTH_COMPONENT16 16 [15:0] - - -

Table 4-28. Errors Generated by the nngxAddBlockImageCopyCommand() Function

Error Code Cause

A command list with object name 0 was bound, or there is no space in the
GL_ERROR_8078_DMP
command request queue.
GL_ERROR_8079_DMP startaddr0 or startaddr1 is not 16-byte aligned.

GL_ERROR_807A_DMP size0 or size1 is not a multiple of 16.

GL_ERROR_807B_DMP An invalid value is specified in width0 or width1.

4.1.20. Moving the 3D Command Buffer Pointer

Call the nngxMoveCommandbufferPointer() function to move the 3D command buffer pointer


(the position in the 3D commands from which to start running the 3D commands) of the currently
bound command list.

Code 4-26. Function for Moving the 3D Command Buffer Pointer

void nngxMoveCommandbufferPointer(GLint offset);

Specify the amount by which to move the pointer (in bytes) as the offset parameter.

A GL_ERROR_8061_DMP error occurs when no command list is bound, or this operation would move
the pointer outside of the 3D command buffer region.

4.1.21. Adding Jump Commands

Call the nngxAddJumpCommand() function to add to the currently bound command list a jump
command that executes a 3D command in the specified memory region. Use a jump command to
move execution to a different command list without causing any interrupts.

This function uses the command buffer execution PICA register. This only uses channel 0, so the
content of two registers (0x0238 and 0x023A) are both written when this function is run. For more
information, see 8.8.23. Command Buffer Execution Registers (0x0238 – 0x023D) and [Link].
Consecutive Execution of Command Buffers in 3DS Programming Manual: Advanced Graphics.

Code 4-27. Function for Adding Jump Commands

void nngxAddJumpCommand(const GLvoid* bufferaddr, GLsizei buffersize);

In bufferaddr and buffersize, specify the address and size of the command buffer to move
execution to. Both bufferaddr and buffersize must be multiples of 16.

The content of the destination command buffer (the command list specified by bufferaddr and
buffersize) is not copied to the command buffer of the currently bound command list. A jump
command changes the execution address of a command buffer and directly executes the
destination command buffer. Consequently, the application must ensure that the jump destination
memory cache has been flushed.

The last command executed at the jump destination must be a split command (a command to write
to the split command setting register, added by the nngxSplitDrawCmdlist() function).
Alternatively, this command could be another jump command. When using multiple jump commands,
the last command in the last command buffer in the chain must be a split command.

This function adds a command request for a 3D execution command. A GL_ERROR_809A_DMP error
occurs when this function is called immediately after the command buffer has been flushed (for
example, by a call to the nngxFlush3DCommand() function) because doing so is meaningless. To
add a 3D command to the command buffer immediately after a flush, call the
nngxAdd3DCommand() function.

Table 4-29. Errors Generated by the nngxAddJumpCommand() Function

Error Code Cause

GL_ERROR_8096_DMP The bound command list’s object name is 0.


GL_ERROR_8097_DMP buffersize is 0 or less.

GL_ERROR_8098_DMP buffersize is not a multiple of 16.

GL_ERROR_8099_DMP bufferaddr is not a multiple of 16.

GL_ERROR_809A_DMP This function was called immediately after the command buffer was flushed.
GL_ERROR_809B_DMP The command request added by this function makes the queue overflow.

GL_ERROR_809C_DMP The command added by this function makes the command buffer overflow.

4.1.22. Adding Subroutine Commands

Call the nngxAddSubroutineCommand() function to add both a jump command to execute a 3D


command in the specified memory region and a command to set the address for returning to the
command buffer jumped from, to the currently bound command list. Use a subroutine command to
execute another command list without causing any interrupts, as if it were a subroutine.

This function uses the command buffer execution PICA register. This uses all channels, so the
content of four registers (0x0238 through 0x023B) are written when this function is run. For more
information, see 8.8.23. Command Buffer Execution Registers (0x0238 – 0x023D) and [Link].
Consecutive Execution of Command Buffers in 3DS Programming Manual: Advanced Graphics.

Code 4-28. Function for Adding Subroutine Commands

void nngxAddSubroutineCommand(const GLvoid* bufferaddr, GLsizei buffersize);

In bufferaddr and buffersize, specify the address and size of the command buffer to move
execution to. Both bufferaddr and buffersize must be multiples of 16.

The content of the destination command buffer (the command list specified by bufferaddr and
buffersize) is not copied to the command buffer of the currently bound command list. A jump
command changes the execution address of a command buffer and directly executes the
destination command buffer. Consequently, the application must ensure that the jump destination
memory cache has been flushed.

The jump command is executed on channel 0, and the command to return to the command buffer
jumped from is executed on channel 1. Consequently, the last command executed at the jump
destination must be a kick command for channel 1 (a command to write to the command buffer
execution register 0x023D). Alternatively, this command could be a jump command to another
command buffer, but the channel used by the jump must not be channel 0, and the last command in
the last command buffer in the chain must be a kick command for channel 1. In addition, you must
not write to the channel 1 address setting registers (0x0239 and 0x023B). This function adds a
jump command (channel 0) and an address setting (channel 1). The application must place the
channel 1 kick command and the jump commands within the subroutine.

This function does not add a command request for a 3D execution command. After calling this
function, continue accumulating commands, and then execute them after flushing the command
buffer, such as by using the nngxFlush3DCommand() function. Values written to the channel 1
size setting register (0x023B) added by this function are undefined until the command buffer is
flushed. Operation is similarly undefined if you reuse the copied content of this register until the
command buffer is flushed.

Table 4-30. Errors Generated by the nngxAddJumpCommand() Function

Error Code Cause

GL_ERROR_809D_DMP The bound command list’s object name is 0.


GL_ERROR_809E_DMP buffersize is 0 or less.

GL_ERROR_809F_DMP buffersize is not a multiple of 16.

GL_ERROR_80A0_DMP bufferaddr is not a multiple of 16.

GL_ERROR_80A1_DMP The command added by this function makes the command buffer overflow.
4.2. Command Request Types

The following command requests are queued in a command list.

DMA Transfer Command Requests

These command requests use DMA transfers to send texture images and vertex buffers from main
memory into VRAM.

These command requests are queued by glTexImage2D() and other functions that allocate texture
regions, and by glBufferData() and other functions that allocate vertex buffer regions.

Render Command Requests

These command requests execute a single command set of 3D commands accumulated in the 3D
command buffer.

When glClear(), glTexImage2D(), and other functions are called, they write a buffer loading
complete 3D command and then queue the accumulated 3D command buffer as a single render
command request.

The nngxSplitDrawCmdlist() function allows you to queue render command requests at any time.

Memory-Fill Command Requests

These command requests use the GPU memory-fill feature to clear a region allocated in VRAM using
a specified data pattern.

These command requests specify a render buffer and are queued when the glClear() function is
called. The glClear() function also requires a 3D command other than a memory-fill command
request to be executed. In other words, when the glClear() function is called, it first writes 3D
commands for the glClear() function and a buffer loading complete 3D command, and then it
queues a render command request and a memory-fill command request.

Post-Transfer Command Requests

These command requests use the GPU post-filter feature to convert images rendered in PICA block
format into a linear format that can be read by the LCDs.

These command requests are queued when the nngxTransferRenderImage() function is called. If
the nngxSplitDrawCmdlist() function has not been called in advance to stop reading from the 3D
command buffer, these command requests are queued after a buffer loading complete command is
written and a render command request is queued.

Copy Texture Command Requests

These command requests copy GPU rendering results into memory as texture images.

These command requests are queued when glCopyTexImage2D() or glCopyTexSubImage2D() is


called.

If the nngxSplitDrawCmdlist() function has not been called in advance to stop reading from the
3D command buffer, these command requests are queued after a buffer loading complete command is
written and a render command request is queued.
4.3. Methods for Optimizing 3D Command Buffer Performance

The following information describes methods for optimizing performance during 3D command buffer
execution.

4.3.1. Changes in Load Speed due to Address and Size

The address and size of a 3D command buffer can have an effect on load speed at run time.

There are two types of command buffer execution: executing 3D execution commands queued in a
command request, and executing the command buffer execution register.

When executing 3D execution commands, execution is affected by the size from the address
immediately after a split command added by nngxFlush3DCommand() or
nngxSplitDrawCmdlist(), up to the next split command added. You can get the address of 3D
commands being accumulated in the 3D command buffer by calling the
nngxGetCmdlistParameteri() function and passing NN_GX_CMDLIST_CURRENT_BUFADDR for
pname.

When executing using the command buffer execution register, execution is affected by the address
and size of the following command buffers: added by nngxAddJumpCommand(), added as
subroutines by nngxAddSubroutineCommand(), or executed to return from a subroutine to the
calling location.

If the 3D command buffer address is 128-byte aligned, and the size is a multiple of 256 bytes (256,
512, 768, and so on), transfer speed may be faster.

If the 3D command buffer address is not 128-byte aligned and the size starting from the previous
128-byte aligned address to the end of the 3D command buffer is a multiple of 256, speed may be
increased. For example, if the 3D command buffer address and size are 0x20000010 and 0x1F0
respectively, the preceding 128-byte aligned address is only 0x10 earlier, at 0x20000000. The
distance from there to the end is 0x1F0 + 0x10, which is 0x200 (and a multiple of 256).

Although the address and size of the 3D command buffer can influence loading speed as just
described because of the way the GPU is implemented, you may not see large benefits due to
factors such as the location of the buffer, the content of 3D commands, and memory access
conflicts with other modules.

4.3.2. Using Subroutine Execution

It may be possible to improve performance by using 3D command buffer subroutine execution.

[Link]. Overview

3D command buffer subroutine execution uses the command buffer execution register for
execution. In contrast to the ordinary method of storing 3D commands in a sequence of 3D
command buffers and executing it, a command buffer stored in a different location is executed
successively using a command buffer address jump feature. This method is called command
buffer subroutine execution because of performing the following controls: first performing an
address jump specifying the address of a 3D command buffer, executing the 3D command buffer
at that location, and then returning to the calling location.
For more information about using command buffer subroutine execution, see 4.1.22. Adding
Subroutine Commands and Command Buffer Execution Registers in the 3DS Programming
Manual: Advanced Graphics.

[Link]. Effect on Behavior

Command buffer subroutine execution has the following advantages.

Only a jump command to the subroutine command buffer needs to be stored, eliminating the
CPU processing needed to copy the 3D commands. The technique is effective for tasks that
are quite large and configured frequently, such as loading reference table data or shader
programs.
The subroutine command buffer is not copied to the current 3D command buffer, but is
referenced directly by the GPU, allowing the total size of the command buffer to be reduced.
If the subroutine command buffer is stored in VRAM, GPU access to the command buffer is
faster than if it is in main memory (device memory). If memory access to the command buffer
is a performance bottleneck, this technique could improve overall system processing speed.

On the other hand, it has the following disadvantage.

Switching the address due to a jump command incurs memory access overhead. If the
granularity of subroutines in the implementation is small and they are called frequently, a
decrease in GPU processing speed could result.

The effect of converting to subroutines on processing performance is heavily influenced by issues


such as memory access conflicts, so it is strongly dependent on the actual implementation of the
application.

[Link]. Storage Location

Command buffer access speed is faster in VRAM than in main memory (device memory), so we
recommend storing subroutine command buffers in VRAM.

There is some memory access overhead when executing a subroutine command buffer using a
jump command, but if the executed command buffer is stored in VRAM, this overhead is
decreased.

To store a command buffer in VRAM, it must first be generated in device memory and then
transferred to VRAM by DMA using nngxAddVramDmaCommand(). For information about DMA
transfers to VRAM, see 4.1.13. Adding a DMA Transfer Command Request.

[Link]. Balance Between Execution and Access Processes

Depending on the content of subroutine command buffers, the processing bottleneck could move
between accessing and executing 3D commands.

If the 3D command is the register write command of the rasterization module or a later module
(including the rasterization module), each 3D command requires 2 cycles to process, so it is
relatively processor-intensive. When 3D commands are composed of burst commands, execution
is even more processor-intensive relative to access processing. In this case, the bottleneck is in
command execution, and the processing cost of memory access due to conversion to subroutines
is hidden.

If the 3D command is the register write command of a module before the rasterization module
(not including the rasterization module), each 3D command requires only one cycle to process, so
processing emphasis is light relative to commands discussed in the previous paragraph. In this
case, the bottleneck is more likely to be access processing, and the memory access processing
cost incurred by conversion to subroutines is more likely to affect the overall performance.

For information about the relative positioning of each module, see 2.2. Rendering Pipeline.

CONFIDENTIAL

5. Shader Programs

Shader programs can customize the 3D graphics pipeline and control various graphics effects on the
3DS system.

There are three types of 3DS shader programs: one processes vertices, one creates geometry, and one
processes fragments.

The shader program that processes vertices (the vertex shader) can be a unique shader programmed
by developers.

The shader program that creates geometry (the geometry shader) is provided by the SDK. Geometry
shaders and vertex shaders can be used in conjunction with each other.

The shader program that processes fragments (the fragment shader) is not programmable. Fragment
processing is implemented as a fixed pipeline, but it can be controlled through reserved uniforms. This
document uses the terms reserved fragment processing and reserved fragment shader to refer to the
fragment pipeline and shader program, respectively.

5.1. Creating Shaders

Vertex shaders are the only shader programs that can be created by developers. The series of
procedures related to vertex processing follow the OpenGL ES 2.0 specifications, but some features
are not supported.

Shader programs are written in an assembly language that is unique to the PICA graphics core. For
an application to use a shader program, it must load and then attach a binary generated by a special-
purpose assembler and linker. The glShaderSource() and glCompileShader() functions in the
OpenGL ES 2.0 specifications are not implemented.

For more information about how to create a vertex shader, see 8. Vertex Shader and the Vertex
Shader Reference.

5.2. Loading Shaders

As explained in 5.1. Creating Shaders, you must load and then attach a shader program's binary
data. Use the glCreateShader() function first to create a shader object.

Code 5-1. Definition of the glCreateShader Function

GLuint glCreateShader(GLenum type);

The value to pass as type depends on the shader program being loaded. The reserved fragment
shader uses a fixed implementation and does not need to be loaded.

Table 5-1. Shader Object Types

Type Generated Object

GL_VERTEX_SHADER A developer-created vertex shader object.


GL_GEOMETRY_SHADER_DMP A geometry shader object provided by the SDK.

Load the shader program binary into memory and then bind it to the GPU with the
glShaderBinary() function.

Code 5-2. Definition of the glShaderBinary Function

void glShaderBinary(GLint n, const GLuint* shaders, GLenum binaryformat,


const void* binary, GLint length);

For shaders, specify an array of shader objects, and for n, specify the number of array elements.
Because you can only load a binary that was created by the assembler and linker, set
binaryformat to GL_PLATFORM_BINARY_DMP. Set binary to the address at which the shader
program binary was loaded and set length to the binary's size (in bytes).

The loaded shader programs are bound to the array in the order that they were passed to the linker.
You can use the map file output by the linker to check the number of elements in the array and the
type of shader objects to set. For more information, see the Vertex Shader Reference.

5.3. Attaching Shaders

To use a loaded shader program, an application must attach the shader program to a program object
created by glCreateProgram, and then link that program object.

Code 5-3. Definition of the glCreateProgram Function

GLuint glCreateProgram(void);

Unlike in OpenGL ES 2.0, program objects have a 13-bit namespace that is independent of shader
objects. Consequently, up to 8191 objects can be created simultaneously. If any of these objects are
deleted by glDeleteProgram, they can be re-created.

Use the glAttachShader() function to attach a shader program to a program object.

Code 5-4. Definition of the glAttachShader Function

void glAttachShader(GLuint program, GLuint shader);


Set program to the return value from glCreateProgram() and set shader to the shader object.

Both the vertex shader and geometry shader are attached from loaded binaries. The reserved
fragment shader, on the other hand, does not need to be loaded and is attached by setting shader to
GL_DMP_FRAGMENT_SHADER_DMP.

You can attach one vertex shader, one geometry shader, and one reserved fragment shader to a
single program object. In other words, if a point shader and line shader (both geometry shaders) are
attached one after the other, only the line shader is activated.

Use the glLinkProgram() function to link a program object.

Code 5-5. Definition of the glLinkProgram Function

void glLinkProgram(GLuint program);

You can link more than one program object. However, if you use glAttachShader to attach a
different shader program to a linked program object, you must use glLinkProgram to relink the
program object. The program object fails to be linked when the vertex shader and geometry shader
use a total of more than 2048 uniforms.

5.4. Using Shaders

Call the glUseProgram() function to apply a linked shader program to the 3D processing pipeline.

Code 5-6. Definition of the glUseProgram Function

void glUseProgram(GLuint program);

This function allows you to switch between several linked shader programs.

OpenGL allowed you to call the glValidateProgram() function to validate a shader program, but
this function does nothing when it is called on a 3DS system.

Code 5-7. Definition of the glValidateProgram Function

void glValidateProgram(GLuint program);

5.5. Detaching Shaders

You can use the glDetachShader() function to detach shader programs that are no longer
necessary.

Code 5-8. Definition of the glDetachShader Function

void glDetachShader(GLuint program, GLuint shader);


5.6. Destroying Shaders

You can use the glDeleteShader() function to destroy shader objects that are no longer
necessary.

Code 5-9. Definition of the glDeleteShader Function

void glDeleteShader(GLuint shader);

5.7. Querying Shaders

You can query program objects and shader objects to determine whether they are valid or invalid, and
to get parameters and other shader-related information.

5.7.1. Validation

You can use the glIsProgram() and glIsShader() functions to determine whether program
objects and shader objects are valid.

Code 5-10. Definition of the glIsProgram and glIsShader Functions

GLboolean glIsProgram(GLuint program);


GLboolean glIsShader(GLuint shader);

These functions return a value of GL_TRUE if the program object or shader object passed as an
argument is valid and GL_FALSE if it is not.

5.7.2. Getting Attached Shader Objects

You can use the glGetAttachedShaders() function to get the shader objects attached to a
program object.

Code 5-11. Definition of the glGetAttachedShaders Function

void glGetAttachedShaders(GLuint program, GLsizei maxcount, GLsizei* count,


GLuint* shaders);

A list of shader objects attached to the program object specified by program is stored in the array
specified by shaders. For maxcount, specify the size of the array specified in shaders. A
GL_INVALID_VALUE error is generated if program is invalid or maxcount is negative.

The count parameter holds the number of shader objects that were saved. This value is not saved
if NULL is specified, but you can get the number of attached shader objects by calling the
glGetProgramiv() function (described later) with GL_ATTACHED_SHADERS passed as an
argument.
5.7.3. Getting Command List Parameters

You can use the glGetProgramiv() and glGetShaderiv() functions to get parameters for
program objects and shader objects.

Code 5-12. Definition of the glGetProgramiv and glGetShaderiv Functions

void glGetProgramiv(GLuint program, GLenum pname, GLint* params);


void glGetShaderiv(GLuint shader, GLenum pname, GLint* params);

These functions store parameter values in params that correspond to the parameter name
specified by pname. A GL_INVALID_ENUM error is generated if an invalid value is specified for
pname. A GL_INVALID_VALUE error is generated if an invalid value is specified for program or
shader.

The following table shows the parameter names that can be specified for pname and the
parameters that are stored in params for the glGetProgramiv() function.

Table 5-2. Specifiable Parameter Names and Values Saved for the glGetProgramiv Function

pname Values Saved in params


GL_TRUE if the program object is in the "waiting to be
deleted" state, and GL_FALSE otherwise. A program object in
GL_DELETE_STATUS
use by glUseProgram transitions to the "waiting to be
deleted" state when glDeleteProgram is called on it.

GL_TRUE if glLinkProgram has successfully linked the


GL_LINK_STATUS
program object, and GL_FALSE otherwise.
GL_VALIDATE_STATUS The same value as when GL_LINK_STATUS is specified.
GL_INFO_LOG_LENGTH Always 0.

The number of shader objects attached to the program


GL_ATTACHED_SHADERS
object.
GL_ACTIVE_ATTRIBUTES The number of vertex attributes in the active state.

The number of characters in the longest name of the active


GL_ACTIVE_ATTRIBUTE_MAX_LENGTH vertex attributes. The character count includes the
terminating character (NULL).
GL_ACTIVE_UNIFORMS The number of uniforms in the active state.

The number of characters in the longest name of the active


GL_ACTIVE_UNIFORM_MAX_LENGTH uniforms. The character count includes the terminating
character (NULL).

The following table shows the parameter names that can be specified for pname, and the
parameters that are stored in params for the glGetShaderiv() function.

Table 5-3. Specifiable Parameter Names and Values Saved for the glGetShaderiv Function

pname Values Saved in params


GL_SHADER_TYPE The shader type.
GL_TRUE if the shader object is in the "waiting to be deleted" state,
and GL_FALSE otherwise. The shader object attached to a program
GL_DELETE_STATUS
object transitions to the "waiting to be deleted" state when
glDeleteShader is called on it.

GL_COMPILE_STATUS Always GL_FALSE.


GL_INFO_LOG_LENGTH Always 0.

GL_SHADER_SOURCE_LENGTH Always 0.
CONFIDENTIAL

6. Vertex Buffers

Vertex buffers store vertex coordinates, colors, and indices, in addition to texture coordinates and other
information. Use vertex buffers to process certain kinds of models, such as those with a large number
of vertices, in the vertex shader. If you do not use vertex buffers, heavy CPU processing (such as
vertex array sorting) could cause a considerable drop in performance.

6.1. Creating Objects

Use the glGenBuffers() function to create the buffer objects.

Code 6-1. Definition of the glGenBuffers Function

void glGenBuffers(GLsizei n, GLuint* buffers);

This code creates n buffer objects and stores their object names in buffers.

6.2. Specifying Objects

Specify buffer objects to bind as vertex buffers with the glBindBuffer() function. After this
function is called, processing for each type of vertex buffer is run on the specified buffer objects.

Code 6-2. Definition of the glBindBuffer Function

void glBindBuffer(GLenum target, GLuint buffer);

Set target to the vertex buffer type. buffer specifies the buffer object created by the
glGenBuffers() function. A buffer object is created for an object name as long as that uncreated
object name is specified for buffer.

Table 6-1. Vertex Buffer Types

target Value Vertex Buffer Types


Buffer for vertex coordinates, vertex colors, normals, and so
GL_ARRAY_BUFFER
on.

GL_ELEMENT_ARRAY_BUFFER Index buffer used by the glDrawElements() function.


GL_VERTEX_STATE_COLLECTION_DMP Vertex state collection (see 6.6. Vertex State Collections).
6.3. Allocating Buffers

Use the glBufferData() function to allocate a buffer region, and then load vertex data.

Code 6-3. Definition of the glBufferData Function

void glBufferData(GLenum target, GLsizeiptr size, const void* data,


GLenum usage);

For target specify the same value specified in the glBindBuffer() function (Table 6-1).

For data and size, specify the vertex data to store and its size, respectively. When data is 0
(NULL) the region is simply allocated and no data is stored.

The value of usage must be GL_STATIC_DRAW.

Note: To configure the GPU access targets and processing to use when allocating the buffer
region, pass a bitwise OR of some specific flag values for target for the
glBufferData() function. For more information, see 3. Using Data Located in Main
Memory in 3DS Programming Manual: Advanced Graphics.

6.4. Rewriting Buffers

Use the glBufferSubData() function to rewrite part of the buffer that was allocated by
glBufferData.

Code 6-4. Definition of the glBufferSubData Function

void glBufferSubData(GLenum target, GLintptr offset, GLsizeiptr size,


const void* data);

For target specify the same value specified in the glBindBuffer() function (Table 6-1).

For offset, specify the offset to the section to rewrite.

For data and size, specify the data to write and its size, respectively.

6.5. Freeing Buffers

Use the glDeleteBuffers() function to destroy buffer objects that are no longer necessary.

Code 6-5. Definition of the glDeleteBuffers Function

void glDeleteBuffers(GLsizei n, const GLuint* buffers);

This destroys the buffer objects specified by the n object names in buffers.
6.6. Vertex State Collections

Vertex state collections are new to the Nintendo 3DS. You can bind buffer objects to vertex buffers
and record them. By using vertex state collections, you can bind buffer objects to vertex buffers and
set vertex attributes for a group of vertices.

Vertex state collections share a namespace with buffer objects and can be created, specified, and
destroyed using glGenBuffers, glBindBuffer, and glDeleteBuffers, respectively.

6.6.1. Creating Vertex State Collections

A vertex state collection operates as a special buffer object. Use the glGenBuffers() function to
generate objects that can be used as vertex state collections, just as you would for buffer objects.

6.6.2. Specifying Vertex State Collections

Use the glBindBuffer() function to specify a buffer object to use as a vertex state collection.
For target, specify GL_VERTEX_STATE_COLLECTION_DMP. By default, the object with a name of
0 (with a value of 0 passed into buffer) is a vertex state collection.

After this function is called, the vertex state collection records which buffer objects are bound by
glBindBuffer as vertex buffers (using GL_ARRAY_BUFFER or GL_ELEMENT_ARRAY_BUFFER),
and also records which vertex attribute values are set by glEnableVertexAttribArray,
glDisableVertexAttribArray, glVertexAttrib{1234}{fv}, or glVertexAttribPointer.
When a buffer object is bound as a vertex buffer, it overwrites any other existing bindings to the
same vertex buffer target. Settings continue to be recorded in a vertex state collection until it is
switched.

When the vertex state collection is switched, all buffer objects recorded in the new vertex state
collection are bound as the new vertex buffers.

6.6.3. Destroying Vertex State Collections

As with a buffer object, you can use the glDeleteBuffers() function to destroy vertex state
collections. Even if a vertex state collection is destroyed, it does not affect the binding between the
vertex buffer target and the buffer objects recorded in that vertex state collection.

The glDeleteBuffers() function does not immediately destroy a vertex state collection that is in
use. A vertex state collection remains in use until it is switched with another one. You cannot
destroy the default vertex state collection. Calls to glDeleteBuffers on the default vertex state
collection are ignored.

6.7. Sample Vertex Buffer Usage

The following code samples show how to render a single triangle using a vertex buffer and the
glDrawElements() function. There are three steps: defining the arrays, allocating the buffers, and
rendering the triangle.

Code 6-6. Defining the Arrays

GLuint triIndexID, triArrayID;


GLushort triIndex[1 * 3] = { 0, 1, 2 };
GLfloat triVertex[3 * 4] =
{
0.5f, 0.0f, 0.0f, 1.0f,
-0.5f, 0.5f, 0.0f, 1.0f,
-0.5f, -0.5f, 0.0f, 1.0f
};
GLfloat triVertexColor[3 * 3] =
{
1.0f, 0.0f, 0.0f,
0.0f, 1.0f, 0.0f,
0.0f, 0.0f, 1.0f
};

Code 6-7. Allocating the Buffers

glGenBuffers(1, &triArrayID);
glBindBuffer(GL_ARRAY_BUFFER, triArrayID);
glBufferData(GL_ARRAY_BUFFER, sizeof(triVertex) + sizeof(triVertexColor), 0,
GL_STATIC_DRAW);
glBufferSubData(GL_ARRAY_BUFFER, 0, sizeof(triVertex), triVertex);
glBufferSubData(GL_ARRAY_BUFFER, sizeof(triVertex), sizeof(triVertexColor),
triVertexColor);
glGenBuffers(1, &triIndexID);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, triIndexID);
glBufferData(GL_ELEMENT_ARRAY_BUFFER, 3 * sizeof(GLushort), triIndex,
GL_STATIC_DRAW);

Code 6-8. Rendering With glDrawElements

glBindBuffer(GL_ARRAY_BUFFER, triArrayID);
glVertexAttribPointer(0, 4, GL_FLOAT, GL_FALSE, 0, 0);
glVertexAttribPointer(1, 3, GL_FLOAT, GL_FALSE, 0, (GLvoid*)sizeof(triVertex));

glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, triIndexID);
glDrawElements(GL_TRIANGLES, 3, GL_UNSIGNED_SHORT, 0);

6.8. Restrictions on Vertex Data Placement

The following hardware restrictions apply to the vertex data placement configured by the
glVertexAttribPointer() function when rendering using vertex data stored in a vertex buffer. A
GL_INVALID_OPERATION error is generated when the glDrawArrays() or glDrawElements()
function is called in violation of these restrictions.

All vertex data must be aligned to its own data type's size.
The stride for all vertex data in a single structure must be a multiple of the size of that structure's
largest vertex data type.
If any padding (beyond the minimum amount required to meet the two aforementioned
restrictions) is inserted after the vertex data, it must start at the end of the vertex data and
continue until the second-closest 4-byte boundary.

The compiler automatically inserts padding, so there will be no conflicts with the first two restrictions
even if you do not pay attention to them when coding.

The following sample code would conflict with the last restriction, if it did not insert extraPadding2.

Code 6-9. Sample Code for Working Around Restrictions on Vertex Data Placement

struct tagVertex
{
GLshort position[3];
GLshort extraPadding1;
GLshort extraPadding2[2];
GLshort color[4];
};

You cannot mix vertex attributes and indices that use vertex buffers with attributes and indices that
do not use them in a single call to the glDrawArrays or glDrawElements() function. A
GL_INVALID_OPERATION error is generated if they are mixed together.

6.8.1. Restrictions Affecting Only glDrawElements

A GL_INVALID_OPERATION error is generated when all of the following conditions are met, due to
restrictions associated with the method of vertex array storage.

The vertex buffer is being used.


There are 12 vertex attributes being used.
All vertex attributes are being used as vertex arrays.
(qlEnableVertexAttribArray has been called for all vertex attributes.)

Rendering is performed using qlDrawElements.

When all these conditions are present, at least two vertex attributes must be placed as interleaved
arrays. In other words, 12 independent arrays cannot be simultaneously used as vertex attributes.

Note: An interleaved array is a group made up of multiple vertex attributes defined as a


structure. An individual array made up of one vertex attribute is called an independent
array.

CONFIDENTIAL

7. Textures

In addition to cube map textures and 2D textures that are pasted onto polygon models, the 3DS system
can handle shadow textures used for shadows, gas textures used for rendering gases, and lookup
tables used by the reserved fragment shader (these lookup tables cannot be used as textures).

This chapter explains the necessary procedures for using these textures and describes how the native
PICA format differs from the OpenGL specifications.
7.1. Creating Texture Objects

Use the glGenTextures() function to create texture objects to bind as textures.

Code 7-1. Definition of the glGenTextures Function

void glGenTextures(GLsizei n, GLuint* textures);

7.2. Specifying Texture Objects

Use the glBindTexture() function to specify a texture object to bind as a texture. After this
function is called, the various texture processes are performed using the specified texture object.
Texture images loaded for (and the results of processing performed on) a texture object are
preserved until the texture object is deleted. Consequently, you can switch texture objects to change
textures without reloading texture images.

Code 7-2. Definition of the glBindTexture Function

void glBindTexture(GLenum target, GLuint texture);

Specify the texture type for target, and the texture object to bind for texture. A
GL_INVALID_ENUM error is generated if you set target to a value that is not in the following table.

Table 7-1. Texture Types

target Value Type of Texture


2D textures, shadow textures, and gas
GL_TEXTURE_2D
textures.
GL_TEXTURE_CUBE_MAP_POSITIVE_X Cube-map texture.
GL_TEXTURE_CUBE_MAP_NEGATIVE_X Cube-map texture.

GL_TEXTURE_CUBE_MAP_POSITIVE_Y Cube-map texture.


GL_TEXTURE_CUBE_MAP_NEGATIVE_Y Cube-map texture.
GL_TEXTURE_CUBE_MAP_POSITIVE_Z Cube-map texture.
GL_TEXTURE_CUBE_MAP_NEGATIVE_Z Cube-map texture.

GL_LUT_TEXTUREi_DMP (where i is a value between Lookup tables (used by the reserved fragment
0 and 31) shader).
Texture collections (see 7.9. Texture
GL_TEXTURE_COLLECTION_DMP
Collections).

7.3. Loading Texture Images

In addition to normal texture images (2D textures), the glTexImage2D() function loads the following
special textures: cube map textures, shadow textures, and gas textures. Partial texture loading with
the glTexSubImage2D() function is not supported.

Code 7-3. Definition of the glTexImage2D Function

void glTexImage2D(GLenum target, GLint level, GLenum internalformat,


GLsizei width, GLsizei height, GLint border, GLenum format,
GLenum type, const void* pixels);

Set target to one of the following values.

Table 7-2 target Values in glTexImage2D

target Value Texture Usage


GL_TEXTURE_2D 2D textures, shadow textures, and gas textures.

GL_TEXTURE_CUBE_MAP_POSITIVE_X Cube map +X plane textures.


GL_TEXTURE_CUBE_MAP_NEGATIVE_X Cube map -X plane textures.
GL_TEXTURE_CUBE_MAP_POSITIVE_Y Cube map +Y plane textures.

GL_TEXTURE_CUBE_MAP_NEGATIVE_Y Cube map −Y plane textures.


GL_TEXTURE_CUBE_MAP_POSITIVE_Z Cube map +Z plane textures.
GL_TEXTURE_CUBE_MAP_NEGATIVE_Z Cube map −Z plane textures.

Combinations of format and type specify the format of texture images stored in the region specified
by pixels. The following table lists the formats that can be handled by the 3DS system. The Bytes
column indicates the number of bytes per texel. An asterisk (*) that follows the byte count indicates a
native PICA format, which has a different byte order than the standard OpenGL format. For more
information about the native PICA format, see 7.10. Native PICA Format.

Table 7-3. Texture Image Formats Specified by format and type

format type Format Bytes


GL_RGBA GL_UNSIGNED_SHORT_4_4_4_4 RGBA4 2
GL_RGBA GL_UNSIGNED_SHORT_5_5_5_1 RGBA5551 2

GL_RGBA GL_UNSIGNED_BYTE RGBA8 4


GL_RGB GL_UNSIGNED_SHORT_5_6_5 RGB565 2
GL_RGB GL_UNSIGNED_BYTE RGB8 3
GL_ALPHA GL_UNSIGNED_BYTE A8 1

GL_ALPHA GL_UNSIGNED_4BITS_DMP A4 0.5


GL_LUMINANCE GL_UNSIGNED_BYTE L8 1
GL_LUMINANCE GL_UNSIGNED_4BITS_DMP L4 0.5
GL_LUMINANCE_ALPHA GL_UNSIGNED_BYTE LA8 2

GL_LUMINANCE_ALPHA GL_UNSIGNED_BYTE_4_4_DMP LA4 1


GL_SHADOW_DMP GL_UNSIGNED_INT - 4
GL_GAS_DMP GL_UNSIGNED_SHORT - 4
GL_HILO8_DMP GL_UNSIGNED_BYTE - 2

GL_RGBA_NATIVE_DMP GL_UNSIGNED_SHORT_4_4_4_4 RGBA4 2


GL_RGBA_NATIVE_DMP GL_UNSIGNED_SHORT_5_5_5_1 RGBA5551 2
GL_RGBA_NATIVE_DMP GL_UNSIGNED_BYTE RGBA8 4*

GL_RGB_NATIVE_DMP GL_UNSIGNED_SHORT_5_6_5 RGB565 2


GL_RGB_NATIVE_DMP GL_UNSIGNED_BYTE RGB8 3*
GL_ALPHA_NATIVE_DMP GL_UNSIGNED_BYTE A8 1
GL_ALPHA_NATIVE_DMP GL_UNSIGNED_4BITS_DMP A4 0.5

GL_LUMINANCE_NATIVE_DMP GL_UNSIGNED_BYTE L8 1
GL_LUMINANCE_NATIVE_DMP GL_UNSIGNED_4BITS_DMP L4 0.5
GL_LUMINANCE_ALPHA_NATIVE_DMP GL_UNSIGNED_BYTE LA8 2*
GL_LUMINANCE_ALPHA_NATIVE_DMP GL_UNSIGNED_BYTE_4_4_DMP LA4 1

GL_SHADOW_NATIVE_DMP GL_UNSIGNED_INT - 4*
GL_GAS_NATIVE_DMP GL_UNSIGNED_SHORT - 4*
GL_HILO8_DMP_NATIVE_DMP GL_UNSIGNED_BYTE - 2*

R, G, and B represent colors (red, green, and blue, respectively).


A represents the alpha value.
L represents the luminance value.

As set forth in the OpenGL ES 1.1 specifications, a value of 1 is output for the alpha component if the
texture combiner references a texture without an alpha component. This is also true for compressed
textures.

You can only set format to GL_RGB or GL_RGB_NATIVE_DMP and type to GL_UNSIGNED_BYTE
when target is GL_TEXTURE_2D. When format is GL_*_NATIVE_DMP, pixels must specify data
in the native PICA format. When GL_GAS_DMP is specified, pixels must be 0 (NULL).

For width and height, specify the width and height of the texture image. Both numbers must be
powers of 2, from 8 through 1024.

If target is GL_TEXTURE_CUBE_MAP_*, width and height must have the same value. Every
surface must have the same settings, except for pixels (and target).

If pixels is set to 0 (NULL), a region is allocated but image data is not loaded.

For internalformat specify the basic internal format. A GL_INVALID_OPERATION error is


generated when internalformat and format are not the same value. The following table shows
how an image's RGBA components correspond to the components in the basic internal format.

Table 7-4. Correspondence Between RGBA Formats and Basic Internal Formats

Basic Internal Format RGBA Internal Format

GL_ALPHA A A
GL_LUMINANCE R L
GL_LUMINANCE_ALPHA R, A L, A

GL_RGB R, G, B R, G, B
GL_RGBA R, G, B, A R, G, B, A
GL_HILO8_DMP R, G Nx, Ny

The GL_HILO8_DMP format outputs 0.0 for the B component and 1.0 for the A component.

Unlike in the OpenGL specifications, level specifies the number of mipmap levels as a negative
value. For example, -2 indicates two mipmap levels, and both -1 and 0 indicate one mipmap level.
You cannot separately specify which textures to use for each mipmap level. Instead, set pixels to a
series of textures, starting with the texture to use as the largest mipmap and ending with the texture
to use as the smallest mipmap.

You must set border to 0.

Note: To configure the GPU access targets and processing to use when allocating the buffer
region, pass a bitwise OR of certain specific flag values for target for the
glTexImage2D() function. For more information, see 3. Using Data Located in Main
Memory in 3DS Programming Manual: Advanced Graphics.

7.3.1. Formats With 4-Bit Components

Texture formats with a type of GL_UNSIGNED_BYTE_4_4_DMP or GL_UNSIGNED_4BITS_DMP use


4 bits for a single component, and require a special data ordering.

In a format with a type of GL_UNSIGNED_BYTE_4_4_DMP, two components are stored in a single


byte. You can combine this with a format of GL_LUMINANCE_ALPHA or
GL_LUMINANCE_ALPHA_NATIVE_DMP. The luminance is stored in the most-significant 4 bits and
the alpha component is stored in the least-significant 4 bits.

In a format with a type of GL_UNSIGNED_4BITS_DMP, two texels are stored in a single byte. You
can combine this with a format of GL_LUMINANCE, GL_LUMINANCE_NATIVE_DMP, GL_ALPHA, or
GL_ALPHA_NATIVE_DMP. Viewed as a row of texels, the first component is stored in the least-
significant 4 bits, the second component is stored in the most-significant 4 bits, and so on.

Warning: If you enable a texture format with a type of GL_UNSIGNED_4BITS_DMP (a 4-bit


format) at the same time as another texture format (a non-4-bit format that includes
ETC1 compressed textures) and then use them as a multitexture, there are restrictions
on the placements of the texture. and then use them as a multitexture, there are
restrictions on the placements of the texture.

If you are placing 4-bit textures in VRAM, you must place 4-bit textures and non-4-bit
textures in separate memory. In such cases, VRAM-A and VRAM-B are treated as
separate memory. Behavior is undefined when textures having different bit formats are
placed in the same memory.

There is no restriction on texture arrays when 4-bit textures are placed in main memory.

Figure 7-1. Bit Layout of Texture Formats That Comprise 4-Bit Components

7.4. Loading Compressed Textures

You can load compressed image data as texture images. Partial texture loading, with the
glCompressedTexSubImage2D() function, is not supported.

Code 7-4. Definition of the glCompressedTexImage2D Function


void glCompressedTexImage2D(GLenum target, GLint level, GLenum internalformat,
GLsizei width, GLsizei height, GLint border,
GLsizei imageSize, const void* data);

For target specify the same value specified in the glTexImage2D() function (Table 7-2). However,
shadow textures and gas textures cannot use the regions allocated by this function.

For width and height, specify the width and height of the texture image. Both numbers must be
powers of 2, from 16 through 1024.

The values specified for level and border, and the restrictions on cube map textures and mipmap
textures, are the same as in the glTexImage2D() function. For more information, see 7.3. Loading
Texture Images.

The hardware supports only one compressed texture format: ETC1 (Ericsson Texture Compression).
You can set internalformat to either GL_ETC1_RGB8_NATIVE_DMP or
GL_ETC1_ALPHA_RGB8_A4_NATIVE_DMP.

The ETC1 format takes blocks of 4&times;4 texels in the 24-bit RGB format and compresses them
each into 64 bits. An alpha channel is not supported by GL_ETC1_RGB8_NATIVE_DMP, but it is
supported by GL_ETC1_ALPHA_RGB8_A4_NATIVE_DMP, which adds 4 bits of alpha component data
for each of the 16 texels.

A value of 1 is output for the alpha component when the texture combiner references a compressed
texture with a format that does not include an alpha channel.

For imageSize specify the number of bytes in the image data. If the original texture image has a
width of w and a height of h, imageSize can be found using the following equation. The value of
blockSize is either 8 when there is no alpha channel or 16 when there is.

imageSize = (w / 4) * (h / 4) * blockSize

The ETC1 format handled by the 3DS system is different from the standard OpenGL specifications.
7.10. Native PICA Format explains the differences between this format and the standard
specifications. For more information about formats, see the 3DS Programming Manual: Advanced
Graphics.

Note: To configure the GPU access targets and processing to use when allocating the buffer
region, pass a bitwise OR of certain specific flag values for target for the
glCompressedTexImage2D() function. For more information, see 3. Using Data Located
in Main Memory in 3DS Programming Manual: Advanced Graphics.

7.5. Copying From the Framebuffer

You can get (copy) an image of the color buffer and depth buffer, which are bound to a framebuffer
object as a texture.

7.5.1. Copying From the Color Buffer

You can get (copy) an image of the color buffer as a texture.

Code 7-5. Definition of the glCopyTexImage2D Function

void glCopyTexImage2D(GLenum target, GLint level, GLenum internalformat,


GLint x, GLint y, GLsizei width, GLsizei height,
GLint border);

For target specify the same value specified in the glTexImage2D() function (Table 7-2). All
other argument values are also the same, except for the following differences.

For internalformat specify either GL_RGB or GL_RGBA, but data cannot be converted from
the color buffer format during the copy (only formats with the same pixel sizes are allowed).
The values of x and y specify the starting point of the data region to copy from the color buffer
(with the origin at the lower-left corner and the positive axes pointing up and right). The values
of width and height specify the width and height of the region to copy. The values of x and y
must be multiples of 8.
Only 0 can be specified for level.

Note: To configure the GPU access targets and processing to use when allocating the buffer
region, pass a bitwise OR of certain specific flag values for target for the
glCopyTexImage2D() function. For more information, see 3. Using Data Located in
Main Memory in 3DS Programming Manual: Advanced Graphics.

7.5.2. Copying From the Color Buffer

You can also copy a partial texture image region from the color buffer.

Code 7-6. Definition of the glCopyTexSubImage2D Function

void glCopyTexSubImage2D(GLenum target, GLint level,


GLint xoffset, GLint yoffset, GLint x, GLint y,
GLsizei width, GLsizei height);

Data must be copied to a texture image region that was allocated in advance by the
glTexImage2D() function.

This function is the same as glCopyTexImage2D except that xoffset and yoffset specify the
coordinates of the region to be copied to (where the origin is the lower-left corner and the positive
axes point up and toward the right), and width and height specify a multiple of 8 (not necessarily
a power of two). For more information, see 7.5.1. Copying From the Color Buffer.

7.5.3. Copying From the Depth Buffer

If you call the glEnable() function with GL_DEPTH_STENCIL_COPY_DMP passed as an argument,


the glCopyTexImage2D() and glCopyTexSubImage2D() functions copy the content of the depth
(and stencil) buffer, rather than the color buffer, to a texture as long as depth/stencil copy
operations are enabled.

The format of the current depth buffer determines the format of the texture to specify as the copy
target. Because the format is not converted during the copy operation, a GL_INVALID_OPERATION
error occurs if you attempt to copy data to a texture with an unsupported format. The copied data is
identical for both native and non-native texture formats.

Table 7-5. Depth Buffer Formats and the Corresponding Texture Format and Type
Depth Buffer Format Texture Format Texture Type Components
R: Stencil
GL_RGBA G: D [23:16]
GL_DEPTH24_STENCIL8_EXT GL_UNSIGNED_BYTE
GL_RGBA_NATIVE_DMP B: D [15:8]
A: D [7:0]

R: D [23:16]
GL_RGB
GL_DEPTH_COMPONENT24_OES GL_UNSIGNED_BYTE G: D [15:8]
GL_RGB_NATIVE_DMP
B: D [7:0]

GL_HILO8_DMP R: D [15:8]
GL_DEPTH_COMPONENT16 GL_UNSIGNED_BYTE
GL_HILO8_DMP_NATIVE_DMP G: D [7:0]

Note: The Components column shows which bits of the depth value are set in each
component. (For example, Depth [15:8] indicates that bits 8 through 15 of the depth value are
set for that component.)

You can call the glEnable, glDisable, and glIsEnabled() functions and pass
GL_DEPTH_STENCIL_COPY_DMP as an argument to respectively enable, disable, and determine the
current status of depth and stencil copy operations.

Note: You cannot use the glCopyTexImage2D() and glCopyTexSubImage2D() functions in


block 32 mode. No errors are generated, but images are not transferred properly. For
information about block mode settings, see the 3DS Programming Manual: Advanced
Graphics.

7.6. Specifying a Texture as the Render Target

To write rendering results directly to a texture ("render to texture"), take a texture image allocated by
glTexImage2D() and bind it to a framebuffer object using the glFramebufferTexture2D()
function. Because rendering results are written directly to each of the special textures used by
shadows and gases, you do not need to use this function to specify the special textures as rendering
targets.

Code 7-7. Definition of the glFramebufferTexture2D Function

void glFramebufferTexture2D(GLenum target, GLenum attachment, GLenum textarget,


GLuint texture, GLint level);

For target, you can only specify GL_FRAMEBUFFER.

For attachment, specify the data to write to the texture. Specify GL_COLOR_ATTACHMENT0 if a color
buffer and GL_DEPTH_ATTACHMENT if a depth buffer.

For textarget specify the same value specified in the glTexImage2D() function (Table 7-2).

For texture, specify the texture object to bind to the framebuffer object.

Only 0 can be specified for level.

When a texture is specified as the render target for the depth (and stencil) buffer, the texture format
to specify for each depth buffer format is set. The corresponding format is the same as that indicating
in 7.5.3. Copying From the Depth Buffer (Table 7-5). When the format of the depth buffer is
GL_DEPTH24_STENCIL8_EXT, you can set GL_DEPTH_STENCIL_ATTACHMENT to attachment, but
it is the same as setting GL_DEPTH_ATTACHMENT.
7.7. Loading Lookup Tables

The glTexImage1D() function loads one-dimensional textures in OpenGL, but it is used to load
lookup tables on the 3DS system. A lookup table is a one-dimensional table accessed by procedural
textures, fragment lighting, fog, and gases. It cannot be used as a texture.

Code 7-8. Definition of the glTexImage1D Function

void glTexImage1D(GLenum target, GLint level, GLint internalformat,


GLsizei width, GLint border, GLenum format, GLenum type,
const GLvoid *pixels);

For target, specify GL_LUT_TEXTUREi_DMP to specify the lookup table to load. The lookup table
number is specified in i as a number between 0 and one less than the value obtained by passing
GL_MAX_LUT_TEXTURES_DMP to pname in the glGetIntegerv() function. In other words, you can
specify a lookup table number in the range from 0 through 31. GL_LUT_TEXTUREi_DMP is defined as
GL_LUT_TEXTURE0_DMP + i.

You can only specify 0 for level, GL_FLOAT for type, and GL_LUMINANCEF_DMP for format and
internalformat. A GL_INVALID_VALUE error is generated if level is nonzero. A
GL_INVALID_ENUM error is generated if type, format, or internalformat is not one of these
values.

For width, specify the number of table elements, and for pixels, specify the table elements. The
maximum value that can be specified for width is 512, which is the same value obtained from the
glGetIntegerv() function when GL_MAX_LUT_ENTRIES_DMP is passed into pname. However, the
various reserved fragment processes have unique restrictions on the number of table elements and
on the table elements.

To get the ID of the texture object bound to GL_LUT_TEXTUREi_DMP, call glGetIntegerv() and
specify GL_TEXTURE_BINDING_LUTi_DMP for the pname parameter (where i is a value from 0
through 31).

Use the glTexSubImage1D() function to rewrite a subset of the lookup table data that has been
loaded.

Code 7-9. Definition of the glTexSubImage1D Function

void glTexSubImage1D(GLenum target, GLint level, GLint xoffset,


GLsizei width, GLenum format, GLenum type,
const GLvoid *pixels);

Data must be copied to a lookup table region that was allocated in advance by the glTexImage1D()
function.

This function is identical to glTexImage1D except for xoffset and width, which specify the
starting element number and the number of elements, respectively. A GL_INVALID_VALUE error is
generated if the sum of width and xoffset exceeds the number of table elements.

The glCopyTexImage1D() and glCopyTexSubImage1D() functions have not been implemented.


Texture parameter settings are also unsupported.

7.8. Destroying Texture Objects


You can use the glDeleteTextures() function to destroy texture objects that are no longer
necessary.

Code 7-10. Definition of the glDeleteTextures Function

void glDeleteTextures(GLsizei n, const GLuint* textures);

7.9. Texture Collections

Texture collections are new to the Nintendo 3DS. You can bind texture objects to textures and record
them. Using texture collections, you can bind a group of texture objects to a recorded texture type in
one operation. Texture collections share their namespace with texture objects and can be created,
specified, and destroyed using glGenTextures, glBindTexture, and glDeleteTextures,
respectively.

7.9.1. Creating Texture Collections

A texture collection operates as a special texture object. Consequently, use the glGenTextures()
function to generate objects that can be used as texture collections, just as you would for texture
objects.

7.9.2. Specifying Texture Collections

Use the glBindTexture() function to specify a texture object to use as a texture collection. Call
this function, and for the target parameter, specify GL_TEXTURE_COLLECTION_DMP. By default,
the object with a name of 0 (with a value of 0 passed into texture) is a texture collection.

After this function is called, the texture collection records the texture objects bound by
glBindTexture to each texture type (2D texture, cube map texture, and lookup table). The
function overwrites any other existing binding to the same texture. Settings recorded in the texture
collection remain until the collection is switched.

When the texture collection is switched, all texture objects recorded in the new texture collection
are bound as the new textures.

7.9.3. Destroying Texture Collections

Texture collections can be destroyed with the glDeleteTextures() function, just like texture
objects. Even if a texture collection is destroyed, it does not affect texture bindings to the texture
objects recorded in that vertex state collection.

The glDeleteTextures() function does not immediately destroy a texture collection that is in
use. A texture collection remains in use until it is switched with another one. You cannot destroy the
default texture collection. Calls to glDeleteTextures on the default texture collection are
ignored.
7.10. Native PICA Format

The GPU's texture unit supports a different texture format than the OpenGL specifications. The
format that is actually supported by the texture unit is called the native PICA format. Because
libraries do not convert textures loaded in native PICA format, it is more efficient than the standard
format.

There are three major differences between this format and the OpenGL specifications.

Byte order
Bytes are written in a different order because of the way internal addresses are processed.

V-Flip
The two formats use v-axes that run in opposite directions, when placing texels using UV
coordinates.

Addressing
Texels and compressed data blocks are written in different orders, because of differences
between linear addressing (OpenGL) and block addressing (native PICA format).

In the CTR system, v-flip, addressing, and byte-order conversions are run, in that order, on an
uncompressed texture (loaded by glTexImage2D), to convert it from the OpenGL format into the
native PICA format. The system converts a compressed texture (loaded by
glCompressedTexImage2D) by running V-flip, ETC compression, addressing, and finally, byte-order
conversions. Note that v-flip conversion must precede ETC compression.

7.10.1. Byte Order Swapping

When the format uses more than one byte to represent a single texel and the texture was defined
by calling glTexImage2D() and passing GL_UNSIGNED_BYTE for type, the number of bytes
swapped (replaced) is the same as the number of bytes that represent a texel.

Data in a compressed format is swapped one block (8 bytes for 4×4 texels) at a time. However, if
the format has an alpha channel (the first 8 bytes), the alpha portion is not byte-swapped.

Figure 7-2. Four-Byte Swap

Figure 7-3. Three-Byte Swap

Figure 7-4. Two-Byte Swap


Figure 7-5. Byte Swap in a Compressed Format (No Alpha Channel)

Figure 7-6. Byte Swap in a Compressed Format (With Alpha Channel)

7.10.2. V-Flip Differences

In the OpenGL specifications, images are converted into data starting with the texel at UV
coordinates (0.0, 0.0). In the native PICA format, images are converted into data, starting with the
texel at UV coordinates (0.0, 1.0). This is the same for both uncompressed and compressed
textures.

7.10.3. Differences in Addressing

In the OpenGL specifications, texels are stored consecutively in the U direction (linear addressing).
In the native PICA format, blocks of texels are stored, consecutively, in the U direction, but texels
within those blocks are stored in a zigzag pattern (block addressing).

[Link]. Uncompressed Textures

Texels are stored in a zigzag pattern within each block of 8×8 texels (as shown by the red lines in
the figure), but the blocks are stored consecutively in the U direction.

Figure 7-7. Texel Order in an Uncompressed Texture


[Link]. Compressed Textures

A compressed texture compresses each block of 4×4 texels. In the OpenGL specifications, blocks
are stored consecutively in the U direction (shown by the blue line in the figure). In the native
PICA format, blocks are stored in a zigzag pattern in meta-blocks of 2&times;2 blocks. The meta-
blocks themselves are stored consecutively in the U direction (shown by the red line in the
figure).

Figure 7-8. Texel Order in a Compressed Texture

For more information about compressed textures, see the 3DS Programming Manual: Advanced
Graphics.

CONFIDENTIAL

8. Vertex Shaders

A vertex shader converts coordinates, applies shadows, and performs other operations on vertex
attributes that are input. The 3DS system has four vertex processors, each of which can handle vector
data comprising four, 24-bit floating-point values. The input vertex data is processed, in parallel, by the
four vertex processors. Although the processors can run general-purpose calculations, they cannot
read data from or write data to VRAM.

Like OpenGL ES 2.0, the 3DS system does not make a clear distinction between coordinates, normals,
and other attributes stored with vertex data. Attributes determine how a vertex shader processes and
outputs its input data. In general, the following types of attributes are input as vertex data for 3D
graphics processing.

Vertex coordinates
Normal vectors
Tangent vectors
Texture coordinates
Vertex color

Vertex shaders are written in an assembly language that is unique to the PICA graphics core. We
recommend that you refer to the Vertex Shader Reference as you continue to read this chapter.

8.1. Input Vertex Data

Vertex data input by the application is passed to the vertex shader through input registers that are
bound to vertex attribute numbers. Using #pragma bind_symbol, the vertex shader specifies the
names and registers for input data.

Code 8-1. Binding Data Names and Registers (in Shader Assembly)

#pragma bind_symbol([Link], v0, v0)

In this code sample, the xyzw component for the v0 input register is bound to the AttribPosition
data name. You cannot bind more than one data name to the same input register. The second and
third arguments must also take the same value (or the third argument must be omitted).

The application uses the glBindAttribLocation() function to bind vertex attribute numbers and
data names, the glEnableVertexAttribArray() function to enable bound vertex attribute
numbers, and glVertexAttribPointer or another function to input vertex data.

Code 8-2. Binding Vertex Attribute Numbers and Inputting Vertex Data

glBindAttribLocation(program, 0, "AttribPosition");
glEnableVertexAttribArray(0);
glVertexAttribPointer(0, 4, GL_FLOAT, GL_FALSE, 0, pointer);

In this code sample, the vertex attribute number 0 is bound to data having the name
AttribPosition, and vertex attributes for four components are specified. The register number
and the vertex attribute number do not have to be the same.

Input registers that are not bound by #pragma bind_symbol have undefined values. Any input
vertex data that is not a GL_FLOAT value is automatically converted into a GL_FLOAT value.
However, note that the data is not normalized.

When a vertex buffer is used, the vertex attribute data type and data size combination affects transfer
speed of the vertex data. For more information, see 15.13. Effect of Vertex Attribute Combinations on
Vertex Data Transfer Speed.

Warning: If vertex data is input through the glVertexAttribPointer() function, the fourth
argument (specifying whether to normalize values) is ignored because it is not supported
by the hardware. The vertex shader must explicitly normalize values because the specified
value has no effect.

You cannot specify GL_FIXED or GL_UNSIGNED_SHORT for the type parameter of the
glVertexAttribPointer() function. If GL_FLOAT or GL_SHORT is specified, ptr must
be a pointer that is 4-byte aligned or 2-byte aligned, respectively.

While a vertex shader is processing a single vertex, it must load vertex data from at least one input
register (even a single component), or it may not run properly.

8.2. Vertex Data Output

The vertex shader's processing results are written to the output registers that are mapped to output
vertex attributes, and in this way the results are passed on to the next processing stage. Using
#pragma output map, the vertex shader specifies the vertex attribute names and registers for
output data.

Code 8-3. Mapping Registers to Output Vertex Attributes and Writing to the Registers (in Shader Assembly)

#pragma output_map(position, o0)


mov o0, v0

In this code sample, vertex coordinates are mapped to the o0 output register. Data is output as
vertex coordinates by writing data to o0.

Because only a reserved shader can be used for fragment processing, the vertex shader's output
vertex attributes will be fixed in advance. Output vertex attributes have the following names. The
components that the vertex shader must set for the output attributes are fixed.

Table 8-1. Output Vertex Attributes

Attribute
Output Attribute Required Components
Name

position Vertex coordinates. 4 components: x, y, z, and w.


color Vertex color. 4 components: R, G, B, and A.
texture0 Texture coordinate 0 2 components: u and v.
texture0w Texture coordinate 0 1 component: w.

texture1 Texture coordinate 1 2 components: u and v.


texture2 Texture coordinate 2 2 components: u and v.
quaternion Quaternions. 4 components: x, y, z, and w.
view View vector. 3 components: x, y, and z.

General-purpose Any number of components that are used by the geometry


generic
attribute. shader.

The vertex shader must write some value to every component (x, y, z, and w) of registers that have
been mapped. Dummy values must be written to any unmapped (unused) components in the registers
specified by #pragma output_map.

The vertex shader forces processing to end when values have been written to all mapped registers. It
then moves on to process the next vertex (the end instruction must be called). In other words, after
the last attribute data has been written to a register, later instructions might not be executed.

You can only write to (each component of) a single output register once while processing a single
vertex. Correct operation is not guaranteed if data is written to the same component (of the same
register) more than once.

Except for the generic attribute, output vertex attributes can be mapped to no more than seven
output registers. To map eight or more non-generic output vertex attributes, you must map multiple
attributes to a single register. In this case, multiple vertex attributes (up to a total of four
components) must be packed into a single register. For example, you could map texture1 and
texture2 to [Link] and [Link], respectively.

8.3. Uniform Settings

When #pragma bind_symbol binds data names to non-input registers (floating-point constant
registers, Boolean registers, or integer registers), the application can use the glUniform*()
functions to set values in each register. You can transpose matrices within the glUniformMatrix*
() functions (convert the matrices from column-major to row-major order), by specifying GL_TRUE for
transpose.

Sample code follows for both the vertex shader and the application.

Code 8-4. Binding Data Names to Non-Input Registers (in Shader Assembly)

#pragma bind_symbol ( ModelViewMatrix , c0 , c3 )


#pragma bind_symbol ( LoopCounter0 , i1 , i1 )
#pragma bind_symbol ( bFirst , b2 , b2 )
#pragma bind_symbol ( Scalar.x , c4, c4 )

Code 8-5. Setting Uniforms

uniform_location = glGetUniformLocation ( program , "ModelViewMatrix" );


GLfloat matrix[4][4];
glUniform4fv ( uniform_location , 4 , matrix );

GLfloat scalar_value;
uniform_location = glGetUniformLocation ( program , "Scalar" );
glUniform1f ( uniform_location , scalar_value );

uniform_location = glGetUniformLocation ( program , "bFirst" );


glUniform1i ( uniform_location , GL_TRUE );

GLint loop_setting[3] = { 4 , 0 , 1 } ; // loop_count – 1 , init , step


uniform_location = glGetUniformLocation ( program , "LoopCounter0" );
glUniform3iv ( uniform_location , loop_setting );

The sample code specifies component x for the data name Scalar. Components can be specified
this way when binding a floating-point constant register. You must specify components consecutively,
in xyzw order. In other words, xy, zw, or yzw can be specified but not xz, yw, or xyw.

Integer registers are used to control the loop instruction in a shader program. In a register that is 24
bits wide, the loop count is assigned to bits 0 through 7, the initial value is assigned to bits 8 through
15, and the increment value is assigned to bits 16 through 23. The loop instruction initializes the
loop counter register to a default value, and then repeatedly executes the instructions between loop
and endloop once more than the specified loop count. The loop counter register is incremented only
by the increment value each time through the loop.

8.4. Notes for the Clip Coordinate System

The vertex shader outputs Z components in a clip coordinate system that differs from the one used in
OpenGL ES.

OpenGL ES clips coordinates between -Wc and Wc, but the 3DS system clips them between 0 and -
Wc (the sign is reversed). To use projective transformation matrices that are compatible with OpenGL
ES, applications must convert the range from –Wc to Wc into the range from 0 to –Wc.

Converting Projective Transformation Matrices in the Application

Make the following conversion and set the resulting projection matrix as a uniform.

Code 8-6. Converting to an OpenGL ES-Compatible Projective Transformation Matrix

GLfloat projection[16];
projection[2] = (projection[2] + projection[3]) * (-0.5f);
projection[6] = (projection[6] + projection[7]) * (-0.5f);
projection[10] = (projection[10] + projection[11]) * (-0.5f);
projection[14] = (projection[14] + projection[15]) * (-0.5f);

Converting in a Vertex Shader

Apply the projection conversion as follows.

Code 8-7. Converting to an OpenGL ES-Compatible Projective Transformation Matrix (in Shader Assembly)

#pragma output_map(position, o0)


#pragma bind_symbol(attrib_position, v0)
#pragma bind_symbol(modelview, c0, c3)
#pragma bind_symbol(projection, c4, c7)
def c8, -0.5, -0.5, -0.5, -0.5

// Model View Transformation


dp4 r0.x, v0, c0
dp4 r0.y, v0, c1
dp4 r0.z, v0, c2
dp4 r0.w, v0, c3
// Projective Transformation
dp4 o0.x, r0, c4
dp4 o0.y, r0, c5
mov r1, c6
add r1, r1, c7
mul r1, r1, c8
dp4 o0.z, r0, r1
dp4 o0.w, r0, c7

8.5. Vertex Cache

Some of the vertex data created or processed by the vertex shader is saved in a cache. The vertex
shader does not process input vertex data that is determined to be the same as the original vertex
data saved in the cache, based on its vertex indices. Instead, the processed data in the cache is sent
to the next process. The same vertex data is often processed more than once when it is input using
GL_TRIANGLES, but this can be avoided if there is processed vertex data in the cache already.

The following conditions must be satisfied to use the vertex cache.

Vertex data must be input in a format that accesses vertex indices. In short, glDrawElements
must be called to input vertex data.
Input vertex data must use a vertex buffer.

The vertex cache can save 32 vertex data entries. It is implemented with a proprietary algorithm that
resembles the functionality of the LRU (least recently used) algorithm.

When the repeatedly accessed vertex data contains no more than 32 vertices, there is a higher
chance of a cache hit. But the efficiency of the vertex cache is affected by conditions other than index
order, including the usage state of the memory holding the index array, and the length of the shader
executing as the vertex shader. For these reasons, the optimal index depends on the content, and
there may not be a definitive answer.

8.6. Querying the Vertex Shader

You can query the vertex shader for information about active vertex attributes and uniforms.

8.6.1. Getting Vertex Attribute Information

You can use the glGetActiveAttrib() function to get attribute information for the vertex data
input to the vertex shader.

Code 8-8. Definition of the glGetActiveAttrib Function

void glGetActiveAttrib(GLuint program, GLuint index, GLsizei bufsize,


GLsizei* length, GLint* size, GLenum* type, char* name);

A GL_INVALID_OPERATION error is generated if program specifies an unlinked or otherwise


invalid program object.

For index, specify a value between 0 and one less than the number of vertex attributes obtained
from the glGetProgramiv() function, when it is called on the program object specified by
program, when GL_ACTIVE_ATTRIBUTES is specified for pname. A GL_INVALID_VALUE error is
generated if the specified value is negative or greater than or equal to the number of vertex
attributes.

For bufsize, specify the size of the array specified in name. A GL_INVALID_VALUE error is
generated if a negative value is specified.

The vertex attribute's type is returned in type. The vertex attribute's size is returned in size. This
is the number of values indicated by type that are required to represent the vertex attribute.

The vertex attribute's name is returned in name. If there are more than bufsize characters in the
vertex attribute's name, up to bufsize - 1 characters are stored with a terminating character
(NULL) added at the end. The number of characters in name is returned in length (excluding the
terminating null character).

8.6.2. Getting Uniform Information

You can use the glGetActiveUniform() function to get uniform information registered with a
program object. This is not limited to the vertex shader; you can also get information about
uniforms used by the geometry shader and reserved fragment shader, which are described later.

Code 8-9. Definition of the glGetActiveUniform Function

void glGetActiveUniform(GLuint program, GLuint index, GLsizei bufsize,


GLsizei* length, GLint* size, GLenum* type, char* name);

A GL_INVALID_OPERATION error is generated if program specifies an unlinked or otherwise


invalid program object.

For index, specify a value between 0 and one less than the uniform information count obtained
from the glGetProgramiv() function when it is called on the program object specified by
program and when pname is GL_ACTIVE_UNIFORMS. A GL_INVALID_VALUE error is generated if
the specified value is negative or greater than or equal to the uniform information count.

For bufsize, specify the size of the array specified in name. A GL_INVALID_VALUE error is
generated if a negative value is specified.

The type of value for the uniform setting is returned in type. The value's size is returned in size.
This is the number of elements indicated by type that are required to represent the uniform setting.
For example, GL_FLOAT_VEC4 is stored in type and 4 is stored in size for a 4×4 matrix, such as
the modelview matrix.

The uniform's name is returned in name. If there are more than bufsize characters in the
uniform's name, up to bufsize - 1 characters are returned with a terminating character (NULL)
added at the end. The number of characters in name is returned in length (excluding the
terminating null character).

8.6.3. Setting Categories

The glGetActiveAttrib() and glGetActiveUniform() functions can get the following types
of values.

Table 8-2. List of Setting Categories

Type Type Number of Components Major Use


GL_FLOAT float 1 Bias and scale values.
GL_FLOAT_VEC2 float 2 Viewport settings.
GL_FLOAT_VEC3 float 3 Colors (RGB) and directional vectors.

GL_FLOAT_VEC4 float 4 Colors (RGBA) and transformation matrices.


GL_INT int 1 Mode settings.
GL_INT_VEC3 int 3 Combiner source input.

GL_BOOL bool 1 Enabling and disabling features.

GL_SAMPLER_1D int 1 Specifying lookup tables.


Lists categories that are actually used by uniforms in each shader.

8.7. Getting and Setting the Values of Multiple Uniforms

Functions are provided for getting and setting multiple uniforms, concurrently, on the 3DS system.

By calling the glUniformsDMP() function, you can concurrently set values in multiple uniforms for
the program object that is currently bound.

Code 8-10. Setting a Group of Uniforms

void glUniformsDMP(GLuint n, GLint* locations, GLsizei* counts,


const GLuint* value);

For n, specify the number of uniforms to set.

For locations, specify a pointer to an array storing n uniform locations (which can be obtained by
glGetUniformLocation). For counts, specify a pointer to an array storing the number of elements
in the n uniforms. count for the glUniform*() functions corresponds to the number of uniform
elements. Fill the array specified by counts with the number of elements to set for each array
uniform and 1 for each non-array uniform.

For value, specify a pointer to an array storing the values to set in the uniforms. Because each
uniform has a different amount of data, the indices for the values to store in value are not
necessarily the same as the indices in locations and counts for the corresponding uniforms. You
can mix both GLfloat and GLuint data in the uniforms that you set. Store 32-bit GLfloat data for
the values to set in GLfloat uniforms.

This function does not perform any error-checking. Behavior is undefined if you specify an invalid
value for any argument.

By calling the glGetUniformsDMP() function, you can concurrently get values from multiple
uniforms for a specified program object.

Code 8-11. Getting a Group of Uniforms

void glGetUniformsDMP(GLuint program, GLuint n, GLint* locations,


GLsizei* counts, GLuint* params);

For program, specify the program object for which to get uniform values.

For n, specify the number of uniforms to get.

For locations, specify a pointer to an array storing n uniform locations (which can be obtained by
glGetUniformLocation). For counts, specify a pointer to an array storing the number of elements
in the n uniforms. The number of uniform elements is either 1 (for a non-array uniform) or the number
of array elements from which to get values for an array uniform.

For params, specify a pointer to an array used to get the uniform values. Because each uniform has
a different amount of data, the indices for the values stored in params are not necessarily the same
as the indices in locations and counts for the corresponding uniforms. Both GLfloat and
GLuint data can be mixed in the uniforms from which values are obtained. The GLfloat values
obtained from GLfloat uniforms are stored as 32-bit data in params.

This function does not perform any error-checking. Behavior is undefined if you specify an invalid
value for any argument.
8.8. Other Notes and Cautions

You can access uniform values with the glUniform*() and glGetUniform*() functions, using
uniform locations that can be obtained with the glGetUniformLocation() function. By adding an
offset to a location, you can specify and access a specific element in an array uniform. For example,
the second element of an array uniform is accessed when 1 is added to the uniform's location.

A uniform's location is fixed when the glLinkProgram() function is called and the value originally
differs for each program object. The glUniform*() functions generate an error if the location is not
related to the current program object. The glGetUniform*() functions generate an error if the
location is related to a program object other than program. However, the glUniform* and
glGetUniform*() functions do not generate errors if the programs being queried are reserved
fragment shaders and the location is specified using a bitwise OR with 0xFFF80000.

Do not include vertex attributes in #pragma output_map definitions when they do not need to be
output. If you do not follow this advice, useless instructions will be required because values must be
written to all defined vertex attributes (output registers) when the vertex shader is run. Some output
attributes also involve clock instructions for parts of the GPU circuit, and their output can needlessly
run down the battery.

The maximum number of vertex attributes when rendering without using a vertex buffer is 16, and the
maximum number of vertex attributes that can use a vertex buffer is 12. However, if 12 vertex
attributes are rendered using a vertex buffer, you must take care about the limitations in vertex data
placement (see 6.8.1. Restrictions Affecting Only glDrawElements).

In the vertex shader assembler, you can define up to 16 vertex attributes, but if the maximum number
as determined according to the conditions above is exceeded, a GL_INVALID_OPERATION error may
be generated when the rendering function is called.

CONFIDENTIAL

9. Geometry Shaders

Geometry shaders use vertex shader output to generate primitives and output an arbitrary number of
vertices.

In OpenGL ES 2.0, you can specify GL_POINTS, GL_LINES, GL_LINE_STRIP, or GL_LINE_LOOP to


the glDrawElements or glDrawArrays() function to render primitives, but on the 3DS system, you
must use the geometry shader to generate non-triangle primitives. Triangles are rendered with
GL_TRIANGLES, GL_TRIANGLE_STRIP, and GL_TRIANGLE_FAN because they are included in OpenGL
ES 2.0, but multisample rendering is not supported.

As described in 5. Shader Programs, the SDK provides geometry shaders for the Nintendo 3DS system.
You can use the following geometry shaders.

Point shaders
Line shaders
Silhouette shaders
Catmull-Clark subdivision shaders
Loop subdivision shader
Particle system shader

These geometry shaders cannot be used alone. Always load a binary that also has a linked vertex
shader, and also make sure that you use the vertex shader. When a geometry shader is used, one of
the four vertex processors is used as the geometry processor. The 15th Boolean register (b15) is
reserved for the geometry shader.

Figure 9-1. Processor Configurations When the Geometry Shader Is Used and Not Used

Vertex shader output is used as input data to the geometry shader. Each geometry shader requires
specific vertex attributes and a specific input order and can only use some vertex attributes. The vertex
shader must output these values correctly. Data is input to the geometry shader in order, starting with
the smallest register number output by the vertex shader. Output vertex attributes are defined by
#pragma output_map, but any generic attribute names are only used by the geometry shader,
without being handled by some later process (such as fragment processing). Geometry shaders have
reserved uniforms.

The point shader, for example, uses reserved uniforms to configure the viewport and other settings.
Because all of these reserved uniforms initially have undefined values, they must be set by the
application.

9.1. Point Shaders

The point shader generates two triangle primitives to draw a square (a point primitive) centered at the
specified vertex coordinates, with sides as long as the specified point size. The point sprite shader is
a point shader that outputs texture coordinates s, t (shown in the following figure), r (fixed at 0.0),
and q (fixed at 1.0), for applying textures to point primitives. Neither shader supports grid
adjustments or multisample rendering.

Figure 9-2. Drawing Point Primitives


9.1.1. Shader Files

The shader file to link with the vertex shader is determined by two factors: the number of vertex
attributes output for fragment processing that are also not required by the point shader, and the
number of texture coordinates output by the point sprite shader.

Points: DMP_pointN.obj

Point sprites: DMP_pointSpriteN_T.obj

Where N is the number of vertex attributes that are not required by the point (or point sprite)
shader, and T is the number of texture coordinates.

9.1.2. Reserved Uniform

There are reserved uniforms for configuring the viewport, and for enabling or disabling distance
attenuation on the point size. These reserved uniforms must be set by the application because they
initially have undefined values.

Viewport

Use the glUniform2fv() function to set the width and height of the viewport in the reserved
uniform dmp_Point.viewport.

Distance Attenuation

Use the glUniform1i() function to set the reserved uniform for distance attenuation on the point
size, dmp_Point.distanceAttenuation. A value of GL_TRUE enables distance attenuation, and
GL_FALSE disables it. The point size is multiplied by the clip coordinate Wc when distance
attenuation is disabled. This cancels out the division by Wc during the conversion to window
coordinates and prevents the size of displayed point primitives from being affected by distance.

Table 9-1. Reserved Uniforms for Point Shaders

Reserved Uniform Type Value to Set


The viewport using the following equation.
dmp_Point.viewport vec2
(1 / [Link], 1 / [Link])
Specifies whether the point size is affected by distance
attenuation.
dmp_Point.distanceAttenuation bool
Specify GL_TRUE to enable distance attenuation, and
GL_FALSE otherwise.

9.1.3. Vertex Shader Settings

The point shader requires vertex coordinates and a point size to render point primitives. The vertex
shader outputs the vertex coordinates and then the point size in order (starting with the smallest
output register number). This is followed by the texture coordinates for the point sprite shader. If
multiple texture coordinate pairs are used, they must be packed two at a time into the xy and zw
components of a single output register.

Output vertex attributes must be set as follows: position for the vertex coordinates, generic for
the point size, and texture0 through texture2 for the texture coordinates.
If the vertex shader outputs the vertex color in addition to the vertex coordinates and point size
required by the point shader, link the DMP_point1.obj shader file and use the following #pragma
output_map statements in the vertex shader.

Code 9-1. Sample Output Register Settings When the Point Shader Is Used (in Shader Assembly)

#pragma output_map ( position , o0 )


#pragma output_map ( generic , o1 )
#pragma output_map ( color , o2 )

If the vertex shader outputs the vertex color in addition to the vertex coordinates, point size, and
two texture coordinate pairs required by the point sprite shader, link the
DMP_pointSprite1_2.obj shader file and use the following #pragma output_map statements
in the vertex shader.

Code 9-2. Sample Output Register Settings When the Point Sprite Shader Is Used (in Shader Assembly)

#pragma output_map ( position , o0 )


#pragma output_map ( generic , o1 )
#pragma output_map ( texture0 , [Link] )
#pragma output_map ( texture1 , [Link] )
#pragma output_map ( color , o3 )

The point sprite shader outputs texture coordinates for point sprites that replace the input texture
coordinates. Note that the vertex shader must write dummy values to the output registers for the
texture coordinates that are replaced by the point sprite shader.

9.1.4. Input Vertex Data

To generate primitives with the point shader, call the glDrawElements or glDrawArrays()
function and specify GL_GEOMETRY_PRIMITIVE_DMP for mode.

9.2. Line Shaders

A line shader generates two triangle primitives to draw lines (a line primitive) that connect two points
specified by vertex coordinates. You can use a reserved uniform to set the line width. The line shader
does not support grid adjustments or multisample rendering.

Figure 9-3. Drawing Line Primitives

Coordinates are generated for the four vertices that make a rectangle (parallelogram) from the slope
and width of the line segment connecting the specified vertex coordinates. The coordinates of the
four vertices are generated from the input vertex coordinates in the Y direction (or the X direction for
some line segment slopes).

9.2.1. Shader Files

The shader file to link with the vertex shader is determined by the number of vertex attributes
output for fragment processing that are also not required by the line shader.

Vertex coordinates can be specified for separate lines or line strips. Separate lines are drawn using
two sets of vertex coordinates per line. The coordinates of the first two vertices are the same for
both a line strip and separate lines, but the next separate line is drawn from the coordinates of the
second and third vertices. In other words, starting with the third vertex, each vertex is used
together with the previous one to draw a single, connected line.

Separator lines: DMP_separateLineN.obj

Strip lines: DMP_stripLineN.obj

Where N is the number of vertex attributes not required by the line shader.

9.2.2. Reserved Uniform

There is a reserved uniform for setting the line width. The reserved uniform for the line width must
be set by the application because its initial value is undefined.

Line Width

Use the glUniform4fv() function to set the reserved uniform for the line width,
dmp_Line.width, using values calculated from both the line width and the width and height of the
viewport.

Table 9-2. Reserved Uniforms for the Line Shader

Reserved Uniform Type Value to Set


The following expression calculates the line width.
([Link] / [Link],
dmp_Line.width vec4 [Link] / [Link],
[Link] * [Link],
2 / [Link])

9.2.3. Vertex Shader Settings

The line shader requires vertex coordinates to render line primitives. The vertex shader outputs the
vertex coordinates, starting with the smallest output register number.

One output vertex attribute must be set: the position attribute of the vertex coordinates.

If the vertex shader outputs the vertex color in addition to the vertex coordinates required by the
separate line shader, link the DMP_separateLine1.obj shader file and use the following
#pragma output_map statements in the vertex shader.
Code 9-3. Sample Output Register Settings When a Line Shader Is Used (in Shader Assembly)

#pragma output_map ( position , o0 )


#pragma output_map ( color , o1 )

The line strip shader works the same way as the separate line shader. Its shader file is
DMP_stripLine1.obj.

9.2.4. Input Vertex Data

To generate primitives with the line shader, call the glDrawElements() or glDrawArrays()
function and pass GL_GEOMETRY_PRIMITIVE_DMP for the mode parameter.

9.3. Silhouette Shaders

Silhouette shaders generate and render silhouettes around object edges. You can use silhouette
edges to render object contours and, when combined with the shadow feature, soft shadows.

To generate silhouette edges, the silhouette shader needs a primitive called a triangle with
neighborhood, or TWN for short.

9.3.1. Triangle With Neighborhood

This section describes one of the triangles that make up the objects used to render silhouette
edges.

This is called a center triangle. A triangle with neighborhood (TWN) comprises a center triangle and
the three (adjacent) triangles that share an edge with it.

Figure 9-4. Sample Triangles With Neighborhood


One TWN comprises the center triangle with vertices 3, 1, and 4, in addition to the three triangles
defined by vertices 0, 1, and 3; 2, 4, and 1; and 6, 3, and 4. When vertices 3, 4, and 6 form the
center triangle, vertices 3, 1, and 4 represent a triangle in a separate TWN.

TWNs are used to detect the silhouette edges of center triangles. You can make an object from
TWNs to render silhouette edges on it.

9.3.2. Shader Files

TWN vertices can be input to a silhouette shader in two ways: (1) either as silhouette triangles, one
TWN at a time (just like normal triangles), or (2) as continuous silhouette strips of adjoining TWNs.

Silhouette triangles: DMP_silhouetteTriangle.obj

Silhouette strips: DMP_silhouetteStrip.obj

9.3.3. Reserved Uniform

Silhouette shaders have the following reserved uniforms. These reserved uniforms must be set by
the application because they initially have undefined values.

Polygon Facing

Use the glUniform1i() function to configure how the silhouette shader determines whether a
polygon is front-facing (dmp_Silhouette.frontFaceCCW). Specify GL_TRUE or GL_FALSE if
GL_CCW or GL_CW, respectively, have been passed to the glFrontFace() function for object
vertex input.

Silhouette Edge Width

Use the glUniform2fv() function to set the silhouette edge width (dmp_Silhouette.width) to
a value calculated by multiplying the normal vector's x direction by a coefficient.

You can configure the effect of a vertex's w component (dmp_Silhouette.scaleByW). Multiplying


this component by the silhouette edge width disables distance attenuation. Call the
glUniform1i() function and specify a value of GL_TRUE to multiply by the w component, or
specify GL_FALSE to fix the silhouette edge's w component at 1.0.

Silhouette Edge Colors

Use the glUniform4fv() function to set the color of the silhouette edge
(dmp_Silhouette.color) with R, G, B, and A values.

Open Edges

An open edge is an edge of the center triangle that is not shared with any other triangles. You can
configure open edges (dmp_Silhouette.acceptEmptyTriangles) to always be drawn
(GL_TRUE) or not (GL_FALSE).

Open edges differ from silhouette edges in that they are drawn like line primitives, without using
vertex normals. As a result, they may look different from silhouette edges at some angles. Some
settings are specific to open edges.
Open Edge Width

Use the glUniform4fv() function to set the open edge width


(dmp_Silhouette.openEdgeWidth) using values calculated from a specified width and the
viewport's width and height (just like a line shader).

Open Edge Colors

Use the glUniform4fv() function to set the open edge color


(dmp_Silhouette.openEdgeColor) using R, G, B, and A values.

Open Edge Bias Toward the Viewpoint

Use the glUniform1fv() function to set the bias toward the viewpoint
(dmp_Silhouette.openEdgeDepthBias). A negative value indicates movement toward the
viewpoint, and a positive value indicates movement away from it. Normal vectors are not used to
generate open edges, so this bias value adjusts their appearance.

Multiplying an Open Edge's w Component

You can multiply a vertex's w component by using the open edge’s width and bias values. Use the
glUniform1i() function to set dmp_Silhouette.openEdgeWidthScaleByW and
dmp_Silhouette.openEdgeDepthBiasScaleByW to GL_TRUE or GL_FALSE for each setting.

Table 9-3. Reserved Uniforms for Silhouette Shaders

Reserved Uniform Type Value to Set


Specifies the silhouette edge width using the
following expression.
(xscale_f,
dmp_Silhouette.width vec2
xscale_f * [Link] / [Link])
xscale_f is the factory to multiply by the
normal vector's x component.

Specifies whether to apply a vertex's w


component to silhouette edges.
dmp_Silhouette.scaleByW bool
Specify GL_TRUE to apply the vertex's w
component, and GL_FALSE otherwise.
dmp_Silhouette.color vec4 Specifies the silhouette edge color.
Specifies how a polygon is determined to be
front-facing or back-facing.
Specify GL_TRUE to set CCW
dmp_Silhouette.frontFaceCCW bool
(counterclockwise is front-facing), and
GL_FALSE to set CW (clockwise is front-
facing).
Specifies whether to render open edges.
dmp_Silhouette.acceptEmptyTriangles bool Specify GL_TRUE to render open edges, and
GL_FALSE otherwise.
dmp_Silhouette.openEdgeColor vec4 Specifies the open edge color.
Specifies the open edge width using the
following expression.
([Link] / [Link],
dmp_Silhouette.openEdgeWidth vec4
[Link] / [Link],
[Link] / [Link],
2 / [Link])

Specifies the open edge bias toward the


viewpoint.
dmp_Silhouette.openEdgeDepthBias float A negative value indicates movement toward
the viewpoint, and a positive value indicates
movement away from it.
Specifies whether to multiply the open edge
width by a vertex's w component.
dmp_Silhouette.
bool Specify GL_TRUE to multiply the open edge
openEdgeWidthScaleByW
width by a vertex's w component, and
GL_FALSE otherwise.

Specifies whether to multiply a vertex's w


component by the open edge bias toward the
dmp_Silhouette. viewpoint.
bool
openEdgeDepthBiasScaleByW Specify GL_TRUE to multiply the open edge
width by a vertex's w component, and
GL_FALSE otherwise.

9.3.4. Vertex Shader Settings

A silhouette shader requires three items to render silhouette edges: vertex coordinates, vertex
colors, and normal vectors. The vertex shader outputs the vertex coordinates, vertex color, and
then the normal vector in that order, starting with the smallest output register number.

The output vertex attributes must be set using the vertex coordinates in the position attribute,
the vertex color in the color attribute, and the normal vector in the generic attribute.

To output silhouette triangles, link the DMP_silhouetteTriangle.obj shader file, and use the
following #pragma output_map statements in the vertex shader.

Code 9-4. Sample Output Register Settings When a Silhouette Shader Is Used (in Shader Assembly)

#pragma output_map ( position , o0 )


#pragma output_map ( color , o1 )
#pragma output_map ( generic , o2 )

The vertex shader runs a modelview transformation on the normal vectors input from the
application. Normalized values must be output for the x and y components. In other words, the
vertex shader must output a normal vector that is normalized as follows from the normal vector (nx,
ny, nz) in the viewpoint coordinate system.

This is shown by the following shader assembly code. aNormal specifies the normal vector input
from the application, and vNormal specifies the normal vector output to the silhouette shader.

Code 9-5. Normalizing Input Normal Vectors for the Silhouette Shader (in Shader Assembly)

mov TEMP_NORM, CONST_0


dp3 TEMP_NORM.x, aNormal, MATRIX_ModelView[0]
dp3 TEMP_NORM.y, aNormal, MATRIX_ModelView[1]
mul TEMP, TEMP_NORM, TEMP_NORM
add TEMP, TEMP.x, TEMP.y
rsq TEMP, TEMP.x
mul vNormal, TEMP_NORM, TEMP

To output silhouette strips, use the same procedure as you would for silhouette triangles, but link
the DMP_silhouetteStrip.obj shader file instead.
9.3.5. Input Vertex Data

To render silhouette edges with the silhouette shader, call the glDrawElements or
glDrawArrays() function, and specify GL_GEOMETRY_PRIMITIVE_DMP for the mode parameter.
You must also disable culling by calling glDisable(GL_CULL_FACE) before rendering.

We also recommend using the vertex indices to input vertex data to the silhouette shader, in
consideration of TWN characteristics. The following description assumes that you are using the
vertex indices. If you are not using the vertex indices, you must order the vertex data according to
the TWN vertex input rules.

9.3.6. Silhouette Triangle Indices

Silhouette triangles are input to the shader, one TWN at a time, using six vertices per TWN.

Vertices are input in the following order.

1. The first vertex of the center triangle.


2. The second vertex of the center triangle.
3. The remaining vertex of the adjacent triangle that shares the edge created by the first and
second vertices of the center triangle.
4. The third vertex of the center triangle.
5. The remaining vertex of the adjacent triangle that shares the edge created by the first and third
vertices of the center triangle.
6. The remaining vertex of the adjacent triangle that shares the edge created by the second and
third vertices of the center triangle.

The object in the following figure shows how to specify indices.

Figure 9-5. Specifying Indices With TWNs

A setting of CCW is assumed for determining polygon facing.

Having selected the center triangles (1,4,3) and (3,4,6) so that they are front-facing, specify the
indices as follows.

Triangle (1,4,3): 1, 4, 2, 3, 0, 6.
Triangle (3,4,6): 3, 4, 1, 6, 5, 7.

Silhouette edges are not generated for a degenerate center triangle. Triangles that have edges
shared by three or more triangles were not considered.

9.3.7. Silhouette Strip Indices


Silhouette strips are input to the shader as consecutive TWNs, using six vertices for the first TWN
and two vertices for each subsequent TWN. You can expect this to be at least twice as efficient as
using silhouette triangles to render the same model.

Silhouette strips can be used as input when the second and third vertices of a center triangle share
an edge with another vertex to form the center triangle of the next TWN. In other words, the last
adjacent triangle of a TWN is the center triangle of the next TWN.

Vertices are input in the following order.

1. The first six vertices are the same as those used for silhouette triangles. The second and third
vertices of the center triangle and the last specified vertex represent the first, second, and third
vertices of the next center triangle.
2. The remaining vertex of the adjacent triangle shares the edge created by the first and third
vertices of the center triangle.
3. The remaining vertex of the adjacent triangle that shares the edge created by the second and
third vertices of the center triangle.
4. The second and third vertices of the center triangle and the vertex specified in step 3 form
vertices 1 through 3 of a new center triangle. Repeat steps 2 and 3.

To stop entering silhouette strips or to enter the next silhouette strip, give the third vertex of the
center triangle as input before, the vertex specified in step 3.

Figure 9-6. Sample Indices for Silhouette Strips

Assuming (3,8,7) as the final center triangle in the figure,

the silhouette strip indices would be specified in the following order: 1, 4, 2, 3, 0, 8, 6, 7, 5, 7,


9.

The number in bold, 7, is the vertex used to indicate the end of the silhouette strip.

To continue entering silhouette strips after you have specified the end of one strip, you must first
enter the six vertices that start a new TWN. If the first center triangle of the new silhouette strip
faces in the opposite direction than the one specified using glFrontFace, enter its first vertex
twice. In other words,

if the silhouette strip in the previous example was initially back-facing, its indices would be
specified in the following order: 1, 1, 4, 2, 3, 0, 8, 6, 7, 5, 7, 9.

The end of a silhouette strip is specified as a delimiter when inputting multiple silhouette strips.
However, if the end of the last silhouette strip is not specified, further input triangles connected to
the silhouette strip may cause duplicate silhouette edges to be rendered. The end of each input
silhouette strip must be specified, when alpha blending is used on silhouettes.

Note that a degenerate center triangle is considered to specify the end of a strip.

9.3.8. Open Edges


An open edge has already been explained to be an edge of the center triangle that is not shared
with any other triangle. Because it does not share an edge with any other triangle, its indices are
specified by using the remaining vertex from the center triangle. You can visualize an open edge as
an adjacent triangle that has been folded precisely over the center triangle.

The indices for the object in Figure 9-5 would be specified as follows, if vertex 0 did not exist.

The indices (1, 4, 3) for the silhouette triangle would be specified in the order: 1, 4, 2, 3, 4, 6.

The silhouette strip indices would be specified in the following order: 1, 4, 2, 3, 4, 6, 5, 6, 7.

Open edges must be configured differently than silhouette edges. For more information, see 9.3.3.
Reserved Uniforms.

Figure 9-7. Sample Indices for Open Edges

9.3.9. Generating Silhouette Edges

A silhouette edge is rendered by generating a new rectangular polygon on the edge of a front-
facing center triangle and a back-facing adjacent triangle in a TWN. Two vertices are added along
the normal vectors (n1 and n2) of the two vertices (1 and 2) shared by the center and adjacent
triangle to generate a rectangular polygon (from two triangular polygons).

Figure 9-8. Rectangular Polygon That Forms a Silhouette Edge

The coordinates (x', y', z', w') of the vertices to add are calculated by the following equation, where
(x, y, z, and w) are the coordinates of a vertex on the center triangle, and (nx, ny, nz) represent a
normal vector.

x' = x + x_factor * nx * w_scale


y' = y + y_factor * ny * w_scale
z' = z
w' = w

The reserved uniform for silhouette edge width sets the values applied to x_factor, y_factor,
and w_scale.
9.4. Catmull-Clark Subdivision Shaders

The Catmull-Clark subdivision shader uses quadrilateral polygons and their surrounding vertices to
split groups of vertices into smooth polygons. Note: The word subdivision always indicates Catmull-
Clark subdivision within this section.

9.4.1. Subdivision Patches

To subdivide polygons, give the shader a set of polygons comprising only quadrilaterals (a
Catmull-Clark subdivision patch, or subdivision patch for short). A subdivision patch is made up of
the target (center) quadrilateral and the group of quadrilaterals that share edges formed by that
center quadrilateral's four vertices.

Figure 9-9. Sample Catmull-Clark Subdivision Patch

A subdivision patch can only be applied to a polygon model that is entirely made up of
quadrilaterals.

Each vertex in the central quadrilateral usually forms four edges, like vertices 5, 6, and 10.
However, a vertex that forms three or five edges like vertex 9 is called an extraordinary point, and
its edge count is called its valence. A subdivision patch can only have one extraordinary point.
It must be in the central quad and have a valence from 3 through 12.

If a subdivision patch's central quad has an extraordinary point, you must start specifying indices
from that point. Any other subdivision patches with central quads that have the same extraordinary
point must be entered consecutively. Holes will appear in the mesh in some cases.

9.4.2. Shader Files

The shader file to link with the vertex shader is determined by the number of vertex attributes
output for fragment processing that are not required by the subdivision shader.

Subdivision: DMP_subdivisionN.obj

Where N is the number of vertex attributes (1 through 6) that are not required by the subdivision
shader.

9.4.3. Reserved Uniform


Subdivision shaders have the following reserved uniforms. These reserved uniforms must be set by
the application because they initially have undefined values.

Subdivision Level

Use the glUniform1f() function to set the subdivision level (dmp_Subdivision.level), which
controls how finely the shader subdivides polygons. Higher levels (larger numbers) indicate finer
subdivision. At the smallest value of 0, a single vertex is added to the center of the subdivision
patch, and the patch's original vertex coordinates are adjusted.

Using Quaternions

The subdivision shader generates new output vertices that have interpolated vertex attributes for
their attributes that are not vertex coordinates. Because quaternions must be subdivided in a
particular way, the shader must be notified when it is given quaternions.

Use the glUniform1i() function to either enable (GL_TRUE) or disable (GL_FALSE) quaternions
(dmp_Subdivision.fragmentLightingEnabled).

Table 9-4. Reserved Uniforms for the Catmull-Clark Subdivision Shader

Reserved Uniform Type Value to Set


Specifies the subdivision level.
0 (The lowest subdivision level).
dmp_Subdivision.level float
1
2 (The highest subdivision level).

Specifies whether to use quaternions


required for fragment lighting.
dmp_Subdivision.fragmentLightingEnabled bool
Specify GL_TRUE to use quaternions.,
and GL_FALSE otherwise.

9.4.4. Vertex Shader Settings

The subdivision shader requires one vertex attribute: the vertex coordinates. The vertex shader
outputs the vertex coordinates, starting with the smallest output register number.

Two output vertex attributes must be set: the position attribute of the vertex coordinates and one
other attribute.

If the vertex shader outputs the vertex color in addition to the vertex coordinates required by the
subdivision shader, link the DMP_subdivision1.obj shader file and use the following #pragma
output_map statements in the vertex shader.

Code 9-6. Sample Output Register Settings When the Catmull-Clark Subdivision Shader Is Used (in Shader
Assembly)

#pragma output_map ( position , o0 )


#pragma output_map ( color , o1 )

When quaternions are required for fragment lighting, they must be output to the register with the
smallest number following the register that outputs the vertex coordinates.

Code 9-7. Sample Output Register Settings When the Subdivision Shader Uses Quaternions (in Shader
Assembly)
#pragma output_map ( position , o0 )
#pragma output_map ( quaternion , o1 )
#pragma output_map ( color , o2 )

9.4.5. Input Vertex Data

To use Catmull-Clark subdivision, call the glDrawElements() function, and specify


GL_GEOMETRY_PRIMITIVE_DMP for the mode parameter. You cannot use the glDrawArrays()
function. Vertex indices must also be used through the vertex buffer.

9.4.6. Subdivision Patch Indices

The number of vertices in a subdivision patch depends on the valence (from 3 through 12) of its
extraordinary point. A subdivision patch does not have a fixed number of input vertices.

Enter the size of the subdivision patch first. Specify the size of the subdivision patch as the number
of vertices that it includes. To find this number, double the extraordinary point's valence, and add 8.
A subdivision patch's size must be in the range from 14 through 32. It is 16 when the patch does
not have an extraordinary point. Behavior is undefined for input patch sizes larger than 32.

Specify a subdivision patch's indices in the following order.

1. The four vertices in the central quad (following the order specified when glFrontFace was
called, starting from the extraordinary point, if it exists).
2. The vertices around the central quad (following the order specified when glFrontFace was
called).

Assuming that polygons with a counterclockwise (CCW) winding are front-facing, the indices for the
subdivision patch in Figure 9-8 would be specified in the following order.

Subdivision Patch Indices: 18, 9, 5, 6, 10, 8, 4, 0, 1, 2, 3, 7, 11, 17, 16, 15, 14, 13, 12.

. The first number, 18, is the size of the subdivision patch.

It is followed by the subdivision patch for the central quad, with the extraordinary point's vertex 9.
In the following figure, when subdivision patch indices are used,

the resulting subdivision patch indices would be 18, 9, 10, 16, 15, 5, 6, 7, 11, 17, 21, 20, 19,
18, 14, 13, 12, 8, 4.

Figure 9-10. Sample Subdivision Patch Indices


9.5. Loop Subdivision Shader

The Loop subdivision shader uses triangular polygons and their surrounding vertices to split groups
of vertices into smooth polygons. The word subdivision always indicates Loop subdivision within this
section.

9.5.1. Subdivision Patches

To subdivide polygons, give the shader a Loop subdivision patch (referred to later simply as a
subdivision patch). A subdivision patch comprises the target (center) triangle and the group of
vertices that share edges with that triangle's three vertices.

Figure 9-11. Sample Loop Subdivision Patch

Each vertex in the center triangle forms a number of edges that is called its valence.

With the vertices 0, 1, 2 in Figure 9-11 set as the center triangle, the valence of vertex 0 is 6, the
valence of vertex 1 is 7, and the valence of vertex 2 is 6. The valence of each patch that accepts
a subdivision patch is 3 to 12, and the total of the valences of the three vertices of the center
triangle must be 29 or less .

You can increase the number of virtual vertices to include a vertex with a valence of 2 (a vertex that
only shares an edge with other vertices in the center triangle) in a subdivision patch.

9.5.2. Shader Files

The number of output registers configured to send non-valence vertex attributes to the subdivision
shader determines which shader file to link with the vertex shader. There must be at least one
output register because vertex coordinates are required output.

Subdivision: DMP_subdivisionN.obj

Where N is the number of output registers (1 through 4) configured with vertex attributes, other
than the valence.

9.5.3. Reserved Uniform

The Loop subdivision shader has the same reserved uniforms as the Catmull-Clark subdivision
shader. For more information, see 9.4.3. Reserved Uniforms. The reserved uniform for the loop
subdivision shader must be set by the application because its initial value is undefined.

Although new vertices are not added when the subdivision level is 0, the subdivision patch's
original vertex coordinates are adjusted.

9.5.4. Vertex Shader Settings

The subdivision shader requires two attributes: the vertex coordinates and the valence. The vertex
shader outputs these attributes, starting at the smallest output register number. The vertex
coordinates are first, followed by any other vertex attributes that exist, and finally, followed by the
valence.

Two output vertex attributes must be set: the vertex coordinates using the position attribute, and
the valence using the generic attribute.

If the vertex shader outputs the vertex color in addition to the vertex coordinates required by the
subdivision shader, link the DMP_loopSubdivision2.obj shader file (because the vertex
coordinates are also included in the number of output registers) and use the following #pragma
output_map statements in the vertex shader.

Code 9-8. Sample Output Register Settings When the Loop Subdivision Shader Is Used (in Shader Assembly)

#pragma output_map ( position , o0 )


#pragma output_map ( color , o1 )
#pragma output_map ( generic, o2 )

When quaternions are required for fragment lighting, they must be output to the register with the
smallest number following the register that outputs the vertex coordinates. Because the number of
non-valence output registers is restricted to 4 or less, multiple attributes must be packed into a
single register, when there are five or more vertex attributes other than the valence. Quaternions,
however, cannot be packed with other vertex attributes.

Code 9-9. Sample Output Register Settings When the Subdivision Shader Uses a Large Number of Vertex
Attributes (in Shader Assembly)

#pragma output_map ( position , o0 )


#pragma output_map ( quaternion , o1 )
#pragma output_map ( color , o2 )
#pragma output_map ( texture0 , [Link] )
#pragma output_map ( texture1 , [Link] )
#pragma output_map ( generic , o4 )

9.5.5. Input Vertex Data

To use Loop subdivision, call the glDrawElements() function and, for the mode parameter, pass
GL_GEOMETRY_PRIMITIVE_DMP. You cannot use the glDrawArrays() function. Vertex indices
must also be used through the vertex buffer.

9.5.6. Subdivision Patch Indices

The number of vertices in a subdivision patch depends on the total valence of the center triangle. A
subdivision patch does not have a fixed number of input vertices.
Enter the size of the subdivision patch first. Specify the size of the subdivision patch as three plus
the total valence of the vertices that form the center triangle.

Specify a subdivision patch's indices in the following order.

1. The indices of the three vertices (v0, v1, and v2) that form the center triangle (following the
order specified when glFrontFace was called).
2. All vertices that share an edge with v0 (in any order, although, the same order must also be
used by any other subdivision patch that includes the same vertex).
3. All vertices that share an edge with v1 (in any order, although, the same order must also be
used by any other subdivision patch that includes the same vertex).
4. All vertices that share an edge with v2 (in any order, although, the same order must also be
used by any other subdivision patch that includes the same vertex).
5. A fixed value of 12 and the center triangle's three vertices (12, v0, v1, and v2).
6. A vertex (e00) that forms a triangle with v0 and v2 and is not in the center triangle.
7. A vertex (e10) that forms a triangle with v0 and v1 and is not in the center triangle.
8. A vertex (e20) that forms a triangle with v1 and v2 and is not in the center triangle.
9. The vertex that shares an edge with v0 and is next to e00, in counterclockwise order.
10. The vertex that shares an edge with v1 and is next to e10, in counterclockwise order.
11. The vertex that shares an edge with v2 and is next to e20, in counterclockwise order.
12. The vertex that shares an edge with v0 and is next to e10, in clockwise order.
13. The vertex that shares an edge with v1 and is next to e20, in clockwise order.
14. The vertex that shares an edge with v2 and is next to e00, in clockwise order.

Assuming that polygons with a counterclockwise (CCW) winding are front-facing, the indices for the
subdivision patch in Figure 9-11 would be specified in the following order.

Subdivision Patch Indices:

22, 0, 1, 2, 1, 2, 12, 3, 4, 5, 2, 0, 5, 6, 7, 8, 9, 0, 1, 9, 10, 11, 12,


12, 0, 1, 2, 12, 5, 9, 3, 6, 10, 4, 11, 8.

. The number 22 on the first line is the size of the subdivision patch. The number 12 on the second
line is a fixed value.

When another subdivision patch uses v0 in its center triangle, the vertices that share an edge with
it must be specified in the same order: (1, 2, 12, 3, 4, 5). The same applies to v1 and v2.

Although some vertices will be specified more than once when specifying a subdivision patch, they
are read from the cache after vertex processing, and do not actually impose a performance penalty.

Figure 9-12. Sample Subdivision Patch Indices


9.6. Particle System Shaders

Particle system shaders are used by particle systems that render a large number of point sprites
(particles) along a Bézier curve.

Particles are rendered along a Bézier curve that is defined by four control points input to the shader.
Each control point is randomly placed within its own bounding box, changing the Bézier curve.

A particle's color, size, angle of texture coordinate rotation, and other attributes are interpolated
based upon its position on the Bézier curve.

9.6.1. Shader Files

Link the vertex shader to the correct particle system shader file, based on the features that you
want to support.

Particle system shader: DMP_particleSystem_X_X_X_X.obj

Where each X represents a value of 0 or 1 that controls a particle system feature. These features
are, in order, particle time clamping, texture coordinate rotation, the use of RGBA components or
the alpha component alone, and output of texture coordinate 2. A feature is not necessarily
disabled by a value of 0, nor enabled by a value of 1. Refer to the following table to determine
which shader file to link.

Table 9-5. Shader Filenames and the Particle System Features They Support

Filename Time Clamping Texture Coordinate Rotation RGBA Colors Texture coordinate 2
*_0_0_0_0.obj Yes Yes (Alpha only) No
*_0_0_0_1.obj Yes Yes (Alpha only) Yes

*_0_0_1_0.obj Yes Yes Yes No


*_0_0_1_1.obj Yes Yes Yes Yes
*_0_1_0_0.obj Yes None (Alpha only) No
*_0_1_0_1.obj Yes None (Alpha only) Yes

*_0_1_1_0.obj Yes None Yes No


*_0_1_1_1.obj Yes None Yes Yes
*_1_0_0_0.obj None Yes (Alpha only) No
*_1_0_0_1.obj None Yes (Alpha only) Yes

*_1_0_1_0.obj None Yes Yes No


*_1_0_1_1.obj None Yes Yes Yes
*_1_1_0_0.obj None None (Alpha only) No

*_1_1_0_1.obj None None (Alpha only) Yes


*_1_1_1_0.obj None None Yes No
*_1_1_1_1.obj None None Yes Yes

Each asterisk (*) in the table stands for DMP_particleSystem.

9.6.2. Reserved Uniform


Particle system shaders have the following reserved uniforms. These reserved uniforms must be
set by the application because they initially have undefined values.

Color

Use the glUniformMatrix4fv() function to set the particle color (dmp_PartSys.color). A 4x4
matrix represents the particle color of the first, second, third, and fourth control point, using an
RGBA value in each of the corresponding rows. This setting is valid only for a shader file that uses
RGBA colors (DMP_particleSystem_X_X_1_X.obj).

Performance is worse when RGBA colors are used than when alpha components alone are used. If
you are not using color components, we recommend that you link to a shader file that uses alpha
components alone (DMP_particleSystem_X_X_0_X.obj).

Aspect

Use the glUniformMatrix4fv() function to set the particle aspect (dmp_PartSys.aspect) with
a 4x4 matrix. Each row corresponds to a control point (from 1 through 4) and configures the particle
size, texture coordinate rotation, texture coordinate scaling, and alpha component.

Size

Set the particle size (the first aspect column) to a value of 1.0 or greater.

Use the glUniform2fv() function to set the minimum and maximum particle size
(dmp_PartSys.pointSize). Use the glUniform2fv() function to set the reciprocal of the
viewport's width and height (dmp_PartSys.viewport) because particle rendering does not
account for the screen size. If distance attenuation is being applied to the particle size, call the
glUniform3fv() function to set the distance attenuation factor
(dmp_PartSys.distanceAttenuation), and specify the attenuation coefficients that calculate
the attenuated size, as shown in the following equation.

derived_size is the size with distance attenuation applied, size is the original size, and d is
the distance from the viewpoint.

Texture Coordinates

You can set the rotation (the second aspect column) and scaling (the third aspect column) of
texture coordinates at each control point.
The particle system outputs two texture coordinates: 0 and 2. Both texture coordinates support
rotations, but only texture coordinate 2 supports scaling. The linked shader file enables or disables
these settings.

Output texture coordinate 2 (DMP_particleSystem_X_X_X_1.obj, do not output


(DMP_particleSystem_X_X_X_0.obj)
Rotate and scale the texture coordinate (DMP_particleSystem_X_0_X_X.obj), neither
rotate nor scale (DMP_particleSystem_X_1_X_X.obj)

Rotations are specified in radians. Clockwise rotations are specified by positive values.

Texture coordinate 0 is output with the particle's lower-left, lower-right, upper-left, and upper-right
corners at (0,0), (1,0), (0,1), and (1,1) respectively. Texture coordinate 2 is output with the
particle's lower-left, lower-right, upper-left, and upper-right corners at (-1,-1), (1,-1), (-1,1), and
(1,1) respectively.

Given a rotation of A and a scaling value of R, texture coordinate 0 is calculated as follows.

Lower-left = ( 0.5×(1.0 + (-cosA + sinA)), 0.5×(1.0 + (-cosA - sinA)) )


Lower-right = ( 0.5×(1.0 + (cosA + sinA)), 0.5×(1.0 + (-cosA + sinA)) )
Upper-left = ( 0.5×(1.0 + (-cosA - sinA)), 0.5×(1.0 + (cosA - sinA)) )
Upper-right = ( 0.5×(1.0 + (cosA - sinA)), 0.5×(1.0 + (cosA + sinA)) ).

Texture coordinate 2 is calculated as follows.

Lower-left = ( R×(-cosA + sinA), R×(-cosA - sinA) )


Lower-right = ( R×(cosA + sinA), R×(-cosA + sinA) )
Upper-left = ( R×(-cosA - sinA), R×(cosA - sinA) )
Upper-right = ( R×(cosA - sinA), R×(cosA + sinA) ).

Alpha Component

Set the particle's alpha component (the fourth aspect column) as a value between 0.0 and 1.0. This
setting is used with the linked shader files that only use the alpha component
(DMP_particleSystem_X_X_0_X.obj).

Emission Count

Use the glUniform1fv() function to set the maximum particle emission count
(dmp_PartSys.countMax). Set this value to one less than the actual number of particles you want
to emit. You must set a value of 0.0 or greater. However, because the shader program
implementation limits the maximum number of emitted particles to 255, no more than 255 particles
will be emitted, even when this reserved uniform is set to a value greater than 256.

Execution Time and Speed

A particle system has a concept of time. Use the glUniform1fv() function to set the particle
system time (dmp_PartSys.time) to the current time. This current time is randomly converted into
each particle's execution time, during which the particle travels from the first control point to the
fourth control point. A particle is emitted at the first control point when its execution time is 0.0, and
reaches the fourth control point when its execution time is 1.0.

If you link to a shader file that clamps the execution time (DMP_particleSystem_0_X_X_X.obj),
particles with an execution time of 1.0 or greater cease to be rendered. Consequently, particles will
cease to be emitted at some point, if the application simply lets time pass without resetting the
execution time.
If you link to a shader file that does not clamp the execution time
(DMP_particleSystem_1_X_X_X.obj), the execution time loops between 0.0 and 1.0. In other
words, particles that reach the fourth control point are re-emitted from the first control point.

Use the glUniform1fv() function to set the particle speed (dmp_PartSys.speed).

Random Values

The position of control points within their bounding boxes and the execution time of particles are
determined using a function that generates pseudorandom numbers. The application can specify a
random seed and coefficient to use in this random function.

The implementation of the random function is similar to the following algorithm for a pseudorandom
number generator.

Use the glUniform4fv() function to set the random seed (dmp_PartSys.randSeed) with an
array of values (the x, y, and z components of the Bézier curve and the particle execution time, in
that order) corresponding to X 0 in Equation 9-8.

For the random function's coefficients (dmp_PartSys.randomCore), use the glUniform4fv()


function to specify the values a, b, m, and 1/m in Equation 9-8.

Table 9-6. Reserved Uniforms for Particle System Shaders

Reserved Uniform Type Value to Set

The color at each control point.


(R, G, B, A) * 4vec
dmp_PartSys.color mat4
Each component has a value in the range from 0.0
through 1.0.
The aspect of each control point.
(particle_size, rotation_angle, scale, alpha) * 4vec
dmp_PartSys.aspect mat4
particle_size is 1.0 or greater and alpha is in the
range from 0.0 through 1.0.
dmp_PartSys.time float The current particle system time.
The speed of particle movement.
dmp_PartSys.speed float
0.0 or more
A number that is one less than the maximum
dmp_PartSys.countMax float number of particles to emit.
0.0 or more
The random seed for each of the random functions.
(Multiplicand with the Bézier curve's x component,
dmp_PartSys.randSeed vec4 multiplicand with the Bézier curve's y component,
multiplicand with the Bézier curve's z component,
multiplicand with the particle execution time).
The random function's coefficients.
dmp_PartSys.randomCore vec4
(a, b, m, 1/m)
dmp_PartSys.distanceAttenuation vec3 The distance attenuation factor.
The viewport using the following equation.
dmp_PartSys.viewport vec2
(1 / [Link], 1 / [Link])
The minimum and maximum particle size.
dmp_PartSys.pointSize vec2
Each has a value of 0.0 or greater.

9.6.3. Vertex Shader Settings


A particle system shader requires a 4×4 matrix that contains the following values converted into clip
coordinates: a single control point's vertex coordinates and the radius for the x, y, and z
components of its bounding box (centered on said vertex coordinates). Starting at the smallest
output register number, the vertex shader outputs the vertex coordinates followed by the first,
second, third, and fourth rows of the conversion matrix, in that order.

Two output vertex attributes must be set: the vertex coordinates using the position attribute and
the converted matrix using the generic attribute.

Code 9-10. Sample Output Register Settings When a Particle System Shader Is Used (in Shader Assembly)

#pragma output_map ( position , o0 )


#pragma output_map ( generic , o1 )
#pragma output_map ( generic , o2 )
#pragma output_map ( generic , o3 )
#pragma output_map ( generic , o4 )

The following equation shows the conversion into clip coordinates. The radii for the bounding box's
x, y, and z components are: Rx, Ry, and Rz respectively. The projection matrix is Mproj and the
modelview matrix is Mmodelview.

This is shown by the following shader assembly code. aBoundingBox is a vector with the x, y, and
z components of the radius input by the application, and vBoundingBox1 through
vBoundingBox4 represent the matrix output to the particle system shader. The bounding box's
radius is input as attributes in this sample code, but because the particle system shader requires
data for only four vertices, you can use an implementation that sets all of this along with the vertex
coordinates in uniforms.

Code 9-11. Bounding Box Radius and Clip Coordinate Conversion (in Shader Assembly)

mov TEMP_BOX[0], CONST_0


mov TEMP_BOX[1], CONST_0
mov TEMP_BOX[2], CONST_0
mov TEMP_BOX[3], CONST_0
mov TEMP_BOX[0].x, aBoundingBox.x
mov TEMP_BOX[1].y, aBoundingBox.y
mov TEMP_BOX[2].z, aBoundingBox.z
m4x4 TEMP_MAT, MATRIX_Project, MATRIX_ModelView
m4x4 TEMP_MAT, TEMP_MAT, TEMP_BOX
mov vBoundingBox1, TEMP_MAT[0]
mov vBoundingBox2, TEMP_MAT[1]
mov vBoundingBox3, TEMP_MAT[2]
mov vBoundingBox4, TEMP_MAT[3]

9.6.4. Input Vertex Data

Calls from the glDrawArrays() function are not supported. When using a particle system shader,
call the glDrawElements() function, and specify GL_GEOMETRY_PRIMITIVE_DMP for the mode
parameter.

CONFIDENTIAL
10. Rasterization

Even primitives created by geometry shaders, such as the point and line shader, are all ultimately
converted to triangle primitives and then generated (involving triangle generation and triangle setup).
The generated triangles are culled, clipped, converted into window coordinates, and finally rasterized
into a collection of fragments. Unlike in OpenGL ES 2.0, the scissor test is performed during the
rasterization stage. Further processing affects the generated fragments.

Figure 10-1. Process From the Vertex and Geometry Shaders to Rasterization

All processing after rasterization is implemented as a fixed pipeline, so there are fixed vertex attributes
that can be assigned to fragments. The following are the major vertex attributes.

Window coordinates
Depth values
Texture coordinates and partial differential values
Quaternions
View vectors
Vertex colors (rasterized from absolute values).

10.1. Culling

The order in which vertices are specified for the generated triangles (polygons) determines which
face, front or back, is facing the viewer. Culling is a feature that determines whether to rasterize a
polygon based on whether it is back-facing.

10.1.1. Determining the Front and Back Faces

A polygon's front and back faces are determined by the order in which its triangles' vertices are
specified in window coordinates. Either a clockwise (CW) or a counterclockwise (CCW) winding can
be considered to be front-facing. You can specify this using the glFrontFace() function.

Code 10-1. Definition of the glFrontFace Function

void glFrontFace(GLenum mode);

For mode, specify GL_CW for clockwise winding or GL_CCW for counterclockwise winding. A
counterclockwise winding (GL_CCW) is configured by default.
10.1.2. How to Use

Culling is used in the same way it is used in OpenGL

Enabling or Disabling Culling

To enable or disable culling, call glEnable or glDisable, respectively, and specify


GL_CULL_FACE for cap. Call the glIsEnabled() function, and specify GL_CULL_FACE for cap to
get the current setting. Culling is disabled by default.

Specifying the Culled Face

Use the glCullFace() function to specify which face is not rasterized (the culled face).

Code 10-2. Definition of the glCullFace Function

void glCullFace(GLenum mode);

You can choose from the following mode values to specify the culled face.

Table 10-1. Specifying the Culled Face

Setting Value Culled Face

GL_FRONT Front
GL_BACK (default) Back
GL_FRONT_AND_BACK Both

10.2. Clipping

Clipping is a feature that clips (removes) primitives from the regions (clip volumes) at which the view
volume intersects the half-spaces defined by the specified clipping planes. 3DS clipping features
correspond to the clipping features in OpenGL ES 1.1, but they are all controlled using reserved
uniforms. The clipping implementation also generates new vertices and triangles for each clipped
triangle, splitting it into multiple triangles.

Because GPU vertex processing converts coordinates using 24-bit floating-point numbers, clipping
may not be performed correctly at the far clipping plane, if the ratio of the near clipping plane to the
far clipping plane is large. Either configure the clip volume to keep the ratio of the near clipping plane
to the far clipping plane small, or avoid placing polygons near the far clipping plane, whenever
possible.

10.2.1. Reserved Uniform

The following reserved uniforms are used for clipping.

Enabling or Disabling Clipping


To enable clipping, set the reserved uniform dmp_FragOperation.enableClippingPlane to
GL_TRUE, using the glUniform1i() function. This setting is disabled (set to GL_FALSE) by
default.

Clipping Plane

To specify the clipping plane, set four coefficients in the reserved uniform
dmp_FragOperation.clippingPlane using the glUniform4f() function. If p1, p2, p3, and p4
are the four coefficients, the clip volume is a collection of points that satisfy the following equation.

Because these coefficients must be defined in the clip coordinate system, you must specify values
that have been through a modelview transformation and perspective projection for the clipping
plane used by the OpenGL ES standard. Unlike the OpenGL ES specifications, the 3DS system
clips z-coordinates between 0 and -Wc. OpenGL ES-compatible matrices cannot be used
unchanged for perspective projections. All coefficients are set to 0.0 by default.

Note: For cautions related to using OpenGL ES-compatible matrices with projection
transformations, see 8.4. Notes for the Clip Coordinate System.

Table 10-2. Reserved Uniforms Used for Clipping

Reserved Uniform Type Value to Set


Enables or disables clipping.
dmp_FragOperation.enableClippingPlane bool
GL_TRUE or GL_FALSE (default).
Specifies the four coefficients for the
dmp_FragOperation.clippingPlane vec4 clipping plane.
This is (0.0, 0.0, 0.0, 0.0) by default.

10.3. Transformation to Window Coordinates

A polygon model's vertices are transformed from object coordinates into window coordinates through
the following four steps.

Modelview transformation:
Transforms object coordinates into eye coordinates.

Projection transformation:
Transforms eye coordinates into clip coordinates. Note that the converted values (Zc) are
between -Wc and 0.

Perspective division:
Transforms clip coordinates into normalized device coordinates using w values.
Viewport transformation:
Transforms normalized device coordinates into window coordinates following the viewport
settings.

The vertex shader (and geometry shader) must output vertex attributes in clip coordinates. As a
result, the vertex shader generally performs modelview and projection transformations internally.
Perspective division and viewport transformation determine the position of each generated triangle in
the display region. These triangles are converted into fragments during rasterization, and are then
used by processes such as fragment lighting.

The following figure illustrates the process of transforming coordinates.

Figure 10-2. Process of Transforming Vertices From Object Coordinates Into Window Coordinates

10.3.1. Configuring the Viewport

Normalized device coordinates are transformed into window coordinates by the following equation.

Xw, Yw, and Zw are window coordinates.


Xd, Yd, and Zd are normalized device coordinates.
p x and p y are the viewport's width and height.
o x and o y are the viewport's center point.
n and f are the depth values of the near and far planes, respectively, in clip space.

You can use the glDepthRangef() function to set the values applied to n and f in this equation.

Code 10-3. Definition of the glDepthRangef Function

void glDepthRangef(GLclampf zNear, GLclampf zFar);

The values set for zNear and zFar are clamped between 0.0 and 1.0. By default, zNear is 0.0
and zFar is 1.0.

p x , p y , o x , and o y can all be calculated from the viewport settings. The glViewport() function
configures the viewport.

Code 10-4. glViewport Function


void glViewport(GLint x, GLint y, GLsizei width, GLsizei height);

For x and y specify the coordinates of the viewport's starting point (lower-left corner). A
GL_INVALID_VALUE error is generated if a negative value is specified. If the specified value is not
a multiple of 4, processing efficiency drops (to half for an even number and to one third for an odd
number). In this case, extend the viewport so that it is a multiple of 4, adjust the perspective
projection matrix so that it renders correctly on the extended viewport, and then apply the scissor
test to avoid rendering unnecessary regions.

For width and height, specify the width and height of the viewport. A GL_INVALID_VALUE error
is generated if a negative value is specified. The maximum width and height are both 1024.

The value of width is applied to p x , height is applied to p y , (x + width) / 2 is applied to o x ,


and (y + height) / 2 is applied to o y .

[Link]. A Viewport Larger Than 1023×1016 Causes Incorrect


Rendering

A hardware bug prevents images from being rendered properly if the following conditions are met
when the glViewport() function sets the viewport.

When width is greater than 1023, entire polygons are not rendered if they contain pixels
whose window x-coordinate is at least 1023 greater than the x parameter of the
glViewport() function (taking the left side of the window to have an x-coordinate of 0).
When height is greater than 1016, the GPU stops responding if any polygon contains pixels
whose window y-coordinate is at least 1016 greater than the y parameter of the
glViewport() function (taking the bottom of the window to have a y-coordinate of 0).

To work around this hardware bug, avoid rendering pixels at the problematic coordinates by
setting the viewport’s width and height to values that are not greater than 1023 and 1016,
respectively.

When using the render-to-texture technique with a 1024×1024 texture, you must only render to a
1023×1016 region, and you must adjust texture coordinates so that only a 1023×1016 texture
region is valid for use.

If you want to render to the entire 1024×1024 region, you must keep the viewport’s width from
exceeding 1023 and its height from exceeding 1016, while changing the viewport’s offset to
render the texture in sections. For example, the bug described in this section does not occur if
you split rendering into the following four function calls: glViewport(0, 0, 512, 512),
glViewport(512, 0, 512, 512), glViewport(0, 512, 512, 512), and
glViewport(512, 512, 512, 512).

You cannot work around this bug by using the scissor test to prevent the pixels at the problematic
coordinates from being rendered.

10.3.2. Polygon Offset

Polygon offset is a feature that adds an offset to depth values when polygons are rasterized and
converted into fragments. This resolves the situation in which a lack of depth value resolution
prevents the fragments' front-to-back order from being determined, as for overlapping coplanar
polygons. Polygon offset occurs during window coordinates conversion.
Enabling or Disabling Polygon Offset

To enable or disable polygon offset, call glEnable or glDisable, respectively, and specify
GL_POLYGON_OFFSET_FILL for cap. Call the glIsEnabled() function, and specify
GL_POLYGON_OFFSET_FILL for cap to get the current setting. Polygon offset is disabled by
default.

Specifying Offset Values

You can use the glPolygonOffset() function to specify the offset for depth values when polygon
offset is enabled.

Code 10-5. Definition of the glPolygonOffset Function

void glPolygonOffset(GLfloat factor, GLfloat units);

OpenGL uses the values of factor and units to determine the offset, but the 3DS system uses
only units. The value of factor is set, but it is irrelevant to the offset.

The offset value is set with the product of units and the minimum (fixed) value at which a
difference in the depth value appears in window coordinates. Because the z-value for vertex
coordinates is implemented as a 24-bit floating-point number after vertex processing, a units
value that is not a multiple of 128 has no effect when a polygon’s z-value is close to 1.0. To get a
definite effect, set units to a multiple of 128.

Depth values are written to the depth buffer after an offset is added to them.

10.3.3. W-Buffer

The w-buffer is a feature that calculates depth values in window coordinates without using a
perspective projection. You can control the w-buffer with the following reserved uniforms. Settings
made by the glDepthRangef() function are disabled when the w-buffer is enabled.

Depth Value Scale

Depth values are calculated by the following equation, when the w-buffer is enabled.

Zw = -scale w × Zc

Zw is the depth value in window coordinates, Zc is the z-value in clip coordinates, and scale w is the
scale value. This scale value is a floating-point number set by the reserved uniform
dmp_FragOperation.wScale. As long as it is not 0.0, the w-buffer is enabled. Set the scale such
that Z w remains in the range from 0.0 to 1.0.

Table 10-3. Reserved Uniforms Used for the W-Buffer

Reserved Uniform Type Setting Value


Specifies the depth value scale.
dmp_FragOperation.wScale float
0.0 by default.

When Polygon Offset Is Enabled


If both the w-buffer and polygon offset features are enabled, the value obtained by multiplying W c
(the w value in clip coordinates) times units, as specified by the glPolygonOffset() function,
is applied to as the polygon offset value.

10.4. Scissor Test

The scissor test is a feature that rejects fragments outside the specified range of window coordinates
to reduce the number of fragments handled by further processing.

10.4.1. How to Use

Although its position in the pipeline is different, the process has the same specifications as the
OpenGL scissor test.

Enabling or Disabling the Scissor Test

To enable or disable the scissor test, call glEnable or glDisable respectively, and specify
GL_SCISSOR_TEST for cap. Call the glIsEnabled() function and specify GL_SCISSOR_TEST for
cap to get the current setting. The scissor test is disabled by default. When it is disabled,
fragments are not rejected.

Specifying the Scissor Box

Use the glScissor() function to specify the range (scissor box) through which fragments are
allowed to pass.

Code 10-6. Definition of the glScissor Function

void glScissor(GLint x, GLint y, GLsizei width, GLsizei height);

x and y are coordinates that specify the starting point (lower-left corner) of the scissor box in the
window coordinates. width and height specify the width and height of the scissor box. A
GL_INVALID_VALUE error is generated if width or height is 0. Because there are no default
values, always specify the scissor box when the scissor test is enabled.

The scissor box includes the fragment at the starting coordinates, but not the fragment with an x-
coordinate of (x + width) or a y-coordinate of (y + height).

10.5. Rasterization

Polygon rasterization (fragment generation) by the PICA graphics core adheres to the following rules.

The center coordinates of the pixel (x + 0.5, y + 0.5; x and y are integers) must lie inside the
polygon.
According to the lower-left rule, a fragment is generated if the lower or left edge of a polygon
passes through the center coordinate of a pixel, but no fragment is generated if the upper or right
edge passes through the center coordinate.
Note: This assumes that the x-axis is negative toward the left and the y-axis is negative toward
the lower edge.

Figure 10-2 shows how the lower-left rule is applied based on the rasterization results of two
polygons. The two polygons are the one defined by the three vertices (5.5, 0.5), (5.5, 5.5), and (0.5,
5.5), and the one defined by the three vertices (5.5, 0.5), (0.5, 0.5), and (0.5, 5.5). The former is
shown in red, while the latter is shown in blue.

Figure 10-3. Example Demonstrating Rasterization Rules (the Lower-Left Rule)

The rasterization rules are applied as follows.

On the boundary of the two polygons—because the left edge of the red polygon passes through the
center of the pixels, these pixels are colored red.
Because the blue polygon passes through the center of pixels with a center coordinate x value of 0.5,
these pixels are colored blue.
Because the right side of the red polygon passes through the center of the pixels with a center
coordinate x value of 5.5, these pixels are not colored.
Because the bottom of the blue polygon passes through the center of the pixels with a center
coordinate y value of 0.5, these pixels are colored blue.
And, because the top of the red polygon passes through the center of the pixels with a center
coordinate y value of 5.5, these pixels are not colored.

CONFIDENTIAL

11. Texture Processing


The 3DS system can perform texture processing operations equivalent to those in OpenGL ES 2.0, but
there are several CTR-specific restrictions.

11.1. Texture Unit


Four texture units (TEXTURE0 through TEXTURE3) are installed on the 3DS system, but each one can
handle different types of textures. A rasterizer can independently output up to three sets of texture
coordinates to the texture units. To output texture coordinates to all four texture units, TEXTURE2
or TEXTURE3 must share the same texture coordinates with other texture units.

Table 11-1. Textures Supported by the Texture Units

Texture 2D Cube Map Shadow Projection Procedural


Unit Textures Textures Textures Textures Textures
TEXTURE0 ✓ ✓ ✓ ✓

TEXTURE1 ✓
TEXTURE2 ✓
TEXTURE3 ✓

One- and three-dimensional textures are not supported. Cube map textures, shadow textures,
projection textures, and other textures that require the w component can only be processed by
TEXTURE0. TEXTURE3 is a unit used exclusively for procedural textures.
GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS defines the number of installed texture units to be 4.

11.1.1. Texture Coordinate Input

Only texture unit 0 can accept the w component as input; the other texture units only allow two
input components, u and v. A projection texture is simply a 2D texture with its w component
enabled, but care must be taken with the coordinates output by the vertex shader because the
generated UV coordinates are divided by the w component.

To send texture coordinates from the vertex shader, map output registers to the attribute names
texture0, texture0w, texture1, and texture2. texture0w must be output for cube map
textures, shadow textures, projection textures, and anything else that requires the w component. If
it is not output, texture coordinate 0's output is undefined. texture0w output is ignored for
textures that do not require the w component.

Table 11-2. Attribute Names and Corresponding Texture Coordinates

Attribute Name Attributes Sent From the Vertex Shader


texture0 UV components of texture coordinate 0.

texture0w W component of texture coordinate 0.


texture1 UV components of texture coordinate 1.
texture2 UV components of texture coordinate 2.

[Link]. Texture Coordinate Precision

Within a texture unit, texture coordinates are represented as 16-bit values combining an integer
and decimal component. The number of decimal bits decreases as the absolute value of the
integer component increases.

The accuracy of texture sampling depends on the decimal bit precision. A texture can be
sampled, optimally, if there are enough decimal bits to represent its width and height in texels.
Allocate another six decimal bits for bilinear filtering.
11.1.2. How to Use

To enable or disable a texture unit in OpenGL, specify an argument such as TEXTURE_2D to


parameters glEnable() and glDisable(). On the 3DS system, however, set the reserved
uniform dmp_Texture[i].samplerType (where i is the texture unit number). Behavior is
undefined if you call glEnable() or glDisable() using the GL_TEXTURE_2D argument.

To disable a texture unit, call glUniform1i, and pass GL_FALSE to set a reserved uniform value.
Each texture unit is enabled by different reserved uniform settings.

Table 11-3. Settings for the Reserved Uniforms dmp_Texture[i].samplerType

Reserved Uniform Value to Set Supported Textures

GL_FALSE Disabled (default).


GL_TEXTURE_2D 2D textures
GL_TEXTURE_CUBE_MAP Cube-map texture.
dmp_Texture[0].samplerType
GL_TEXTURE_SHADOW_2D_DMP Shadow textures.

GL_TEXTURE_SHADOW_CUBE_DMP Cube map shadow textures.


GL_TEXTURE_PROJECTION_DMP Projection textures
GL_FALSE Disabled (default).
dmp_Texture[1].samplerType
GL_TEXTURE_2D 2D textures

GL_FALSE Disabled (default).


dmp_Texture[2].samplerType
GL_TEXTURE_2D 2D textures
GL_FALSE Disabled (default).
dmp_Texture[3].samplerType
GL_TEXTURE_PROCEDURAL_DMP Procedural textures

Texture units 0 and 1 both have fixed texture coordinate input: texture coordinates 0 and 1,
respectively. Texture units 2 and 3 must have their texture coordinate input specified by the
reserved uniform dmp_Texture[i].texcoord (where i is the texture unit number 2 or 3). Each
texture unit allows different reserved uniform settings. When all four texture units are used, texture
unit 2 or 3 must share its input texture coordinates with another texture unit.

Color values input to the texture combiner are undefined when the texture combiner accesses a
disabled texture unit.

Table [Link] for the Reserved Uniforms dmp_Texture[i].texcoord

Reserved Uniform Value to Set Texture Coordinate Used as Input


GL_TEXTURE1 Texture coordinate 1
dmp_Texture[2].texcoord
GL_TEXTURE2 Texture coordinate 2 (default).
GL_TEXTURE0 Texture coordinate 0 (default).

dmp_Texture[3].texcoord GL_TEXTURE1 Texture coordinate 1


GL_TEXTURE2 Texture coordinate 2

Unless they are configured using reserved uniforms, texture unit settings apply to the texture unit
specified by the glActiveTexture() function.

Code 11-1. Definition of the glActiveTexture Function

void glActiveTexture(GLenum texture);


For texture you can specify GL_TEXTURE0, GL_TEXTURE1, or GL_TEXTURE2. A
GL_INVALID_ENUM error is generated when some other value, or GL_TEXTURE3 (texture unit 3), is
specified.

Reserved uniforms are used for all procedural texture settings.

11.1.3. Specifying Textures to Use

To specify the texture to use with a texture unit, first specify the texture unit using the
glActiveTexture() function, and then specify the texture object using the glBindTexture()
function.

If you want to use a different texture for each texture unit, call the glActiveTexture() and
glBindTexture() functions, as shown in the following sample code.

Code 11-2. Specifying a Texture for Each Texture Unit

// Texture Unit0
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, imageTexID);
// Texture Unit1
glActiveTexture(GL_TEXTURE1);
glBindTexture(GL_TEXTURE_2D, bumpTexID);

11.1.4. Texture Parameters

Use the glTexParameter*() functions to add parameters to a texture, such as the texture
wrapping mode and filters.

Code 11-3. Definition of the glTexParameter* Functions

void glTexParameterf(GLenum target, GLenum pname, GLfloat param);


void glTexParameterfv(GLenum target, GLenum pname, const GLfloat* params);
void glTexParameteri(GLenum target, GLenum pname, GLint param);
void glTexParameteriv(GLenum target, GLenum pname, const GLint* params);

The glTexParameteri() function adds parameters passed in as integers and the


glTexParameterf() function adds parameters passed in as floating-point numbers. Functions
with names that end in v are used to add parameters that must be passed as vectors (arrays).

For target specify the same value specified in the glTexImage2D() function (Table 7-2). pname
is the name of the parameter to add and param is the parameter value. Unique parameters have
been added for 3DS.

The 3DS system gets up to eight texels for a single fragment when filtering.

Table 11-5. Parameters Added for pname

int / float /
pname Added Parameter
vector
GL_TEXTURE_WRAP_S int Wrapping mode in the S direction (Table 11-6).

GL_TEXTURE_WRAP_T int Wrapping mode in the S direction (Table 11-6).

GL_TEXTURE_MIN_FILTER int Minification filter (Table 11-7).


GL_TEXTURE_MAG_FILTER int Magnification filter (Table 11-8).
The border color to use when the wrapping mode is
GL_TEXTURE_BORDER_COLOR vec4(float) GL_CLAMP_TO_BORDER (each component is between
0.0 and 1.0).
LOD bias (-16.0 through 16.0. The default is 0.0.)
GL_TEXTURE_LOD_BIAS float
Default value: 0.0

GL_TEXTURE_MIN_LOD int Minimum LOD (-1000 by default.)


Automatic mipmap texture generation (GL_FALSE by
GL_GENERATE_MIPMAP int(bool) default). For more information, see [Link].
Automatic Generation of Mipmap Textures.

The following values specify the wrapping mode in the S and T directions.

Table 11-6. Specifying the Wrapping Mode in the S and T Directions

param Description
GL_REPEAT Repeat (default).

GL_MIRRORED_REPEAT Flip and repeat.


Use the color at the edge of the texture image for texture coordinates that
GL_CLAMP_TO_EDGE
are not in the range from 0.0 through 1.0.

Use the border color for texture coordinates that are not in the range from
GL_CLAMP_TO_BORDER
0.0 through 1.0.

The following values specify the filters to use when rendering texture images that have been scaled
down.

Table 11-7. Specifying Minification Filters

param Description
GL_NEAREST Use the color of the nearest texel (default).

Use bilinear sampling (the average of four samples) to determine


GL_LINEAR
the color.
Select the nearest mipmap texture, and then use the color of the
GL_NEAREST_MIPMAP_NEAREST
nearest texel.
Select two mipmap texture levels, and then interpolate between the
GL_NEAREST_MIPMAP_LINEAR
nearest colors at each level.
Select the nearest mipmap texture, and then use bilinear sampling
GL_LINEAR_MIPMAP_NEAREST
to determine the color.
Select two mipmap texture levels, and then interpolate between the
GL_LINEAR_MIPMAP_LINEAR
bilinearly sampled colors at each level.

The following values specify the filters to use when rendering texture images that have been scaled
up.

Table 11-8. Specifying Magnification Filters

param Description

GL_NEAREST Use the color of the nearest texel (default).


GL_LINEAR Use bilinear sampling (the average of four samples) to determine the color.

[Link]. Automatic Generation of Mipmap Textures


When the texture parameter GL_GENERATE_MIPMAP is GL_TRUE and a value of -2 or less is
passed to level in the glTexImage2D(), glCopyTexImage2D(), or
glCopyTexSubImage2D() function, mipmap textures are automatically generated for all but the
lowest mipmap level. However, texture formats (combinations of the format and type
parameters) apply the following restrictions to the minimum width and height of mipmap textures
that are automatically generated.

Table 11-9. Minimum Width and Height of Automatically Generated Mipmap Textures

Format format type Minimum Width and Height


GL_RGBA
RGBA4 GL_UNSIGNED_SHORT_4_4_4_4 64
GL_RGBA_NATIVE_DMP

GL_RGBA
RGBA5551 GL_UNSIGNED_SHORT_5_5_5_1 64
GL_RGBA_NATIVE_DMP
GL_RGBA
RGBA8 GL_UNSIGNED_BYTE 32
GL_RGBA_NATIVE_DMP
GL_RGB
RGB565 GL_UNSIGNED_SHORT_5_6_5 64
GL_RGB_NATIVE_DMP
GL_RGB
RGB8 GL_UNSIGNED_BYTE 32
GL_RGB_NATIVE_DMP

Even if two textures have the same width and height, the range of values that can be specified for
level differ, if the texture formats have different minimum values. A GL_INVALID_OPERATION
error is generated when you specify the automatic generation of a mipmap texture smaller than
the minimum size. For a texture that is 128×128 texels, for example, you can specify a level of
-2 or -3 when the format is RGB8, but you can only specify -2 for level when the format is
RGB565.

When enabled, automatically generated mipmap textures take priority. The mipmap texture data
loaded with a texture image is ignored.

With automatic generation enabled, level for the glCopyTexImage2D() and


glCopyTexSubImage2D() functions must have the same value as was passed to level for the
glTexImage2D() function when the texture was loaded. A GL_INVALID_OPERATION error is
generated if these values differ. However, even if a nonzero value is passed to level, data is
copied to the texture with the lowest level, and the mipmap texture is not replaced.

With automatic generation disabled, level for the glCopyTexImage2D() and


glCopyTexSubImage2D() functions must be 0, regardless of the value that was passed to
level for the glTexImage2D() function when the texture was loaded. A
GL_INVALID_OPERATION error is generated if a nonzero value is specified.

[Link]. Cautions for Specifying GL_NEAREST as the Filter

If you set the GL_TEXTURE_MIN_FILTER and GL_TEXTURE_MAG_FILTER texture parameters to


GL_NEAREST and use a texture image with vertical or horizontal lines (including color boundaries
that form straight lines), when the texture is applied to polygons that place those stripes
horizontally or perpendicular to the screen's scan lines, the straight lines may appear as broken
lines.

This phenomenon is caused by the computational precision of texture coordinate interpolation


within polygons. Consider a row of fragments that is horizontal or perpendicular to the scan lines.
If each fragment samples texels in a single vertical or horizontal line, the straight lines on the
texture image are rendered unchanged on the polygons. If each of these fragments sample
around the boundary between two adjacent rows of texels, however, each fragment will sample a
different texel row due to texture coordinate errors. As a result, the straight line on the texture
image is rendered as a broken line on the polygons. You can work around this by adjusting the
texture coordinates and rendering area so that each fragment samples the texel centers, just as
when a rectangular polygon with texture coordinates of 0 and 1 at its edges is rendered using the
same size as a texture.

[Link]. Cautions for Specifying GL_XXX_MIPMAP_LINEAR as a


Minification Filter

Trilinear filtering is enabled when GL_XXX_MIPMAP_LINEAR is set for the texture parameter
GL_TEXTURE_MIN_FILTER. This interpolates colors from two mipmap texture levels and then
renders with the resulting color. However, this interpolation is subject to errors caused by
computational precision. For example, even if two colors with the same component values are
processed, a different color could be rendered due to these minor errors.

Colors are interpolated when, as a result of LOD calculations, they must be obtained from two
mipmap texture levels. Colors are not interpolated when they are obtained from only one mipmap
texture level. A texture is slightly darker where colors are interpolated than where colors are not.
This causes differences in hue to be rendered as edges along mipmap level boundaries.

You can mitigate this effect by using fixed component values in a texel color. For example, a
texture in the GL_RGB format has a fixed alpha component of 1.0, but even this fixed component
value decreases slightly for texels that have been interpolated by trilinear filtering. Nonetheless,
this component retains its value of 1.0 in texels that have not been interpolated. To correct a
texture color, multiply it by the change in alpha value caused by interpolation, and then add this
product to the texture color.

Texture combiners can make this correction, as shown in the following sample code. The
following code shows an example of the setting.

Code 11-4. Compensating for Trilinear Filtering

glUniform1i(glGetUniformLocation(program, "dmp_TexEnv[0].combineRgb"),
GL_MULT_ADD_DMP);
glUniform3i(glGetUniformLocation(program, "dmp_TexEnv[0].operandRgb"),
GL_SRC_COLOR, GL_ONE_MINUS_SRC_ALPHA, GL_SRC_COLOR);
glUniform3i(glGetUniformLocation(program, "dmp_TexEnv[0].srcRgb"),
GL_TEXTURE0, GL_TEXTURE0, GL_TEXTURE0);

Wherever colors are not interpolated, these settings result in a product of 0 and output the
original texture color. Wherever colors are interpolated, these settings multiply the texture color
and the difference in the alpha component, add this product to the texture color, and finally output
the corrected color. You can only apply these settings to the first texture combiner (combiner 0)
because they specify a texture color for every input source.

If you are using a texture format without fixed component values, such as GL_RGBA, prepare a
separate texture with fixed component values and use that as a multitexture to perform the
correction. This texture must have the same size, number of mipmap levels, and UV input values
as the original texture. We recommend using an ETC1-compressed texture for data size and
cache efficiency reasons.

[Link]. Getting Texture Level (Mipmap) Parameters

You can use the following functions to get parameters for each mipmap level of textures bound to
texture units that are currently active. However, you cannot get information about procedural
textures because the texture unit for procedural textures (GL_TEXTURE3) cannot be specified to
the glActiveTexture() function.

Code 11-5. Getting Texture Level Parameters

void glGetTexLevelParameterfv(GLenum target, GLint level, GLenum pname,


GLfloat *params);
void glGetTexLevelParameteriv(GLenum target, GLint level, GLenum pname,
GLint *params);

Both functions get the same values, even though they are saved as different types.

Specify the type of texture in target. You can specify the following values.

Table 11-10. Specifying the Types of Textures to Get Texture Level Parameters For

target Value Type of Texture


GL_TEXTURE_2D 2D texture (including shadow and gas textures).
GL_TEXTURE_CUBE_MAP_POSITIVE_X Cube-map texture (in the positive X direction).

GL_TEXTURE_CUBE_MAP_NEGATIVE_X Cube-map texture (in the negative X direction).


GL_TEXTURE_CUBE_MAP_POSITIVE_Y Cube-map texture (in the positive Y direction).
GL_TEXTURE_CUBE_MAP_NEGATIVE_Y Cube-map texture (in the negative Y direction).
GL_TEXTURE_CUBE_MAP_POSITIVE_Z Cube-map texture (in the positive Z direction).

GL_TEXTURE_CUBE_MAP_NEGATIVE_Z Cube-map texture (in the negative Z direction).

Specify the mipmap level to get in level. When level is 0, you get the parameter for the
texture at the lowest mipmap level (the largest texture). You can get the next level and the one
after that by specifying 1 and 2, respectively, for level.

Specify the type of parameter to get in pname. The following table shows how the value of pname
corresponds to the parameter stored in params.

Table 11-11. Specifying Texture Level Parameters to Get

pname Value Parameters


GL_TEXTURE_WIDTH Texture width (in texels).
GL_TEXTURE_HEIGHT Texture height (in texels).

GL_TEXTURE_DEPTH Unsupported; has a fixed value of 0.


GL_TEXTURE_INTERNAL_FORMAT A texture’s internal format.
GL_TEXTURE_BORDER Unsupported; has a fixed value of 0.
GL_TEXTURE_RED_SIZE Number of bits in the red component (per texel).

GL_TEXTURE_GREEN_SIZE Number of bits in the green component (per texel).


GL_TEXTURE_BLUE_SIZE Number of bits in the blue component (per texel).
GL_TEXTURE_ALPHA_SIZE Number of bits in the alpha component (per texel).
GL_TEXTURE_LUMINANCE_SIZE Number of bits in the luminance component (per texel).

Number of bits in the intensity component (per texel).


GL_TEXTURE_INTENSITY_SIZE
Only for shadow textures.

Number of bits in the depth component (per texel). Only


GL_TEXTURE_DEPTH_SIZE
for shadow textures.
Number of bits (per texel) of density information (density
GL_TEXTURE_DENSITY1_SIZE_DMP value 1) that does not account for intersections. Only for
gas textures.
Number of bits (per texel) of density information (density
GL_TEXTURE_DENSITY2_SIZE_DMP value 2) that does account for intersections. Only for gas
textures.
Whether a texture is compressed.
GL_TEXTURE_COMPRESSED GL_TRUE: Compressed texture
GL_FALSE: Uncompressed texture
Number of bytes in the texture at the mipmap level
specified by level. Only for compressed textures.
GL_TEXTURE_COMPRESSED_IMAGE_SIZE
Specifying an uncompressed texture results in a
GL_INVALID_OPERATION error.

The following table shows how the number of bits in each component for a single texel
corresponds to the internal format of textures obtained when GL_TEXTURE_INTERNAL_FORMAT is
specified for pname.

Table 11-12. Internal Formats and the Number of Bits in Components Making Up Each Texel

Density
Internal Format Red Green Blue Alpha Luminance Intensity Depth
1
GL_RGBA4 4 4 4 4
GL_RGB5_A1 5 5 5 1

GL_RGBA 8 8 8 8
GL_RGB565 5 6 5
GL_RGB 8 8 8

GL_ALPHA 8
GL_ALPHA4_EXT 4
GL_LUMINANCE 8
GL_LUMINANCE4_ EXT 4

GL_LUMINANCE_ALPHA 8 8
GL_LUMINANCE4_ALPHA4_EXT 4 4
GL_SHADOW_DMP 8 24
GL_GAS_DMP 16

GL_HILO8_DMP 8 8
GL_ETC1_RGB8_NATIVE_DMP 8 8 8
GL_ETC1_ALPHA_RGB8_A4_NATIVE_DMP 8 8 8 4

A GL_INVALID_ENUM error is generated when an invalid value is specified for target and
pname. A GL_INVALID_VALUE error is generated when the mipmap level specified for level has
not been loaded.

11.1.5. Effects of Texture Settings on Performance

Graphics performance is affected by texture format and size, and by various settings. The following
is a list of common tendencies.

Compressed textures are processed the fastest, followed by formats that use a small number of
bytes per texel.
Processing speed increases as the size decreases.
Contention for memory access causes processing to slow as the number of textures used
simultaneously increases.
The following pairs of minification filter settings are each processed at the same
speed:GL_NEAREST and GL_LINEARGL_NEAREST_MIPMAP_NEAREST and
GL_LINEAR_MIPMAP_NEARESTGL_NEAREST_MIPMAP_LINEAR and
GL_LINEAR_MIPMAP_LINEAR. However, because GL_NEAREST(_XXX) fetches 1 texel per
pixel and GL_LINEAR(_XXX) fetches 4 texels per pixel, GL_NEAREST(_XXX) uses less
memory.
It is faster to apply scaled-down textures using mipmaps. Even when mipmaps are used,
however, the processing load depends on the filter settings. GL_*_MIPMAP_LINEAR may entail
approximately twice the processing load of GL_*_MIPMAP_NEAREST.
Although the GL_NEAREST and GL_LINEAR magnification filters have nearly the same
performance, GL_NEAREST is slightly faster in some cases.
Gas and shadow textures cannot use mipmaps and are processed more slowly than normal
textures.
Shadow textures make use of special filters for shadows, so there is a processing load
comparable to trilinear filtering (when GL_*_MIPMAP_LINEAR is set as the minification filter for
normal textures). The processing load is around twice that for a normal texture (excluding
trilinear filtering).
There are no differences caused by conditions for setting procedural textures. They are faster
to process than normal 2D textures.
When multiple textures are used, they are processed more quickly if they are all placed in
VRAM-A or VRAM-B, rather than split between the two.
Textures created to match the upward direction of the framebuffer can sometimes process
faster than textures created to match the upward direction of the rendering results. This is
because the direction in which fragments are generated matches the direction in which textures
are loaded, improving the texture cache hit rate. Flipping a texture vertically has no effect on
performance. Fragments are processed horizontally in 8&times;8-pixel units, whereas textures
are loaded in 8&times;4-texel units. Note also that the short sides of the 3DS's LCD screen are
used as the top and bottom.

11.1.6. Texture Cache

There is a 256-byte L1 texture cache and an 8-KB L2 texture cache. Within a cache, only
compressed textures (in the ETC format) are handled unchanged. All other textures (including
those in the alpha ETC format) are converted into a 32-bit format.

There is a separate L1 cache for each texture unit, but the L2 cache is shared by all texture units.

There is a 5-cycle penalty for missing the L1 cache and, instead, getting data from the L2 cache.
There is an additional penalty of approximately 30 cycles for missing the L2 cache and instead
getting data from VRAM. However, the hardware is implemented to prefetch texels to hide these
delays.

Texture caches have a 4-way set-associative format. There are 16 cache lines. The L2 cache is 8
KB, with 512 bytes per cache line.

A cache line is the lower 4 bits of a 8x4 block address (a 4x4 texel unit address divided by 2)
calculated from the texture coordinate values. In ETC1, this is the value of the [5:2] bit of the
address of the block unit in which 2x2 of the 4x4 texel blocks are arranged. Cache thrashing occurs
when the same cache line is continuously accessed.
11.2. Combiners

There are six (texture) combiners installed on the 3DS system. They can combine the primary and
secondary colors for fragment lighting, in addition to the colors output by texture units, such as the
texture color, vertex color, and constant color. If you have experience developing applications for the
Nintendo GameCube and Wii, you can more easily understand this effect if you imagine it as
combining color and alpha values by using the TEV.

OpenGL ES 1.1 uses TexEnv for combiner settings, but the 3DS system uses reserved uniforms. The
following table shows the reserved uniforms that correspond to TexEnv parameters.

Table 11-13. Reserved Uniforms Corresponding to TexEnv Parameters in OpenGL ES 1.1

TexEnv Reserved Uniform Setting


COMBINE_RGB dmp_TexEnv[i].combineRgb Color combiner function.
COMBINE_ALPHA dmp_TexEnv[i].combineAlpha Alpha combiner function.
SRCn_RGB dmp_TexEnv[i].srcRgb Color source.

SRCn_ALPHA dmp_TexEnv[i].srcAlpha Alpha source.


OPERANDn_RGB dmp_TexEnv[i].operandRgb Color operands.
OPERANDn_ALPHA dmp_TexEnv[i].operandAlpha Alpha operands.
RGB_SCALE dmp_TexEnv[i].scaleRgb Color scaling value.

ALPHA_SCALE dmp_TexEnv[i].scaleAlpha Alpha scaling value.


TEXTURE_ENV_COLOR dmp_TexEnv[i].constRgba Constant color (with an alpha component).

Where n is the source (from 0 through 2) and i is the combiner number (from 0 through 5).

Each combiner processes its three source inputs as operands in its combiner function, multiplies the
calculated result by a scaling value, and clamps the value between 0.0 and 1.0 before outputting it.
Also, 0.0 to 1.0 are clamped before input is computed in the combiner, and the clamped value is a
result of a rasterized absolute value for the vertex color (primary color).

Color operations are performed on all components (red, green, and blue) using a single setting, and
alpha operations are performed on the alpha component using a separate setting. Color values input
to the texture combiner are undefined when the texture combiner accesses a disabled texture unit.

Figure 11-1. Combiner Options

A combiner accepts three input sources. Each input source must be one of the following types (a
single type can be used for more than one input).
Texture color output by a texture unit.
Constant color.
Primary color.
Primary color for fragment lighting.
Secondary color for fragment lighting.
Output from the previous combiner stage (except for combiner 0).
Output from the previous combiner buffer stage (except for combiner 0).

Figure 11-2. Combiner Input Sources

11.2.1. Reserved Uniforms for Combiner Functions

There are two reserved uniforms for combiner functions: dmp_TexEnv[i].combineRgb and
dmp_TexEnv[i].combineAlpha. Use the glUniform1i() function to set a value at the reserved
uniform location obtained by glGetUniformLocation.

The following values are used to set the reserved uniforms for combiner functions. The same
values can be set for both dmp_TexEnv[i].combineRgb and dmp_TexEnv[i].combineAlpha.

If GL_DOT3_RGBA is set for the combiner, it must have the same combiner function
(GL_DOT3_RGBA) for both the color (combineRgb) and alpha (combineAlpha) components.

Table 11-14. Reserved Uniform Values That Can Be Set for Combiner Functions

Setting Value Combiner Function


GL_REPLACE Src0 (default)

GL_MODULATE Src0 * Src1


GL_ADD Src0 + Src1
GL_ADD_SIGNED Src0 + Src1 - 0.5
GL_INTERPOLATE Src0 * Src2 + Src1 * (1 - Src2)

GL_SUBTRACT Src0 - Src1


4 * ((Src0_Red - 0.5) * (Src1_Red - 0.5) + (Src0_Green - 0.5) * (Src1_Green -
GL_DOT3_RGB
0.5) + (Src0_Blue - 0.5) * (Src1_Blue - 0.5))
4 * ((Src0_Red - 0.5) * (Src1_Red - 0.5) + (Src0_Green - 0.5) * (Src1_Green -
GL_DOT3_RGBA 0.5) + (Src0_Blue - 0.5) * (Src1_Blue - 0.5))

(Src0 + Src1) * Src2 Note: The sum is clamped between 0.0 and 1.0 before it is
GL_ADD_MULT_DMP
multiplied.
GL_MULT_ADD_DMP (Src0 * Src1) + Src2

11.2.2. Reserved Uniforms for the Input Sources

There are two reserved uniforms for the input sources: dmp_TexEnv[i].srcRgb and
dmp_TexEnv[i].srcAlpha.

Use the glUniform3i() function to set values at the reserved uniform location obtained by the
glGetUniformLocation() function. Source 0 is first, followed by source 1 and source 2,
respectively.

The following values are used to set the reserved uniforms for the input sources. The same values
can be set for dmp_TexEnv[0].srcRgb and dmp_TexEnv[0].srcAlpha, and for
dmp_TexEnv[i].srcRgb and dmp_TexEnv[i].srcAlpha.

Warning: Every combiner except for combiner 0 must have GL_CONSTANT, GL_PREVIOUS, or
GL_PREVIOUS_BUFFER_DMP specified as one of its three input sources.

Table 11-15. Reserved Uniform Values That Can Be Set for Input Sources

Setting Value Input Source


GL_TEXTURE0 Texture color from texture unit 0.
GL_TEXTURE1 Texture color from texture unit 1.

GL_TEXTURE2 Texture color from texture unit 2.


GL_TEXTURE3 Texture color from texture unit 3.
Constant color. (This is the default for combiner 0 and is set
GL_CONSTANT
by dmp_TexEnv[i].constRgba.)
GL_PRIMARY_COLOR Primary color (vertex color).
Output from the previous combiner stage. (Cannot be set for
GL_PREVIOUS
combiner 0. Default for combiners other than 0.)

Output from the previous combiner buffer stage. (Cannot be


GL_PREVIOUS_BUFFER_DMP
set for combiner 0.)
GL_FRAGMENT_PRIMARY_COLOR_DMP Primary color for fragment lighting.
GL_FRAGMENT_SECONDARY_COLOR_DMP Secondary color for fragment lighting.

11.2.3. Reserved Uniforms for Operands

There are two reserved uniforms for operands: dmp_TexEnv[i].operandRgb and


dmp_TexEnv[i].operandAlpha.

Use the glUniform3i() function to set values at the reserved uniform location obtained by the
glGetUniformLocation() function. Source 0 is first, followed by source 1 and source 2,
respectively.

The following values are used to set the reserved uniforms for operands.
Table 11-16. Reserved Uniform Values That Can Be Set for Operands

Setting Value Operands


Color. (Cannot be set for operandAlpha. operandAlpha is the
GL_SRC_COLOR
default.)
GL_ONE_MINUS_SRC_COLOR 1 - Color. (Cannot be set for operandAlpha.)

GL_SRC_ALPHA Alpha. (operandAlpha is the default.)


GL_ONE_MINUS_SRC_ALPHA 1 - Alpha
GL_SRC_R_DMP Color_Red
GL_ONE_MINUS_SRC_R_DMP 1 - Color_Red

GL_SRC_G_DMP Color_Green
GL_ONE_MINUS_SRC_G_DMP 1 - Color_Green
GL_SRC_B_DMP Color_Blue

GL_ONE_MINUS_SRC_B_DMP 1 - Color_Blue

11.2.4. Reserved Uniforms for Scaling Values

There are two reserved uniforms for scaling values: dmp_TexEnv[i].scaleRgb and
dmp_TexEnv[i].scaleAlpha.

Use the glUniform1f() function to set a value at the reserved uniform location obtained by the
glGetUniformLocation() function

The following values are used to set the reserved uniforms for scaling.

Table 11-17. Reserved Uniform Values That Can Be Set for Scaling

Setting Value Scaling


1.0 Unchanged combiner output (default).

2.0 Double the combiner output (clamp between 0.0 and 1.0).
4.0 Quadruple the combiner output (clamp between 0.0 and 1.0).

11.2.5. Reserved Uniforms for Constant Colors

There is one uniform for constant colors: dmp_TexEnv[i].constRgba. Use the glUniform4f()
function to set values at the reserved uniform location obtained by the glGetUniformLocation()
function. The first value is R, followed by G, B, and A, respectively.

By default, 0.0 is set for the R, G, B, and A values in the reserved uniforms for constant colors.

11.2.6. Sample Combiner Settings

In the following example, combiner 2 references the primary color and outputs (renders) it
unchanged.

To output only the primary color from the combiners without being affected by anything else,
combiner 2's input source 0 is set to the primary color (GL_PRIMARY_COLOR), operand 0 is set to
the unchanged input source color (GL_SRC_COLOR), the combiner function is configured to output
input source 0 unchanged (GL_REPLACE), and the scale is set to 1.0. With these settings, the
combiner outputs only the primary color and combiners 0 and 1 do not affect the output results.

This is illustrated by the following connection diagram.

Figure 11-3. Sample Combiner Settings 1

The following sample code shows how to make these settings in a program.

Code 11-6. Code for Sample Combiner Settings 1

glUniform3i(glGetUniformLocation(program, "dmp_TexEnv[2].srcRgb"),
GL_PRIMARY_COLOR, GL_PREVIOUS, GL_PREVIOUS);
glUniform3i(glGetUniformLocation(program, "dmp_TexEnv[2].srcAlpha"),
GL_PRIMARY_COLOR, GL_PREVIOUS, GL_PREVIOUS);
glUniform1i(glGetUniformLocation(program, "dmp_TexEnv[2].combineRgb"),
GL_REPLACE);
glUniform1i(glGetUniformLocation(program, "dmp_TexEnv[2].combineAlpha"),
GL_REPLACE);
glUniform3i(glGetUniformLocation(program, "dmp_TexEnv[2].operandRgb"),
GL_SRC_COLOR, GL_SRC_COLOR, GL_SRC_COLOR);
glUniform3i(glGetUniformLocation(program, "dmp_TexEnv[2].operandAlpha"),
GL_SRC_ALPHA, GL_SRC_ALPHA, GL_SRC_ALPHA);
glUniform1f(glGetUniformLocation(program, "dmp_TexEnv[2].scaleRgb"), 1.0);
glUniform1f(glGetUniformLocation(program, "dmp_TexEnv[2].scaleAlpha"), 1.0);

In a more complex example, combiner 1 is configured to add output from texture 0 and texture 1,
and combiner 2 is configured to multiply the result by the primary color.

Figure 11-4. Sample Combiner Settings 2

Code 11-7. Code for Sample Combiner Settings 2

glUniform3i(glGetUniformLocation(program, "dmp_TexEnv[1].srcRgb"),
GL_TEXTURE0, GL_TEXTURE1, GL_PREVIOUS);
glUniform3i(glGetUniformLocation(program, "dmp_TexEnv[1].srcAlpha"),
GL_TEXTURE0, GL_TEXTURE1, GL_PREVIOUS);
glUniform1i(glGetUniformLocation(program, "dmp_TexEnv[1].combineRgb"),
GL_ADD);
glUniform1i(glGetUniformLocation(program, "dmp_TexEnv[1].combineAlpha"),
GL_ADD);
glUniform3i(glGetUniformLocation(program, "dmp_TexEnv[1].operandRgb"),
GL_SRC_COLOR, GL_SRC_COLOR, GL_SRC_COLOR);
glUniform3i(glGetUniformLocation(program, "dmp_TexEnv[1].operandAlpha"),
GL_SRC_ALPHA, GL_SRC_ALPHA, GL_SRC_ALPHA);
glUniform3i(glGetUniformLocation(program, "dmp_TexEnv[2].srcRgb"),
GL_PREVIOUS, GL_PRIMARY_COLOR, GL_PREVIOUS);
glUniform3i(glGetUniformLocation(program, "dmp_TexEnv[2].srcAlpha"),
GL_PREVIOUS, GL_PRIMARY_COLOR, GL_PREVIOUS);
glUniform1i(glGetUniformLocation(program, "dmp_TexEnv[2].combineRgb"),
GL_MODULATE);
glUniform1i(glGetUniformLocation(program, "dmp_TexEnv[2].combineAlpha"),
GL_MODULATE);
glUniform3i(glGetUniformLocation(program, "dmp_TexEnv[2].operandRgb"),
GL_SRC_COLOR, GL_SRC_COLOR, GL_SRC_COLOR);
glUniform3i(glGetUniformLocation(program, "dmp_TexEnv[2].operandAlpha"),
GL_SRC_ALPHA, GL_SRC_ALPHA, GL_SRC_ALPHA);

11.3. Combiner Buffers

In the CTR system, each combiner, except for the last one (combiner 5), has a combiner buffer
configured in parallel to it. A combiner buffer can select the output of the previous combiner stage or
combiner buffer stage as its input source. By preserving the output of a previous combiner buffer
stage, a combiner buffer can allow a later combiner stage's input to come from the output of a
combiner stage earlier than the previous one.

11.3.1. Reserved Uniforms for Combiner Buffers

Because combiner buffer 0 has no input, its initial value is a constant color set by the reserved
uniform dmp_TexEnv[0].bufferColor. Use the glUniform4f() function to set the constant
color at the reserved uniform location obtained by the glGetUniformLocation() function. The
first value is R, followed by G, B, and A, respectively.

By default, 0.0 is set as the R, G, B, and A values of combiner buffer 0's constant color.

You can choose output from either the previous combiner stage or the previous combiner buffer
stage as the input source to combiner buffers 1 through 4. Input from the color and alpha
components can be selected separately through the reserved uniforms
dmp_TexEnv[i].bufferInput (where i is 1 through 4). Use the glUniform2i() function to set
a value at the reserved uniform location obtained by glGetUniformLocation. The color
components are first, followed by the alpha component.

The following reserved uniform values are used to configure the combiner buffer input sources.

Table 11-18. Reserved Uniform Values That Can Be Set for Combiner Buffer Input Sources

Setting Value Input Source


GL_PREVIOUS Output from the previous combiner stage (default).
GL_PREVIOUS_BUFFER_DMP Output from the previous combiner buffer stage.

11.3.2. Sample Combiner Buffer Settings


The following sample settings demonstrate how to connect combiner buffers so that texture 0 is
multiplied with the primary color for fragment lighting, texture 1 is multiplied with the secondary
color for fragment lighting, and then these results are added together and output.

Figure 11-5. Sample Combiner Buffer Settings

The following sample code shows how to make these settings in a program.

Code 11-8. Sample Code for Setting Up the Combiner Buffers

// Combiner 0
glUniform3i(glGetUniformLocation(program, "dmp_TexEnv[0].srcRgb"),
GL_TEXTURE0, GL_FRAGMENT_PRIMARY_COLOR_DMP, GL_PREVIOUS);
glUniform3i(glGetUniformLocation(program, "dmp_TexEnv[0].srcAlpha"),
GL_TEXTURE0, GL_PREVIOUS, GL_PREVIOUS);
glUniform1i(glGetUniformLocation(program, "dmp_TexEnv[0].combineRgb"),
GL_MODULATE);
glUniform1i(glGetUniformLocation(program, "dmp_TexEnv[0].combineAlpha"),
GL_REPLACE);
glUniform3i(glGetUniformLocation(program, "dmp_TexEnv[0].operandRgb"),
GL_SRC_COLOR, GL_SRC_COLOR, GL_SRC_COLOR);
glUniform3i(glGetUniformLocation(program, "dmp_TexEnv[0].operandAlpha"),
GL_SRC_ALPHA, GL_SRC_ALPHA, GL_SRC_ALPHA);
// CombinerBuffer 1
glUniform2i(glGetUniformLocation(program, "dmp_TexEnv[1].bufferInput"),
GL_PREVIOUS, GL_PREVIOUS);
// Combiner 1
glUniform3i(glGetUniformLocation(program, "dmp_TexEnv[1].srcRgb"),
GL_TEXTURE1, GL_FRAGMENT_SECONDARY_COLOR_DMP, GL_PREVIOUS);
glUniform3i(glGetUniformLocation(program, "dmp_TexEnv[1].srcAlpha"),
GL_TEXTURE1, GL_PREVIOUS, GL_PREVIOUS);
glUniform1i(glGetUniformLocation(program, "dmp_TexEnv[1].combineRgb"),
GL_MODULATE);
glUniform1i(glGetUniformLocation(program, "dmp_TexEnv[1].combineAlpha"),
GL_REPLACE);
glUniform3i(glGetUniformLocation(program, "dmp_TexEnv[1].operandRgb"),
GL_SRC_COLOR, GL_SRC_COLOR, GL_SRC_COLOR);
glUniform3i(glGetUniformLocation(program, "dmp_TexEnv[1].operandAlpha"),
GL_SRC_ALPHA, GL_SRC_ALPHA, GL_SRC_ALPHA);
// Combiner 2
glUniform3i(glGetUniformLocation(program, "dmp_TexEnv[2].srcRgb"),
GL_PREVIOUS_BUFFER_DMP, GL_PREVIOUS, GL_PREVIOUS);
glUniform3i(glGetUniformLocation(program, "dmp_TexEnv[2].srcAlpha"),
GL_PREVIOUS_BUFFER_DMP, GL_PREVIOUS, GL_PREVIOUS);
glUniform1i(glGetUniformLocation(program, "dmp_TexEnv[2].combineRgb"),
GL_ADD);
glUniform1i(glGetUniformLocation(program, "dmp_TexEnv[2].combineAlpha"),
GL_REPLACE);
glUniform3i(glGetUniformLocation(program, "dmp_TexEnv[2].operandRgb"),
GL_SRC_COLOR, GL_SRC_COLOR, GL_SRC_COLOR);
glUniform3i(glGetUniformLocation(program, "dmp_TexEnv[2].operandAlpha"),
GL_SRC_ALPHA, GL_SRC_ALPHA, GL_SRC_ALPHA);

11.4. Procedural Textures


Procedural textures are unlike conventional textures, in that they determine texel colors procedurally
rather than by referencing images. Procedural textures are most effective when used for perfectly
regular patterns and patterns that are regular, but have some randomness. Because they do not
access texture images, they avoid memory access conflicts and reduce data (content) sizes.

Texture unit 3 is exclusively used for procedural textures, and is also the only texture unit that can
handle procedural textures. Even though procedural textures calculate texel colors, they are similar
to normal textures, in that they still determine which texel colors correspond to UV texture
coordinates.

As a part of the reserved fragment shaders, procedural textures have parameters that are configured
through reserved uniforms by the glUniform*() functions.

11.4.1. Procedural Texture Unit

The procedural texture unit comprises three computational components. In order of process flow,
these are random number generation, clamping, and mapping. Random number generation adds
noise to UV texture coordinates; clamping determines wrapping and mirror symmetry for patterns;
and mapping calculates texel colors from UV coordinates.

Figure 11-6. Structure of the Procedural Texture Unit

Input texture coordinates are processed as shown in the figure above. Use the following procedure
to set parameters to get the desired image.

1. Enable Procedural Textures


2. Set shared RGBA mode or independent alpha mode.
This corresponds to settings for G(u, v) and F(g).
3. Select a basic shape.
This corresponds to the selection in G(u, v).
4. Colors are set in color lookup tables. This corresponds to settings for Color(f).
5. Set the relationship between the basic shape and color lookup table.
This determines how the basic shape in step 3 corresponds to the color lookup table in step 4.
This corresponds to settings for F(g).
6. Select random number parameters.
If necessary, enable random numbers and determine the size of their effect. If not necessary,
disable them. This corresponds to random number generation. The UV coordinates are made
into absolute values and output to the clamp portion, regardless of whether they are enabled or
disabled.
7. Configure wrapping and symmetry.
This corresponds to clamp settings.

11.4.2. Enable Procedural Textures

To enable or disable texture unit 3, which is used by procedural textures, set a value for the
dmp_Texture[3].samplerType reserved uniform. Note that using glActiveTexture to select
a unit or glEnable to select a texture type results in an error.
Using glUniform1i, set the value to GL_TEXTURE_PROCEDURAL_DMP to enable texture unit 3 or
to GL_FALSE to disable it. The only supported texture type is GL_TEXTURE_PROCEDURAL_DMP.

Code 11-9. Enabling Procedural Textures

glUniform1i(
glGetUniformLocation(s_PgID, "dmp_Texture[3].samplerType"),
GL_TEXTURE_PROCEDURAL_DMP);

11.4.3. Setting Shared RGBA Mode or Independent Alpha Mode

Choose whether the alpha component is mapped using the same functions as the RGB components
(shared RGBA mode) or using functions that are set separately for the alpha component
(independent alpha mode). If independent alpha mode is selected, two G functions and two F
functions must be set for mapping. Shared RGBA mode is easier to set if you just want to output an
image to see what it looks like.

Using the glUniform1i() function, set the reserved uniform


dmp_Texture[3].ptAlphaSeparate to GL_FALSE to select shared RGBA mode or to GL_TRUE
to select independent alpha mode. Shared RGBA mode is the default.

Figure 11-7. Mapping in Independent Alpha Mode

11.4.4. Selecting a Basic Shape

A procedural texture's basic shape is determined by the G function during mapping.

The basic shapes in Table 11-19 use the following color assignments.

Table 11-19. G Function Settings, Selected Functions, and Basic Shapes

Setting Value Selected Function Basic Shape

GL_PROCTEX_U_DMP (default) u
GL_PROCTEX_V_DMP v

GL_PROCTEX_U2_DMP u2

GL_PROCTEX_V2_DMP v2

GL_PROCTEX_ADD_DMP (u + v) / 2

GL_PROCTEX_ADD2_DMP (u 2 + v 2 ) / 2

GL_PROCTEX_ADDSQRT2_DMP sqrt(u 2 + v 2 )

GL_PROCTEX_MIN_DMP min(u, v)

GL_PROCTEX_MAX_DMP max(u, v)

GL_PROCTEX_RMAX_DMP ((u + v) / 2 + sqrt(u 2 + v 2 )) / 2

The basic shape is rendered using texture coordinates between -1.0 and 1.0 (with 0.0 at
the center) and GL_MIRRORED_REPEAT specified for wrapping.

The reserved uniforms used for selection with the G function are dmp_Texture[3].ptRgbMap for
RGB components and dmp_Texture[3].ptAlphaMap for the alpha component. Use the
glUniform1i() function to set each of these reserved uniforms.

dmp_Texture[3].ptAlphaMap settings are only valid in independent alpha mode. Choose the
shape that is closest to the desired texture image.

For example, choose GL_PROCTEX_U_DMP and GL_PROCTEX_V_DMP for wood grain or


GL_PROCTEX_ADDSQRT2_DMP for the annual rings in a tree.

11.4.5. Color Lookup Table Settings


Color lookup tables are used to convert values calculated by the G and F functions into actual texel
colors, and can set each of the RGBA components separately.

The content of the color lookup table depends on whether LOD is used.

Figure 11-8. Color Lookup Table Differences Caused by the Use of LOD

When LOD is not used, multiple color tables can be stored as partial arrays in a color lookup table.
By changing the offset and table width, you can also render textures with different coloring from the
same calculation results.

A color lookup table can hold up to 512 elements. You can store the color table in the first 256
elements and the differences between the color table values in the last 256 elements. Because the
delta values must start at the 257 th element, the number of color lookup table entries is defined as
the sum of 256, the number of color lookup table elements that are actually referenced, and the
starting offset to the color tables. The number of elements is calculated for the last color table.

Use the glUniform1i() function to set the color table width that is actually referenced
(dmp_Texture[3].ptTexWidth). Set this value to a power of 2 that is no greater than 128. Use
the glUniform1i() function to set the color table's starting offset
(dmp_Texture[3].ptTexOffset) to an integer between 0 and 128.

When LOD is used, the color table width and offset must be 128 and 0 respectively. The level of
detail determines which color table is actually referenced. The maximum number of elements in a
color lookup table is fixed at 512.

Table 11-20. Level of Detail and Corresponding Color Tables

Level of Detail Starting Position Width


0 0 128
1 128 64
2 192 32

3 224 16
4 240 8
5 248 4
6 252 2

As explained in 7.7. Loading Lookup Tables, lookup tables are loaded from arrays by calls to the
glTexImage1D() function. Prepare an array of floating-point numbers with as many elements as
the color lookup table, storing values 0.0 through 1.0 for the color table elements (T) in the first
half of the array and the differences between the first half’s 256 elements (ΔT) in the last half of the
array. The following equations calculate the elements and delta values, given size as the number of
elements in the color table, offset as the starting offset, C i as each element, and func as the
conversion function.

Set the last difference value to either 0.0 or the difference between the color table's last element
and the convergence value.

You can use the following code to call the glTexImage1D() function and configure 512 as the
maximum number of elements in a color lookup table, data as the array storing the color lookup
table, and 0 as the number of the lookup table to set.

Code 11-10. Loading a Color Lookup Table

glTexImage1D(GL_LUT_TEXTURE0_DMP, 0, GL_LUMINANCEF_DMP, 512, 0,


GL_LUMINANCEF_DMP, GL_FLOAT, data);

The color lookup table used by each RGBA component uses the glUniform1i() function to
specify the lookup table number as the following reserved uniform. Note that the lookup table
number specified here, GL_LUT_TEXTUREi_DMP, where i represents a number from 0 to 31, does
not specify the name (ID) of the texture nor GL_LUT_TEXTUREi_DMP directly.

Table 11-21. Reserved Uniforms That Specify Color Lookup Tables

Reserved Uniform Value to Set


Specifies the lookup table number to use as the color lookup table
dmp_Texture[3].ptSamplerR
for the red component.

Specifies the lookup table number to use as the color lookup table
dmp_Texture[3].ptSamplerG
for the green component.
Specifies the lookup table number to use as the color lookup table
dmp_Texture[3].ptSamplerB
for the blue component.
Specifies the lookup table number to use as the color lookup table
dmp_Texture[3].ptSamplerA
for the alpha component.

Each reserved uniform specifies a value between 0 and 31.

The value set for dmp_Texture[3].ptSamplerA is ignored in independent alpha mode.

You can apply the same minification filters to a procedural texture's color lookup tables as you can
to a normal texture. Choose a value from the following table to set the reserved uniform
dmp_Texture[3].ptMinFilter with the glUniform1i() function.

Table 11-22. Color Lookup Table Filters

Value to Set Filter to Apply


GL_NEAREST Nearest in the UV directions without LOD.
GL_LINEAR Linear in the UV directions without LOD (default).

GL_NEAREST_MIPMAP_NEAREST Nearest in the UV directions with the nearest LOD.


GL_NEAREST_MIPMAP_LINEAR Nearest in the UV directions with a linear LOD.
GL_LINEAR_MIPMAP_NEAREST Linear in the UV directions with the nearest LOD.

GL_LINEAR_MIPMAP_LINEAR Linear in the UV directions with a linear LOD.


You can apply an LOD bias when referencing the color lookup table. Use the glUniform1f()
function to set the reserved uniform dmp_Texture[3].ptTexBias to a value from 0.0 through
6.0. This is disabled with a value of 0.0, and has a default value of 0.5.

11.4.6. Setting the Relationship Between the Basic Shape and Color
Lookup Table

The F function configures how the G function, which selects the basic shape, corresponds to the
color lookup table, which sets the basic color. The F function uses a lookup table to map output
from the G function (0.0 through 1.0) into lookup values for the color lookup table (0.0 through 1.0).
The lookup table has 256 elements. A mapping table is stored in the first 128 elements and the
differences between the mapping table values are stored in the last 128 elements. Because the
mapping table configures the relationship between shapes and colors, by changing the mapping
table you can render textures that have different appearances even though they use the same
shape and color lookup table. By modifying the F function, a wide variety of outputs is possible. For
example, you could use a simple F function to calculate results, such as F(x)=x or F(x)=x 2 , or you
could have the F function use discontinuous values and operate like an index.

Like color lookup tables, the mapping table for the F function is loaded from an array into a lookup
table by a call to the glTexImage1D() function. Prepare an array of floating-point numbers with as
many elements (256) as the mapping table, storing the mapping table elements (0.0 through 1.0) in
the first half of the array and the differences between those elements in the last half of the array.
The following equations calculate the mapping table elements and delta values, assuming F i is a
mapping table element and func is the conversion function.

You can use the following code to call the glTexImage1D() function and configure 256 as the
number of mapping table elements, data as the array storing the mapping table, and 0 as the
number of the lookup table to set.

Code 11-11. Loading a Mapping Table

glTexImage1D(GL_LUT_TEXTURE0_DMP, 0, GL_LUMINANCEF_DMP, 256, 0,


GL_LUMINANCEF_DMP, GL_FLOAT, data);

The mapping table used as the F functions uses the glUniform1i() function to specify the lookup
table number as the following reserved uniform. Note that the specified lookup table number
GL_LUT_TEXTUREi_DMP, where i represents a number from 0 to 31, does not specify the name
(ID) of the texture nor GL_LUT_TEXTUREi_DMP directly.

Table 11-23. Reserved Uniforms That Specify Mapping Tables

Reserved Uniform Value to Set


Specifies the lookup table number to use as the F function
dmp_Texture[3].ptSamplerRgbMap
for RGB values.
Specifies the lookup table number to use as the F function
dmp_Texture[3].ptSamplerAlphaMap
for alpha values.
Each reserved uniform specifies a value between 0 and 31.
dmp_Texture[3].ptSamplerAlphaMap is only valid in independent alpha mode.

11.4.7. Selecting Random Number Parameters

As a random element in a procedural texture, noise can be added to the UV texture coordinates
that are input to the G function. Noise affects the basic shape. When the G function is
GL_PROCTEX_U_DMP and noise affects U texture coordinates, for example, it becomes possible to
render wood grain with natural warping. Ordinarily, wood grain can only be rendered in a straight
line.

To enable or disable noise, set a value in the reserved uniform


dmp_Texture[3].ptNoiseEnable. Using glUniform1i, specify GL_TRUE to enable it, and
GL_FALSE to disable it.

Code 11-12. Enabling Noise

glUniform1i(
glGetUniformLocation(s_PgID, "dmp_Texture[3].ptNoiseEnable"), GL_TRUE);

The function that adds noise is a black box, but it can be controlled through three parameters from
the application: the frequency (F), the phase (P), and the amplitude (A). The F parameter adjusts
the speed of the fluctuations (noise) so that large values create jagged waves and small values
create gentle undulations. The P parameter changes the starting location of the noise. When
rendering a texture of the ocean surface, for example, you can represent changing waves by
modifying only the P parameter. When increased, the A parameter magnifies the effect of the noise
and further destroys the basic shape.

The three parameters F, P, and A can each be set separately for the U and V components.

Table 11-24. Reserved Uniforms for Noise

Reserved Uniform Value to Set


Specifies the F, P, and A parameters for the u-component. The range
from -8.0 to 8.0 is clamped only for parameter A.
dmp_Texture[3].ptNoiseU
(F-parameter, P-parameter, A-parameter)
These are (0.0, 0.0, 0.0) by default.

Specifies the F, P, and A parameters for the v component. The range


from -8.0 to 8.0 is clamped only for parameter A.
dmp_Texture[3].ptNoiseV
(F-parameter, P-parameter, A-parameter)
These are (0.0, 0.0, 0.0) by default.

Apart from the noise parameters, you can control changes in the continuity of random numbers
(called noise modulation) in the function that adds noise. Noise modulation (a continuous noise
function) is specified by a lookup table called the noise modulation table. The noise function takes
a noise modulation table and uses it to create natural noise values from the discrete values that
arise from calculations alone. A suitable continuous noise function, such as 3x 2 - 2x 3 , generates
values that change gradually when x is near 0.0 and 1.0.

Like the color lookup table, the noise modulation table is loaded from an array by a call to the
glTexImage1D() function. The noise function takes a noise modulation table and uses it to create
natural noise values from the discrete values that arise from calculations alone. The following
equations calculate the noise modulation table elements and delta values, assuming N i is a table
element and func is the conversion function.
You can use the following code to call the glTexImage1D() function and configure 256 as the
number of noise modulation table elements, data as the array storing the noise modulation table,
and 0 as the number of the lookup table to set.

Code 11-13 Loading a Noise Modulation Table

glTexImage1D(GL_LUT_TEXTURE0_DMP, 0, GL_LUMINANCEF_DMP, 256, 0,


GL_LUMINANCEF_DMP, GL_FLOAT, data);

The noise modulation table to use as continuous noise functions use the glUniform1i() function
to specify the lookup table number as the following reserved uniform. Note that the specified lookup
table number GL_LUT_TEXTUREi_DMP, where i represents a number from 0 to 31, does not
specify the name (ID) of the texture nor GL_LUT_TEXTUREi_DMP directly.

Table 11-25. Reserved Uniforms That Specify Noise Modulation Tables

Reserved Uniform Value to Set


Specifies the lookup table number to use as the continuous
dmp_Texture[3].ptSamplerNoiseMap noise function.
0 through 31

To illustrate the effects of the three noise parameters F, P, and A on the output results, consider the
difference between a procedural texture that is rendered as a concentric circle when it is unaffected
by noise, and the same texture when its parameters are changed in both the U and V directions.
These are rendered with F(x)=x as the continuous noise function.

Figure 11-9 shows the effect of changing only the A parameter. The other parameters, F and P, are
set to 0.3 and 0.0 respectively. Although the waves become more prominent as A gets larger, note
that most points along the circumference of the circle are unaffected.

Figure 11-9. Effect of the A Parameter

Figure 11-10 shows the effect of changing only the F parameter. The other parameters, A and P, are
set to 0.3 and 0.0 respectively. Note that as F gets larger, the frequency of the noise
(fluctuations) increases, and the affected locations along the circumference of the circle get closer
to each other. Also note that the absolute value is used for the F parameter when calculating noise,
so reversing the sign does not change the result.

Figure 11-10. Effect of the F Parameter

Figure 11-11 shows the effect of changing only the P parameter. The other parameters, A and F, are
both set to 0.3. Note that when P changes, only the shape of the noise changes. By modifying the
P parameter, you can change a procedural texture so that it appears animated.

If you set the P parameter equal to a large value while using it to animate a texture, small changes
to the P parameter will affect the shape of the noise. This is caused by the accuracy of calculations
in the hardware. For example, if you animate a texture by adding a constant value to the P
parameter every frame (which changes the noise), you must restore the P parameter to a small
value before it gets too large. One characteristic of the F and P parameters is that when they are
both positive and have a product that is a multiple of 16, they have the same effect as when the P
parameter is 0.0. In other words, by changing the P parameter back to 0 when the product of the P
and F parameters is 16, you can maintain the animation’s continuity. However, you may not get the
same shape when the P parameter is 0.0 and when the product of the F and P parameters is 16, if
the F parameter is large.

If the values for the F and A parameters are fixed, and the P parameter varies in a range where the
sign does not change for the phase |u| + u (or for the phase |v| + v), you can get the same noise
result for X + 16 when F×P is some arbitrary value X. However, depending on the accuracy of the
noise calculation process, you may not be able to get the same random value this way if you set a
large value for the F parameter. Also, if you set a large value for the P parameter, changes in small
values may not be applied to the noise result.

Figure 11-11. Effect of the P Parameter


11.4.8. Wrapping and Symmetry Settings

Procedural textures have a feature equivalent to the wrapping mode that can be set for normal
textures. This feature is called clamp calculation, and can be configured with dedicated modes such
as pulse and zero-clamp. There are also shift calculations that shift blocks of texture coordinates
that have the same integer values during wrapping.

Clamp calculations use the clamp mode to determine how to convert texture coordinates that are
less than 0.0 or greater than 1.0 into values between 0.0 and 1.0.

Table 11-26. Clamp Modes

Clamp Mode Coordinate Clamping

GL_SYMMETRICAL_REPEAT_DMP

GL_MIRRORED_REPEAT

GL_PULSE_DMP

GL_CLAMP_TO_EDGE (default)

GL_CLAMP_TO_ZERO_DMP
You can set different clamp modes for the U and V texture coordinates. To set these modes, call the
glUniform1i() function on the reserved uniform dmp_Texture[3].ptClampU or
dmp_Texture[3].ptClampV.

GL_SYMMETRICAL_REPEAT_DMP lines up the same image on a grid. GL_MIRRORED_REPEAT uses a


mirror reflection at even-numbered values. GL_PULSE_DMP uses the pixel closest to the edge of the
texture for each pixel that is used for rendering. GL_CLAMP_TO_EDGE uses a texture's internal
image for values between -1.0 and 1.0. Outside of that range, it uses pixels at the edge of the
texture. GL_CLAMP_TO_ZERO_DMP uses the texture image for values between &ndash;1.0 and
1.0 (excluding these two values). Outside of that range, it uses the image at coordinate 0
(including the values &ndash;1.0 and 1.0).

Shift calculations determine the shift coordinates based on the shift mode. The shift width depends
on the clamp mode. This process is applied before clamp calculations, allowing you to avoid
rendering the same image over and over.

Table 11-27. Shift Modes

Shift Mode Shift Calculation Shift Width

GL_NONE_DMP
No shift calculation. None.
(default)
1.0 for
Shifts coordinates when their integer value changes
GL_ODD_DMP GL_MIRRORED_REPEAT only;
from an odd number to an even number.
0.5 otherwise.
1.0 for
Shifts coordinates when their integer value changes
GL_EVEN_DMP GL_MIRRORED_REPEAT only;
from an even number to an odd number.
0.5 otherwise.

You can set different shift modes for the U and V texture coordinates. To set these modes, call the
glUniform1i() function on the reserved uniform dmp_Texture[3].ptShiftU or
dmp_Texture[3].ptShiftV.

Figure 11-12. How Shift Modes and Clamping Modes Affect Shift Widths
CONFIDENTIAL

12. Reserved Fragment Shaders


A reserved fragment shader can process lighting and other effects for fragments output from the
previous shader.

As described in 5. Shader Programs, the reserved fragment shader does not need to be loaded from a
binary. To use it, attach it to the same program object as the vertex shader and geometry shader, using
the special name GL_DMP_FRAGMENT_SHADER_DMP. The reserved fragment shader is a collection of
fragment processing features. These features include the following.

Fragment lighting
Shadows
Fog
Gas
Miscellaneous (alpha test, w buffer)

There are reserved uniforms for each fragment process. These reserved uniforms have default values
and must be set, as necessary, by the application.

12.1. Fragment Operation Mode

Reserved fragment processing replaces the standard OpenGL fragment pipeline (from the alpha test
onward) with independent processing that can handle the special rendering passes required by
shadows and gas. To switch this fragment operation mode, set the reserved uniform
(dmp_FragOperation.mode) to the desired mode using the glUniform1i() function.

Table 12-1. Fragment Operation Modes

Fragment Operation Mode Pipeline


GL_FRAGOP_MODE_GL_DMP
Standard OpenGL fragment pipeline (standard mode).
(default)
Fragment pipeline for the shadow accumulation pass (shadow
GL_FRAGOP_MODE_SHADOW_DMP
mode).

GL_FRAGOP_MODE_GAS_ACC_DMP Fragment pipeline for rendering density information (gas mode).

12.2. Fragment Lighting

The 3DS system uses fragment lighting, which calculates the primary and secondary colors for each
fragment, rather than for each vertex. It also has the following features: bump mapping, which
references a texture to perturb normal vectors; shadows, which involve the creation of shadow
textures and color calculations; and attenuation, which is calculated from the distance to a spotlight
or another light.

The primary and secondary colors are determined by first combining multiple functions that output
lookup table values based on the dot product of two vectors, and then using bitwise AND/OR
operations to combine those output values. With 3DS, although you cannot fully customize the
method that combines the lookup tables and vectors for the dot products, you can select different
configurations from preset settings.

Nintendo expects the CTR system to use eye coordinates for several vectors, because lighting
equations are considered to be in eye coordinates. Although eye coordinates are not necessarily
required, all the vectors that are used must be in the same coordinate system.

To use fragment lighting, you must set dmp_FragmentLighting.enabled to GL_TRUE with the
glUniform1i() function and enable at least one light. In the vertex shader, the normal vector, view
vector, and tangent vector (when required for lighting) must be converted into a quaternion and
output as a single vertex attribute.

Table 12-2. Enabling and Disabling Fragment Lighting

Reserved Uniform Type Setting Value


GL_TRUE: Enable lighting.
dmp_FragmentLighting.enabled bool
GL_FALSE: Disable lighting (default).

12.2.1. Quaternion Conversion

Lighting calculations require all vectors to use the same coordinate system. In other words, bump
mapping (described later) must convert the perturbation normals referenced in a texture from
surface-local coordinates (with the vertex at the origin and the normal vector along the positive z-
axis) into eye coordinates.

The following 3×3 rotation matrix converts surface-local coordinates into eye coordinates. It
comprises a normal, tangent, and binormal vector, and can be converted into a quaternion (Qx, Qy,
Qz, Qw).

(E represents the eye coordinates, T is the tangent, N is the normal, B is the binormal, and S
represents the surface-local coordinates.)

Fragment lighting is implemented to generate a quaternion for each fragment from a quaternion for
each vertex, rather than generate a vector for each fragment from the normal, tangent, and
binormal (which can be calculated from the normal and tangent) vectors input for each vertex. The
quaternion is converted into the original rotation matrix during fragment light processing. To use
fragment lighting, you must convert each vector into a valid quaternion in the vertex shader.

Generating Normal-Only Quaternions

The CTR-SDK sample demos include a vertex shader (l_position_view_quaternion) that


generates quaternions using only the vertex normal information. To generate a quaternion that
transforms normals in surface-local coordinates to normals in perspective coordinates, the vertex
shader uses the half angle vector between the unit normal vector (0, 0, 1) and a normal vector that
has been transformed to perspective coordinates (Nx, Ny, Nz) as the axis of rotation, and derives a
quaternion that performs 180° rotations.

Figure 12-1. The Rotation Axis and Angle of a Quaternion

For an axis of rotation (α, β, γ) and an angle of rotation θ, the derived quaternion Q would be
computed as follows.

Note: In the vertex shader, the real component of the quaternion is set to the w component.

For a θ value of 180°, this becomes:

Q = ( 0; α, β, γ )

Taking the half-angle vector as the axis of rotation, this becomes:

The orientation of the half-angle vector is undefined only when (Nx, Ny, Nz) is (0, 0, –1). In this
case, assume that (α, β, γ) = (1, 0, 0).

12.2.2. Lighting Overview

3DS fragment lighting always calculates the primary and secondary colors.

The primary color is calculated first by accumulating each light's effect (with shadows, spotlight
attenuation, and distance attenuation applied) on a fragment's ambient and diffuse light. The
fragment's emissive light and the effect of the scene's ambient light is then added to this
accumulated value. This becomes the fragment's base color.

The secondary color is calculated by accumulating each light's effect on a fragment's second
specular light with shadows, spotlight attenuation, and distance attenuation applied. This color is
mainly used for fragment highlights.

Like OpenGL, an object's color is determined from its ambient, diffuse, emissive, and specular light.
Unlike OpenGL, however, lighting uses per-fragment calculations and second specular light, making
it possible to calculate the specular light in a variety of ways. The second specular light in
particular can be used to represent materials with colors that change depending on the angle.

Figure 12-2. Fragment Lighting

12.2.3. Scene Settings

Fragment lighting can handle scene sizes from -2 16 through 2 15 . Do not allow the distance between
the viewpoint and any fragment or light in the scene to be greater than or equal to 2 16 .

The scene affects fragments through its ambient light (the global ambient light). To specify the
scene's global ambient light, set the reserved uniform dmp_FragmentLighting.ambient to an
RGBA color using the glUniform4fv() function.

Table 12-3. Reserved Uniforms for Scene Settings

Reserved Uniform Type Setting Value


Specifies the scene's global ambient light (R, G, B, A).
dmp_FragmentLighting.ambient vec4 Each component has a value between 0.0 and 1.0.
This is (0.2, 0.2, 0.2, 1.0) by default.

12.2.4. Material Settings

Material settings can be described simply as settings that use color information, such as the
ambient and specular light, to represent a fragment's materials and texture. Specify material-
related settings in the reserved uniforms dmp_FragmentMaterial.*.

12.2.7. Equations for the Primary Color and 12.2.8. Equations for the Secondary Color describe
how the settings are used in lighting calculations.
Table 12-4. Reserved Uniforms for Material Settings

Reserved Uniform Type Setting Value


Specifies the ambient light (R, G, B, A). Each
dmp_FragmentMaterial.ambient vec4 component has a value between 0.0 and 1.0.
This is (0.2, 0.2, 0.2, 1.0) by default.
Specifies the ambient light (R, G, B, A). Each
dmp_FragmentMaterial.diffuse vec4 component has a value between 0.0 and 1.0.
This is (0.8, 0.8, 0.8, 1.0) by default.

Specifies the ambient light (R, G, B, A). Each


dmp_FragmentMaterial.emission vec4 component has a value between 0.0 and 1.0.
This is (0.0, 0.0, 0.0, 1.0) by default.

Specifies specular light 0 (R, G, B, A). Each


dmp_FragmentMaterial.specular0 vec4 component has a value between 0.0 and 1.0.
This is (0.0, 0.0, 0.0, 1.0) by default.

Specifies specular light 1 (R, G, B, A). Each


dmp_FragmentMaterial.specular1 vec4 component has a value between 0.0 and 1.0.
This is (0.0, 0.0, 0.0, 1.0) by default.
Specifies the lookup table numbers to use for lighting
dmp_FragmentMaterial.samplerXX
int calculations.
(XX=DO,D1,RR,RG,RB,FR)
Each factor is a number between 0 and 31.

12.2.5. Light Settings

There are two types of light settings. One configures the effect of light on a material and the other
configures the light itself. Fragment lighting can handle eight lights. Specify light-related settings in
the reserved uniforms (dmp_FragmentLightSource[i].*, where i is the light number between 0
and 7).

12.2.7. Equations for the Primary Color and 12.2.8. Equations for the Secondary Color describe
how the settings are used in lighting calculations.

Table 12-5. Reserved Uniforms for Light Settings

Reserved Uniform Type Setting Value


Enables or disables a light.
*.enabled bool GL_TRUE: Enable light.
GL_FALSE: Disable light (default).
Specifies an ambient light (R, G, B, A). Each
*.ambient vec4 component has a value between 0.0 and 1.0.
This is (0.0, 0.0, 0.0, 0.0) by default.
Specifies a diffuse light (R, G, B, A). Each component
has a value between 0.0 and 1.0.
*.diffuse vec4
By default, only light number 0 is (1.0, 1.0, 1.0, 1.0).
All of the others are (0.0, 0.0, 0.0, 0.0).
Specifies specular light 0 (R, G, B, A). Each
component has a value between 0.0 and 1.0.
*.specular0 vec4
By default, only light number 0 is (1.0, 1.0, 1.0, 1.0).
All of the others are (0.0, 0.0, 0.0, 0.0).
Specifies specular light 1 (R, G, B, A). Each
*.specular1 vec4 component has a value between 0.0 and 1.0.
This is (0.0, 0.0, 0.0, 0.0) by default.
Specifies the position of a light source (x, y, z, w).
The vector does not need to be normalized. The w
component is used to distinguish between directional
*.position vec4
(0.0) and positional light sources.
This is (0.0, 0.0, 1.0, 0.0) by default.
Specifies a spotlight direction (x, y, z).
*.spotDirection vec3 The vector does not need to be normalized.
(0.0, 0.0, –1.0) by default.

Specifies whether a light is affected by shadows.


*.shadowed bool GL_TRUE: Affected by shadows.
GL_FALSE: Not affected by shadows (default).
Specifies whether to use geometry factor 0 in lighting
calculations.
*.geomFactor0 bool
GL_TRUE: Use geometry factor 0.
GL_FALSE: Do not use geometry factor 0 (default).

Specifies whether to use geometry factor 1 in lighting


calculations.
*.geomFactor1 bool
GL_TRUE: Use geometry factor 1.
GL_FALSE: Do not use geometry factor 1 (default).
Specifies whether to use two-sided lighting.
*.twoSideDiffuse bool GL_TRUE: Use two-sided lighting.
GL_FALSE: Do not use two-sided lighting (default).
Specifies whether to attenuate a spotlight based on its
light distribution.
*.spotEnabled bool GL_TRUE: Apply spotlight attenuation.
GL_FALSE: Do not apply spotlight attenuation
(default).
Specifies whether to attenuate a light over distance.
*.distanceAttenuationEnabled bool GL_TRUE: Apply distance attenuation.
GL_FALSE: Do not apply distance attenuation.
Specifies the distance attenuation bias.
*.distanceAttenuationBias float
0.0 by default.
Specifies the distance attenuation scale.
*.distanceAttenuationScale float
1.0 by default.
Specifies the lookup table number to use for
*.samplerXX calculating spotlight attenuation and distance
int
(XX=SP,DA) attenuation.
Each factor is a number between 0 and 31.

The table uses asterisks (*) to indicate dmp_FragmentLightSource[i]. Where i is a light


number between 0 and 7.

12.2.6. Lighting Environment

The reserved uniforms dmp_LightEnv.* configure settings related to general lighting, including
shadow texture selection, bump map settings, and lookup table input.

12.2.7. Equations for the Primary Color and 12.2.8. Equations for the Secondary Color describe
how the settings are used in lighting calculations.

Table 12-6. Reserved Uniforms for the Lighting Environment

Reserved Uniform Type Setting Value


Specifies whether to convert lookup table input for each
*.absLutInputXX factor into absolute values.
bool
(XX=D0,D1,RR,RG,RB,FR,SP) GL_TRUE: Convert to absolute values.
GL_FALSE: Do not convert to absolute values (default).

Specifies the cosine of the angle between two vectors to


use as lookup table input for each factor.
GL_LIGHT_ENV_NH_DMP: The normal and half vectors
(default).
*.lutInputXX GL_LIGHT_ENV_VH_DMP: The view and half vectors.
int
(XX=D0,D1,RR,RG,RB,FR,SP) GL_LIGHT_ENV_NV_DMP: The normal and view vectors.
GL_LIGHT_ENV_LN_DMP: The light and normal vectors.
GL_LIGHT_ENV_SP_DMP: The light and spotlight vectors.
GL_LIGHT_ENV_CP_DMP: The tangent vector and the
projection of the half vector onto the tangent plane.
Specifies the scale to apply to each factor's lookup table
output. After applying a scale value to the value output by
the lookup table, clamping is performed within a range
from -2.0 through 2.0.
*.lutScaleXX 0.25
float
(XX=D0,D1,RR,RG,RB,FR,SP) 0.5
1.0 (default)
2.0
4.0
8.0
Specifies the texture unit to use for shadows.
GL_TEXTURE0 (default)
*.shadowSelector int GL_TEXTURE1
GL_TEXTURE2
GL_TEXTURE3

Specifies the texture unit to use for bump maps.


GL_TEXTURE0 (default)
*.bumpSelector int GL_TEXTURE1
GL_TEXTURE2
GL_TEXTURE3
Specifies the perturbation mode for normal and tangent
vectors.
GL_LIGHT_ENV_BUMP_NOT_USED_DMP: No bump mapping
(default).
*.bumpMode int
GL_LIGHT_ENV_BUMP_AS_BUMP_DMP: Perturb normal
vectors.
GL_LIGHT_ENV_BUMP_AS_TANG_DMP: Perturb tangent
vectors.
Specifies whether to regenerate the third component of the
normal vector.
*.bumpRenorm bool
GL_TRUE: Regenerate.
GL_FALSE: Do not regenerate (default).
Specifies the configuration for each factor.
GL_LIGHT_ENV_LAYER_CONFIG0_DMP (default)
GL_LIGHT_ENV_LAYER_CONFIG1_DMP
GL_LIGHT_ENV_LAYER_CONFIG2_DMP
*.config int GL_LIGHT_ENV_LAYER_CONFIG3_DMP
GL_LIGHT_ENV_LAYER_CONFIG4_DMP
GL_LIGHT_ENV_LAYER_CONFIG5_DMP
GL_LIGHT_ENV_LAYER_CONFIG6_DMP
GL_LIGHT_ENV_LAYER_CONFIG7_DMP
Specifies whether to invert the shadow term (1.0 -
shadow).
*.invertShadow bool
GL_TRUE: Invert.
GL_FALSE: Do not invert (default).
Specifies whether to apply shadows to the primary color.
*.shadowPrimary bool GL_TRUE: Apply.
GL_FALSE: Do not apply (default).
Specifies whether to apply shadows to the secondary color.
*.shadowSecondary bool GL_TRUE: Apply.
GL_FALSE: Do not apply (default).
Specifies whether to apply shadows to the alpha
component.
*.shadowAlpha bool
GL_TRUE: Apply.
GL_FALSE: Do not apply (default).

Specifies the output mode for the Fresnel factor.


GL_LIGHT_ENV_NO_FRESNEL_DMP (default)
*.fresnelSelector int GL_LIGHT_ENV_PRI_ALPHA_FRESNEL_DMP
GL_LIGHT_ENV_SEC_ALPHA_FRESNEL_DMP
GL_LIGHT_ENV_PRI_SEC_ALPHA_FRESNEL_DMP

Specifies whether to clamp the specular output value.


*.clampHighlights bool GL_TRUE: Clamp.
GL_FALSE: Do not clamp (default).

Specifies whether to apply output values from the lookup


table for distribution 0 (D0).
*.lutEnabledD0 bool
GL_TRUE: Apply.
GL_FALSE: Do not apply (default).
Specifies whether to apply output values from the lookup
table for distribution 1 (D1).
*.lutEnabledD1 bool
GL_TRUE: Apply.
GL_FALSE: Do not apply (default).
Specifies whether to apply output values from the lookup
tables for reflection (RR, RG, RB).
*.lutEnabledRefl bool
GL_TRUE: Apply.
GL_FALSE: Do not apply (default).

The table uses asterisks (*) to indicate dmp_LightEnv.

12.2.7. Equations for the Primary Color

The following formula summarizes how the primary color is calculated.

Color primary = ∑((Diffuse × DP LN × Shadow + Ambient) × Spot × DistAtt + Ambient global +


Emission

The following paragraphs provide more information about how each term is calculated.

The product of the material and light color components is applied to the diffuse and ambient light.

Diffuse = Diffuse material × Diffuse light


Ambient = Ambient material × Ambient light

The diffuse light is affected by shadows and the angle of incident light.

DP LN = max { 0, L·N } or abs ( L·N )

The effect of the angle of incident light is specified by DP LN in the equation. It is the dot product of
a normalized light vector and normal vector. For one-sided lighting, it is the larger of 0 and the dot
product, and for two-sided lighting it is the absolute value of the dot product. To enable one-sided
or two-sided lighting, set dmp_FragmentLightSource[i].twoSideDiffuse to GL_FALSE or
GL_TRUE respectively.

The effect of shadow attenuation is represented by Shadow in the equation. If either


dmp_LightEnv.shadowPrimary or dmp_FragmentLightSource[i].shadowed is GL_FALSE,
shadows have no effect and a value of 1.0 is applied. If either reserved uniform is GL_TRUE, the
value used is sampled from the texture unit specified by dmp_LightEnv.shadowSelector. If
dmp_LightEnv.invertShadow is GL_TRUE, the value used is actually the sampled value
subtracted from 1.0. If any texture other than a shadow texture is bound to the specified texture
unit, color components are applied unchanged to the sampled value.

The effect of spotlights is represented by Spot in the equation. You can set a lookup table for each
light and configure whether it is used. Use dmp_FragmentLightSource[i].spotEnabled to
configure whether spotlights are used and dmp_FragmentLightSource[i].samplerSP to set
the lookup tables used by spotlights. Use dmp_FragmentLightSource[i].spotDirection to
set the spotlight direction vector. A value of GL_LIGHT_ENV_SP_DMP is usually specified as the
lookup table input for spotlights (dmp_LightEnv.lutInputSP).
The effect of distance attenuation is represented by DistAtt in the equation. Directional light
sources are unaffected by distance attenuation.

The reserved uniform dmp_FragmentLightSource[i].distanceAttenuationEnabled


configures whether distance attenuation is applied. This can be controlled for each light. However,
the effect of distance attenuation is disabled (1.0 is applied) when dmp_LightEnv.config is
GL_LIGHT_ENV_LAYER_CONFIG7_DMP.

The lookup table to use is set by dmp_FragmentLightSource[i].samplerDA. Lookup table


input is affected by the values set for the scale
(dmp_FragmentLightSource[i].distanecAttenuationScale) and bias
(dmp_FragmentLightSource[i].distanceAttenuationBias).

This section has so far described the effect of each light on a fragment's primary color. This effect
is calculated only for valid lights and then it is added to the material's emissive color and the global
ambient color, which are unaffected by lights, to compute the fragment's final primary color.

The global ambient light is the product of the material's ambient light and the scene's ambient light.

Ambient global = Ambient material × Ambient scene

When light source 0 is disabled (dmp_FragmentLightSource[0].enabled is GL_FALSE),


however, a value of 0.0 is applied to the global ambient light and the material's emissive light.

12.2.8. Equations for the Secondary Color

The following formula summarizes how the secondary color is calculated.

Color secondary = ∑((Specular0 + Specular1) × f × Shadow × Spot × DistAtt)

The following paragraphs provide more information about how each term is calculated.

Except that dmp_LightEnv.shadowSecondary settings are used instead of


dmp_LightEnv.shadowPrimary settings, Shadow, Spot, and DistAtt are each calculated in the
same way that they are calculated for the primary color.

Specular0 and Specular1 are each calculated as follows.

Specular0 = Specular0 material × Specular0 light × Distribution0 × Geometry0


Specular1 = Reflection RGB × Specular1 light × Distribution1 × Geometry1

The specular term is usually calculated as the product of the specular property of a material, the
specular color of a light, a distribution function, and a geometry factor. Some settings allow output
from lookup tables that define different reflections for each color to be applied to the term that
corresponds to a material's specular light 1. By adjusting the reflection and distribution lookup
tables, you can represent fragments with a variety of different textures.

Distribution functions (factors) are represented by Distribution0 and Distribution1 in the equation.
The distribution functions are configured through lookup tables for distribution 0 (D0) and
distribution 1 (D1).

The reserved uniform dmp_LightEnv.lutEnabledD0 (lutEnabledD1) controls whether these


functions are used. The reserved uniform dmp_FragmentLightSource[i].samplerD0
(samplerD1) specifies the lookup table number to use. The reserved uniforms
dmp_LightEnv.lutInputD0 and dmp_LightEnv.lutInputD1 specify the lookup table input.
dmp_LightEnv.absLutInputD0 and dmp_LightEnv.absLutInputD1 specify the absolute
value of the input. Lookup table output accounts for the scale values specified by
dmp_LightEnv.lutScaleD0 and dmp_LightEnv.lutScaleD1.

Reflections are represented by Reflection RGB in the equation. They use lookup tables (RR, RG,
RB) to set functions that calculate the reflection for each RGB component instead of a material's
specular light 1. The reserved uniform dmp_LightEnv.lutEnabledRefl controls whether this
feature is used. If GL_FALSE is specified, the material's specular color 1 is applied. The reserved
uniform dmp_FragmentLightSource[i].samplerRR (samplerRG, samplerRB) specifies the
lookup table number to use. The reserved uniforms dmp_LightEnv.lutInputXX specify the
lookup table input and dmp_LightEnv.absLutInputXX specify the absolute value of the input
(where XX is RR, RG, or RB).

Geometry factors are represented by Geometry0 and Geometry1 in the equation. They are used by
the Cook-Torrance lighting model. The reserved uniforms
dmp_FragmentLightSource[i].geomFactor0 and
dmp_FragmentLightSource[i].geomFactor1 control whether they are used. A value of 1.0 is
applied for a setting of GL_FALSE and an approximation of the geometry factors used in the Cook-
Torrance lighting model is applied for a setting of GL_TRUE.

f is a function that uses the dot product of the normalized light vector and normal vector to
determine whether lighting is enabled for a fragment. A value of 1.0 is always applied when
dmp_LightEnv.clampHighlights is set to GL_FALSE. This setting is used to represent
translucent objects that allow light to pass through them to areas where it would not otherwise
reach. If GL_TRUE is specified, a value of 0.0 is applied when the dot product is 0.0 or less and a
value of 1.0 is applied when the dot product is greater than 0.0. In this case, unlit areas have no
specular light.

A fragment's final secondary color is calculated from the effect of each valid light's effect on it.

12.2.9. Alpha Component Lighting

The previous sections described how to calculate the primary and secondary colors with the alpha
component fixed at 1.0. Fragment lighting allows you to apply Fresnel factors and shadows to the
alpha component.

Fresnel factors were originally intended to be used as lookup tables (FR) for Fresnel reflections in
translucent objects, but by replacing the alpha component with lookup table output, they can also
be used for other purposes.

The reserved uniform dmp_FragmentLightSource[i].samplerFR specifies the number of the


lookup table to use for the Fresnel factors. The reserved uniform dmp_LightEnv.lutInputFR
specifies the lookup table input and dmp_LightEnv.absLutInputFR specifies the absolute
values of the input. If multiple lights have been enabled, the light with the largest number has its
light vector used in the dot product that is input to the lookup table. The reserved uniform
dmp_LightEnv.fresnelSelector uses the following values to control the extent to which
Fresnel factors are applied.

Table 12-7. Scope of the Applied Fresnel Factor

Setting Value Applies To


GL_LIGHT_ENV_NO_FRESNEL_DMP (default) Nothing. (Fix the alpha component at 1.0.)
GL_LIGHT_ENV_PRI_ALPHA_FRESNEL_DMP Only the alpha component for the primary color.

GL_LIGHT_ENV_SEC_ALPHA_FRESNEL_DMP Only the alpha component for the secondary color.


The alpha component for the primary and
GL_LIGHT_ENV_PRI_SEC_ALPHA_FRESNEL_DMP
secondary colors.

The Fresnel factor is also applied to the alpha component for shadows. The shadow alpha
component is multiplied with the applied alpha component when a value of GL_TRUE is specified for
dmp_LightEnv.shadowAlpha, which controls the effect on the shadow alpha component. If
dmp_LightEnv.invertShadow is GL_TRUE, the shadow alpha value is subtracted from 1.0 (as it
is for colors) before being multiplied.
12.2.10. Creating and Specifying Lookup Tables

Lighting equations use the following eight types of lookup tables.

Reflections (three types: RR, RG, and RB)


Distribution factors (two types: D0 and D1)
Fresnel factors (FR)
Spotlights (SP)
Distance attenuation of light

The lookup tables for reflections (RR, RG, RB), distribution factors (D0, D1), and Fresnel factors
are all material settings, and are common to all lights. The lookup tables for spotlights (SP) and the
distance attenuation of light (DA) can be set differently for each light.

The layer configuration (dmp_LightEnv.config) can control which lookup tables are used for
each term in the secondary color's lighting equation, except for the distance attenuation of light.

Table 12-8. Lookup Tables and Number of Cycles for Each Layer Configuration

Layer Configuration Rr Rg Rb D0 D1 Fr Sp Cycles


GL_LIGHT_ENV_LAYER_CONFIG0_DMP RR RR RR D0 - - SP 1
GL_LIGHT_ENV_LAYER_CONFIG1_DMP RR RR RR - - FR SP 1

GL_LIGHT_ENV_LAYER_CONFIG2_DMP RR RR RR D0 D1 - - 1
GL_LIGHT_ENV_LAYER_CONFIG3_DMP - - - D0 D1 FR - 1
GL_LIGHT_ENV_LAYER_CONFIG4_DMP RR RG RB D0 D1 - SP 2
GL_LIGHT_ENV_LAYER_CONFIG5_DMP RR RG RB D0 - FR SP 2

GL_LIGHT_ENV_LAYER_CONFIG6_DMP RR RR RR D0 D1 FR SP 2
GL_LIGHT_ENV_LAYER_CONFIG7_DMP RR RG RB D0 D1 FR SP 4

The table shows which lookup tables are used to get values for a reflection's RGB components, the
distribution factors, the Fresnel factors, and the spotlight term. A value of 1.0 is applied to the
lighting equation for any cell that contains a hyphen (-). In other words, the corresponding term
disappears from the equation. The Cycles column shows the number of hardware cycles required
for lighting calculations. To speed up lighting calculations, choose a layer configuration that
minimizes this number.

Warning: When only write access to the color buffer has been configured (when the
glColorMask() function has set a value of GL_TRUE for all color buffer components
and the glDisable() function has disabled GL_BLEND and GL_COLOR_LOGIC_OP),
layer configurations from GL_LIGHT_ENV_LAYER_CONFIG4_DMP through
GL_LIGHT_ENV_LAYER_CONFIG6_DMP require three rather than two cycles to process a
single pixel.

A setting of GL_LIGHT_ENV_LAYER_CONFIG7_DMP disables the effect of distance


attenuation when the primary and secondary color are calculated.

For example, when GL_LIGHT_ENV_LAYER_CONFIG0_DMP is set, all of the reflection RGB


components are obtained from the RR lookup table, the distribution 0 values are obtained from the
D0 lookup table, and the spotlight values are obtained from the SP lookup table. A fixed value of
1.0 is applied for distribution 1 and the Fresnel factor.
As described in 7.7. Loading Lookup Tables, lookup tables are prepared by the glTexImage1D()
function. The lookup tables used for fragment lighting have a fixed width of 512 elements. The
first 256 elements store the lookup table's sampling values and the last 256 elements store the
differences between each of the sampling values.

The order that sampling values are stored in depends on whether lookup table input is between 0.0
and 1.0 or between –1.0 and 1.0. The reserved uniforms dmp_LightEnv.absLutInputXX
(where XX is D0, D1, RR, RG, RB, FR, or SP) set the range of input values from 0.0 to 1.0 when
GL_TRUE is specified or from –1.0 to 1.0 when GL_FALSE is specified.

Procedures for getting sampling values are described next, followed by the corresponding
procedures for storing them.

If the range of input values is between 0.0 and 1.0, each input value is multiplied by 256 and then
clamped to 255. The integer portion of this number is the index for getting values. The index is first
used to get a sampling value from the lookup table, and then it is incremented by 256 to get a
difference value. The difference value is multiplied by the fractional portion of the input value, and
then added to the original sampling value to find the final sampling value.

Code 12-1. Procedure to Get Sampling Values for Input Between 0.0 and 1.0 (Pseudocode)

index=min(floor(input * 256), 255);


samplingValue=LUT[index] + LUT[index + 256] * (input * 256 - index);

If the range of input values is between –1.0 and 1.0, each input value is multiplied by 128 and
then its integer portion is converted to a two's complement index. The index is first used to get a
sampling value from the lookup table, and then it is incremented by 256 to get a difference value.
The difference value is multiplied by the fractional portion of the input value, and then added to the
original sampling value to find the final sampling value.

Code 12-2. Procedure to Get Sampling Values for Input Between –1.0 and 1.0 (Pseudocode)

if (input < 0.0) {


flooredInput=floor(input * 128);
index = 255 + flooredInput;
samplingValue=LUT[index] + LUT[index + 256] * (input * 128 - flooredInput);
} else {
index=min(floor(input * 128), 127);
samplingValue=LUT[index] + LUT[index + 256] * (input * 128 - index);
}

Figure 12-3 shows the order that sampling values are stored in.

Figure 12-3. Order of Sampling Values Stored in the Lookup Table

Lookup tables are usually created by storing the sampling value that results from dividing the index
by 256 or 128. The difference between each sampling value and the next is stored 256 indices
later. Note that the lookup table is discontinuous for input values between –1.0 and 1.0.

The following pseudocode samples illustrate this process using func as the function that calculates
sampling values from input values.

Code 12-3. Creating a Lookup Table for Input Between 0.0 and 1.0 (Pseudocode)

for (i = 0; i < 256; i++) LUT[i] = func((float) i / 256.0f);


for (i = 0; i < 255; i++) LUT[i + 256] = LUT[i + 1] - LUT[i];
LUT[511] = func(1.0f) - LUT[255] * 16.0f / 15.0f;

The difference value for an input of 1.0 (the last difference element) is multiplied by 16.0/15.0
because the GPU has a fractional precision of 4 bits. If the original difference value were stored,
the highest input value would not produce the sampling value for 1.0.

Code 12-4. Creating a Lookup Table for Input Between –1.0 and 1.0 (Pseudocode)

for (i = 0; i < 128; i++) {


LUT[i] = func((float) i / 128.0f);
LUT[255 - i] = func((float) (i + 1) * -1.0f / 128.0f);
}
for (i= 0; i < 127; i++) {
LUT[i + 256] = LUT[i + 1] - LUT[i];
LUT[i + 384] = LUT[i + 129] - LUT[i + 128];
}
LUT[383] = func(1.0f) - LUT[127] * 16.0f / 15.0f;
LUT[511] = LUT[0] - LUT[255];

In this example, the value is multiplied by 16.0/15.0 because, if the original difference value were
stored, the highest input value would not produce the sampling value for 1.0.

Load the created lookup table using the glTexImage1D() function. For target for the
glBindTexture() function, specify GL_LUT_TEXTUREi_DMP to specify the lookup table to load.
To reference the lookup table during lighting calculations, use the glUniform1i() function to
specify the lookup table number as the reserved uniform. Note that the lookup table number
specified here, GL_LUT_TEXTUREi_DMP, where i represents a number from 0 to 31, does not
specify the name (ID) of the texture nor GL_LUT_TEXTUREi_DMP directly. The reserved uniform
specified by the lookup table number will be either material settings when referring to reflections,
distribution factors, or Fresnel factors (dmp_FragmentMaterial.samplerXX (where
XX=DO,D1,RR,RG,RB,FR)) or light settings when referring to spotlight and light distance
attenuation (dmp_FragmentLightSource[i].samplerXX (where XX=SP,DA)).

Code 12-5. Specifying the Reflection Lookup Table (for the R Component)

glBindTexture(GL_LUT_TEXTURE2_DMP, lutTextureID);
glTexImage1D(GL_LUT_TEXTURE2_DMP, 0, GL_LUMINANCEF_DMP, 512, 0,
GL_LUMINANCEF_DMP, GL_FLOAT, LUT);
glUniform1i(glGetUniformLocation(progID, "dmp_FragmentMaterial.samplerRR"), 2);

12.2.11. Lookup Table Input

All terms (D0, D1, RR, RG, RB, FR, SP), except for the distance attenuation of light (DA), take as
input the cosine of the angle between two vectors (the dot product of two normalized vectors).

The input vectors are the normal vector (N), light vector (L), view vector (V), half vector (H),
tangent vector (T), binormal vector (B), spotlight direction vector, and the projection of the half
vector onto the tangent plane.
Figure 12-4. Vectors Used as Lookup Table Input

To specify the values used as lookup table input for each factor, set the reserved uniforms for the
lighting environment, dmp_LightEnv.lutInputXX (where XX is D0, D1, RR, RG, RB, FR, or SP), to
one of the following values using the glUniform1i() function. The cosine of the angle between
the two specified vectors is used as lookup table input for each factor.

Table 12-9. Settings for Lookup Table Input

Setting Value Input Vector Pair


GL_LIGHT_ENV_NH_DMP The normal and half vectors (default).

GL_LIGHT_ENV_VH_DMP The view and half vectors.


GL_LIGHT_ENV_NV_DMP The normal and view vectors.
GL_LIGHT_ENV_LN_DMP The light and normal vectors.
The inverse light vector and the spotlight vector (cannot be used with RR,
GL_LIGHT_ENV_SP_DMP
RG, RB, and FR).
The tangent vector and the projection of the half vector onto the tangent
GL_LIGHT_ENV_CP_DMP
plane (cannot be used with RR, RG, RB, and FR).

12.2.12. Distance Attenuation of Light

The following equation calculates input values for the distance attenuation of light.

f position is the fragment position and l position is the light position. Both are expected to use eye
coordinates. Position only has meaning for a point light source. It means nothing for a directional
light source.

This equation shows that the distance between a fragment and a light is multiplied by a scale value,
and then added to a bias value to calculate an input value for the distance attenuation of light.

Create lookup tables that take input between 0.0 and 1.0 (absolute values). Use the
glUniform1f() function to set the scale value in the reserved uniform
dmp_FragmentLightSource[i].distanceAttenuationScale and the bias value in
dmp_FragmentLightSource[i].distanceAttenuationBias.

When GL_LIGHT_ENV_LAYER_CONFIG7_DMP is specified as the layer configuration’s setting value,


you must disable distance attenuation. Set
dmp_FragmentLightSource[i].distanceAttenuationEnabled to GL_FALSE.

12.2.13. Texture Combiner Settings


The primary and secondary colors calculated by fragment lighting can each be used as an input
source to a texture combiner.

To use them, set the reserved uniforms for the input source (dmp_TexEnv[i].srcRgb and
dmp_TexEnv[i].srcAlpha) to GL_FRAGMENT_PRIMARY_COLOR_DMP for the primary color, and to
GL_FRAGMENT_SECONDARY_COLOR_DMP for the secondary color.

The output value is (0.0, 0.0, 0.0, 1.0) when lighting is disabled (when
dmp_FragmentLighting.enabled is GL_FALSE).

12.3. Bump Mapping

Bump mapping is a feature of fragment lighting that perturbs (alters) a fragment's normal and tangent
vectors according to a normal map that is input as a texture. Bump mapping can make an object
appear to have shadows caused by surface irregularities. This allows you to render a simple model
that looks complex but actually has a small polygon count.

12.3.1. Reserved Uniform

The following reserved uniforms are used for bump mapping.

Normal Maps

For the normal map texture for bump mapping (dmp_LightEnv.bumpSelector), specify the
texture unit to which the texture is bound. A normal map texture is created with the x, y, and z
components of the perturbation vectors encoded in the R, G, and B components, respectively. A
vector value of &ndash;1.0 is encoded as the minimum luminance (0 in an 8-bit format), and 1.0
is encoded as the maximum luminance (255 in an 8-bit format).

Perturbation Mode

To enable bump mapping, set the perturbation mode (dmp_LightEnv.bumpMode) to any value
other than GL_LIGHT_ENV_BUMP_NOT_USED_BUMP. The following table shows the perturbation
modes.

Table 12-10. Perturbation Modes

Perturbation Mode Perturbed Vectors


GL_LIGHT_ENV_BUMP_NOT_USED_BUMP None.
GL_LIGHT_ENV_BUMP_AS_BUMP_DMP Normal vectors (bump mapping).

GL_LIGHT_ENV_BUMP_AS_TANG_DMP Tangent vectors (tangent mapping).

Normal Recalculation

If dmp_LightEnv.bumpRenorm is set to GL_TRUE to enable recalculation of normal vectors, the z-


component of the normal vector is not obtained from the B component sampled from a texture.
Instead, the z-component is recalculated from the x-component and the y-component.
Note: If the expression inside the square root is negative, the result is
0.

In most cases, recalculating values yields better results than sampling them from a texture. This
recalculation feature must be enabled, if bump mapping (of normals) uses a texture that only has R
and G components (GL_HILO8_DMP). However, if you have selected tangent mapping as the
perturbation mode for use with a technique such as anisotropic reflections, we recommend that you
avoid using this feature. This is because tangent mapping (for fragment lighting) expects input
perturbation tangents that do not have a z-component. If the recalculation feature is enabled,
nonzero values may be generated in the z-component.

If recalculation is disabled, the perturbation normal vectors sampled from the texture are not
normalized before they are used. Make sure that you normalize values before storing them in a
texture. Re-enable normal calculation when point sampling (GL_NEAREST) is not configured as the
texture filter mode because filtering can cause the non-normalized values to be used as the
perturbation normals.

Table 12-11. Reserved Uniforms Used for Bump Mapping

Reserved Uniform Type Setting Value


Specifies the texture unit to use for the normal map.
GL_TEXTURE0 (default)
dmp_LightEnv.bumpSelector int GL_TEXTURE1
GL_TEXTURE2
GL_TEXTURE3

Specifies the perturbation mode for normal and tangent


vectors.
GL_LIGHT_ENV_BUMP_NOT_USED_DMP: No bump mapping
(default).
dmp_LightEnv.bumpMode int
GL_LIGHT_ENV_BUMP_AS_BUMP_DMP: Perturb normal
vectors.
GL_LIGHT_ENV_BUMP_AS_TANG_DMP: Perturb tangent
vectors.
Specifies whether to regenerate the third component of the
normal vector.
dmp_LightEnv.bumpRenorm bool
GL_TRUE: Regenerate.
GL_FALSE: Do not regenerate (default).

12.4. Shadows

3DS shadows are rendered in two passes. First, the shadow accumulation pass creates a shadow
buffer (that contains the scene's depth values taking the light source as the origin), which is then
referenced by the shadow lookup pass to cast shadows. The shadow intensity information collected,
along with the depth values in the first pass, allows you to represent soft shadows.

12.4.1. Shadow Accumulation Pass

The shadow accumulation pass requires that the fragment operation mode
(dmp_FragOperation.mode) be switched to shadow mode (GL_FRAGOP_MODE_SHADOW_DMP),
and that the shadow information (depth values and shadow intensity) be stored in a shadow texture
(with a format of GL_SHADOW_DMP and a type of GL_UNSIGNED_INT). Note that only texture unit 0
(GL_TEXTURE0) can write shadow information to a shadow texture. Also note that mipmaps cannot
be applied to shadow textures.

When the fragment pipeline switches to shadow mode, shadow information is output to the
attachment point for the color buffer rather than the depth or stencil buffer. As a result, a shadow
texture must be attached to the color buffer's attachment point (GL_COLOR_ATTACHMENT0). Render
targets attached to depth and stencil attachment points are ignored in shadow mode. The alpha and
stencil tests are skipped.

You can use the following procedure to create a shadow texture and specify a render target.

Code 12-6. Creating a Shadow Texture and Specifying a Render Target

glActiveTexture(GL_TEXTURE0);
glGenTextures(1, &shadowTexID);
glBindTexture(GL_TEXTURE_2D, shadowTexID);
glTexImage2D(GL_TEXTURE_2D, 0, GL_SHADOW_DMP, shadowWidth, shadowHeight, 0,
GL_SHADOW_DMP, GL_UNSIGNED_INT, 0);
glGenFramebuffers(1, &shadowFboID);
glBindFramebuffer(GL_FRAMEBUFFER, shadowFboID);
glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D,
shadowTexID, 0);

Shadow information is accumulated using the coordinate system of the light source. Shadow
information comprises the depth from the light source (the depth values) and the shadow intensity.
When rendering shadows, there are no shadows wherever the G component of the color information
is 1.0 (the R, B, and A components have no effect), there are opaque hard shadows wherever the
G component of the color information is 0.0, and there are non-opaque shadows (soft shadows)
everywhere else.

When opaque hard shadows are rendered, only the shadow depth values are updated. The shadow
intensity does not change. The depth of a fragment and its corresponding pixel in the shadow buffer
are compared (using GL_LESS). If the fragment has a smaller value, the depth value in the shadow
buffer is updated.

When non-opaque soft shadows are rendered, only the shadow intensity is updated. The shadow
depth values do not change. The depth of a fragment and its corresponding pixel in the shadow
buffer are compared (using GL_LESS). If the fragment has a smaller value, the shadow intensity is
also compared (using GL_LESS) and then, if the fragment still has a smaller value, the shadow
intensity in the shadow buffer is updated.

When you initialize the color buffer in the shadow accumulation pass you must set the clear color to
(1.0, 1.0, 1.0, 1.0) by using the glClearColor() function and specify GL_COLOR_BUFFER_BIT to
the glClear() function. Note that you must set all color components (R, G, B, and A)—not just the
G component—equal to 1.0 in the clear color.

The next shadow lookup pass is processed in eye coordinates, so the depth values must be created
by using linear interpolation in eye space. (In most cases this differs from OpenGL, which uses
non-linear relationships.) As a result, you must set a value in the reserved uniform for the w-
buffer's scale factor (dmp_FragOperation.wScale), using the glUniform1f() function. This
has an initial value of 0.0, which results in the same non-linear relationship as OpenGL. Depth
values have a lower valid precision around the far clipping plane, when the near clipping plane is
close to the viewpoint. To use linear interpolation, with f set as the clip value for the far clipping
plane, set the scale factor to 1.0/f for a perspective projection or to 1.0 for an orthographic
projection.

An object casts a hard shadow if it is rendered with a G color component of 0.0. This is generally
implemented by disabling textures, and then implementing a vertex shader that outputs those
vertex colors with G components of 0.0. Every other rendered color is treated as a soft shadow.

Several rendering passes may be necessary to accumulate the required shadow information.
Information for non-opaque shadows must be accumulated after information for opaque shadows.
Results are not guaranteed if information is accumulated in the opposite order or in alternating
order. When light sources do not move, you can generate shadow textures more efficiently by
rendering motionless objects alone to a shadow texture ahead of time. Simplifying object shapes
and decreasing polygon counts are also effective ways to improve performance.

Texture unit 0 and the texture combiners are configured as follows, when shadows are rendered
using a vertex shader implementation that outputs vertex colors.

Code 12-7. Sample Settings for Texture Units and Texture Combiners When Rendering Shadows

glUniform1i(glGetUniformLocation(progID, "dmp_Texture[0].samplerType"),
GL_FALSE);
glUniform1i(glGetUniformLocation(progID, "dmp_TexEnv[0].combineRgb"),
GL_REPLACE);
glUniform1i(glGetUniformLocation(progID, "dmp_TexEnv[0].combineAlpha"),
GL_REPLACE);
glUniform3i(glGetUniformLocation(progID, "dmp_TexEnv[0].operandRgb"),
GL_SRC_COLOR, GL_SRC_COLOR, GL_SRC_COLOR);
glUniform3i(glGetUniformLocation(progID, "dmp_TexEnv[0].operandAlpha"),
GL_SRC_ALPHA, GL_SRC_ALPHA, GL_SRC_ALPHA);
glUniform3i(glGetUniformLocation(progID, "dmp_TexEnv[0].srcRgb"),
GL_PRIMARY_COLOR, GL_PRIMARY_COLOR, GL_PRIMARY_COLOR);
glUniform3i(glGetUniformLocation(progID, "dmp_TexEnv[0].srcAlpha"),
GL_PRIMARY_COLOR, GL_PRIMARY_COLOR, GL_PRIMARY_COLOR);

12.4.2. Shadow Lookup Pass

The shadow lookup pass requires that the fragment operation mode be switched to normal mode
(GL_FRAGOP_MODE_GL_DMP), and that texture unit 0 (GL_TEXTURE0) be configured to reference
the shadow texture that has accumulated the shadow information. To reference a shadow texture,
you must set the reserved uniform dmp_TexEnv[0].samplerType to
GL_TEXTURE_SHADOW_2D_DMP (which specifies shadow textures), and bind the texture that has
accumulated shadow information to GL_TEXTURE_2D. When fragment lighting is enabled, its
primary and secondary colors are used as texture combiner input. Otherwise, vertex colors and
output from texture unit 0 are used.

The reserved uniform dmp_Texture[0].perspectiveShadow indicates whether a perspective


projection or orthographic projection was applied when the shadow accumulation pass was run.
Specify GL_TRUE for perspective projection or GL_FALSE for orthographic projection.

Code 12-8. Sample Settings When Fragment Lighting Is Disabled

glUniform1i(glGetUniformLocation(progID, "dmp_Texture[0].samplerType"),
GL_TEXTURE_SHADOW_2D_DMP);
glUniform1i(glGetUniformLocation(progID, "dmp_Texture[0].perspectiveShadow"),
GL_TRUE);
glUniform1i(glGetUniformLocation(progID, "dmp_TexEnv[0].combineRgb"),
GL_MODULATE);
glUniform1i(glGetUniformLocation(progID, "dmp_TexEnv[0].combineAlpha"),
GL_MODULATE);
glUniform3i(glGetUniformLocation(progID, "dmp_TexEnv[0].operandRgb"),
GL_SRC_COLOR, GL_SRC_COLOR, GL_SRC_COLOR);
glUniform3i(glGetUniformLocation(progID, "dmp_TexEnv[0].operandAlpha"),
GL_SRC_ALPHA, GL_SRC_ALPHA, GL_SRC_ALPHA);
glUniform3i(glGetUniformLocation(progID, "dmp_TexEnv[0].srcRgb"),
GL_TEXTURE0, GL_PRIMARY_COLOR, GL_PRIMARY_COLOR);
glUniform3i(glGetUniformLocation(progID, "dmp_TexEnv[0].srcAlpha"),
GL_TEXTURE0, GL_PRIMARY_COLOR, GL_PRIMARY_COLOR);
glUniform1i(glGetUniformLocation(progID, "dmp_FragOperation.mode"),
GL_FRAGOP_MODE_GL_DMP);
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, shadowTexID);

Both 3DS and OpenGL reference shadow information from textures. The texture unit compares
texture coordinates and shadow information obtained from texels. For this to work properly, the
correct texture coordinates must be specified. Note that a shadow texture is referenced using
texture coordinates (s/q, t/q, r/q) in OpenGL, and (s/r, t/r, r - bias) on 3DS. The texture
transformation matrix has already been applied to texture coordinates (s, t, r, q), and the reserved
uniform dmp_Texture[0].shadowZBias specifies the bias value. If shadow information is
accumulated by an orthographic projection, texture coordinates s and t are referenced directly. A
perspective projection, however, requires adjustments to the texture transformation matrix and bias
value.

The following texture transformation matrix and bias value could be used to compare the texture
coordinates and shadow information accumulated by a perspective projection.

Texture Transformation Matrix and Bias for Perspective Projection

Texture Transformation Matrix and Bias for Parallel Projection

n is the clip value at the near clipping plane, f is the clip value at the far clipping plane, and r
and t are the right and top side values of the frustum.

When calculating the value that is compared to the shadow buffer depth (that is, r – bias ), if the
texture coordinate r is not within the range from 0.0 to 1.0, r is clamped to between 0.0 and 1.0
before bias is subtracted. To compare values correctly, specify a bias of 0 for any objects placed
beyond the far clipping plane in the coordinate system of the light source used during the shadow
accumulation pass.

You can create a texture transformation matrix with the following OpenGL code.

Code 12-9. Creating a Texture Transformation Matrix With OpenGL Code

glMatrixMode(GL_TEXTURE);
glLoadIdentity();
// (glFrustum(-r, r, -t, t, n, f))
glFrustumf(r/n, -3r/n, t/n, -3t/n, 1.0f, 0.0f);
glScalef(-1.0f/(f-n), -1.0f/(f-n), -1.0f/(f-n));
// (glOrtho(-r, r, -t, t, n, f))
glOrthof(-3r, r, -3t, t, 2n-f, f);

By setting the border color and texture wrapping mode, you can control the sampling results for
texture coordinates that are less than 0.0 or greater than 1.0. The texture wrapping mode
guarantees that when GL_CLAMP_TO_BORDER is configured for the s and t texture coordinates, out-
of-range sampling values are set to the border color (which must have a value of 0.0 or 1.0 for all
components). Sampling results are undefined in the current implementation if the wrapping mode is
not GL_CLAMP_TO_BORDER or if the border color is neither (0.0, 0.0, 0.0, 0.0) nor (1.0, 1.0, 1.0,
1.0).

Code 12-10. Sample Settings for the Wrapping Mode and Border Color

glBindTexture(GL_TEXTURE_2D, shadowTexID);
GLfloat bcolor[] = {1.0f, 1.0f, 1.0f, 1.0f};
glTexParameterfv(GL_TEXTURE_2D, GL_TEXTURE_BORDER_COLOR, bcolor);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_BORDER);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_BORDER);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_LOD, 0);

If the depth values in light source coordinates are greater than the depth values in shadow texels,
the shadow intensity will be 0.0 after the comparison. Otherwise, the shadow intensity in the
shadow texels is used. Texture units output the shadow intensity as a value for each RGBA
component.

12.4.3. Omnidirectional Shadow Mapping

You can combine cube mapping with shadow textures to implement omnidirectional shadow
mapping.

This implementation requires that you bind a cube map texture (one of six types:
GL_TEXTURE_CUBE_MAP_POSITIVE_{X,Y,Z} or GL_TEXTURE_CUBE_MAP_NEGATIVE_{X,Y,Z})
to the texture unit during the shadow accumulation pass, render six times, and then specify
GL_TEXTURE_SHADOW_CUBE_DMP as the texture to reference during the shadow lookup pass
(dmp_Texture[0].samplerType). Also, you must specify GL_CLAMP_TO_EDGE as the texture
coordinate wrapping mode to use in both the S and T directions.

12.4.4. Soft Shadows Using the Silhouette Shader

Shadow textures contain depth values and shadow intensity. The sampling value output by a
texture unit that references a shadow texture is either the color black (0.0, 0.0, 0.0), within a
shadow region, or the shadow intensity saved in the shadow texture (between 0.0 and 1.0), outside
of a shadow region. You can represent soft shadows by setting this shadow intensity to the
appropriate values.

The 3DS silhouette shader can render silhouette primitives (edges) as soft shadow regions, whose
shadow intensity changes gradually until they disappear. Opaque hard shadows are rendered
before the silhouette shader is used for rendering. Set the G component of the color at the
silhouette edge to 1.0, and set the G component of the vertex color output by the vertex shader to
0.0. All other settings are the same as those used to render opaque hard shadows. This renders
the rectangle for a silhouette edge so that its G component gradually changes from 0.0 on the
object side to 1.0 on the outer side.

During the shadow accumulation pass for these soft shadow regions, you can apply additive
modulation to the shadow intensity, according to the relative distance to an object. This is called an
attenuation factor, and it can adjust the width of the soft shadow region through the reserved
uniforms for the soft shadow bias (dmp_FragOperation.penumbraBias) and scale
(dmp_FragOperation.penumbraScale). By adjusting these values, you can represent more
natural soft shadows that narrow close to the object.

The attenuation factor is calculated by the following equation. The equation uses Z frag to represent
a fragment's depth value and Z rec to represent the depth value stored in a shadow texture. The
shadow intensity is not attenuated properly if objects have not already been rendered and depth
values have not already been saved in a shadow texture.

12.4.5. Handling Shadow Artifacts

[Link]. Self-Shadow Aliasing

Various problems can occur during multiple-pass shadow rendering. For example, by accidentally
casting a shadow on itself (called self-shadow aliasing) a fragment can cause a moiré pattern to
be rendered. This occurs when depth values from the shadow accumulation pass are slightly
smaller than depth values from the shadow lookup pass (measured from the light source).
Unnatural shadows such as these are called shadow artifacts.

One way to suppress shadow artifacts is to apply a negative offset (bias) to depth values during
the shadow lookup pass. You can set the bias value in the reserved uniform
dmp_Texture[0].shadowZBias.

Code 12-11. Sample Settings to Suppress Self-Shadow Aliasing

glUniform1f(glGetUniformLocation(progID, "dmp_Texture[0].shadowZBias"),
1.2f*n/(f-n));

[Link]. Silhouette Shadow Artifacts

As described previously, the shadow intensity in a shadow texture is output as a sampling value
for any location that is determined to be outside of a shadow region during the shadow lookup
pass. Although non-shadow regions usually have a shadow intensity of 1.0 and no brightness
attenuation, the soft shadow regions rendered by the silhouette shader can have an output
shadow intensity that is not 1.0, and thereby have brightness attenuation. This causes shadow
artifacts to occur for some objects.

The method used to suppress self-shadow aliasing does not work for silhouette shadow artifacts,
but these artifacts can still be suppressed through adjustments to shadow settings and texture
combiner settings.

Because the main cause of these artifacts is the brightness attenuation of a light source by soft
shadows on a parallel plane, the shadow texture output (shadow attenuation) can be calculated
by using the following equation.

ShadowAttenuation = 1.0 – f (1.0 – ShadowIntensity)

Where f is a function that returns a value close to 0.0 when the fragment normals are
perpendicular to the light source and 1.0 when the fragment normals are parallel to the
light source.

This equation yields a shadow attenuation of 1.0 (shadows have no effect) when a fragment's
normals are nearly perpendicular to the light source and yields the shadow intensity itself when
the normals are nearly parallel to the light source.

This shadow attenuation term can be implemented by using the Fresnel factor (lookup table FR)
as the f function, reserved uniforms for the lighting environment (dmp_LightEnv.shadowAlpha
and dmp_LightEnv.invertShadow) as the inverse of the shadow intensity, and texture
combiner settings as the inverse of the final value.

Lighting is configured as follows. Note that the Fresnel factor affects only the alpha component.
The alpha component is multiplied by the texture combiners. The f function is configured to return
the square of the input value.

Code 12-12. Sample (Lighting) Settings to Suppress Silhouette Shadow Artifacts

glUniform1i(glGetUniformLocation(progID, "dmp_FragmentLighting.enabled"),
GL_TRUE);
// ..other code..
glUniform1i(glGetUniformLocation(progID, "dmp_LightEnv.lutInputFR"),
GL_LIGHT_ENV_LN_DMP);
glUniform1i(glGetUniformLocation(progID, "dmp_LightEnv.config"),
GL_LIGHT_ENV_LAYER_CONFIG1_DMP);
glUniform1i(glGetUniformLocation(progID, "dmp_LightEnv.fresnelSelector"),
GL_LIGHT_ENV_PRI_SEC_ALPHA_FRESNEL_DMP);
glUniform1i(glGetUniformLocation(progID, "dmp_LightEnv.shadowAlpha"),
GL_TRUE);
glUniform1i(glGetUniformLocation(progID, "dmp_LightEnv.invertShadow"),
GL_TRUE);

GLfloat lut[512];
int j;
memset(lut, 0, sizeof(lut));
for (j = 1; j < 128; j++)
{
lut[j] = powf((float)j/127.0f, 2.0f);
lut[j+255] = lut[j] - lut[j-1];
}
glTexImage1D(GL_LUT_TEXTURE0_DMP, 0, GL_LUMINANCEF_DMP, 512, 0,
GL_LUMINANCEF_DMP, GL_FLOAT, lut);
glUniform1i(glGetUniformLocation(progID, "dmp_FragmentMaterial.samplerFR"), 0);

In this code, f (1.0 – ShadowIntensity) is output as the fragment's primary and secondary alpha
component.

The final shadow attenuation factor is calculated next from the texture combiner settings and is
multiplied by the fragment's primary color. Note that the fragment's primary alpha value is
inverted here (GL_ONE_MINUS_SRC_ALPHA).

Code 12-13. Sample (Texture Combiner) Settings to Suppress Silhouette Shadow Artifacts

glUniform1i(glGetUniformLocation(progID, ("dmp_TexEnv[0].combineRgb"),
GL_MODULATE);
glUniform1i(glGetUniformLocation(progID, ("dmp_TexEnv[0].combineAlpha"),
GL_REPLACE);
glUniform3i(glGetUniformLocation(progID, ("dmp_TexEnv[0].operandRgb"),
GL_SRC_COLOR, GL_ONE_MINUS_SRC_ALPHA, GL_SRC_COLOR);
glUniform3i(glGetUniformLocation(progID, ("dmp_TexEnv[0].operandAlpha"),
GL_SRC_ALPHA, GL_SRC_ALPHA, GL_SRC_ALPHA);
glUniform3i(glGetUniformLocation(progID, ("dmp_TexEnv[0].srcRgb"),
GL_FRAGMENT_PRIMARY_COLOR_DMP, GL_FRAGMENT_PRIMARY_COLOR_DMP,
GL_PRIMARY_COLOR);
glUniform3i(glGetUniformLocation(progID, ("dmp_TexEnv[0].srcAlpha"),
GL_PRIMARY_COLOR, GL_PRIMARY_COLOR, GL_PRIMARY_COLOR);

12.4.6. Reserved Uniform


The following table lists the reserved uniforms used for shadows.

Table 12-12. Reserved Uniforms Used for Shadows

Reserved Uniform Type Setting Value

Specifies the type of texture to reference. The


following two types can be used for shadows.
dmp_Texture[0].samplerType int
GL_TEXTURE_SHADOW_2D_DMP
GL_TEXTURE_SHADOW_CUBE_DMP
Specifies whether a perspective projection has
been applied as the projection transformation.
dmp_Texture[0].perspectiveShadow bool
GL_TRUE: Has been applied (default).
GL_FALSE: Has not been applied.
Specifies the bias value for the negative offset to
dmp_Texture[0].shadowZBias float apply to depth values during the lookup pass.
0.0 (default)
Specifies the fragment operation mode. Specifies
shadow mode during the accumulation pass and
dmp_FragOperation.mode int standard mode during the lookup pass.
GL_FRAGOP_MODE_GL_DMP (standard mode)
GL_FRAGOP_MODE_SHADOW_DMP (shadow mode)
Specifies the scale factor for the w-buffer.
dmp_FragOperation.wScale float
0.0 (default)
Specifies the scale value for soft shadows.
dmp_FragOperation.penumbraScale float
1.0 (default)

Specifies the bias value for soft shadows.


dmp_FragOperation.penumbraBias float
0.0 (default)

12.4.7. Checking Shadow Texture Content

To check the image that was rendered to a shadow texture, attach the shadow texture to the current
color buffer, and then read the texel data by calling the glReadPixels() function and specifying
GL_RGBA for the format parameter and GL_UNSIGNED_BYTE for the type parameter. When
accessed through a u32 pointer, the least-significant 8 bits of texel data store the shadow intensity,
and the most-significant 24 bits store the depth value. Shift this to the right by 8 bits to get only the
depth value.

Because shadow textures store an 8-bit shadow intensity and a 24-bit depth value, and both of
these represent values in the range from 0.0 to 1.0, the shadow intensity and depth value have
different precisions.

The shadow intensity takes a value between 0x00 and 0xFF, where 0xFF indicates the absence of
a soft shadow region and all other values indicate the presence of a soft shadow region. In other
words, a value of 0x00 indicates a shadow intensity of 0.0 and 0xFF indicates a shadow intensity
of 1.0.

The depth value is scaled so that 0x000000 is the near value and 0xFFFFFF is the far value. In
other words, a value of 0x000000 indicates a depth value of 0.0 and 0xFFFFFF indicates a depth
value of 1.0. Note that this scaling is uniform if the w buffer is enabled during the shadow
accumulation pass. For more information, see 10.3.3. W-Buffer.

A shadow intensity of 0xFF and a depth value of 0xFFFFFF are written to the shadow texture
wherever there are no shadows. A shadow intensity of 0xFF and a depth value other than
0xFFFFFF are written to the shadow texture wherever there are only hard shadows. A shadow
intensity other than 0xFF and a depth value other than 0xFFFFFF are used in regions that have
both hard and soft shadows.
Table 12-13. Types of Shadows and the Values Written to Shadow Textures

Shadow Type Shadow Intensity Depth Values


No shadows 0xFF 0xFFFFFF
Only hard shadows 0xFF Value other than 0xFFFFFF
Soft shadows Value other than 0xFF Value other than 0xFFFFFF

12.5. Fog

Although 3DS fog has nearly the same features as fog defined in OpenGL ES 1.1, the effect of 3DS
fog is determined by projection-corrected depth values, whereas the effect of OpenGL fog is
determined by the distance from the viewpoint. Another difference is that fog coefficients are
specified by lookup tables. There are also no settings for fog properties, such as the beginning, end,
and density.

As in OpenGL, the following equation determines a fragment's color after fog has been applied.

Color = f × C fragment + (1 - f) × C fog

f is the fog coefficient (between 0.0 and 1.0).

12.5.1. Reserved Uniform

The following reserved uniforms are used for fog.

Fog Mode

To enable fog, specify a fog mode (dmp_Fog.mode) of GL_FOG using the glUniform1i()
function. To disable it, specify GL_FALSE.

Fog Color

Use the glUniform3f() function to set the fog color (dmp_Fog.color). Only the RGB
components are set. (The alpha component is not.)

Fog Coefficients

Fog coefficients configure a lookup table that takes depth values in window coordinates as input.
Call the glUniform1i() function to specify the reserved uniform (dmp_Fog.sampler) to the
lookup table number to use. Note that the specified lookup table number GL_LUT_TEXTUREi_DMP,
where i represents a number from 0 to 31, does not specify the name (ID) of the texture nor
GL_LUT_TEXTUREi_DMP directly.

Whether to Invert Input Depth Value

You can choose whether to invert input values (changing z to 1 - z) for the fog coefficient lookup
table. To invert values, set the reserved uniform dmp_Fog.zFlip to GL_TRUE, using the
glUniform1i() function.
Table 12-14. Reserved Uniforms Used for Fog

Reserved Uniform Type Setting Value


Specifies the mode for processing the fog pipeline (the fog mode).
dmp_Fog.mode int GL_FOG
GL_GAS_DMPGL_FALSE (default)
Specifies the fog color. There is no alpha component.
dmp_Fog.color vec3 Each component has a value between 0.0 and 1.0.
(0.0, 0.0, 0.0) by default.

Specifies whether to invert the depth values used as input to the lookup
dmp_Fog.zFlip bool table for fog coefficients.
GL_TRUE or GL_FALSE (default).

Specifies the lookup table to use for fog coefficients.


dmp_Fog.sampler int
0 through 31

12.5.2. Creating and Specifying Lookup Tables

A lookup table for fog coefficients takes input values between 0.0 and 1.0, and has a fixed width
of 256. Like other lookup tables, it stores the output values in the first 128 elements and the
differences between output values in the last 128 elements.

You must be careful about how you convert input values, when you create lookup tables to
implement OpenGL fog coefficients on a 3DS system. The input depth values are specified in
window coordinates, with the near clipping plane at the minimum value (0.0) and the far clipping
plane at the maximum value (1.0). For a perspective projection, however, these depth values have
a non-linear relationship with depth values in eye coordinates. As a result, lookup table output must
be calculated using depth values that were converted from window coordinates into eye
coordinates.

Considering that input in window coordinates (between 0.0 and 1.0) is mapped to clip coordinates
(between 0.0 and –1.0), the following equation is used to convert values into eye coordinates.
Note that the sign of the input is reversed.

(Xe Ye Ze We) = (0.0 0.0 -Zw 1.0) × M projection -1

Fog coefficients are a function of the distance to a fragment from the origin in eye coordinates.
Input to this function can be approximated as the distance -Ze / We between the xy plane and the
fragment in eye coordinates.

This is shown by the following sample code. FogCoef is the fog coefficient function.

Code 12-14. Creating a Fog Lookup Table

float Fog_LUT[256], Fog_c[128 + 1];


int i;
Matrix44 invPM;
Vector4 v_eye, v_clip(0.0f, 0.0f, 0.0f, 1.0f);

MTX44Inverse(&invPM, &projMatrix);
Vector4 v0(invPM.m[0]);
Vector4 v1(invPM.m[1]);
Vector4 v2(invPM.m[2]);
Vector4 v3(invPM.m[3]);
for (i = 0; i <= 128; i++) {
v_clip.z = -(static_cast<f32>(i)) / 128;
v_eye.x = VEC4Dot(&v0, &v_clip); v_eye.y = VEC4Dot(&v1, &v_clip);
v_eye.z = VEC4Dot(&v2, &v_clip); v_eye.w = VEC4Dot(&v3, &v_clip);
Fog_c[i] = -(v_eye.z / v_eye.w);
}
for (i = 0; i < 128; i++) {
Fog_LUT[i] = FogCoef(Fog_c[i]);
Fog_LUT[128 + i]= FogCoef(Fog_c[i + 1]) - FogCoef(Fog_c[i]);
}
Note: OpenGL fog is affected by the eye coordinates' z value, but 3DS fog is affected by depth
values that have been corrected with a perspective projection. As a result, fog changes
when the near and far clipping planes change, and the same lookup table can produce
different effects depending on whether the w-buffer or a normal depth buffer is used. For
more information, see 10.3.3. W-Buffer.

12.6. Gas Rendering

Gas rendering uses the fog feature, configured in gas mode, to render gaseous bodies from density
information, which is itself generated based on depth values of polygon objects. Gaseous bodies
require three rendering passes: the polygon object rendering pass, which generates depth values for
the polygon objects; the density rendering pass, which accumulates density information in a gas
texture; and the shading pass, which renders gaseous bodies by referencing a gas texture.

12.6.1. Polygon Object Rendering Pass

This pass renders polygon objects as usual, generating depth information that is used to determine
where the polygon objects and gaseous bodies intersect. The content of the depth buffer is the only
rendering result that is used by the next pass.

12.6.2. Density Rendering Pass

This pass uses a means such as point sprites to render the smallest units comprising a gaseous
body and gas particles (a texture with a density pattern), and it accumulates a gas's depth
information in the color buffer. The content of the color buffer is copied to the gas texture and is
used for the next path.

Preparing Color Buffers and Gas Textures

Prepare a color buffer with an internal format of GL_GAS_DMP and a texture (gas texture), both of
which are used to copy the rendering results. Create the color buffer with the same size as the
depth buffer, but the gas texture must have a width and height (in texels) that are both powers of 2,
and GL_UNSIGNED_SHORT must be specified for type. Because gas textures always use point
sampling, minification and magnification filters have no effect. Use GL_NEAREST as the filter
setting. Note that mipmaps cannot be applied to gas textures.

Code 12-15. Preparing a Color Buffer and Gas Texture for Gas Rendering

// Generating Object
glGenFramebuffers(1, &gasAccFboID);
glGenTextures(1, &gasTexID);
// Renderbuffer & Framebuffer
glGenRenderbuffers(1, &gasAccColorID);
glBindRenderbuffer(GL_RENDERBUFFER, gasAccColorID);
glRenderbufferStorage(GL_RENDERBUFFER, GL_GAS_DMP, GAS_ACC_WIDTH,
GAS_ACC_HEIGHT);
glBindFramebuffer(GL_FRAMEBUFFER, gasAccFboID);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_RENDERBUFFER,
gasAccColorID);
// Gaseous Texture
glBindTexture(GL_TEXTURE_2D, gasTexID);
glTexImage2D(GL_TEXTURE_2D, 0, GL_GAS_DMP, GAS_TEX_WIDTH, GAS_TEX_HEIGHT, 0,
GL_GAS_DMP, GL_UNSIGNED_SHORT, 0);
glTexParameterf(GL_TEXTURE_2D,GL_TEXTURE_MIN_FILTER,GL_NEAREST);
glTexParameterf(GL_TEXTURE_2D,GL_TEXTURE_MAG_FILTER,GL_NEAREST);

Rendering Gas Particles

To render density information to the color buffer, you must switch the fragment operation mode
(dmp_FragOperation.mode) to gas mode (GL_FRAGOP_MODE_GAS_ACC_DMP). Two types of
density information are accumulated in the color buffer: information that is simply accumulated
(D1), and information that accounts for intersections with polygon offsets (D2).

Disable the depth test, the depth mask, blending, and fog to prevent the content of the depth buffer
from being updated. However, do not change the comparison function that was used for the depth
test, when polygon objects were rendered.

Clear only one buffer, the color buffer to which density information is rendered. The depth buffer
must not be cleared.

The R component of the gas particles to be rendered (Df) is accumulated in the color buffer as
density information. D1 and D2 are updated to D1' and D2' by the following equations. The equation
for D2' depends on the comparison function for the depth test.

Zb is the depth value stored in the depth buffer, and Zf is the depth value of the fragment.

D1 accumulates fragment (gas particle) density information unchanged. D2 accumulates fragment


density information that has been multiplied by the attenuation coefficient EZ in the depth direction
and also by the difference between the depth values in the depth buffer and fragment. The EZ
coefficient is a floating-point number set by the glUniform1f() function in the reserved uniform
dmp_Gas.deltaZ.

Code 12-16. Rendering Gas Particles

// Change to gas accumulation mode.


glBindFramebuffer(GL_FRAMEBUFFER, gasAccFboID);
glUniform1i(glGetUniformLocation(progID, "dmp_FragOperation.mode"),
GL_FRAGOP_MODE_GAS_ACC_DMP);
glDisable(GL_DEPTH_TEST);
glDepthMask(GL_FALSE);
glDisable(GL_BLEND);
glUniform1i(glGetUniformLocation(progID, "dmp_Fog.mode"), GL_FALSE);
// Colorbuffer Clear
glClearColor(0.0f, 0.0f, 0.0f, 0.0f);
glClear(GL_COLOR_BUFFER_BIT);
// Set dmp_Gas.deltaZ.
glDepthFunc(GL_LESS);
glUniform1f(glGetUniformLocation(progID, "dmp_Gas.deltaZ"), 50.0f);

Copying Information to a Gas Texture


After all gas particles have been rendered, the density information accumulated in the color buffer
is copied to a gas texture.

The gas texture is not usually allocated with the same size as the color buffer (the width and height
must be powers of 2), so the glCopyTexSubImage2D() function partially copies color buffer data.
The gas texture is required for the next pass.

Code [Link] Information to a Gas Texture

// Bind and copy to gaseous texture.


glBindTexture(GL_TEXTURE_2D, gasTexID);
glCopyTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, 0, 0, GAS_ACC_WIDTH,
GAS_ACC_HEIGHT);

12.6.3. Shading Pass

This pass shades gaseous bodies by referencing the density information accumulated in a gas
texture, and then blends the shading results with the color buffer contents from the polygon object
rendering pass.

Gaseous bodies are shaded using the fog feature configured in gas mode. Special settings are
required for fog input from the texture combiners.

Fog Input (Texture Combiner Settings)

Fog takes two inputs: the RGBA components from the second-to-last texture combiner (though only
the R component is actually used), and the density information from the input source, specified as
input source 2 (the third component of srcRgb and srcAlpha) to the last texture combiner (the
specified texture unit must sample the gas texture). Because output from the last texture combiner
is ignored, the number of usable texture combiner levels is reduced by one.

Ultimately, fog output is blended with the contents of the color buffer. Because fog outputs alpha
values that account for intersections with polygon objects, blending both outputs based on the fog's
alpha values allows gaseous bodies to be rendered in the correct order front-to-back.

Figure [Link] Input and Blending With the Color Buffer

Fog Operations in Gas Mode

In gas mode, fog shades gaseous bodies based on input information. To enable fog in the gas
mode, specify a fog mode (dmp_Fog.mode) for GL_GAS_DMP.

The RGB components of the shading results are determined by shading lookup tables. To determine
the alpha component, the gas attenuation value (dmp_Gas.attenuation) is multiplied by density
information that accounts for intersections (D2), and the result is looked up in the lookup table
specified by the fog coefficient (dmp_Fog.sampler).
Figure 12-6. Fog in Gas Mode

Shading Lookup Tables

Shading lookup tables accept either density information or shading intensity as input, and provide
the RGB components of the shading results as output.

Shading lookup tables are specified separately for each component using tables generated with
width set to 16. Both the input and output values are between 0.0 and 1.0. As with other lookup
tables, the output values are stored in the first eight elements, the differences between output
values are stored in the last eight elements, and the output values are interpolated using the delta
values.

The shading lookup table to use uses the glUniform1i() function to specify the lookup table
number as the following reserved uniform. Note that the specified lookup table number
GL_LUT_TEXTUREi_DMP, where i represents a number from 0 to 31, does not specify the name
(ID) of the texture nor GL_LUT_TEXTUREi_DMP directly.

Table 12-15. Reserved Uniforms Used to Specify Shading Lookup Tables

Reserved Uniform Type Setting Value


dmp_Gas.samplerTR Specifies the lookup table number to use as the shading lookup tables
dmp_Gas.samplerTG int (for the R, G, and B components).
dmp_Gas.samplerTB 0 through 31

The following are examples of shading lookup tables and their implementations.

Figure 12-7. A Shading Lookup Table

Code 12-18. Creating Shading Lookup Tables

// Define
GLfloat shading_color[3 * 9] = {
0.00f, 0.00f, 0.00f,
0.20f, 0.15f, 0.05f,
0.60f, 0.25f, 0.15f,
0.90f, 0.35f, 0.20f,
0.92f, 0.60f, 0.15f,
0.95f, 0.85f, 0.05f,
1.00f, 0.95f, 0.00f,
1.00f, 1.00f, 1.00f,
1.00f, 1.00f, 1.00f
};
GLfloat samplerTR[16], samplerTG[16], samplerTB[16];
// Table
for(int i = 0; i < 8; i++) {
// shading color value
samplerTR[i] = shading_color[3*i + 0];
samplerTG[i] = shading_color[3*i + 1];
samplerTB[i] = shading_color[3*i + 2];
// difference of shading color value
samplerTR[8 + i] = shading_color[3*(i + 1) + 0] - shading_color[3*i + 0];
samplerTG[8 + i] = shading_color[3*(i + 1) + 1] - shading_color[3*i + 1];
samplerTB[8 + i] = shading_color[3*(i + 1) + 2] - shading_color[3*i + 2];
}
// Texture
glBindTexture(GL_LUT_TEXTURE1_DMP, samplerTR_ID);
glTexImage1D(GL_LUT_TEXTURE1_DMP, 0, GL_LUMINANCEF_DMP, 16, 0,
GL_LUMINANCEF_DMP, GL_FLOAT, samplerTR);
glBindTexture(GL_LUT_TEXTURE2_DMP, samplerTG_ID);
glTexImage1D(GL_LUT_TEXTURE2_DMP, 0, GL_LUMINANCEF_DMP, 16, 0,
GL_LUMINANCEF_DMP, GL_FLOAT, samplerTG);
glBindTexture(GL_LUT_TEXTURE3_DMP, samplerTB_ID);
glTexImage1D(GL_LUT_TEXTURE3_DMP, 0, GL_LUMINANCEF_DMP, 16, 0,
GL_LUMINANCEF_DMP, GL_FLOAT, samplerTB);
// Set Uniform
glUniform1i(glGetUniformLocation(progID, "dmp_Gas.samplerTR"), 1);
glUniform1i(glGetUniformLocation(progID, "dmp_Gas.samplerTG"), 2);
glUniform1i(glGetUniformLocation(progID, "dmp_Gas.samplerTB"), 3);

Shading Based on Density Information

For shading that is based on density information alone (without accounting for the effect of
shadows cast by light), set the reserved uniform dmp_Gas.colorLutInput to
GL_GAS_DENSITY_DMP, using the glUniform1i() function to select density information, as the
shading lookup table input.

As described for the density rendering pass, two types of density information are stored in a gas
texture. One type of density information does not account for intersections with polygon objects
(D1), and the other type does (D2). You can set a value in the reserved uniform
dmp_Gas.shadingDensitySrc to select the density information to use.

Specify GL_GAS_PLAIN_DENSITY_DMP for D1 and GL_GAS_DEPTH_DENSITY_DMP for D2, using


the glUniform1i() function. Even if you choose D1 as input to the shading lookup tables, fog in
gas mode outputs alpha components calculated from density information that accounts for
intersections (D2). By using alpha values while blending, polygon objects and gaseous bodies can
be rendered with the correct front-to-back ordering.

Lookup table input uses values between 0.0 and 1.0. You can multiply the density information by
the reciprocal of the maximum density value to keep values in this range. The reciprocal of the
maximum value can be calculated automatically. By setting the reserved uniform
dmp_Gas.autoAcc to GL_TRUE, you can use the glUniform1i() function to calculate the
reciprocal from the maximum D1 value in the density rendering pass.

When GL_FALSE is specified, the value set in the reserved uniform dmp_Gas.accMax by the
glUniform1f() function is used as the reciprocal of the maximum value.

Shading Based on the Shading Intensity

To use shading that is calculated from the shading intensity (accounting for the effect of shadows
cast by light), set the reserved uniform dmp_Gas.colorLutInput to
GL_GAS_LIGHT_FACTOR_DMP, using the glUniform1i() function. This selects the shading
intensity as input to the shading lookup table.

The shading intensity is the total of two calculated values: the planar shading intensity (IG) and the
view shading intensity (IS). IG and IS are defined as follows.

ig = r × (1.0 - lightAtt × d1)


IG = (1.0 - ig) × lightMin + ig × lightMax
is = LZ × (1.0 - scattAtt × d1)
IS = (1.0 - is) × scattMin + is × scattMax

d1 is the result of multiplying the density information that does not account for intersections with
polygon objects (D1), by the reciprocal of the maximum density. This mechanism is similar to the
one used for shading based on density information. r is the R component output from the texture
combiner that is used as fog input. lightAtt, lightMin, and lightMax are coefficients of the planar
shading intensity. LZ is the light direction along the z-axis in eye coordinates. scattAtt, scattMin,
and scattMax are coefficients of the view shading intensity. These all take values between 0.0 and
1.0.

The planar shading intensity is proportional to r and (1.0 – lightAtt × d1), as its equation shows. It
approaches lightMax when r increases and lightMin when d1 increases. Likewise, the view shading
intensity is proportional to LZ and (1.0 – scattAtt × d1). It approaches scattMax when LZ increases
and scattMin when d1 increases.

Note that the shading intensity input to a shading lookup table is larger for gaseous bodies that are
less dense. This emulates the real behavior of light, which penetrates through the thin (low-density)
areas of a gaseous body, and is absorbed in the thick (high-density) areas. Accordingly, lightMin
and scattMin represent the shading intensity when the effect of light is small, and lightMax and
scattMax represent the shading intensity when the effect of light is large. lightAtt and scattAtt set
the ratio of light attenuation caused by density. Note that, depending on the values set in the
shading lookup tables, lightMin is not necessarily less than lightMax (and scattMin is not
necessarily less than scattMax). Also note that alpha values are determined by fog coefficients
whose input is proportional to density.

Coefficients for the planar shading intensity (lightMin, lightMax, and lightAtt) are set, as a group, in
the reserved uniform dmp_Gas.lightXY. Coefficients for the view shading intensity (scattMin,
scattMax, and scattAtt), and the light direction along the z-axis in eye coordinates (LZ), are set as
a group in the reserved uniform dmp_Gas.lightZ. The minimum value, maximum value, and
attenuation are set in that order. They are followed by LZ for the view shading intensity.

The following code shows how to set the shading intensity coefficients.

Code 12-19. Setting Shading Intensity Coefficients

GLfloat lightXY[3], lightZ[4];


GLfloat lightMin, lightMax, lightAtt;
GLfloat scattMin, scattMax, scattAtt, LZ;

//...

// lightXY
lightXY[0] = lightMin;
lightXY[1] = lightMax;
lightXY[2] = lightAtt;
// lightZ
lightZ[0] = scattMin;
lightZ[1] = scattMax;
lightZ[2] = scattAtt;
lightZ[3] = LZ;
// Set Uniform
glUniform3fv(glGetUniformLocation(progID, "dmp_Gas.lightXY"), 1, lightXY);
glUniform4fv(glGetUniformLocation(progID, "dmp_Gas.lightZ"), 1, lightZ);

Alpha Shading

The results of shading the alpha component are determined by output from the lookup table for fog
coefficients (dmp_Fog.sampler), given the product of the gas attenuation
(dmp_Gas.attenuation) and the density information that accounts for intersections (D2) as input.

Fog coefficients are normally specified using a function that approaches 0.0 when the gas density
is low, and 1.0 when the gas density is high.
Code 12-20. Setting Fog Coefficients

// Fog factor
for(int i = 0; i < 128; i++) {
fogTable[i]= 1.0f - exp(-8.0f * i / 128.0f);
}
for(int i = 0; i < 128; i++) {
fogTable[128 + i] = fogTable[i + 1] - fogTable[i];
}
fogTable[255] = 0;
// Set LUT
glGenTextures(1, &fogLUT_ID);
glBindTexture(GL_LUT_TEXTURE0_DMP, fogLUT_ID);
glTexImage1D(GL_LUT_TEXTURE0_DMP, 0, GL_LUMINANCEF_DMP, 256, 0,
GL_LUMINANCEF_DMP, GL_FLOAT, fogTable);

The following sample code sets uniforms required by the shading pass and renders a quad
(polygon), to which a gas texture has been applied.

Code 12-21. Setting Uniforms for the Shading Pass

// Bind Framebuffer(Colorbuffer)
glBindFramebuffer(GL_FRAMEBUFFER, renderFboID);
// Bind Gas Texture
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, gasTexID);
glUniform1i(glGetUniformLocation(progID, "dmp_Texture[0].samplerType"),
GL_TEXTURE_2D);
// Set TextureCombiner #5
glUniform3i(glGetUniformLocation(progID, "dmp_TexEnv[5].srcRgb"),
GL_PREVIOUS, GL_PREVIOUS, GL_TEXTURE0);
glUniform3i(glGetUniformLocation(progID, "dmp_TexEnv[5].srcAlpha"),
GL_PREVIOUS, GL_PREVIOUS, GL_TEXTURE0);
// Set uniform for gas shading mode.
glUniform1i(glGetUniformLocation(progID, "dmp_Fog.sampler"), 0);
glUniform1i(glGetUniformLocation(progID, "dmp_Gas.autoAcc"), GL_FALSE);
glUniform1i(glGetUniformLocation(progID, "dmp_Gas.samplerTR"), 1);
glUniform1i(glGetUniformLocation(progID, "dmp_Gas.samplerTG"), 2);
glUniform1i(glGetUniformLocation(progID, "dmp_Gas.samplerTB"), 3);
glUniform1i(glGetUniformLocation(progID, "dmp_Gas.shadingDensitySrc"),
GL_GAS_DEPTH_DENSITY_DMP);
glUniform1i(glGetUniformLocation(progID, "dmp_Gas.colorLutInput"),
GL_GAS_DENSITY_DMP);
glUniform1f(glGetUniformLocation(progID, "dmp_Gas.accMax"), 1.0f/6.0f);
glUniform4fv(glGetUniformLocation(progID, "dmp_Gas.lightZ"), 1, gasLightZ);
glUniform3fv(glGetUniformLocation(progID, "dmp_Gas.lightXY"), 1, gasLightXY);
// Change to gas shading mode.
glUniform1i(glGetUniformLocation(progID, "dmp_FragOperation.mode"),
GL_FRAGOP_MODE_GL_DMP);
glUniform1i(glGetUniformLocation(progID, "dmp_Fog.mode"), GL_GAS_DMP);
glEnable(GL_BLEND);
glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);
glDisable(GL_DEPTH_TEST);
glDepthMask(GL_FALSE);

Code 12-22. Rendering a Quad (Polygon) With a Gas Texture Applied

// Gaseous Shading
// Texture Coord
float u0 = 0.0f;
float v0 = 0.0f;
float u1 = (GAS_ACC_WIDTH * 1.0f) / (GAS_TEX_WIDTH * 1.0f);
float v1 = (GAS_ACC_HEIGHT * 1.0f) / (GAS_TEX_HEIGHT * 1.0f);
GLfloat texCoord[8]= {u0, v0, u0, v1, u1, v1, u1, v0};
// Vertex
GLushort quadIndex[6] = {0, 1, 2, 0, 2, 3};
GLfloat vertex[16] = {
-1.0f, -1.0f, 0.0f, 1.0f,
-1.0f, 1.0f, 0.0f, 1.0f,
1.0f, 1.0f, 0.0f, 1.0f,
1.0f, -1.0f, 0.0f, 1.0f
};
// Set Array
GLfloat quadIndexID, quadVertexID, quadTexCoordID;
glGenBuffers(1, &quadIndexID);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, quadIndexID);
glBufferData(GL_ELEMENT_ARRAY_BUFFER, 6 * sizeof(GLushort), &quadIndex,
GL_STATIC_DRAW);
glGenBuffers(1, &quadVertexID);
glBindBuffer(GL_ARRAY_BUFFER, quadVertexID);
glBufferData(GL_ARRAY_BUFFER, 16 * sizeof(GLfloat), &vertex, GL_STATIC_DRAW);
glGenBuffers(1, &quadTexCoordID);
glBindBuffer(GL_ARRAY_BUFFER, quadTexCoordID);
glBufferData(GL_ARRAY_BUFFER, 8 * sizeof(GLfloat), &texCoord, GL_STATIC_DRAW);
// Draw Quad
glEnableVertexAttribArray(0);
glEnableVertexAttribArray(1);
glBindBuffer(GL_ARRAY_BUFFER, quadVertexID);
glVertexAttribPointer(0, 4, GL_FLOAT, GL_FALSE, 0, 0);
glBindBuffer(GL_ARRAY_BUFFER, quadTexCoordID);
glVertexAttribPointer(1, 2, GL_FLOAT, GL_FALSE, 0, 0);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, quadIndexID);
glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_SHORT, 0);

12.6.4. Reserved Uniform

The following table lists the reserved uniforms used for gas.

Table 12-16. Reserved Uniforms Used for Gas

Reserved Uniform Type Setting Value


Specifies the fragment operation mode. Specify gas mode
in the density rendering pass. Specify standard mode in
dmp_FragOperation.mode int any other pass.
GL_FRAGOP_MODE_GL_DMP (standard mode)
GL_FRAGOP_MODE_SHADOW_DMP (shadow mode)
Specifies the mode for processing the fog pipeline. Specify
gas mode in the shading pass. Disable fog in the density
rendering pass.
dmp_Fog.mode int
Specify GL_GAS_DMP for gas mode in the shading pass,
and GL_FALSE to disable fog in the density rendering
pass.
Specifies the lookup table to use for fog coefficients. Fog
coefficients are used to calculate alpha values for gaseous
dmp_Fog.sampler int
bodies.
0 through 31
Specifies the attenuation coefficient EZ in the depth
dmp_Gas.deltaZ float direction.
10.0 (default)
Specifies whether to automatically calculate the reciprocal
of the maximum density value.
dmp_Gas.autoAcc bool
GL_TRUE: Automatically calculate (default).
GL_FALSE: Do not automatically calculate.
Specifies the reciprocal of the maximum density value.
dmp_Gas.accMax float 0.0 or more
1.0 (default)
dmp_Gas.samplerTR Specifies the shading lookup table for each RGB
dmp_Gas.samplerTG int component.
dmp_Gas.samplerTB 0 through 31
Specifies the density information to use for shading.
dmp_Gas.shadingDensitySrc int Specify GL_GAS_PLAIN_DENSITY_DMP (default) or
GL_GAS_DEPTH_DENSITY_DMP.
Specifies whether the density or shading intensity is given
as input to the shading lookup tables.
dmp_Gas.colorLutInput int
Specify GL_GAS_LIGHT_FACTOR_DMP (default) or
GL_GAS_DENSITY_DMP.
Specifies factors used to control planar shading: the
minimum intensity, maximum intensity, and density
dmp_Gas.lightXY vec3 attenuation.
(lightMin, lightMax, lightAtt)
Each control value is between 0.0 and 1.0.
The default value is (0.0, 0.0, 0.0).

Specifies factors used to control view shading: the


minimum intensity, maximum intensity, density attenuation,
and effect in the view direction.
dmp_Gas.lightZ vec4
(scattMin, scattMax, scattAtt, LZ)
Each control value is between 0.0 and 1.0.
The default value is (0.0, 0.0, 0.0, 0.0).

Specifies the density attenuation coefficient to use when


calculating alpha values for shading.
dmp_Gas.attenuation float
0.0 or more
1.0 (default)

CONFIDENTIAL

13. Per-Fragment Operations


Per-fragment operations update the pixel data stored at a fragment's window coordinates in the
framebuffer, based on the lighting results and other related data.

This does not mean that all fragments are taken into account. Unnecessary fragments are rejected by
tests, such as the alpha test, and then existing pixel data and colors are combined through blending
and logical operations.

When the fragment operation mode (mp_FragOperation.mode) is in standard mode


(GL_FRAGOP_MODE_GL_DMP), per-fragment operations occur as in the process in Figure 13-1.

Figure 13-1. Per-Fragment Operations Process in Standard Mode

The 3DS specifications differ from the OpenGL ES 2.0 specifications in the following ways.

The scissor test is run during rasterization on a CTR system.


The glSampleCoverage() function has not been implemented because multisampling is not
supported.
The alpha test provides features equivalent to those in OpenGL ES 1.1, but it is controlled entirely
through reserved uniforms.
The glStencil*Separate() functions have not been implemented because the stencil test does
not distinguish between front-facing polygons and back-facing polygons.
Logical operators (GL_LOGIC_OP) cannot be selected for blending.
GL_SRC_ALPHA_SATURATE can be selected for the destination in the glBlendFunc* blending
functions, even though this is not allowed by the OpenGL ES specifications.
Dithering is not applied.
Logical operations are implemented to correspond with the OpenGL ES 1.1 specifications.

13.1. Alpha Test

The alpha test compares a fragment's alpha value with a reference value, to determine whether to
pass the fragment to the next process or reject it.

The 3DS alpha test has features that correspond to the alpha test in OpenGL ES 1.1, but they are all
controlled by reserved uniforms.

13.1.1. Reserved Uniform

The following reserved uniforms are used for the alpha test.

Enabling or Disabling the Alpha Test

To enable the alpha test, set the reserved uniform dmp_FragOperation.enableAlphaTest to


GL_TRUE, using the glUniform1i() function. Disabled (set to GL_FALSE) by default.

Comparison Function

Use the glUniform1i() function to set the comparison function in the reserved uniform
dmp_FragOperation.alphaTestFunc. You can specify one of the following eight values.

Table 13-1. Comparison Functions for the Alpha Test

Setting Value Comparison


GL_NEVER Always rejected (never passes).
GL_ALWAYS (default) Always passes.

GL_LESS Passes if the alpha value is less than the reference value.
GL_LEQUAL Passes if the alpha value is less than, or equal to, the reference value.
GL_EQUAL Passes if the alpha value is equal to the reference value.
GL_GEQUAL Passes if the alpha value is greater than, or equal to, the reference value.

GL_GREATER Passes if the alpha value is greater than the reference value.
GL_NOTEQUAL Passes if the alpha value is not equal to the reference value.

Reference Value

Use the glUniform1f() function to set the reference value for the alpha test in the reserved
uniform dmp_FragOperation.alphaRefValue. The reference value is clamped between 0.0 and
1.0 when it is used for comparisons.

The following table lists the reserved uniforms used for the alpha test.
Table 13-2. Reserved Uniforms Used for the Alpha Test

Reserved Uniform Type Setting Value


Specifies whether to enable or disable the alpha
test.
dmp_FragOperation.enableAlphaTest bool
GL_TRUE
GL_FALSE (default)
Specifies the reference value to use with the
alpha test.
dmp_FragOperation.alphaRefValue float
0.0–1.0
0.0 by default.
Specifies the comparison function for the alpha
test.
GL_NEVER
GL_ALWAYS (default)
GL_LESS
dmp_FragOperation.alphaTestFunc int
GL_LEQUAL
GL_EQUAL
GL_GEQUAL
GL_GREATER
GL_NOTEQUAL

13.2. Stencil Test

The stencil test compares a reference value with the value stored at a fragment's window coordinates
in the stencil buffer to determine whether to pass the fragment to the next process or reject it.

The glStencilFuncSeparate() and glStencilOpSeparate() functions have not been


implemented because the 3DS system does not distinguish between front-facing polygons and back-
facing polygons.

13.2.1. How to Use

The stencil test is used in the same way it is used in OpenGL.

Enabling or Disabling the Stencil Test

To enable or disable the stencil test, call the glEnable() or glDisable() function, passing
GL_STENCIL_TEST for cap. To get the current setting, call the glIsEnabled() function,
specifying GL_STENCIL_TEST for cap. The stencil test is disabled by default. When it is disabled,
the stencil buffer is not changed and fragments are not rejected. If a stencil buffer has not been
bound to the framebuffer, the stencil test is treated as if it were disabled. There is no performance
difference between when both the stencil test and the depth test are enabled and when only one of
the two is enabled.

Comparison Function, Reference Value, and Mask

The stencil test has three elements: the comparison function, reference value, and mask, which are
specified by the glStencilFunc() function.

Code 13-1. Definition of the glStencilFunc Function


void glStencilFunc(GLenum func, GLint ref, GLuint mask);

For the ref parameter, specify an integer to use as the reference value during comparisons. The
reference value is handled as an unsigned number when it is compared by the stencil test, and it is
clamped between 0 and 255 because the 3DS stencil buffer is always 8-bit. A value of 0 is used by
default.

For the mask parameter, specify the mask. Bitwise AND operations apply the mask to both the
reference value and the value in the stencil buffer to get the values to use during comparisons. A
value of 0xFFFFFFFF (all 1s) is used by default.

For the func parameter, specify the comparison function. You can specify one of the following eight
values.

Table 13-3. Comparison Functions for the Stencil Test

Setting Value Comparison


GL_NEVER Always rejected (never passes).
GL_ALWAYS
Always passes.
(default)
GL_LESS Passes if the reference value is less than the value in the stencil buffer.
Passes if the reference value is less than, or equal to, the value in the stencil
GL_LEQUAL
buffer.

GL_EQUAL Passes if the reference value is equal to the value in the stencil buffer.
Passes if the reference value is greater than, or equal to, the value in the stencil
GL_GEQUAL
buffer.
GL_GREATER Passes if the reference value is greater than the value in the stencil buffer.

GL_NOTEQUAL Passes if the reference value is not equal to the value in the stencil buffer.

Handling Test Results

Using the glStencilOp() function, you can specify how the contents of the stencil buffer are
modified as a result of both the stencil test (which compares the reference value with the stencil
buffer) and the depth test, which will be described later.

Code 13-2. Definition of the glStencilOp Function

void glStencilOp(GLenum fail, GLenum zfail, GLenum zpass);

The fail parameter specifies how the contents of the stencil buffer are modified when a fragment
is rejected by the stencil test. You can specify one of the following eight values.

Table 13-4. Modifying Stencil Buffer Contents

Setting Value Stencil Buffer Content


GL_KEEP (default) Keep the current value.
GL_ZERO Set the value equal to 0.
GL_REPLACE Replace the value with the reference value.

GL_INCR Increment the value by 1. This is never greater than 255.

GL_DECR Decrement the value by 1. This is never less than 0.

GL_INVERT Bitwise invert the value.


GL_INCR_WRAP Increment the value by 1. This wraps to 0 after 255.
GL_DECR_WRAP Decrement the value by 1. This wraps to 255 after 0.

The zfail parameter configures how the contents of the stencil buffer are modified when a
fragment is rejected by the depth test, and the zpass parameter configures how the contents of the
stencil buffer are modified when a fragment passes the depth test. For these parameters, specify
one of the eight values that is valid for the fail parameter, earlier.

13.3. Early Depth Test

The depth test compares a fragment's depth value with the value stored at the fragment's window
coordinates in the depth buffer to determine whether to pass the fragment to the next process or
reject it. As explained for the stencil buffer, the results of the depth test can also change the content
of the stencil buffer.

13.3.1. How to Use

The depth test is used in the same way it is used in OpenGL.

Enabling or Disabling the Depth Test

To enable or disable the depth test, call the glEnable or glDisable() function, specifying
GL_DEPTH_TEST for the cap parameter. To get the current setting, call the glIsEnabled()
function, specifying GL_DEPTH_TEST for the cap parameter. The depth test is disabled by default.
When it is disabled, the depth buffer is not modified, and fragments are not rejected. If a depth
buffer has not been bound to the framebuffer, the depth test is treated as if it were disabled. There
is no performance difference between when both the stencil test and the depth test are enabled and
when only one of the two is enabled.

Comparison Function

Use the glDepthFunc() function to set the comparison function.

Code 13-3. Definition of the glDepthFunc Function

void glDepthFunc(GLenum func);

For func, specify one of the following eight values.

Table 13-5. Comparison Functions for the Depth Test

Setting Value Comparison

GL_NEVER Always rejected (never passes).


GL_ALWAYS Always passes.

GL_LESS (default) Passes if the depth value is less than the value in the depth buffer.
GL_LEQUAL Passes if the depth value is less than, or equal to, the value in the depth buffer.

GL_EQUAL Passes if the depth value is equal to the value in the depth buffer.
GL_GEQUAL Passes if the depth value is greater than, or equal to, the value in the depth buffer.
GL_GREATER Passes if the depth value is greater than the value in the depth buffer.

GL_NOTEQUAL Passes if the depth value is not equal to the value in the depth buffer.

If a fragment is rejected as a result of the comparison, the depth buffer is not modified. If a
fragment passes, its depth value overwrites the value in the depth buffer.

13.4. Blending

Blending combines a fragment's color (the source color) with the color stored at the fragment's
window coordinates in the framebuffer (the destination color). The blended result is passed to the
next process as the fragment's color.

13.4.1. How to Use

Blending is used in the same way it is used in OpenGL.

Enabling or Disabling Blending

To enable or disable blending, call the glEnable or glDisable() function, specifying GL_BLEND
for cap. To get the current setting, call the glIsEnabled() function, specifying GL_BLEND for
cap. Blending is disabled by default. When it is disabled, blending is skipped.

Blending is considered to be disabled, either when a color buffer has not been bound to the
framebuffer or when logical operations (described later) are enabled.

Blend Equations

You can use the glBlendEquation or glBlendEquationSeparate() function to specify how a


fragment's color (the source color) is combined with a framebuffer color (the destination color).

Code 13-4. Definitions of the glBlendEquation* Functions

void glBlendEquation(GLenum mode);


void glBlendEquationSeparate(GLenum modeRGB, GLenum modeAlpha);

The glBlendEquation() function allows you to specify equations for both the RGB and alpha
components by using the mode parameter.

The glBlendEquationSeparate() function allows you to specify equations for the RGB and
alpha components, separately, using the modeRGB and modeAlpha parameters, respectively. All of
these parameters accept the following five values.

Table 13-6. Blending Equations

Setting Value RGB Components Alpha Component

R = Rs * Sr + Rd * Dr
GL_FUNC_ADD (default) G = Gs * Sg + Gd * Dg A = As * Sa + Ad * Da
B = Bs * Sb + Bd * Db
R = Rs * Sr + Rd * Dr
GL_FUNC_SUBTRACT G = Gs * Sg + Gd * Dg A = As * Sa - Ad * Da
B = Bs * Sb + Bd * Db
R = Rd * Sr - Rs * Dr
GL_FUNC_REVERSE_SUBTRACT G = Gd * Sg - Gs * Dg A = Ad * Sa - As * Da
B = Bd * Sb - Bs * Db

R = min( Rs, Rd )
GL_MIN G = min( Gs, Gd ) A = min( As, Ad )
B = min( Bs, Bd )

R = max( Rs, Rd )
GL_MAX G = max( Gs, Gd ) A = max( As, Ad )
B = max( Bs, Bd )

Rs, Gs, Bs, and As represent the source color components.


Rd, Gd, Bd, and Ad represent the destination color components.
Sr, Sg, Sb, and Sa represent the source blend factors.
Dr, Dg, Db, and Da represent the destination blend factors.

GL_LOGIC_OP, which can be specified in OpenGL ES 2.0, is not supported.

Source and Destination Blend Factors

You can use the glBlendFunc or glBlendFuncSeparate() function to specify the blend factors
to apply to the source and destination.

Code 13-5. Definitions of the glBlendFunc* Functions

void glBlendFunc(GLenum sfactor, GLenum dfactor);


void glBlendFuncSeparate(GLenum srcRGB, GLenum dstRGB, GLenum srcAlpha,
GLenum dstAlpha);

The glBlendFunc() function allows you to specify both source and destination blend factors
using sfactor and dfactor. The glBlendFuncSeparate() function allows you to specify the
blend factors for the source RGB and alpha components using srcRGB and srcAlpha, and the
blend factors for the destination RGB and alpha components using dstRGB and dstAlpha. All of
these parameters accept the following 15 values.

Table 13-7. Source and Destination Blend Factors

Color Blend Factor


Alpha Blend Factor (Sa or
Setting Value (Sr, Sg, Sb) or (Dr, Dg,
Da)
Db)
GL_ZERO (default for dstRGB and
(0, 0, 0) 0
dstAlpha)
GL_ONE (default for srcRGB and
(1, 1, 1) 1
srcAlpha)

GL_SRC_COLOR (Rs, Gs, Bs) As


GL_ONE_MINUS_SRC_COLOR (1, 1, 1) - (Rs, Gs, Bs) 1 - As
GL_DST_COLOR (Rd, Gd, Bd) Ad
GL_ONE_MINUS_DST_COLOR (1, 1, 1) - (Rd, Gd, Bd) 1 - Ad

GL_SRC_ALPHA (As, As, As) As


GL_ONE_MINUS_SRC_ALPHA (1, 1, 1) - (As, As, As) 1 - As

GL_DST_ALPHA (Ad, Ad, Ad) Ad


GL_ONE_MINUS_DST_ALPHA (1, 1, 1) - (Ad, Ad, Ad) 1 - Ad

GL_CONSTANT_COLOR (Rc, Gc, Bc) Ac


GL_ONE_MINUS_CONSTANT_COLOR (1, 1, 1) - (Rc, Gc, Bc) 1 - Ac
GL_CONSTANT_ALPHA (Ac, Ac, Ac) Ac

GL_ONE_MINUS_CONSTANT_ALPHA (1, 1, 1) - (Ac, Ac, Ac) 1 - Ac


GL_SRC_ALPHA_SATURATE (f, f, f) 1

Rs, Gs, Bs, and As represent the source color.


Rd, Gd, Bd, and Ad represent the destination color.
Rc, Gc, Bc, and Ac represent the constant color.
f = min(As, 1 - Ad).

The 3DS system also allows you to specify GL_SRC_ALPHA_SATURATE for the destination.

Constant Color

You can specify the constant color using the glBlendColor() function.

Code 13-6. Definition of glBlendColor Function

void glBlendColor(GLclampf red, GLclampf green, GLclampf blue, GLclampf alpha);

Specify each of the RGB and alpha components using the red, green, blue, and alpha
parameters. Specify a floating-point number between 0.0 and 1.0 for each component. By default,
they are all 0.0 (0.0, 0.0, 0.0, 0.0).

13.5. Logical Operations

The final per-fragment operation is the logical operation, which is applied to fragment colors and
framebuffer colors for an image. The result of the logical operation is written at the fragment's
window coordinates in the framebuffer.

13.5.1. How to Use

Logical operations are used in the same way they are used in OpenGL ES 1.1.

Enabling or Disabling Logical Operations

To enable or disable logical operations, call the glEnable or glDisable() function, specifying
GL_COLOR_LOGIC_OP for the cap parameter. To get the current setting, call the glIsEnabled()
function, specifying GL_COLOR_LOGIC_OP for the cap parameter. Logical operations are disabled
by default. When disabled, logical operations are skipped, but fragment colors are written to the
framebuffer. Blending is disabled when logical operations are enabled.

Operation

Use the glLogicOp() function to specify the logical operation to be performed on the fragment
color (the source) and the framebuffer color (the destination). To get the current setting, call the
glGetIntegerv() function, specifying GL_LOGIC_OP_MODE as the pname parameter. The default
setting is GL_COPY.

Code 13-7. Definition of the glLogicOp Function

void glLogicOp(GLenum opcode);

Specify the logical operation for both the RGB and alpha components, by using the opcode
parameter.

Table 13-8. Logical Operations

Setting Value Operation C Notation


GL_CLEAR 0 0
GL_AND s ∧ d s & d

GL_AND_REVERSE s ∧ ¬ d s & ~d
GL_COPY (default) s s
GL_AND_INVERTED ¬ s ∧ d ~s & d
GL_NOOP d d

GL_XOR s xor d s ^ d
GL_OR s ∨ d s | d
GL_NOR ¬ (s ∨ d) ~(s | d)

GL_EQUIV ¬ (s xor d) ~(s ^ d)


GL_INVERT ¬ d ~d
GL_OR_REVERSE s ∨ ¬ d s | ~d
GL_COPY_INVERTED ¬ s ~s

GL_OR_INVERTED ¬ s ∨ d ~s | d
GL_NAND ¬ (s ∧ d) ~(s & d)
GL_SET all 1 pow(2, n) - 1

n is the number of bits in each component.

13.6. Masking the Framebuffer

You can apply a masking operation to the color (RGBA), stencil, and depth values that are written to
the framebuffer by per-fragment operations. These values can be configured by the glColorMask,
glStencilMask, and glDepthMask() functions, respectively.

Code 13-8. Framebuffer Masking Functions

void glColorMask(GLboolean red, GLboolean green, GLboolean blue,


GLboolean alpha);
void glStencilMask(GLuint mask);
void glDepthMask(GLboolean flag);

You can specify whether to allow (GL_TRUE) or deny (GL_FALSE) the writing of each RGBA color
value and depth value. These values are all set to GL_TRUE by default.
You can modify the stencil masking value. This masking value is independent of the masking value
for the stencil test. It is 0xFFFFFFFF (all 1s) by default.

CONFIDENTIAL

14. Framebuffer Operations


Framebuffer operations read, copy, and otherwise manipulate framebuffer data.

The features supported by the 3DS system are restricted as follows.

The glDrawBuffer() function has not been implemented.


The glReadBuffer() function has not been implemented.
The glDrawPixels() function has not been implemented.
The glReadPixels() function can get the contents of the color, depth, and stencil buffers.
The glCopyPixels() function has not been implemented.
The glPixelStore*() functions have not been implemented.

14.1. Reading Pixels

You can use the glReadPixels() function to take the contents of a render buffer (the color, depth,
or stencil buffer) bound to the framebuffer and place it into memory. You can also read the contents
of the depth and stencil buffers into memory, using a combined format.

Code 14-1. glReadPixels Function

void glReadPixels(GLint x, GLint y, GLsizei width, GLsizei height,


GLenum format, GLenum type, void* pixels);

For x and y, specify the starting coordinates (the lower-left corner) of the rectangular region to
capture. For width and height, specify the width and height of the rectangular region. A
GL_INVALID_VALUE error is generated if any of these values are negative. A
GL_INVALID_OPERATION error is generated when the sum of x and width exceeds the render
buffer width, or when the sum of y and height exceeds the render buffer height.

Using a combination of format and type, you can specify the format of the image to capture.

Table 14-1. Formats That Can Be Specified for Reading

format type Format Bits

GL_UNSIGNED_BYTE RGBA8 32
GL_RGBA GL_UNSIGNED_SHORT_4_4_4_4 RGBA4 16

GL_UNSIGNED_SHORT_5_5_5_1 RGBA5551 16
GL_UNSIGNED_BYTE RGB8 24
GL_RGB
GL_UNSIGNED_SHORT_5_6_5 RGB565 16
GL_UNSIGNED_BYTE BGRA8 32
GL_BGRA GL_UNSIGNED_SHORT_4_4_4_4 BGRA4 16

GL_UNSIGNED_SHORT_5_5_5_1 BGRA5551 16
GL_LUMINANCE_ALPHA GL_UNSIGNED_BYTE LA8 16
GL_LUMINANCE GL_UNSIGNED_BYTE L8 8
GL_ALPHA GL_UNSIGNED_BYTE A8 8

GL_UNSIGNED_INT Depth 32
GL_UNSIGNED_INT24_DMP Depth 24
GL_DEPTH_COMPONENT
GL_UNSIGNED_SHORT Depth 16
GL_UNSIGNED_BYTE Depth 8

GL_STENCIL_INDEX GL_UNSIGNED_BYTE Stencil 8


GL_DEPTH24_STENCIL8_EXT GL_UNSIGNED_INT Depth + Stencil 32

A GL_INVALID_ENUM error is generated if you specify a combination that is not in the table.

The captured image is stored in pixels, using the specified format.

Images taken from the color buffer are stored in linear format, using a pixel byte order that is
OpenGL-compliant.

Images read from the depth buffer are also stored in linear format, but the pixel data is ordered
starting with the least-significant bits. For example, an image captured using a 24-bit format is stored
in memory one byte at a time, starting with the least-significant byte and ending with the most-
significant byte.

Images read from the stencil buffer are stored in linear format, with one byte per pixel.

Images read from the depth and stencil buffers in a combined format are also stored in linear format,
but the pixel data is ordered starting with the least-significant bits. 24 bits of depth data are stored
first, one byte at a time, from the least-significant to the most-significant byte, followed by 8 bits of
stencil data.

A GL_INVALID_FRAMEBUFFER_OPERATION error is generated if an error related to the framebuffer


occurs (for example, if a color buffer has not been bound to the framebuffer). A GL_OUT_OF_MEMORY
error is generated when temporary memory fails to be allocated within the library.

14.2. Copying Pixels

The glCopyPixels() function does not support copying data to other framebuffers.

You can call the glCopyTexImage2D() and glCopyTexSubImage2D() functions to copy the
content of the color buffer to a texture. For more information, see 7.5.1. Copying From the Color
Buffer.

14.3. Texture Rendering

You can bind a texture to the framebuffer as a render buffer. For more information, see 7.6.
Specifying a Texture as the Render Target.
14.4. Clearing the Framebuffer

In OpenGL, the scissor test and masking affect how the glClear() function clears (every buffer
bound to) the framebuffer. On a 3DS system, however, the scissor test and masking are not included
in the graphics pipeline and have no effect.

You can bind three render buffers to the framebuffer, as explained in 3.3.1. Render Buffer Allocation:
a color buffer (which could be a texture or a buffer used to render gas), a depth buffer, and a stencil
buffer. Index colors are not supported.

To clear a buffer, call the glClear() function with the buffer's corresponding bit specified as an
argument.

Code 14-2. Definition of the glClear Function

void glClear(GLbitfield mask);

You can specify the following values as a bit field in mask. To clear multiple buffers simultaneously,
specify the bit field using bitwise OR operations.

Table 14-2. Specifying the Buffers to Clear

mask Buffer to Clear


GL_COLOR_BUFFER_BIT Color buffer.
GL_DEPTH_BUFFER_BIT Depth buffer.
Stencil buffer. This must be specified together with
GL_STENCIL_BUFFER_BIT
GL_DEPTH_BUFFER_BIT.

You cannot clear the 3DS stencil buffer alone because it is combined with the depth buffer
(GL_DEPTH24_STENCIL8_EXT).

14.4.1. Clear Color

You can use the glClearColor() function to specify the color to use when clearing the color
buffer.

Code 14-3. Specifying the Clear Color

void glClearColor(GLclampf red, GLclampf green, GLclampf blue, GLclampf alpha);

Each of the specified color components is clamped between 0.0 and 1.0. By default, all components
are set to 0.0.

14.4.2. Clear Depth

You can use the glClearDepthf() function to specify the depth value to use when clearing the
depth buffer.
Code 14-4. Specifying the Clear Depth

void glClearDepthf(GLclampf depth);

The specified values are clamped between 0.0 and 1.0. The default value is 1.0.

14.4.3. Clear Stencil

You can use the glClearStencil() function to specify the value to use when clearing the stencil
buffer.

Code 14-5. Specifying the Clear Stencil

void glClearStencil(GLint s);

Specify a value between 0 and 255 to use for clear operations. The default value is 0.

CONFIDENTIAL

15. Other

15.1. glFinish and glFlush Functions

The glFinish() function is the same as the glFlush() function on a 3DS system.

Code 15-1. Definitions of the glFinish and glFlush Functions

void glFinish(void);
void glFlush(void);

15.2. Differences From OpenGL ES

Most features and functions follow the OpenGL ES specifications, but there are some differences. For
example, some OpenGL ES functions and features have not been implemented, and others have
restrictions.

Table 15-1. Differences by Feature

Features Differences
GL_ALPHA_TEST Reserved uniforms are used to control the alpha test.

GL_CLIP_PLANEi Reserved uniforms are used to control clipping.


Logical operations are not present in OpenGL ES 2.0, but the
GL_COLOR_LOGIC_OP equivalent features from OpenGL ES 1.1 have been implemented
for 3DS.
GL_DITHER Dithering is not supported.
Reserved uniforms are used to control fog. Fog coefficients are set
GL_FOG by a lookup table that takes depth values in window coordinates as
input, rather than by the distance from the viewpoint.
GL_INDEX_LOGIC_OP Index colors are not supported.
Reserved uniforms are used to control lighting. 3DS lighting is a
GL_LIGHTING
per-fragment operation.
GL_POLYGON_OFFSET_FILL
Only GL_POLYGON_OFFSET_FILL is supported for controlling
GL_POLYGON_OFFSET_LINE
polygon offsets.
GL_POLYGON_OFFSET_POINT
GL_SAMPLE_COVERAGE
Multisampling is not supported.
GL_SAMPLE_ALPHA_TO_COVERAGE

GL_SCISSOR_TEST The scissor test is run as a subprocess of rasterization.


GL_TEXTURE_2D
Reserved uniforms are used to control texture units.
GL_TEXTURE_CUBE_MAP

Table 15-2. Differences by Function

Function Differences
Shader Functions

glBufferData Only GL_STATIC_DRAW can be specified for usage.


glCompileShader This function is not implemented.
This function uses a (13-bit) namespace that is independent of
glCreateProgram
shader objects.

This function uses a namespace that is independent of program


glCreateShader
objects.
GL_POINTS, GL_LINES, GL_LINE_STRIP, and GL_LINE_LOOP are
glDrawArrays(), glDrawElements() not supported. You must use the geometry shader to render points
and lines.
glGetProgramInfoLog This function is not implemented.
glGetShaderInfoLog This function is not implemented.

glGetShaderSource() This function is not implemented.

glGetShaderPrecisionFormat This function is not implemented.


glLineWidth This function is not implemented.

glReleaseShaderCompiler This function is not implemented.


Only GL_PLATFORM_BINARY_DMP can be specified for
glShaderBinary
binaryformat.
glShaderSource This function is not implemented.

glValidateProgram This function does nothing when it is called.


Neither GL_FIXED nor GL_UNSIGNED_SHORT can be specified for
type. The data pointer must be 4-byte aligned when specifying
glVertexAttribPointer GL_FLOAT for type, or 2-byte aligned when specifying GL_SHORT.
This function does not support normalization specified via
arguments.
Viewport Functions

glViewport You cannot specify a negative value for x or y.

Texture Functions
glActiveTexture An error is generated when GL_TEXTURE3 is specified.

The width and height must be specified as powers of 2 (from 16


glCompressedTexImage2D
through 1024).
glCompressedTexSubImage2D This function is not implemented.

glGenerateMipmap This function is not implemented.


glTexImage1D These functions are used to load lookup tables. One-dimensional
glTexSubImage1D textures are not supported.
The width and height must be specified as powers of 2 (from 8
glTexImage2D
through 1024).
glTexSubImage2D This function is not implemented.
Render Buffer Functions

Index colors are not supported. The stencil buffer must be cleared at
glClear the same time as the depth buffer. This function is unaffected by the
scissor test and masking.
glCopyPixels This function is not implemented.

glDrawBuffer This function is not implemented.


glDrawPixels This function is not implemented.
Because the stencil buffer is combined with the depth buffer,
glFramebufferRenderbuffer GL_DEPTH_STENCIL_ATTACHMENT must be specified as the
attachment point for the stencil buffer.
glFramebufferTexture2D The color buffer is the only supported attachment point.
glPixelStorei This function is not implemented.

glReadBuffer This function is not implemented.


There are restrictions on which formats can be specified. The only
glRenderbufferStorage format that can be used for the stencil buffer is
GL_DEPTH24_STENCIL8_EXT.
glSampleCoverage This function is not implemented.
glStencilFuncSeparate
glStencilOpSeparate This function is not implemented.
glStencilMaskSeparate
Blending Functions
glBlendEquation GL_LOGIC_OP, which can be specified in OpenGL ES 2.0, is not
glBlendEquationSeparate supported.
glBlendFunc
GL_SRC_ALPHA_SATURATE can also be specified for the destination.
glBlendFuncSeparate
Other

glFinish This function does the same thing as glFlush.


glFlush There is no difference until glFinish is called.
glHint This function is not implemented.
glPolygonOffset The value set for factor is ignored.

15.3. Improving Buffer Access Performance

When you are not using either the color, depth, or stencil buffer, you can explicitly disable that
buffer's features to reduce unnecessary processing. Because the same buffer is used for the 3DS
depth and stencil buffers, however, settings to access either one have the same performance as
settings to access both.

The following sections describe the conditions for accessing each buffer. Avoid these conditions to
prevent unnecessary accesses from occurring.
15.3.1. Write Access to the Color Buffer

These accesses occur when the glColorMask() function specifies a value of GL_TRUE for any
component.

15.3.2. Read Access to the Color Buffer

These accesses occur when write accesses to the color buffer occur and any of the following
conditions are met.

GL_BLEND has been enabled by the glEnable() function.


A call of the glColorMask() function has not specified the same values for all components.
GL_COLOR_LOGIC_OP has been enabled by the glEnable() function.

15.3.3. Write Access to the Depth Buffer

These accesses occur when GL_DEPTH_TEST has been enabled by the glEnable() function and
GL_TRUE has been specified by the glDepthMask() function.

15.3.4. Read Access to the Depth Buffer

These accesses occur when GL_DEPTH_TEST has been enabled by the glEnable() function.

15.3.5. Write Access to the Stencil Buffer

These accesses occur when GL_STENCIL_TEST has been enabled by the glEnable() function,
and a nonzero masking setting is used in the glStencilMask() function.

15.3.6. Read Access to the Stencil Buffer

These accesses occur when GL_STENCIL_TEST has been enabled by the glEnable() function.

15.4. Improving CPU Performance

You may be able to improve processing speed by paying attention to the following points when you
implement your application.
Link as many vertex shader objects together as possible. You can link multiple shader objects
together to generate a single shader binary. It is less resource intensive to switch between
shader objects when they are linked to the same shader binary than when they are linked to
different shader binaries.
Ensure that applications keep the uniform location values obtained by the
glGetUniformLocation() function, and use them repeatedly. Location values are static after
the glLinkProgram() function is called. They do not change until glLinkProgram is called
again.
Do not call the nngxSplitDrawCmdlist() function unnecessarily because it generates a split
command each time it is called. For example, when the nngxTransferRenderImage() function
is called, it generates a split command internally. Calling nngxSplitDrawCmdlist immediately
afterwards would generate an unnecessary command.
Use a vertex buffer, whenever possible, to send vertex attribute data to a shader. If you do not
use a vertex buffer, the CPU accumulates vertex data in the 3D command buffer, and CPU
processing increases significantly.
Use a texture collection or a vertex state collection when the same texture or vertex buffer is
used repeatedly in a particular rendering pass. By binding all textures and setting all vertex
arrays at the same time, you can reduce the cost of function calls.
When executing the same shader object with different uniform settings, it is less resource
intensive to attach that shader object to multiple program objects (each with its own uniform
settings) and then switch between the program objects than it is to switch between many uniform
settings on a single program object. The reason is that uniform values are saved for each
program object.
Do not delete and regenerate lookup table objects frequently. After the lookup table data is
loaded by the glTexImage1D() function, it is converted into an internal hardware format each
time the lookup table object is used.

15.5. Using Vertex Buffers

When a vertex buffer is used, the GPU geometry pipeline loads vertex data. When a vertex buffer is
not used, however, the CPU sorts the vertex arrays according to the vertex index arrays, converts all
vertex data into 24-bit floating-point values, and then fills the command buffer. This process imposes
a considerably high processing load on the CPU, and also decreases the efficiency with which the
GPU geometry pipeline loads data. This also requires a larger command buffer. All vertex data is
converted into 24-bit floating-point numbers for the x, y, z, and w components when it is loaded into
the command buffer, which must be able to hold at least 12 bytes multiplied by both the number of
vertices and the number of vertex attributes.

Vertex buffers can be processed more quickly when they are placed in VRAM than when they are
placed in main (device) memory. Splitting them between VRAM and main memory results in the same
speed as placing them in main memory.

15.5.1. Data Structure for Vertex Arrays

A vertex array can either be structured as an interleaved array, which is an array of structures that
contain multiple vertex attributes, or as an independent array, which is an array of single vertex
attributes.

When you use a vertex buffer, interleaved arrays are more efficient at loading vertex data than
independent arrays. The time spent loading vertex data is often hidden by the time spent for
processing later on, such as in the vertex shader and during rasterization. When the vertex buffer is
placed in main memory, however, by making the loading of data more efficient, you can sometimes
reduce the cost of data access and speed up processing.

15.6. Getting the Starting Address for Each Buffer Type

You can get the starting addresses of the data regions allocated for each texture object, vertex buffer
object, and render buffer object.

The GPU can directly access all obtained addresses. You cannot get the address of the copy region
that is generated by the driver.

Code 15-2. Getting the Starting Address for Each Buffer Type

void glGetTexParameteriv(GLenum target, GLenum pname, GLint* params);


void glGetBufferParameteriv(GLenum target, GLenum pname, GLint* params);
void glGetRenderbufferParameteriv(GLenum target, GLenum pname, GLint* params);

To get the texture address, call the glGetTexParameteriv() function and specify
GL_TEXTURE_DATA_ADDR_DMP for pname.

To get the vertex buffer address, call the glGetBufferParameteriv() function and specify
GL_BUFFER_DATA_ADDR_DMP for pname.

To get the render buffer address, call the glGetRenderbufferParameteriv() function and specify
GL_RENDERBUFFER_DATA_ADDR_DMP for pname.

These functions allow you to get various types of information, depending on the value passed in for
pname. For more information about the information that can be obtained, see the CTR-SDK API
Reference.

15.7. Number of Bytes Loaded for Various Data Types

This section describes the number of bytes that the GPU loads from vertex buffers, textures, and
command buffers in a single operation.

15.7.1. Vertex Buffers

The number of bytes loaded concurrently from a vertex buffer depends on the order of the vertex
indices.

The glDrawElements() function loads 16 vertex indices concurrently from an index array, sorts
them, and then loads data from the vertex array in the same order as the sorted indices.
Consecutive data is loaded from the vertex array for any consecutive vertex indices.

Multiple vertex attributes are loaded as an interleaved array from a vertex array if: (1) the array
contains vertex attributes and has been enabled by the glEnableVertexAttribArray()
function, and (2) the driver interprets the array as an interleaved array (a vertex array that
combines multiple vertex attributes) based on the information specified by the
glVertexAttribPointer() function.

Burst reads of up to 256 bytes are used to load consecutive data from a vertex array. If there are
more than 256 bytes to be loaded, they are read 256 bytes at a time. Data is read at least 16 bytes
at a time, even for non-consecutive data.

The glDrawArrays() function processes data just like an index array, using consecutive numbers
starting at 0.

15.7.2. Textures

The number of bytes that are loaded at one time depends on the texture format. The following table
shows the number of bytes that are transferred from VRAM, avoiding the texture cache.

Table 15-3. Texture Format and Loading Units

format type Bytes


GL_UNSIGNED_BYTE 128
GL_RGBA
GL_UNSIGNED_SHORT_5_5_5_1 64
GL_RGBA_NATIVE_DMP
GL_UNSIGNED_SHORT_4_4_4_4 64

GL_RGB GL_UNSIGNED_BYTE 96
GL_RGB_NATIVE_DMP GL_UNSIGNED_SHORT_5_6_5 64

GL_LUMINANCE_ALPHA GL_UNSIGNED_BYTE 64
GL_LUMINANCE_ALPHA_NATIVE_DMP GL_UNSIGNED_BYTE_4_4_DMP 32

GL_LUMINANCE GL_UNSIGNED_BYTE 32
GL_LUMINANCE_NATIVE_DMP GL_UNSIGNED_4BITS_DMP 16

GL_ALPHA GL_UNSIGNED_BYTE 32
GL_ALPHA_NATIVE_DMP GL_UNSIGNED_4BITS_DMP 16
GL_HILO8_DMP
GL_UNSIGNED_BYTE 64
GL_HILO8_DMP_NATIVE_DMP
GL_ETC1_RGB8_NATIVE_DMP - 128

GL_ETC1_ALPHA_RGB8_A4_NATIVE_DMP - 32

15.7.3. Command Buffer

A command buffer loads 128 bytes at a time.

15.8. Block-Shaped Noise Is Rendered on Some Pixels

Data in the 3DS framebuffer is processed 4×4 pixels at a time. These blocks of pixels are called
block addresses and are also used to manage the framebuffer cache. Tag information in the cache is
cleared at several times, including when the glFinish, glFlush, or glClear() function is called;
when the framebuffer-related GPU state (NN_GX_STATE_FRAMEBUFFER, NN_GX_STATE_FBACCESS)
is validated; and when the command list is split by the nngxSplitDrawCmdlist() function. Cache
tags are initialized with their default value of 0x3FFF after tag information in the cache is cleared.
Consequently, any pixels that you attempt to render immediately afterward at the same block address
as the default cache tag value (0x3FFF) mistakenly hit the cache. As a result, an incorrect color is
applied to the pixels.

Block addresses are assigned consecutively, beginning at 0, in 16-pixel blocks from the starting
address of the framebuffer (the color buffer, depth buffer, and stencil buffer). Because addresses are
assigned to data that has been laid out in the GPU render format, pixel locations in a rendered image
correspond to different block addresses in block 8 mode and in block 32 mode.

The problem described in this section is triggered by pixels that are assigned a block address of
0x3FFF. The problem does not occur when the total number of framebuffer blocks is less than or
equal to 0x3FFF (in other words, when the total number of framebuffer pixels is less than or equal to
0x3FFF×16, or 262,128 pixels). This is equivalent to a 512×512 rectangle). This problem also does
not occur when there are no read accesses on the color buffer, depth buffer, or stencil buffer.

Note: Cache tag information is also cleared when a value of 1 is written to GPU register
0x0110.

For more information about accessing GPU registers directly and controlling the GPU
state, block mode, and framebuffer read access, see the 3DS Programming Manual:
Advanced Graphics.

15.8.1. Relationship Between Pixels and Block Addresses

As mentioned previously, block addresses begin at 0 and are assigned in ascending order, 16
pixels at a time, from the starting addresses of the color buffer and depth/stencil buffer, which are
laid out in the GPU render format. Unlike the origin for the glViewport() function, the buffer
addresses start with the pixels at the upper-left corner of the image to render. Note, too, that the
image width (the horizontal direction) corresponds to the shorter edge of the LCD.

Because there are different ways to assign addresses, the block mode changes which block
address in the cache corresponds to the pixels on an image.

[Link]. Block 8 Mode

Block address 0 corresponds to the 4×4 block of pixels at the upper-left corner of the rendered
image; block address 1 corresponds to the 4×4 block of pixels immediately to the right of block
address 0; block address 2 corresponds to the 4×4 block of pixels immediately below block
addresses 0; and block address 3 corresponds to the 4×4 block of pixels immediately below block
address 1. Block addresses increase to the right 8×8 pixels at a time. After they reach the edge
of the image, they continue from the left edge of the image on the next row.

Figure 15-1. Block Address Assignments in Block 8 Mode


The value of N in the figure is calculated by taking one-quarter the width of the framebuffer (in
pixels) and multiplying it by two.

[Link]. Block 32 Mode

In block 32 mode, addresses are assigned in metablocks of 32×32 pixels. Metablock address 0
corresponds to the 32×32-pixel region at the upper-left corner of the rendered image, and
metablock address 1 corresponds to the 32×32-pixel region immediately to its right. Metablock
addresses increase to the right 32×32 pixels at a time. After they reach the edge of the image,
they continue from the left edge of the image on the next row.

As the following figure shows, the block addresses of the pixels are arranged in a zigzag pattern
within each metablock, starting with the 4×4-pixel block at the upper-left corner. To find the block
address of a single pixel in the image, multiply its metablock address by 0x40, and then add its
block address within the metablock.

Figure 15-2. Block Address Assignments in Block 32 Mode

The left side of the figure shows the block addresses (in hexadecimal) for pixels within a
metablock. The right side of the figure shows the metablock addresses for the entire image.

15.8.2. Workaround #1

This problem does not occur when the framebuffer has no more than 262,128 pixels (the product of
its width and height). In other words, you can work around this problem by using a framebuffer that
is no larger than necessary.

Note that the problem does not occur with a framebuffer that has the same size as one of the LCDs
—240×400 (96,000 pixels) or 240×320 (76,800 pixels)—because the total number of pixels does not
exceed 262,128.

Although you specify the framebuffer size with the glRenderbufferStorage() function, if you
allocate a large framebuffer and only use a part (240×400 region) of it, you can avoid this problem
by using no more than the minimum necessary size for the allocated framebuffer region.

15.8.3. Workaround #2

You can work around this problem by adjusting the size of the framebuffer so that the problematic
pixels at block address 0x3FFF are located outside of the rendering region.

For example, when you allocate a 480×800 framebuffer (as you would to apply 2×2 antialiasing in
block 8 mode), block address 0x3FFF is assigned to the 44-pixel region whose upper-left corner is
located at pixel coordinates of (124, 548). If you were to extend the size of the framebuffer by 32
pixels to 512×800, however, block address 0x3FFF would be assigned to the 4×4-pixel region
whose upper-left corner is located at pixel coordinates (508, 508). By configuring the viewport to
display only the 480×800 region on the left side of the framebuffer, you can avoid these problematic
pixels.

One disadvantage of this method is that it requires you to allocate a framebuffer that is larger than
necessary, which wastes VRAM. However, it is a simple workaround that only involves adjusting the
framebuffer size.

For more information about how differences in the block mode and framebuffer size affect the pixel
coordinates to which block address 0x3FFF is assigned, see 15.8.1. Relationship Between Pixels
and Block Addresses.

15.8.4. Workaround #3

You can work around this problem by rendering several pixels that are not at block address 0x3FFF
to change the content of cache tags immediately after they have been cleared.

To change the content of the cache tags, you must render four pixels at specific block addresses.
When both the color buffer and depth/stencil buffer have been configured to be read, these four
pixels must each have a different block address, for which the lower three bits are all 1 (0x7).
When only one buffer (either the color buffer or the depth/stencil buffer) has been configured to be
read, these four pixels must each have a different block address, for which the lower four bits are
all 1 (0xF).

For example, assume that pixels are rendered at the following block addresses immediately after
cache tags are cleared: 0x00, 0x01, 0x0F, 0x02, 0x1F, 0x03, 0x0F, 0x2F, and 0x3F.

Block addresses 0x00 and 0x01 do not count because their lower four bits are not 0xF.

Block address 0x0F is only counted once, even though pixels are rendered there twice. In this
example, the workaround is only effective after pixels have been rendered at block addresses
0x0F, 0x1F, 0x2F, and 0x3F. If pixels at block address 0x3FFF are rendered before the pixels at
block address 0x3F, the problem would occur.

You can work around this problem by rendering dummy polygons, with pixels that meet these
conditions, immediately after cache tags are cleared, given the following caveats. The following are
valid dummy pixels.

Pixels that fail the depth test, stencil test, or alpha test. If you use settings that cause dummy
pixels to always fail these tests (for example, by specifying GL_NEVER for the depth test
function), make sure that you restore the original depth test function when you resume normal
rendering. Note that the cache flush command (a command that writes to register 0x111) would
be required at this time.
Pixels that do not affect the color buffer when they are rendered because of alpha blend
settings.
The following are not valid dummy pixels. Pixels that are clipped by the view volume or user-
defined clipping planes.
Pixels that are dropped by the scissor test.
Pixels that are dropped by the early depth test.

[Link]. Block 8 Mode

When you render a dummy polygon to work around this problem, you must choose pixels at block
addresses whose lower four bits (or lower three bits) are all 1. If you look at how block addresses
are arranged in block 8 mode, the lower four bits of the block address follow the same 32×8-pixel
pattern repeated horizontally, and the lower three bits of the block address follow the same 16×8-
pixel pattern repeated horizontally. However, depending on the framebuffer width, these patterns
may be shifted horizontally by eight pixels, for every eight pixels vertically.

Figure 15-3. Block Address Ordering in Block 8 Mode

The following table shows rectangle sizes that meet the conditions for a dummy polygon. Note
that the rectangle with the smallest possible area must be placed so that the pixels at its four
corners have block addresses that meet the necessary conditions.

Table 15-4. Rectangles to Render as Dummy Polygons in Block 8 Mode

Lower 4 Bits Are Lower 3 Bits Are


Rectangle Shape/Conditions
All 1 All 1
Rectangle with the smallest possible area (cannot be placed
94×1 46×1
anywhere).

125×5 61×5
Rectangle that can be placed anywhere.
29×29 13×29

[Link]. Block 32 Mode


When you render a dummy polygon to work around this problem, you must choose pixels at block
addresses whose lower four bits (or lower three bits) are all 1. If you look at how block addresses
are arranged in block 32 mode, the lower four bits of the block address follow the same 32×32-
pixel pattern repeated horizontally, and the lower three bits of the block address follow the same
32×16-pixel pattern repeated horizontally.

Figure 15-4. Block Address Ordering in Block 32 Mode

The following table shows rectangle sizes that meet the conditions for a dummy polygon. Note
that the rectangle with the smallest possible area must be placed so that the pixels at its four
corners have block addresses that meet the necessary conditions.

Table 15-5. Rectangles to Render as Dummy Polygons in Block 32 Mode

Lower 4 Bits Are Lower 3 Bits Are


Rectangle Shape/Conditions
All 1 All 1

Rectangle with the smallest possible area (cannot be placed 46×1 46×1
anywhere). 14×14 14×6
61×13 61×5
Rectangle that can be placed anywhere.
29×29 29×13

15.9. Lines Are Rendered in Error and the Region Following


the Framebuffer Is Corrupted

When rendering an extremely small polygon that has a right edge close to the window’s x-coordinate
of 0, the system sometimes renders lines in error. This phenomenon is caused by coordinate
wraparound when the pixel's x-coordinate becomes negative, due to calculation errors on polygon
pixel generation. Because of the extremely large x-coordinate value, the system renders a polygon
with extremely elongated dimensions in the positive x direction.

This phenomenon can also cause corruption in the memory region outside of the rendering memory
region. The wrapped x-coordinate becomes 1023, and the system generates pixels from (0, y)
through (1023, y), regardless of the size of the rendering region. In other words, the system
generates pixels with x-coordinates from 0 through 1023, even when the rendering region size is set
to a width smaller than 1024, such as 256×256. The framebuffer is accessed at the addresses
calculated from the pixel's (x, y) coordinates and the width of the rendering region, but the raw x-
coordinate is used for address calculation, even when it is outside of the rendering region's width.
Consequently, depending on the y-coordinate value, the system might write pixel color data to
memory addresses following the last address of the rendering region.

Note: Memory corruption does not happen when the rendering region is cropped by using a
scissor test. We recommend configuring a scissor test to avoid possible memory
corruption. Conducting a scissor test does not entail any penalty in terms of GPU
performance.

The conditions under which this phenomenon occurs depend solely on the window coordinates. For
any set of window coordinates, the problem either always occurs or never occurs. This problem
occurs in relation to the view volume or polygons generated as a result of clipping, so it occurs even
when the original polygon itself is large, provided that the polygon protrudes beyond the edge of the
screen when the window’s x-coordinate is 0, producing an extremely small area contained within the
view volume.

One workaround for this issue is to adjust the x-coordinate in the vertex shader. The clipping x-
coordinate calculated by the vertex shader is clipped from -w to w, so x values close to -w indicate a
vertex close to the screen edge at the window’s x-coordinate of 0. You can avoid the erroneous lines
by moving any such vertices—vertices located close to the edge of the screen where the x-coordinate
is 0—away from the edge of the screen by adjusting the x value in the -w direction. This workaround
only changes the vertex coordinate by no more than one pixel, so it has almost no effect on rendering
results.

When processing this in the vertex shader, handle the x value after applying a projection transform
(the value written to the output register as the vertex coordinate x value) as follows.

Code 15-3. Processing to Avoid Rendering Lines in Error

if ( -w < x && x < -w * (1-epsilon) ) x = -w;

These x and w values are the x and w values of the vertex coordinate after the projection
transformation. The epsilon value is a variable for adjustment that is to be specified as appropriate
for the scene you are rendering.

The following code section is a sample implementation of a vertex shader. Instructions starting from
mul are included to avoid displaying lines in error.

Code 15-4. Workaround Implementation Example

// v0 : position attribute
// o0 : output for position
// c0-c3 : modelview matrix
// c4-c7 : projection matrix
// c8 : (1 - epsilon, 1, any, any)
m4x4 r0, v0, c0 // modelview transformation
m4x4 r1, r0, c4 // projection transformation
mul [Link], -r1.w, [Link] // r2.x = -w * (1-epsilon), r2.y = -w
cmp 2, 4, [Link], [Link] //
ifc 1, 1, 1 // if ((x < -w * (1-epsilon)) && (x > -w))
mov r1.x, -r1.w // x = -w;
endif
mov o0, r1
15.10. Changing the Priority of Operations Within a Driver

The CPU in the 3DS has two cores, one dedicated to running applications, and the other dedicated to
controlling the processes of the system’s devices.

The core for the system controls multiple devices, and GPU processing has a high priority among
these devices. So heavy graphics processing can affect processing for other devices. In such cases,
you can use the nngxSetInternalDriverPrioMode() function to set the GPU to a lower priority
and minimize the effect on other devices.

Code 15-5. Changing the Priority of GPU Driver Operations

void nngxSetInternalDriverPrioMode(nngxInternalDriverPrioMode mode);

Select from the following values for the mode parameter.

Table 15-6. Internal Operation Priorities

Definition Description
NN_GX_INTERNAL_DRIVER_PRIO_MODE_HIGH High priority (default)

NN_GX_INTERNAL_DRIVER_PRIO_MODE_LOW Low priority

Note that lowering the priority for the GPU reduces the performance effect on other devices, but also
reduces graphics performance.

15.11. Functions That Allocate Internal Buffers Within the


Library

The library implicitly allocates internal buffers for some of the gl and nngx() functions.

15.11.1. nngxValidateState, glDraw*

If a reference table (LUT) is being used, the library allocates an internal buffer.

When glTexture1D is called and the library is notified that a reference table will be used, the next
time nngxValidateState or glDraw* is executed, an intermediate buffer is allocated for loading
the reference table. However, when a 3D command is generated directly, and a buffer for the
reference table is specified, no intermediate buffer is allocated.

Note that the allocated region is freed by glDeleteTexture or nngxFinalize.

15.11.2. Command Lists, Display Lists, Textures, and Similar Objects

Functions for allocating command lists, display lists, textures, and other objects allocate internal
buffers for each object. These buffers are maintained until the corresponding object has been
destroyed using nngxDelete* or glDelete*.

This applies to the following functions.

nngxGenCmdlists, nngxGenDisplaybuffers, glCreateProgram, glCreateShader,


glGenBuffers, glGenRenderbuffers, glGenTextures, and others.

15.12. Analyzing Causes of GPU Hangs

The data returned from a call to the nngxGetCmdlistParameteri() function when passing
NN_GX_CMDLIST_HW_STATE for pname includes a number of bits that indicate whether the hardware
is busy. This data can be helpful in analyzing the cause of problems in the operation of the hardware,
such as when the GPU hangs.

When the hardware is malfunctioning, the likely cause is modules that are stuck in the busy state. For
modules that work in sequence, such as the triangle setup > rasterization > texture unit modules, the
busy signal propagates from the last module through to the first module in the chain. When a
sequence of modules is busy, the last module in the chain is the most likely cause. However, the per-
fragment operation module indicated by bit 6 in the returned data can be stuck in the busy state due
to invalid data from a previous module, so the most likely cause could also be an earlier module.

Roughly speaking, busy signals propagate from rasterizing and pixel processing marked by bits 0
through 7, or from geometry processing marked by bits 8 through 16.

Rasterizing and pixel processing covers the sequence of modules of triangle setup > rasterization >
texture unit > fragment lighting > texture combiner > per-fragment operation, with busy signals from
modules later in the chain propagating to modules earlier in the chain. In other words, busy signals
propagate in order from bit 5 to bit 0.

Bit 6 is also a per-fragment operation module busy signal. However, although this does propagate to
bits 0 and 1, it does not propagate to bits 2, 3, or 4.

Bit 7 indicates a busy signal from the early depth test module, which occurs when the system is
waiting for the early depth buffer to clear (GPU built-in memory). This busy signal does not propagate
to other modules.

A busy signal in the triangle setup module does not propagate to earlier modules (the vertex cache or
geometry generator). In other words, no busy signals propagate between rasterizing and pixel
processing and geometry processing.

Geometry processing takes place as follows. Vertex input process module (which loads command
buffers and vertex arrays) vertex processor post vertex cache (in that order). Busy signals from
modules later in the chain propagate to modules earlier in the chain. In other words, busy signals
propagate in this order: bit 16 (bit 11, bit 12, bit 13, bit 14) bit 8 bit 9. Bits 11, 12, 13, and 14
correspond to the busy states of vertex processors 0, 1, 2, and 3, but because each vertex processor
is allocated in parallel with the vertex loading module and post-vertex cache, a busy signal from the
post-vertex cache propagates to one or more of the four vertex processors. The signal might not
propagate to all four of the vertex processors.

This description applies to the situation when the geometry shader is disabled. When it is enabled,
vertex processor 0, which is the geometry shader processor, comes after the post-vertex cache. In
this case, a busy signal from the geometry shader processor propagates to the post-vertex cache, but
it does not propagate to any earlier modules. In other words, a busy signal would propagate in this
order: bit 11 > bit 16. A busy signal arising from the post-vertex cache does propagate to earlier
modules (vertex processors 1, 2, and 3). In other words, a busy signal would propagate in this order:
bit 11 bit 16, and bit 16 (bit 12, bit 13, bit 14) bit 8 bit 9.

The post-vertex cache, indicated by bit 16, outputs a busy signal when it is filled to capacity with
vertex data. If the cache cannot output this data to the next module for some reason, such as when
the next module is not responding, vertex data fills the post-vertex cache to capacity. If the geometry
shader is disabled, the next module is the triangle setup module. If the geometry shader is enabled,
the next module is the geometry processor (vertex processor 0).

The cause of the GPU hanging when the texture unit indicated by bit 2 is busy, can be attributed to a
bug in the hardware that occurs when simultaneously using textures, both in and not in, a 4-bit format
as a multitexture.

If the GPU hangs due to an incorrect load array setting (the vertex attribute data load unit to the
GPU), bit 8 enters a busy state.

If the GPU hangs due to the vertex shader ’s output of NaN (or the geometry shader ’s output pf NaN
when the geometry shader is being used), the rasterization module and the triangle setup (bit 0 and
bit 1) enter a busy state.

Note: For more information about the nngxGetCmdlistParamerteri() function and the bits
obtained with NN_GX_CMDLIST_HW_STATE, see 4.1.10. Getting Command List
Parameters.

15.12.1. Hardware State When the GPU Hangs

The following table provides examples of the hardware state obtained for
NN_GX_CMDLIST_HW_STATE when the GPU hangs, and the related cause.

Table 15-7. Hardware States and Causes of GPU Hangs

Hardware
Cause of GPU Hang
State

0x00011303
0x00011F03 CPU destroyed content of vertex buffer while GPU was operating.
0x00012F03

GPU bug caused content of vertex buffer in VRAM to be discarded in the middle of
0x00010103
rendering.
0x0001011B
For information, see 15.9. Lines Are Rendered in Error and the Region Following the
0x00014303
Framebuffer Is Corrupted.
0x00010107 Hang due to multitexture hardware bug.
0x00011307 Conflict with texture addressing of 128-byte alignment.
0x00012307 For information, see 7.3.1. Formats With 4-Bit Components.

Bit 8 in both PICA registers 0x0229 and 0x0253 was not set correctly.
These registers must be set again when rendering using a GL function after rendering
0x00000100 using a non-GL function.
Unused elements of the load array (bits 31 to 28 of register 0x0205 + n * 3) are not
set to 0.
PICA register 0x0289 was set to use the geometry shader, but the system instead
0x00007300
executed a vertex shader that does not use the geometry shader.
Related register for the command buffer address jump function was not set properly.
While in standby for command request execution, nngxFlush3DCommandPartially
0x00000000
was called prior to completing configuration of the related register of the address jump
function.
0x00000001
0x00000002 NaN was output from the vertex shader.
0x00000003

Note: For more information about how to set a PICA register, see 3DS Programming Manual:
Advanced Graphics.
15.13. Effect of Vertex Attribute Combinations on Vertex Data
Transfer Speed

When a vertex buffer is used, the vertex attribute data type and data size combination (the type and
size parameters in the glVertexAttribPointer() function) affects the speed of vertex data
transfer.

Vertex attribute data stored in the vertex buffer is grouped together as one or multiple vertex
attributes and loaded to the GPU. The load array is the unit used in loading the vertex data.

Note: For more information about load arrays, see 3DS Programming Manual: Advanced
Graphics.

When the GPU transfers load arrays, it determines whether to perform a read-ahead transfer based
on the combination of vertex attribute data types and data sizes comprising the load array. If a read-
ahead transfer can be performed, the vertex data transfer speed is faster.

Read-ahead transfer is performed when data meets the requirements of the conditional equation
shown below.

(Attribute Number Types Other Than GL_FLOAT + Attribute Number Whose Data Size Is 1) <=
(GL_FLOAT Type Attribute Number Whose Data Size Is 4 + GL_FLOAT Type Attribute Number
Whose Data Size Is 3 / 2)

The data size of Attribute Number Types Other Than GL_FLOAT and the data type of Attribute
Number Whose Data Size Is 1 are arbitrary. Vertex attributes applying multiple conditions are counted
according to each of those conditions. For instance, for GL_BYTE type vertex attributes whose data
size is 1, both the attribute type other than GL_FLOAT and the attribute number whose data size is 1
are counted.

If the conditions for read-ahead transfer are matched, transfer speed depends on the data volume of
load arrays. The smaller the data volume, the faster the transfer speed. If the volume of vertex data
is the same, transfer speed depends on the number of attributes included in the load array. The fewer
the load array attributes, the faster the transfer speed.

15.14. Vertex Array Address Alignment

Efficiency of vertex array transfer processing can be improved by keeping vertex array address
alignment to 32 bytes when rendering with the use of a vertex buffer. The vertex array address is a
value comprised of the vertex buffer address and an offset specified by the
glVertexAttribPointer() function (a value specified in ptr).

The extent to which speed is improved in comparison to a vertex array address whose alignment has
not been kept within 32 bytes depends on the vertex attribute type, size, location of vertex array
storage, and the content of the vertex index. There is no guarantee that this method will be effective.
In addition, even if transfer processing performance improves, it will not necessarily improve
performance of the overall system unless vertex array transfer processing is causing a noticeable
bottleneck.
15.15. GPU Hangs When Multitexture Is Used

Note: On SNAKE hardware, the GPU does not hang when using multitextures. However, note
that hangs can still occur when the application is running on CTR.

The GPU may experience a hang when both of the following conditions are met.

Multiple textures are used.


There is a considerable difference in performance between texture units.

Procedural texture is not included in these conditions. This phenomenon does not occur when one
normal texture is used simultaneously with a procedural texture.

This phenomenon can be avoided by taking the following steps.

Use only one texture.


For all textures used, apply the GL_XXX_MIPMAP_LINEAR setting (the setting for trilinear filter
use) to the GL_TEXTURE_MIN_FILTER texture parameter.
This parameter must be set even for textures for which there actually is no mipmap.

In addition, the following methods are recommended as a means of mitigating this phenomenon.

Set the GL_TEXTURE_MIN_FILTER texture parameter to the GL_XXX_MIPMAP_LINEAR setting


(the setting for trilinear filter use) for a portion of the textures to be used.
The problem can be avoided completely by setting all textures to the trilinear filter setting, but the
phenomenon’s occurrence can be mitigated by using the setting on just a portion of the textures.
Place all textures to be used at the same time in the same VRAM.
Reduce the number of textures used.

Because the occurrence of this phenomenon is dependent on timing, changing texture settings as
listed below can also avoid the problem much of the time. However, in some cases, the frequency of
occurrence could actually worsen.

Change the size of textures.


Change texture formats.
Change the filter mode of the textures.
Change the storage locations of textures, switching between VRAM-A, VRAM-B, and FCRAM
(device memory).

For a description of determining whether this phenomenon is the cause of a hang, see 15.12.
Analyzing Causes of GPU Hangs.

However, note that the same type of hangs can also be caused by being in conflict with restrictions
governing the storage location of the 4-bit texture format, making it difficult to determine the cause
for certain.

15.16. GL_INTERPOLATE Calculations of the Texture


Combiner

The texture combiner ’s GL_INTERPOLATE expression is src0 * src2 + src1 * (1 – src2). If


src2 is 1 or 0, you would expect the result of the calculation to be the same value as either src0 or
src1. However, because of how the specifications have been implemented, if src2 is 0 and src0 <
src1, the output result is not the value of src1 as you would expect, but rather something that is
one unit less bright than src1.

To avoid this problem, you must combine GL_MODULATE and GL_MULT_ADD_DMP and perform the
calculation in a two-stage combiner. Alternatively, if you change the src2 operand to
GL_ONE_MINUS_* and switch src0 and src1, you may be able to reduce the fragments for which
this problem arises.

15.17. When Render Results Are a Complete Mismatch for


Polygons With the Same Vertex Coordinates

Even when you render polygons that have exactly the same vertex coordinates, the various attribute
values for the fragments might not match at all.

This phenomenon occurs as a result of calculation errors from fragment interpolation calculations that
arise when the order in which the vertices were entered into the rasterization module was different. It
will not occur when the input order for the vertices is exactly the same but if you render using the
glDrawElements() function with GL_TRIANGLES as a parameter, that input order can change
internally as a result of the relationship with the vertex indices for the immediately prior polygon. To
ensure your ability to render multiple polygons with completely matching fragments, you must use the
same vertex indices when rendering, including those for the polygons that precede and follow your
desired polygons.

CONFIDENTIAL

Revision History

Version 1.6 (2016-06-24)

Changes

4.1.4. Running Command List Objects

Added that the accumulation of commands in an executing command list must occur in the
application core.
Version 1.5 (2016-05-10)

Changes

4. Command Lists

Added that the function issuing command requests can only be called in Core 0.

Version 1.4 (2015-11-05)

Additions

3.8. Specifying Display Portions

7.5. Copying From the Framebuffer

[Link]. Silhouette Shadow Artifacts

15.17. When Render Results Are a Complete Mismatch for Polygons With the Same Vertex
Coordinates

Changes

1. Overview

Added notes to refer to the API Reference for alignment definition names.

3. LCD

Consolidated Path of the Rendering Results Prior to Being Displayed on the LCDs as one
section.

3.3. Allocating Buffers

Added a figure (Configuration of the Framebuffer and the Display Buffer).

[Link]. Flushing the Accumulated 3D Command Buffer

Fixed the function that reflects the cache content from nngxUpdateBuffer() to<