OpenGLES Programming Guide For IOS
OpenGLES Programming Guide For IOS
About OpenGL ES 10
At a Glance 10
OpenGL ES Is a Platform-Neutral API Implemented in iOS 11
GLKit Provides a Drawing Surface and Animation Support 11
iOS Supports Alternative Rendering Targets 11
Apps Require Additional Performance Tuning 12
OpenGL ES May Not Be Used in Background Apps 12
OpenGL ES Places Additional Restrictions on Multithreaded Apps 12
How to Use This Document 13
Prerequisites 13
See Also 13
2
Contents
3
Contents
4
Contents
5
Contents
Glossary 138
6
Figures and Listings
7
Figures and Listings
8
Figures and Listings
9
About OpenGL ES
The Open Graphics Library (OpenGL) is used for visualizing 2D and 3D data. It is a multipurpose open-standard
graphics library that supports applications for 2D and 3D digital content creation, mechanical and architectural
design, virtual prototyping, flight simulation, video games, and more. You use OpenGL to configure a 3D
graphics pipeline and submit data to it. Vertices are transformed and lit, assembled into primitives, and rasterized
to create a 2D image. OpenGL is designed to translate function calls into graphics commands that can be sent
to underlying graphics hardware. Because this underlying hardware is dedicated to processing graphics
commands, OpenGL drawing is typically very fast.
OpenGL for Embedded Systems (OpenGL ES) is a simplified version of OpenGL that eliminates redundant
functionality to provide a library that is both easier to learn and easier to implement in mobile graphics
hardware.
At a Glance
OpenGL ES allows an app to harness the power of the underlying graphics processor. The GPU on iOS devices
can perform sophisticated 2D and 3D drawing, as well as complex shading calculations on every pixel in the
final image. You should use OpenGL ES if the design requirements of your app call for the most direct and
comprehensive access possible to GPU hardware. Typical clients for OpenGL ES include video games and
simulations that present 3D graphics.
OpenGL ES is a low-level, hardware-focused API. Though it provides the most powerful and flexible graphics
processing tools, it also has a steep learning curve and a significant effect on the overall design of your app.
For apps that require high-performance graphics for more specialized uses, iOS provides several higher-level
frameworks:
● The Sprite Kit framework provides a hardware-accelerated animation system optimized for creating 2D
games. (See Sprite Kit Programming Guide .)
10
About OpenGL ES
At a Glance
● The Core Image framework provides real-time filtering and analysis for still and video images. (See Core
Image Programming Guide .)
● Core Animation provides the hardware-accelerated graphics rendering and animation infrastructure for
all iOS apps, as well as a simple declarative programming model that makes it simple to implement
sophisticated user interface animations. (See Core Animation Programming Guide .)
● You can add animation, physics-based dynamics, and other special effects to Cocoa Touch user interfaces
using features in the UIKit framework.
Relevant Chapters: “Checklist for Building OpenGL ES Apps for iOS” (page 15), “Configuring
OpenGL ES Contexts” (page 19)
11
About OpenGL ES
At a Glance
You should design your app to efficiently use the OpenGL ES API. Once you have finished building your app,
use Instruments to fine tune your app’s performance. If your app is bottlenecked inside OpenGL ES, use the
information provided in this guide to optimize your app’s performance.
Xcode provides tools to help you improve the performance of your OpenGL ES apps.
Relevant Chapters: “OpenGL ES Design Guidelines” (page 48), “Best Practices for Working with
Vertex Data” (page 74), “Best Practices for Working with Texture Data” (page 89), “Best Practices for
Shaders” (page 94), “Tuning Your OpenGL ES App” (page 62)
Relevant Chapters: “Multitasking, High Resolution, and Other iOS Features” (page 43)
12
About OpenGL ES
How to Use This Document
If you’re familiar with the basics of using OpenGL ES in iOS, read “Drawing to Other Rendering
Destinations” (page 32) and “Multitasking, High Resolution, and Other iOS Features” (page 43) for important
platform-specific guidelines. Developers familiar with using OpenGL ES in iOS versions before 5.0 should study
“Drawing with OpenGL ES and GLKit” (page 23) for details on new features for streamlining OpenGL ES
development.
Finally, read “OpenGL ES Design Guidelines” (page 48), “Tuning Your OpenGL ES App” (page 62), and the
following chapters to dig deeper into how to design efficient OpenGL ES apps.
Unless otherwise noted, OpenGL ES code examples in this book target OpenGL ES 3.0. You may need to make
changes to use these code examples with other OpenGL ES versions.
Prerequisites
Before attempting use OpenGL ES, you should already be familiar with general iOS app architecture. See Start
Developing iOS Apps Today .
This document is not a complete tutorial or a reference for the cross-platform OpenGL ES API. To learn more
about OpenGL ES, consult the references below.
See Also
OpenGL ES is an open standard defined by the Khronos Group. For more information about the OpenGL ES
standard, please consult their web page at [Link]
● OpenGL® ES 3.0 Programming Guide , published by Addison-Wesley, provides a comprehensive introduction
to OpenGL ES concepts.
13
About OpenGL ES
See Also
● OpenGL® Shading Language, Third Edition , also published by Addison-Wesley, provides many shading
algorithms useable in your OpenGL ES app. You may need to modify some of these algorithms to run
efficiently on mobile graphics processors.
● OpenGL ES API Registry is the official repository for the OpenGL ES specifications, the OpenGL ES shading
language specifications, and documentation for OpenGL ES extensions.
● OpenGL ES Framework Reference describes the platform-specific functions and classes provided by Apple
to integrate OpenGL ES into iOS.
● iOS Device Compatibility Reference provides more detailed information on the hardware and software
features available to your app.
● GLKit Framework Reference describes a framework provided by Apple to make it easier to develop
OpenGL ES 2.0 and 3.0 apps.
14
Checklist for Building OpenGL ES Apps for iOS
The OpenGL ES specification defines a platform-neutral API for using GPU hardware to render graphics. Platforms
implementing OpenGL ES provide a rendering context for executing OpenGL ES commands, framebuffers to
hold rendering results, and one or more rendering destinations that present the contents of a framebuffer for
display. In iOS, the EAGLContext class implements a rendering context. iOS provides only one type of
framebuffer, the OpenGL ES framebuffer object, and the GLKView and CAEAGLLayer classes implement
rendering destinations.
Building an OpenGL ES app in iOS requires several considerations, some of which are generic to OpenGL ES
programming and some of which are specific to iOS. Follow this checklist and the detailed sections below to
get started:
1. Determine which version(s) of OpenGL ES have the right feature set for your app, and create an OpenGL ES
context.
2. Verify at runtime that the device supports the OpenGL ES capabilities you want to use.
3. Choose where to render your OpenGL ES content.
4. Make sure your app runs correctly in iOS.
5. Implement your rendering engine.
6. Use Xcode and Instruments to debug your OpenGL ES app and tune it for optimal performance .
15
Checklist for Building OpenGL ES Apps for iOS
Verifying OpenGL ES Capabilities
You should target the version or versions of OpenGL ES that support the features and devices most relevant
to your app. To learn more about the OpenGL ES capabilities of iOS devices, read iOS Device Compatibility
Reference .
To create contexts for the versions of OpenGL ES you plan to support, read “Configuring OpenGL ES
Contexts” (page 19). To learn how your choice of OpenGL ES version relates to the rendering algorithms you
might use in your app, read “OpenGL ES Versions and Renderer Architecture” (page 50).
To determine implementation specific limits such as the maximum texture size or maximum number of vertex
attributes, look up the value for the corresponding token (such as MAX_TEXTURE_SIZE or
MAX_VERTEX_ATTRIBS, as found in the gl.h header) using the appropriate glGet function for its data type.
To check for OpenGL ES 3.0 extensions, use the glGetIntegerv and glGetStringi functions as in the
following code example:
// (For better performance, create the set only once and cache it for future
use.)
int max = 0;
glGetIntegerv(GL_NUM_EXTENSIONS, &max);
To check for OpenGL ES 1.1 and 2.0 extensions, call glGetString(GL_EXTENSIONS) to get a space-delimited
list of all extension names.
16
Checklist for Building OpenGL ES Apps for iOS
Choosing a Rendering Destination
To learn about rendering to an offscreen buffer, a texture, or a Core Animation layer, read “Drawing to Other
Rendering Destinations” (page 32).
Many iOS devices include high-resolution displays, so your app should support multiple display sizes and
resolutions.
To learn about supporting these and other iOS features, read “Multitasking, High Resolution, and Other iOS
Features” (page 43).
To learn about design considerations important for iOS devices, read “OpenGL ES Design Guidelines” (page
48) and “Concurrency and OpenGL ES” (page 106).
17
Checklist for Building OpenGL ES Apps for iOS
Debugging and Profiling
To learn more about solving problems and improving performance in your OpenGL ES app, read “Tuning Your
OpenGL ES App” (page 62).
18
Configuring OpenGL ES Contexts
Every implementation of OpenGL ES provides a way to create rendering contexts to manage the state required
by the OpenGL ES specification. By placing this state in a context, multiple apps can easily share the graphics
hardware without interfering with the other’s state.
To set a thread’s current context, call the EAGLContext class method setCurrentContext: when executing
on that thread.
Call the EAGLContext class method currentContext to retrieve a thread’s current context.
19
Configuring OpenGL ES Contexts
Every Context Targets a Specific Version of OpenGL ES
Note: If your app actively switches between two or more contexts on the same thread, call the
glFlush function before setting a new context as the current context. This ensures that previously
submitted commands are delivered to the graphics hardware in a timely fashion.
Your app decides which version of OpenGL ES to support when it creates and initializes the EAGLContext
object. If the device does not support the requested version of OpenGL ES, the initWithAPI: method returns
nil. Your app must test to ensure that a context was initialized successfully before using it.
To support multiple versions of OpenGL ES as rendering options in your app, you should first attempt to
initialize a rendering context of the newest version you want to target. If the returned object is nil, initialize
a context of an older version instead. Listing 2-1 demonstrates how to do this.
EAGLContext* CreateBestEAGLContext()
if (context == nil) {
return context;
A context’s API property states which version of OpenGL ES the context supports. Your app would test the
context’s API property and use it to choose the correct rendering path. A common pattern for implementing
this is to create a class for each rendering path; your app tests the context and creates a renderer once, on
initialization.
20
Configuring OpenGL ES Contexts
An EAGL Sharegroup Manages OpenGL ES Objects for the Context
The advantage of a sharegroup becomes obvious when two or more contexts refer to the same sharegroup,
as shown in Figure 2-1. When multiple contexts are connected to a common sharegroup, OpenGL ES objects
created by any context are available on all contexts; if you bind to the same object identifier on another context
than the one that created it, you reference the same OpenGL ES object. Resources are often scarce on mobile
devices; creating multiple copies of the same content on multiple contexts is wasteful. Sharing common
resources makes better use of the available graphics resources on the device.
A sharegroup is an opaque object; it has no methods or properties that your app can call. Contexts that use
the sharegroup object keep a strong reference to it.
To create multiple contexts that reference the same sharegroup, the first context is initialized by calling
initWithAPI:; a sharegroup is automatically created for the context. The second and later contexts are
initialized to use the first context’s sharegroup by calling the initWithAPI:sharegroup: method instead.
Listing 2-2 shows how this would work. The first context is created using the convenience function defined in
Listing 2-1 (page 20). The second context is created by extracting the API version and sharegroup from the
first context.
21
Configuring OpenGL ES Contexts
An EAGL Sharegroup Manages OpenGL ES Objects for the Context
Important: All contexts associated with the same sharegroup must use the same version of the OpenGL ES
API as the initial context.
It is your app’s responsibility to manage state changes to OpenGL ES objects when the sharegroup is shared
by multiple contexts. Here are the rules:
● Your app may access the object across multiple contexts simultaneously provided the object is not being
modified.
● While the object is being modified by commands sent to a context, the object must not be read or modified
on any other context.
● After an object has been modified, all contexts must rebind the object to see the changes. The contents
of the object are undefined if a context references it before binding it.
Here are the steps your app should follow to update an OpenGL ES object:
1. Call glFlush on every context that may be using the object.
2. On the context that wants to modify the object, call one or more OpenGL ES functions to change the
object.
3. Call glFlush on the context that received the state-modifying commands.
4. On every other context, rebind the object identifier.
Note: Another way to share objects is to use a single rendering context, but multiple destination
framebuffers. At rendering time, your app binds the appropriate framebuffer and renders its frames
as needed. Because all of the OpenGL ES objects are referenced from a single context, they see the
same OpenGL ES data. This pattern uses less resources, but is only useful for single-threaded apps
where you can carefully control the state of the context.
22
Drawing with OpenGL ES and GLKit
The GLKit framework provides view and view controller classes that eliminate the setup and maintenance code
that would otherwise be required for drawing and animating OpenGL ES content. The GLKView class manages
OpenGL ES infrastructure to provide a place for your drawing code, and the GLKViewController class
provides a rendering loop for smooth animation of OpenGL ES content in a GLKit view. These classes extend
the standard UIKit design patterns for drawing view content and managing view presentation. As a result, you
can focus your efforts primarily on your OpenGL ES rendering code and get your app up and running quickly.
The GLKit framework also provides other features to ease OpenGL ES 2.0 and 3.0 development.
Like a standard UIKit view, a GLKit view renders its content on demand. When your view is first displayed, it
calls your drawing method—Core Animation caches the rendered output and displays it whenever your view
is shown. When you want to change the contents of your view, call its setNeedsDisplay method and the
view again calls your drawing method, caches the resulting image, and presents it on screen. This approach
23
Drawing with OpenGL ES and GLKit
A GLKit View Draws OpenGL ES Content on Demand
is useful when the data used to render an image changes infrequently or only in response to user action. By
rendering new view contents only when you need to, you conserve battery power on the device and leave
more time for the device to perform other actions.
A GLKit view automatically creates and configures its own OpenGL ES framebuffer object and renderbuffers.
You control the attributes of these objects using the view’s drawable properties, as illustrated in Listing 3-1. If
you change the size, scale factor, or drawable properties of a GLKit view, it automatically deletes and re-creates
the appropriate framebuffer objects and renderbuffers the next time its contents are drawn.
24
Drawing with OpenGL ES and GLKit
A GLKit View Draws OpenGL ES Content on Demand
- (void)viewDidLoad
[super viewDidLoad];
// Create an OpenGL ES context and assign it to the view loaded from storyboard
[Link] = GLKViewDrawableColorFormatRGBA8888;
[Link] = GLKViewDrawableDepthFormat24;
[Link] = GLKViewDrawableStencilFormat8;
// Enable multisampling
[Link] = GLKViewDrawableMultisample4X;
You can enable multisampling for a GLKView instance using its drawableMultisample property. Multisampling
is a form of antialiasing that smooths jagged edges, improving image quality in most 3D apps at the cost of
using more memory and fragment processing time—if you enable multisampling, always test your app’s
performance to ensure that it remains acceptable.
- (void)drawRect:(CGRect)rect
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
25
Drawing with OpenGL ES and GLKit
Rendering Using a Delegate Object
// Draw using previously configured texture, shader, uniforms, and vertex array
glBindTexture(GL_TEXTURE_2D, _planetTexture);
glUseProgram(_diffuseShading);
glUniformMatrix4fv(_uniformModelViewProjectionMatrix, 1, 0,
_modelViewProjectionMatrix.m);
glBindVertexArrayOES(_planetMesh);
Note: The glClear function hints to OpenGL ES that any existing framebuffer contents can be
discarded, avoiding costly memory operations to load the previous contents into memory. To ensure
optimal performance, you should always call this function before drawing.
The GLKView class is able to provide a simple interface for OpenGL ES drawing because it manages the standard
parts of the OpenGL ES rendering process:
● Before invoking your drawing method, the view:
● Makes its EAGLContext object the current context
● Creates a framebuffer object and renderbuffers based on its current size, scale factor, and drawable
properties (if needed)
● Binds the framebuffer object as the current destination for drawing commands
● Sets the OpenGL ES viewport to match the framebuffer size
● After your drawing method returns, the view:
● Resolves multisampling buffers (if multisampling is enabled)
● Discards renderbuffers whose contents are no longer needed
● Presents renderbuffer contents to Core Animation for caching and display
26
Drawing with OpenGL ES and GLKit
Rendering Using a Delegate Object
use different renderer classes to support both OpenGL ES 2.0 and 3.0 (see “Configuring OpenGL ES
Contexts” (page 19)). Or you might use them to customize rendering for better image quality on devices with
more powerful hardware.
GLKit is well suited to this approach—you can make your renderer object the delegate of a standard GLKView
instance. Instead of subclassing GLKView and implementing the drawRect: method, your renderer class
adopts the GLKViewDelegate protocol and implements the glkView:drawInRect: method. Listing 3-3
demonstrates choosing a renderer class based on hardware features at app launch time.
- (BOOL)application:(UIApplication *)application
didFinishLaunchingWithOptions:(NSDictionary *)launchOptions
{
[EAGLContext setCurrentContext:context];
GLint maxTextureSize;
glGetIntegerv(GL_MAX_TEXTURE_SIZE, &maxTextureSize);
else
// Make the renderer the delegate for the view loaded from the main storyboard
[Link] = [Link];
[Link] = context;
return YES;
}
27
Drawing with OpenGL ES and GLKit
A GLKit View Controller Animates OpenGL ES Content
For the display phase, the view controller calls its view’s display method, which in turn calls your drawing
method. In your drawing method, you submit OpenGL ES drawing commands to the GPU to render your
content. For optimal performance, your app should modify OpenGL ES objects at the start of rendering a new
frame, and submit drawing commands afterward. In Figure 3-2, the display phase sets a uniform variable in a
shader program to the matrix calculated in the update phase, and then submits a drawing command to render
new content.
28
Drawing with OpenGL ES and GLKit
A GLKit View Controller Animates OpenGL ES Content
The animation loop alternates between these two phases at the rate indicated by the view controller’s
framesPerSecond property. You can use the preferredFramesPerSecond property to set a desired frame
rate—to optimize performance for the current display hardware, the view controller automatically chooses an
optimal frame rate close to your preferred value.
Important: For best results, choose a frame rate your app can consistently achieve. A smooth, consistent
frame rate produces a more pleasant user experience than a frame rate that varies erratically.
Listing 3-4 Using a GLKit view and view controller to draw and animate OpenGL ES content
- (void)viewDidLoad
[super viewDidLoad];
// Create an OpenGL ES context and assign it to the view loaded from storyboard
[Link] = 60;
// Not shown: load shaders, textures and vertex arrays, set up projection
matrix
[self setupGL];
- (void)update
29
Drawing with OpenGL ES and GLKit
A GLKit View Controller Animates OpenGL ES Content
_normalMatrix =
GLKMatrix3InvertAndTranspose(GLKMatrix4GetMatrix3(modelViewMatrix), NULL);
_modelViewProjectionMatrix = GLKMatrix4Multiply(_projectionMatrix,
modelViewMatrix);
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
glUseProgram(_diffuseShading);
glUniformMatrix4fv(_uniformModelViewProjectionMatrix, 1, 0,
_modelViewProjectionMatrix.m);
glUniformMatrix3fv(_uniformNormalMatrix, 1, 0, _normalMatrix.m);
glBindTexture(GL_TEXTURE_2D, _planetTexture);
glBindVertexArrayOES(_planetMesh);
glDrawElements(GL_TRIANGLE_STRIP, 256, GL_UNSIGNED_SHORT, 0);
@end
30
Drawing with OpenGL ES and GLKit
Using GLKit to Develop Your Renderer
The view controller is automatically the delegate of its view, so it implements both the update and display
phases of the animation loop. In the update method, it calculates the transformation matrices needed to
display a rotating planet. In the glkView:drawInRect: method, it provides those matrices to a shader
program and submits drawing commands to render the planet geometry.
31
Drawing to Other Rendering Destinations
Framebuffer objects are the destination for rendering commands. When you create a framebuffer object, you
have precise control over its storage for color, depth, and stencil data. You provide this storage by attaching
images to the framebuffer, as shown in Figure 4-1. The most common image attachment is a renderbuffer
object. You can also attach an OpenGL ES texture to the color attachment point of a framebuffer, which means
that any drawing commands are rendered into the texture. Later, the texture can act as an input to future
rendering commands. You can also create multiple framebuffer objects in an single rendering context. You
might do this so that you can share the same rendering pipeline and OpenGL ES resources between multiple
framebuffers.
All of these approaches require manually creating framebuffer and renderbuffer objects to store the rendering
results from your OpenGL ES context, as well as writing additional code to present their contents to the screen
and (if needed) run an animation loop.
32
Drawing to Other Rendering Destinations
Creating a Framebuffer Object
● To use the framebuffer image as an input to a later rendering step, attach a texture. See “Using Framebuffer
Objects to Render to a Texture” (page 34).
● To use the framebuffer in a Core Animation layer composition, use a special Core Animation–aware
renderbuffer. See “Rendering to a Core Animation Layer” (page 35).
GLuint framebuffer;
glGenFramebuffers(1, &framebuffer);
glBindFramebuffer(GL_FRAMEBUFFER, framebuffer);
2. Create a color renderbuffer, allocate storage for it, and attach it to the framebuffer’s color attachment
point.
GLuint colorRenderbuffer;
glGenRenderbuffers(1, &colorRenderbuffer);
glBindRenderbuffer(GL_RENDERBUFFER, colorRenderbuffer);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
GL_RENDERBUFFER, colorRenderbuffer);
3. Create a depth or depth/stencil renderbuffer, allocate storage for it, and attach it to the framebuffer’s
depth attachment point.
GLuint depthRenderbuffer;
glGenRenderbuffers(1, &depthRenderbuffer);
glBindRenderbuffer(GL_RENDERBUFFER, depthRenderbuffer);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT,
GL_RENDERBUFFER, depthRenderbuffer);
4. Test the framebuffer for completeness. This test only needs to be performed when the framebuffer’s
configuration changes.
33
Drawing to Other Rendering Destinations
Creating a Framebuffer Object
if(status != GL_FRAMEBUFFER_COMPLETE) {
After drawing to an offscreen renderbuffer, you can return its contents to the CPU for further processing using
the glReadPixels function.
GLuint texture;
glGenTextures(1, &texture);
glBindTexture(GL_TEXTURE_2D, texture);
Although this example assumes you are rendering to a color texture, other options are possible. For example,
using the OES_depth_texture extension, you can attach a texture to the depth attachment point to store
depth information from the scene into a texture. You might use this depth information to calculate shadows
in the final rendered scene.
34
Drawing to Other Rendering Destinations
Creating a Framebuffer Object
The CAEAGLLayer provides this support to OpenGL ES by providing two key pieces of functionality. First, it
allocates shared storage for a renderbuffer. Second, it presents the renderbuffer to Core Animation, replacing
the layer’s previous contents with data from the renderbuffer. An advantage of this model is that the contents
of the Core Animation layer do not need to be drawn in every frame, only when the rendered image changes.
Note: The GLKView class automates the steps below, so you should use it when you want to draw
with OpenGL ES in the content layer of a view.
35
Drawing to Other Rendering Destinations
Creating a Framebuffer Object
2. Allocate an OpenGL ES context and make it the current context. See “Configuring OpenGL ES
Contexts” (page 19).
3. Create the framebuffer object (as in “Creating Offscreen Framebuffer Objects” (page 33) above).
4. Create a color renderbuffer, allocating its storage by calling the context’s
renderbufferStorage:fromDrawable: method and passing the layer object as the parameter. The
width, height and pixel format are taken from the layer and used to allocate storage for the renderbuffer.
GLuint colorRenderbuffer;
glGenRenderbuffers(1, &colorRenderbuffer);
glBindRenderbuffer(GL_RENDERBUFFER, colorRenderbuffer);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
GL_RENDERBUFFER, colorRenderbuffer);
Note: When the Core Animation layer’s bounds or properties change, your app should reallocate
the renderbuffer’s storage. If you do not reallocate the renderbuffers, the renderbuffer size won’t
match the size of the layer; in this case, Core Animation may scale the image’s contents to fit in
the layer.
GLint width;
GLint height;
glGetRenderbufferParameteriv(GL_RENDERBUFFER, GL_RENDERBUFFER_WIDTH,
&width);
glGetRenderbufferParameteriv(GL_RENDERBUFFER, GL_RENDERBUFFER_HEIGHT,
&height);
In earlier examples, the width and height of the renderbuffers were explicitly provided to allocate storage
for the buffer. Here, the code retrieves the width and height from the color renderbuffer after its storage
is allocated. Your app does this because the actual dimensions of the color renderbuffer are calculated
based on the layer’s bounds and scale factor. Other renderbuffers attached to the framebuffer must have
the same dimensions. In addition to using the height and width to allocate the depth buffer, use them to
assign the OpenGL ES viewport and to help determine the level of detail required in your app’s textures
and models. See “Supporting High-Resolution Displays” (page 45).
6. Allocate and attach a depth buffer (as before).
36
Drawing to Other Rendering Destinations
Drawing to a Framebuffer Object
For on-demand drawing, implement your own method to draw into and present your renderbuffer, and call
it whenever you want to display new content.
To draw with an animation loop, use a CADisplayLink object. A display link is a kind of timer provided by
Core Animation that lets you synchronize drawing to the refresh rate of a screen. Listing 4-1 (page 37) shows
how you can retrieve the screen showing a view, use that screen to create a new display link object and add
the display link object to the run loop.
Note: The GLKViewController class automates the usage of CADisplayLink objects for animating
GLKView content. Use the CADisplayLink class directly only if you need behavior beyond what
the GLKit framework provides.
Inside your implementation of the drawFrame method, read the display link’s timestamp property to get the
timestamp for the next frame to be rendered. It can use that value to calculate the positions of objects in the
next frame.
37
Drawing to Other Rendering Destinations
Drawing to a Framebuffer Object
Normally, the display link object is fired every time the screen refreshes; that value is usually 60 Hz, but may
vary on different devices. Most apps do not need to update the screen 60 times per second. You can set the
display link’s frameInterval property to the number of actual frames that go by before your method is
called. For example, if the frame interval was set to 3, your app is called every third frame, or roughly 20 frames
per second.
Important: For best results, choose a frame rate your app can consistently achieve. A smooth, consistent
frame rate produces a more pleasant user experience than a frame rate that varies erratically.
Rendering a Frame
Figure 4-3 (page 38) shows the steps an OpenGL ES app should take on iOS to render and present a frame.
These steps include many hints to improve performance in your app.
Clear Buffers
At the start of every frame, erase the contents of all framebuffer attachments whose contents from a previous
frames are not needed to draw the next frame. Call the glClear function, passing in a bit mask with all of the
buffers to clear, as shown in Listing 4-2.
glBindFramebuffer(GL_FRAMEBUFFER, framebuffer);
glClear(GL_DEPTH_BUFFER_BIT | GL_COLOR_BUFFER_BIT);
Using glClear hints to OpenGL ES that the existing contents of a renderbuffer or texture can be discarded,
avoiding costly operations to load the previous contents into memory.
38
Drawing to Other Rendering Destinations
Drawing to a Framebuffer Object
Resolve Multisampling
If your app uses multisampling to improve image quality, your app must resolve the pixels before they are
presented to the user. Multisampling is covered in detail in “Using Multisampling to Improve Image
Quality” (page 40).
At this stage in the rendering loop, your app has submitted all of its drawing commands for the frame. While
your app needs the color renderbuffer to display to the screen, it probably does not need the depth buffer’s
contents. Listing 4-3 discards the contents of the depth buffer.
glBindFramebuffer(GL_FRAMEBUFFER, framebuffer);
glDiscardFramebufferEXT(GL_FRAMEBUFFER,1,discards);
39
Drawing to Other Rendering Destinations
Using Multisampling to Improve Image Quality
glBindRenderbuffer(GL_RENDERBUFFER, colorRenderbuffer);
[context presentRenderbuffer:GL_RENDERBUFFER];
By default, you must assume that the contents of the renderbuffer are discarded after your app presents the
renderbuffer. This means that every time your app presents a frame, it must completely re-create the frame’s
contents when it renders a new frame. The code above always erases the color buffer for this reason.
If your app wants to preserve the contents of the color renderbuffer between frames, add the
kEAGLDrawablePropertyRetainedBacking key to the dictionary stored in the drawableProperties
property of the CAEAGLLayer object, and remove the GL_COLOR_BUFFER_BIT constant from the earlier
glClear function call. Retained backing may require iOS to allocate additional memory to preserve the buffer’s
contents, which may reduce your app’s performance.
Figure 4-4 shows how multisampling works. Instead of creating one framebuffer object, your app creates two.
The multisampling buffer contains all attachments necessary to render your content (typically color and depth
buffers). The resolve buffer contains only the attachments necessary to display a rendered image to the user
(typically a color renderbuffer, but possibly a texture), created using the appropriate procedure from “Creating
a Framebuffer Object” (page 32). The multisample renderbuffers are allocated using the same dimensions as
40
Drawing to Other Rendering Destinations
Using Multisampling to Improve Image Quality
the resolve framebuffer, but each includes an additional parameter that specifies the number of samples to
store for each pixel. Your app performs all of its rendering to the multisampling buffer and then generates the
final antialiased image by resolving those samples into the resolve buffer.
Listing 4-5 shows the code to create the multisampling buffer. This code uses the width and height of the
previously created buffer. It calls the glRenderbufferStorageMultisampleAPPLE function to create
multisampled storage for the renderbuffer.
glGenFramebuffers(1, &sampleFramebuffer);
glBindFramebuffer(GL_FRAMEBUFFER, sampleFramebuffer);
glGenRenderbuffers(1, &sampleColorRenderbuffer);
glBindRenderbuffer(GL_RENDERBUFFER, sampleColorRenderbuffer);
glGenRenderbuffers(1, &sampleDepthRenderbuffer);
glBindRenderbuffer(GL_RENDERBUFFER, sampleDepthRenderbuffer);
glRenderbufferStorageMultisampleAPPLE(GL_RENDERBUFFER, 4, GL_DEPTH_COMPONENT16,
width, height);
if (glCheckFramebufferStatus(GL_FRAMEBUFFER) != GL_FRAMEBUFFER_COMPLETE)
41
Drawing to Other Rendering Destinations
Using Multisampling to Improve Image Quality
Here are the steps to modify your rendering code to support multisampling:
1. During the Clear Buffers step, you clear the multisampling framebuffer’s contents.
glBindFramebuffer(GL_FRAMEBUFFER, sampleFramebuffer);
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
2. After submitting your drawing commands, you resolve the contents from the multisampling buffer into
the resolve buffer. The samples stored for each pixel are combined into a single sample in the resolve
buffer.
glBindFramebuffer(GL_DRAW_FRAMEBUFFER_APPLE, resolveFrameBuffer);
glBindFramebuffer(GL_READ_FRAMEBUFFER_APPLE, sampleFramebuffer);
glResolveMultisampleFramebufferAPPLE();
3. In the Discard step, you can discard both renderbuffers attached to the multisample framebuffer. This is
because the contents you plan to present are stored in the resolve framebuffer.
glDiscardFramebufferEXT(GL_READ_FRAMEBUFFER_APPLE,2,discards);
4. In the Present Results step, you present the color renderbuffer attached to the resolve framebuffer.
glBindRenderbuffer(GL_RENDERBUFFER, colorRenderbuffer);
[context presentRenderbuffer:GL_RENDERBUFFER];
Multisampling is not free; additional memory is required to store the additional samples, and resolving the
samples into the resolve framebuffer takes time. If you add multisampling to your app, always test your app’s
performance to ensure that it remains acceptable.
Note: The above code assumes an OpenGL ES 1.1 or 2.0 context. Multisampling is part of the core
OpenGL ES 3.0 API, but the functions are different. See the specification for details.
42
Multitasking, High Resolution, and Other iOS
Features
Many aspects of working with OpenGL ES are platform neutral, but some details of working with OpenGL ES
on iOS bear special consideration. In particular, an iOS app using OpenGL ES must handle multitasking correctly
or risk being terminated when it moves to the background. You should also consider display resolution and
other device features when developing OpenGL ES content for iOS devices.
An OpenGL ES app must perform additional work when it is moved into the background. If an app handles
these tasks improperly, it may be terminated by iOS instead. Also, an app may want to free OpenGL ES resources
so that those resources are made available to the foreground app.
If you use a GLKit view and view controller, and only submit OpenGL ES commands during your drawing
method, your app automatically behaves correctly when it moves to the background. The GLKViewController
class, by default, pauses its animation timer when your app becomes inactive, ensuring that your drawing
method is not called.
If you do not use GLKit views or view controllers or if you submit OpenGL ES commands outside a GLKView
drawing method, you must take the following steps to ensure that your app is not terminated in the background:
1. In your app delegate’s applicationWillResignActive: method, your app should stop its animation
timer (if any), place itself into a known good state, and then call the glFinish function.
43
Multitasking, High Resolution, and Other iOS Features
Implementing a Multitasking-Aware OpenGL ES App
2. In your app delegate’s applicationDidEnterBackground: method, your app may want to delete
some of its OpenGL ES objects to make memory and resources available to the foreground app. Call the
glFinish function to ensure that the resources are removed immediately.
3. After your app exits its applicationDidEnterBackground: method, it must not make any new
OpenGL ES calls. If it makes an OpenGL ES call, it is terminated by iOS.
4. In your app’s applicationWillEnterForeground: method, re-create any objects and restart your
animation timer.
To summarize, your app needs to call the glFinish function to ensure that all previously submitted commands
are drained from the command buffer and are executed by OpenGL ES. After it moves into the background,
you must avoid all use of OpenGL ES until it moves back into the foreground.
Your goal should be to design your app to be a good citizen: This means keeping the time it takes to move to
the foreground as short as possible while also reducing its memory footprint while it is in the background.
Easy targets are the framebuffers your app allocates to hold rendering results. When your app is in the
background, it is not visible to the user and may not render any new content using OpenGL ES. That means
the memory consumed by your app’s framebuffers is allocated, but is not useful. Also, the contents of the
framebuffers are transitory ; most app re-create the contents of the framebuffer every time they render a new
frame. This makes renderbuffers a memory-intensive resource that can be easily re-created, becoming a good
candidate for an object that can be disposed of when moving into the background.
44
Multitasking, High Resolution, and Other iOS Features
Supporting High-Resolution Displays
If you use a GLKit view and view controller, the GLKViewController class automatically disposes of its
associated view’s framebuffers when your app moves into the background. If you manually create framebuffers
for other uses, you should dispose of them when your app moves to the background. In either case, you should
also consider what other transitory resources your app can dispose of at that time.
If you present OpenGL ES content using a Core Animation layer, its scale factor is set to 1.0 by default. To
draw at the full resolution of a Retina display, you should change the scale factor of the CAEAGLLayer object
to match the screen’s scale factor.
When supporting devices with high resolution displays, you should adjust the model and texture assets of
your app accordingly. When running on a high-resolution device, you might want to choose more detailed
models and textures to render a better image. Conversely, on a standard-resolution device, you can use smaller
models and textures.
Important: Many OpenGL ES API calls express dimensions in screen pixels. If you use a scale factor greater
than 1.0, you should adjust dimensions accordingly when using the glScissor, glBlitFramebuffer,
glLineWidth, or glPointSize functions or the gl_PointSize shader variable.
An important factor when determining how to support high-resolution displays is performance. The doubling
of scale factor on a Retina display quadruples the number of pixels, causing the GPU to process four times as
many fragments. If your app performs many per-fragment calculations, the increase in pixels may reduce the
frame rate. If you find that your app runs significantly slower at a higher scale factor, consider one of the
following options:
● Optimize your fragment shader’s performance using the performance-tuning guidelines found in this
document.
● Implement a simpler algorithm in your fragment shader. By doing so, you are reducing the quality of
individual pixels to render the overall image at a higher resolution.
● Use a fractional scale factor between 1.0 and and the screen’s scale factor. A scale factor of 1.5 provides
better quality than a scale factor of 1.0 but needs to fill fewer pixels than an image scaled to 2.0.
45
Multitasking, High Resolution, and Other iOS Features
Supporting Multiple Interface Orientations
By default, the GLKViewController and GLKView classes handle orientation changes automatically: When
the user rotates the device to a supported orientation, the system animates the orientation change and changes
the size of the view controller’s view. When its size changes, a GLKView object adjusts the size of its framebuffer
and viewport accordingly. If you need to respond to this change, implement the viewWillLayoutSubviews
or viewDidLayoutSubviews method in your GLKViewController subclass, or implement the
layoutSubviews method if you’re using a custom GLKView subclass.
If you draw OpenGL ES content using a Core Animation layer, your app should still include a view controller
to manage user interface orientation.
The procedure for drawing on an external display is almost identical to that running on the main screen.
46
Multitasking, High Resolution, and Other iOS Features
Presenting OpenGL ES Content on External Displays
1. Create a window on the external display by following the steps in Multiple Display Programming Guide
for iOS .
2. Add to the window the appropriate view or view controller objects for your rendering strategy.
● If rendering with GLKit, set up instances of GLKViewController and GLKView (or your custom
subclasses) and add them to the window using its rootViewController property.
● If rendering to a Core Animation layer, add the view containing your layer as a subview of the window.
To use an animation loop for rendering, create a display link object optimized for the external display
by retrieving the screen property of the window and calling its
displayLinkWithTarget:selector: method.
47
OpenGL ES Design Guidelines
Now that you’ve mastered the basics of using OpenGL ES in an iOS app, use the information in this chapter to
help you design your app’s rendering engine for better performance. This chapter introduces key concepts of
renderer design; later chapters expand on this information with specific best practices and performance
techniques.
Achieving great performance requires carefully managing this overhead. A well-designed app reduces the
frequency of calls it makes to OpenGL ES, uses hardware-appropriate data formats to minimize translation
costs, and carefully manages the flow of data between itself and OpenGL ES.
48
OpenGL ES Design Guidelines
How to Visualize OpenGL ES
Use the pipeline as a mental model to identify what work your app performs to generate a new frame. Your
renderer design consists of writing shader programs to handle the vertex and fragment stages of the pipeline,
organizing the vertex and texture data that you feed into these programs, and configuring the OpenGL ES
state machine that drives fixed-function stages of the pipeline.
Individual stages in the graphics pipeline can calculate their results simultaneously—for example, your app
might prepare new primitives while separate portions of the graphics hardware perform vertex and fragment
calculations on previously submitted geometry. However, later stages depend on the output of earlier stages.
If any pipeline stage performs too much work or performs too slowly, other pipeline stages sit idle until the
slowest stage completes its work. A well-designed app balances the work performed by each pipeline stage
according to graphics hardware capabilities.
49
OpenGL ES Design Guidelines
OpenGL ES Versions and Renderer Architecture
Important: When you tune your app’s performance, the first step is usually to determine which stage it is
bottlenecked in, and why.
OpenGL ES 3.0
OpenGL ES 3.0 is new in iOS 7. Your app can use features introduced in OpenGL ES 3.0 to implement advanced
graphics programming techniques—previously available only on desktop-class hardware and game
consoles—for faster graphics performance and compelling visual effects.
Some key features of OpenGL ES 3.0 are highlighted below. For a complete overview, see the OpenGL ES 3.0
Specification in the OpenGL ES API Registry.
For more details, see “Adopting OpenGL ES Shading Language version 3.0” (page 114) and the OpenGL ES
Shading Language 3.0 Specification in the OpenGL ES API Registry.
This feature enables the use of advanced rendering algorithms such as deferred shading, in which your app
first renders to a set of textures to store geometry data, then performs one or more shading passes that read
from those textures and perform lighting calculations to output a final image. Because this approach
precomputes the inputs to lighting calculations, the incremental performance cost for adding larger numbers
50
OpenGL ES Design Guidelines
OpenGL ES Versions and Renderer Architecture
of lights to a scene is much smaller. Deferred shading algorithms require multiple render target support, as
shown in Figure 6-3, to achieve reasonable performance. Otherwise, rendering to multiple textures requires a
separate drawing pass for each texture.
You set up multiple render targets with an addition to the process described in “Creating a Framebuffer
Object” (page 32). Instead of creating a single color attachment for a framebuffer, you create several. Then,
call the glDrawBuffers function to specify which framebuffer attachments to use in rendering, as shown in
Listing 6-1.
51
OpenGL ES Design Guidelines
OpenGL ES Versions and Renderer Architecture
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT,
GL_TEXTURE_2D, _depthTexture, 0);
glDrawBuffers(3, targets);
When your app issues drawing commands, your fragment shader determines what color (or non-color data)
is output for each pixel in each render target. Listing 6-2 shows a basic fragment shader that renders to multiple
targets by assigning to fragment output variables whose locations match those set in Listing 6-1.
#version 300 es
void main()
positionData = position;
Multiple render targets can also be useful for other advanced graphics techniques, such as real-time reflections,
screen-space ambient occlusion, and volumetric lighting.
52
OpenGL ES Design Guidelines
OpenGL ES Versions and Renderer Architecture
Transform Feedback
Graphics hardware uses a highly parallelized architecture optimized for vector processing. You can make better
use of this hardware with the new transform feedback feature, which lets you capture output from a vertex
shader into a buffer object in GPU memory. You can capture data from one rendering pass to use in another,
or disable parts of the graphics pipeline and use transform feedback for general-purpose computation.
One technique that benefits from transform feedback is animated particle effects. A general architecture for
rendering a particle system is illustrated in Figure 6-4. First, the app sets up the initial state of the particle
simulation. Then, for each frame rendered, the app runs a step of its simulation, updating the position,
orientation, and velocity of each simulated particle, and then draws visual assets representing the current state
of the particles.
Traditionally, apps implementing particle systems run their simulations on the CPU, storing the results of the
simulation in a vertex buffer to be used in rendering particle art. However, transferring the contents of the
vertex buffer to GPU memory is time-consuming. Transform feedback, by optimizing the power of parallel
architecture available in modern GPU hardware, solves the problem more efficiently.
With transform feedback, you can design your rendering engine to solve this problem more efficiently. Figure
6-5 shows an overview of how your app might configure the OpenGL ES graphics pipeline to implement a
particle system animation. Because OpenGL ES represents each particle and its state as a vertex, the GPU’s
53
OpenGL ES Design Guidelines
OpenGL ES Versions and Renderer Architecture
vertex shader stage can run the simulation for several particles at once. Because the vertex buffer containing
particle state data is reused between frames, the expensive process of transferring that data to GPU memory
only happens once, at initialization time.
1. At initialization time, create a vertex buffer and fill it with data containing the initial state of all particles
in the simulation.
2. Implement your particle simulation in a GLSL vertex shader program, and run it each frame by drawing
the contents of the vertex buffer containing particle position data.
● To render with transform feedback enabled, call the glBeginTransformFeedback function. (Call
glEndTransformFeedback() before resuming normal drawing.)
● Use the glTransformFeedbackVaryings function to specify which shader outputs should be
captured by transform feedback, and use the glBindBufferBase or glBindBufferRange function
and GL_TRANSFORM_FEEDBACK_BUFFER buffer type to specify the buffer they will be captured into.
● Disable rasterization (and subsequent stages of the pipeline) by calling
glEnable(GL_RASTERIZER_DISCARD).
54
OpenGL ES Design Guidelines
Designing a High-Performance OpenGL ES App
3. To render the simulation results for display, use the vertex buffer containing particle positions as an input
to second drawing pass, with rasterization (and the rest of the pipeline) once again enabled and using
vertex and fragment shaders appropriate for rendering your app’s visual content.
4. On the next frame, use the vertex buffer output by the last frame’s simulation step as input to the next
simulation step.
Other graphics programming techniques that can benefit from transform feedback include skeletal animation
(also known as skinning) and ray marching.
OpenGL ES 2.0
OpenGL ES 2.0 provides a flexible graphics pipeline with programmable shaders, and is available on all current
iOS devices. Many features formally introduced in the OpenGL ES 3.0 specification are available to iOS devices
through OpenGL ES 2.0 extensions, so you can implement many advanced graphics programming techniques
while remaining compatible with most devices.
OpenGL ES 1.1
OpenGL ES 1.1 provides only a basic fixed-function graphics pipeline. iOS supports OpenGL ES 1.1 primarily
for backward compatibility. If you are maintaining an OpenGL ES 1.1 app, consider updating your code for
newer OpenGL ES versions.
The GLKit framework can assist you in transitioning from the OpenGL ES 1.1 fixed-function pipeline to later
versions. For details, read “Using GLKit to Develop Your Renderer” (page 31).
55
OpenGL ES Design Guidelines
Designing a High-Performance OpenGL ES App
Figure 6-6 suggests a process flow for an app that uses OpenGL ES to perform animation to the display.
When the app launches, the first thing it does is initialize resources that it does not intend to change over the
lifetime of the app. Ideally, the app encapsulates those resources into OpenGL ES objects. The goal is to create
any object that can remain unchanged for the runtime of the app (or even a portion of the app’s lifetime, such
as the duration of a level in a game), trading increased initialization time for better rendering performance.
Complex commands or state changes should be replaced with OpenGL ES objects that can be used with a
single function call. For example, configuring the fixed-function pipeline can take dozens of function calls.
Instead, compile a graphics shader at initialization time, and switch to it at runtime with a single function call.
OpenGL ES objects that are expensive to create or modify should almost always be created as static objects.
The rendering loop processes all of the items you intend to render to the OpenGL ES context, then presents
the results to the display. In an animated scene, some data is updated for every frame. In the inner rendering
loop shown in Figure 6-6, the app alternates between updating rendering resources (creating or modifying
OpenGL ES objects in the process) and submitting drawing commands that use those resources. The goal of
this inner loop is to balance the workload so that the CPU and GPU are working in parallel, preventing the app
and OpenGL ES from accessing the same resources simultaneously. On iOS, modifying an OpenGL ES object
can be expensive when the modification is not performed at the start or the end of a frame.
56
OpenGL ES Design Guidelines
Avoid Synchronizing and Flushing Operations
An important goal for this inner loop is to avoid copying data back from OpenGL ES to the app. Copying results
from the GPU to the CPU can be very slow. If the copied data is also used later as part of the process of rendering
the current frame, as shown in the middle rendering loop, your app blocks until all previously submitted
drawing commands are completed.
After the app submits all drawing commands needed in the frame, it presents the results to the screen. A
non-interactive app would copy the final image to app memory for further processing.
Finally, when your app is ready to quit, or when it finishes with a major task, it frees OpenGL ES objects to
make additional resources available, either for itself or for other apps.
The rest of this chapter provides useful OpenGL ES programming techniques to implement the features of this
rendering loop. Later chapters demonstrate how to apply these general techniques to specific areas of
OpenGL ES programming.
● “Avoid Synchronizing and Flushing Operations” (page 57)
● “Avoid Querying OpenGL ES State” (page 58)
● “Use OpenGL ES to Manage Your Resources” (page 59)
● “Use Double Buffering to Avoid Resource Conflicts” (page 59)
● “Be Mindful of OpenGL ES State” (page 61)
● “Encapsulate State with OpenGL ES Objects” (page 61)
57
OpenGL ES Design Guidelines
Avoid Synchronizing and Flushing Operations
These situations require OpenGL ES to submit the command buffer to the hardware for execution.
● The function glFlush sends the command buffer to the graphics hardware. It blocks until commands
are submitted to the hardware but does not wait for the commands to finish executing.
● The function glFinish flushes the command buffer and then waits for all previously submitted commands
to finish executing on the graphics hardware.
● Functions that retrieve framebuffer content (such as glReadPixels) also wait for submitted commands
to complete.
● The command buffer is full.
When errors occur, OpenGL ES sets an error flag. These and other errors appear in OpenGL ES Frame Debugger
in Xcode or OpenGL ES Analyzer in Instruments. You should use those tools instead of the glGetError
function, which degrades performance if called frequently. Other queries such as
58
OpenGL ES Design Guidelines
Use OpenGL ES to Manage Your Resources
59
OpenGL ES Design Guidelines
Use Double Buffering to Avoid Resource Conflicts
To solve this problem, your app could perform additional work between changing the object and drawing
with it. But, if your app does not have additional work it can perform, it should explicitly create two identically
sized objects; while one participant reads an object, the other participant modifies the other. Figure 6-8
illustrates the double-buffered approach. While the GPU operates on one texture, the CPU modifies the other.
After the initial startup, neither the CPU or GPU sits idle. Although shown for textures, this solution works for
almost any type of OpenGL ES object.
Double buffering is sufficient for most apps, but it requires that both participants finish processing commands
in roughly the same time. To avoid blocking, you can add more buffers; this implements a traditional
producer-consumer model. If the producer finishes before the consumer finishes processing commands, it
takes an idle buffer and continues to process commands. In this situation, the producer idles only if the consumer
falls badly behind.
Double and triple buffering trade off consuming additional memory to prevent the pipeline from stalling. The
additional use of memory may cause pressure on other parts of your app. On an iOS device, memory can be
scarce; your design may need to balance using more memory with other app optimizations.
60
OpenGL ES Design Guidelines
Be Mindful of OpenGL ES State
Don't set a state that's already set. Once a feature is enabled, it does not need to be enabled again. For instance,
if you call a glUniform function with the same arguments more than once, OpenGL ES may not check to see
if the same uniform state is already set. It simply updates the state value even if that value is identical to the
current value.
Avoid setting a state more than necessary by using dedicated setup or shutdown routines rather than putting
such calls in a drawing loop. Setup and shutdown routines are also useful for turning on and off features that
achieve a specific visual effect—for example, when drawing a wire-frame outline around a textured polygon.
The iOS implementation of OpenGL ES can cache some of the configuration data it needs for efficient switching
between states, but the initial configuration for each unique state set takes longer. For consistent performance,
you can “prewarm” each state set you plan to use during a setup routine:
1. Enable a state configuration or shader you plan to use.
2. Draw a trivial number of vertices using that state configuration.
3. Flush the OpenGL ES context so that drawing during this prewarm phase is not displayed.
61
Tuning Your OpenGL ES App
The performance of OpenGL ES apps in iOS differs from that of OpenGL in OS X or other desktop operating
systems. Although powerful computing devices, iOS–based devices do not have the memory or CPU power
that desktop or laptop computers possess. Embedded GPUs are optimized for lower memory and power usage,
using algorithms different from those a typical desktop or laptop GPU might use. Rendering your graphics
data inefficiently can result in a poor frame rate or dramatically reduce the battery life of an iOS-based device.
Later chapters describe many techniques to improve your app’s performance; this chapter covers overall
strategies. Unless otherwise labeled, the advice in this chapter pertains to all versions of OpenGL ES.
62
Tuning Your OpenGL ES App
Debug and Profile Your App with Xcode and Instruments
You can also configure Xcode to stop program execution when an OpenGL ES error is encountered. (See Adding
an OpenGL ES Error Breakpoint.)
63
Tuning Your OpenGL ES App
Debug and Profile Your App with Xcode and Instruments
Figure 7-1 Xcode Frame Debugger before and after adding debug marker groups
When you have a sequence of drawing commands that represent a single meaningful operation—for example,
drawing a game character—you can use a marker to group them for debugging. Listing 7-1 shows how to
group the texture, program, vertex array, and draw calls for a single element of a scene. First, it calls the
glPushGroupMarkerEXT function to provide a meaningful name, then it issues a group of OpenGL ES
commands. Finally, it closes the group with a call to the glPopGroupMarkerEXT function.
glBindTexture(GL_TEXTURE_2D, _spaceshipTexture);
64
Tuning Your OpenGL ES App
General Performance Recommendations
glUseProgram(_diffuseShading);
glBindVertexArrayOES(_spaceshipMesh);
glPopGroupMarkerEXT();
You can use multiple nested markers to create a hierarchy of meaningful groups in a complex scene. When
you use the GLKView class to draw OpenGL ES content, it automatically creates a “Rendering” group containing
all commands in your drawing method. Any markers you create are nested within this group.
Labels provide meaningful names for OpenGL ES objects, such as textures, shader programs, and vertex array
objects. Call the glLabelObjectEXT function to give an object a name to be shown when debugging and
profiling. Listing 7-2 illustrates using this function to label a vertex array object. If you use the
GLKTextureLoader class to load texture data, it automatically labels the OpenGL ES texture objects it creates
with their filenames.
glGenVertexArraysOES(1, &_spaceshipMesh);
glBindVertexArrayOES(_spaceshipMesh);
Even when your data changes, it is not necessary to render frames at the speed the hardware processes
commands. A slower but fixed frame rate often appears smoother to the user than a fast but variable frame
rate. A fixed frame rate of 30 frames per second is sufficient for most animation and helps reduce power
consumption.
65
Tuning Your OpenGL ES App
Use Tile-Based Deferred Rendering Efficiently
If your app is written for OpenGL ES 2.0 or later, do not create a single shader with lots of switches and
conditionals that performs every task your app needs to render the scene. Instead, compile multiple shader
programs that each perform a specific, focused task.
If your app uses OpenGL ES 1.1, disable any fixed-function operations that are not necessary to render the
scene. For example, if your app does not require lighting or blending, disable those functions. Similarly, if your
app draws only 2D models, it should disable fog and depth testing.
Because tile memory is part of the GPU hardware, parts of the rendering process such as depth testing and
blending are much more efficient—in both time and energy usage—than on a traditional stream-based GPU
architecture. Because this architecture processes all vertices for an entire scene at once, the GPU can perform
66
Tuning Your OpenGL ES App
Use Tile-Based Deferred Rendering Efficiently
hidden surface removal before fragments are processed. Pixels that are not visible are discarded without
sampling textures or performing fragment processing, significantly reducing the calculations that the GPU
must perform to render the tile.
Some rendering strategies that are useful on traditional stream-based renderer have a high performance costs
on iOS graphics hardware. Following the guidelines below can help your app perform well on TBDR hardware.
Similarly, when the GPU finishes rendering a tile, it must write the tile’s pixel data back to shared memory. This
transfer, called a logical buffer store, also has a performance cost. At least one such transfer is necessary for
each frame drawn—the color renderbuffer displayed on the screen must be transferred to shared memory so
it can be presented by Core Animation. Other framebuffer attachments used in your rendering algorithm (for
example, depth, stencil, and multisampling buffers) need not be preserved, because their contents will be
recreated on the next frame drawn. OpenGL ES automatically stores these buffers to shared memory—incurring
a performance cost—unless you explicitly invalidate them. To invalidate a buffer, use the
glInvalidateFramebuffer command in OpenGL ES 3.0 or the glDiscardFramebufferEXT command
in OpenGL ES 1.1 or 2.0. (For more details, see “Discard Unneeded Renderbuffers” (page 39).) When you use
the basic drawing cycle provided by GLKView class, it automatically invalidates any drawable depth, stencil,
or multisampling buffers it creates.
Logical buffer store and load operations also occur if you switch rendering destinations. If you render to a
texture, then to a view’s framebuffer, then to the same texture again, the texture’s contents must be repeatedly
transferred between shared memory and the GPU. Batch your drawing operations so that all drawing to a
rendering destination is done together. When you do switch framebuffers (using the glBindFramebuffer
or glFramebufferTexture2D function or bindDrawable method), invalidate unneeded framebuffer
attachments to avoid causing a logical buffer store.
67
Tuning Your OpenGL ES App
Use Tile-Based Deferred Rendering Efficiently
The GPU cannot perform hidden surface removal when blending or alpha testing is enabled, or if a fragment
shader uses the discard instruction or writes to the gl_FragDepth output variable. In these cases, the GPU
cannot decide the visibility of a fragment using the depth buffer, so it must run the fragment shaders for all
primitives covering each pixel, greatly increasing the time and energy required to render a frame. To avoid
this performance cost, minimize your use of blending, discard instructions, and depth writes.
If you cannot avoid blending, alpha testing, or discard instructions, consider the following strategies for reducing
their performance impact:
● Sort objects by opacity. Draw opaque objects first. Next draw objects requiring a shader using the discard
operation (or alpha testing in OpenGL ES 1.1). Finally, draw alpha-blended objects.
● Trim objects requiring blending or discard instructions to reduce the number of fragments processed.
For example, instead of drawing a square to render a 2D sprite texture containing mostly empty space,
draw a polygon that more closely approximates the shape of the image, as shown in Figure 7-2. The
performance cost of additional vertex processing is much less than that of running fragment shaders
whose results will be unused.
● Use the discard instruction as early as possible in your fragment shader to avoid performing calculations
whose results are unused.
● Instead of using alpha testing or discard instructions to kill pixels, use alpha blending with alpha set to
zero. The color framebuffer is not modified, but the graphics hardware can still use any Z-buffer
optimizations it performs. This does change the value stored in the depth buffer and so may require
back-to-front sorting of the transparent primitives.
68
Tuning Your OpenGL ES App
Minimize the Number of Drawing Commands
● If your performance is limited by unavoidable discard operations, consider a “Z-Prepass” rendering strategy.
Render your scene once with a simple fragment shader containing only your discard logic (avoiding
expensive lighting calculations) to fill the depth buffer. Then, render your scene again using the GL_EQUAL
depth test function and your lighting shaders. Though multipass rendering normally incurs a performance
penalty, this approach can yield better performance than a single-pass render that involves a large number
of discard operations.
To avoid these performance penalties, organize your sequence of OpenGL ES calls so that all drawing commands
for each rendering target are performed together.
To reduce this overhead, look for ways to consolidate your rendering into fewer draw calls. Useful strategies
include:
● Merging multiple primitives into a single triangle strip, as described in “Use Triangle Strips to Batch Vertex
Data” (page 78). For best results, consolidate primitives that are drawn in close spatial proximity. Large,
sprawling models are more difficult for your app to efficiently cull when they are not visible in the frame.
● Creating texture atlases to draw multiple primitives using different portions of the same texture image,
as described in “Combine Textures into Texture Atlases” (page 92).
● Using instanced drawing to render many similar objects, as described below.
69
Tuning Your OpenGL ES App
Minimize the Number of Drawing Commands
Vertex data that is reused multiple times is a prime candidate for instanced drawing. For example, the code in
Listing 7-3 draws an object at multiple positions within a scene. However, the many glUniform and
glDrawArrays calls add CPU overhead, reducing performance.
glUniform4fv(uniformPositionOffset, 1, positionOffsets[x][y]);
glDrawArrays(GL_TRIANGLES, 0, numVertices);
Adopting instanced drawing requires two steps: first, replace loops like the above with a single call to
glDrawArraysInstanced or glDrawElementsInstanced. These calls are otherwise identical to
glDrawArrays or glDrawElements, but with an additional parameter indicating the number of instances
to draw (100 for the example in Listing 7-3). Second, choose and implement one of the two strategies OpenGL ES
provides for using per-instance information in your vertex shader.
With the shader instance ID strategy, your vertex shader derives or looks up per-instance information. Each
time the vertex shader runs, its gl_InstanceID built-in variable contains a number identifying the instance
currently being drawn. Use this number to calculate a position offset, color, or other per-instance variation in
shader code, or to look up per-instance information in a uniform array or other bulk storage. For example,
Listing 7-4 uses this technique to draw 100 instances of a mesh positioned in a 10 x 10 grid.
Listing 7-4 OpenGL ES 3.0 vertex shader using gl_InstanceID to compute per-instance information
#version 300 es
in vec4 position;
70
Tuning Your OpenGL ES App
Minimize the Number of Drawing Commands
void main()
With the instanced arrays strategy, you store per-instance information in a vertex array attribute. Your vertex
shader can then access that attribute to make use of per-instance information. Call the
glVertexAttribDivisor function to specify how that attribute advances as OpenGL ES draws each instance.
Listing 7-5 demonstrates setting up a vertex array for instanced drawing, and Listing 7-6 shows the corresponding
shader.
#define kMyInstanceDataAttrib 5
glGenBuffers(1, &_instBuffer);
glBindBuffer(GL_ARRAY_BUFFER, _instBuffer);
glEnableVertexAttribArray(kMyInstanceDataAttrib);
glVertexAttribDivisor(kMyInstanceDataAttrib, 1);
#version 300 es
void main()
71
Tuning Your OpenGL ES App
Minimize OpenGL ES Memory Usage
Instanced drawing is available in the core OpenGL ES 3.0 API and in OpenGL ES 2.0 through the
EXT_draw_instanced and EXT_instanced_arrays extensions.
The virtual memory system in iOS does not use a swap file. When a low-memory condition is detected, instead
of writing volatile pages to disk, the virtual memory frees up nonvolatile memory to give your running app
the memory it needs. Your app should strive to use as little memory as possible and be prepared to dispose
of objects that are not essential to your app. Responding to low-memory conditions is covered in detail in the
iOS App Programming Guide .
72
Tuning Your OpenGL ES App
Be Aware of Core Animation Compositing Performance
For the absolute best performance, your app should rely solely on OpenGL ES to render your content. Size the
view that holds your OpenGL ES content to match the screen, make sure its opaque property is set to YES
(the default for GLKView objects) and that no other views or Core Animation layers are visible.
If you render into a Core Animation layer that is composited on top of other layers, making your CAEAGLLayer
object opaque reduces—but doesn’t eliminate—the performance cost. If your CAEAGLLayer object is blended
on top of layers underneath it in the layer hierarchy, the renderbuffer’s color data must be in a premultiplied
alpha format to be composited correctly by Core Animation. Blending OpenGL ES content on top of other
content has a severe performance penalty.
73
Best Practices for Working with Vertex Data
To render a frame using OpenGL ES your app configures the graphics pipeline and submits graphics primitives
to be drawn. In some apps, all primitives are drawn using the same pipeline configuration; other apps may
render different elements of the frame using different techniques. But no matter which primitives you use in
your app or how the pipeline is configured, your app provides vertices to OpenGL ES. This chapter provides a
refresher on vertex data and follows it with targeted advice for how to efficiently process vertex data.
A vertex consists of one or more attributes, such as the position, the color, the normal, or texture coordinates.
An OpenGL ES 2.0 or 3.0 app is free to define its own attributes; each attribute in the vertex data corresponds
to an attribute variable that acts as an input to the vertex shader. An OpenGL 1.1 app uses attributes defined
by the fixed-function pipeline.
You define an attribute as a vector consisting of one to four components. All components in the attribute
share a common data type. For example, a color might be defined as four GLubyte components (red, green,
blue, alpha). When an attribute is loaded into a shader variable, any components that are not provided in the
app data are filled in with default values by OpenGL ES. The last component is filled with 1, and other unspecified
components are filled with 0, as illustrated in Figure 8-1.
Your app may configure an attribute to be a constant , which means the same values are used for all vertices
submitted as part of a draw command, or an array , which means that each vertex a value for that attribute.
When your app calls a function in OpenGL ES to draw a set of vertices, the vertex data is copied from your app
to the graphics hardware. The graphics hardware than acts on the vertex data, processing each vertex in the
shader, assembling primitives and rasterizing them out into the framebuffer. One advantage of OpenGL ES is
that it standardizes on a single set of functions to submit vertex data to OpenGL ES, removing older and less
efficient mechanisms that were provided by OpenGL.
74
Best Practices for Working with Vertex Data
Simplify Your Models
Apps that must submit a large number of primitives to render a frame need to carefully manage their vertex
data and how they provide it to OpenGL ES. The practices described in this chapter can be summarized in a
few basic principles:
● Reduce the size of your vertex data.
● Reduce the pre-processing that must occur before OpenGL ES can transfer the vertex data to the graphics
hardware.
● Reduce the time spent copying vertex data to the graphics hardware.
● Reduce computations performed for each vertex.
You can reduce the complexity of a model by using some of the following techniques:
● Provide multiple versions of your model at different levels of detail, and choose an appropriate model at
runtime based on the distance of the object from the camera and the dimensions of the display.
● Use textures to eliminate the need for some vertex information. For example, a bump map can be used
to add detail to a model without adding more vertex data.
● Some models add vertices to improve lighting details or rendering quality. This is usually done when
values are calculated for each vertex and interpolated across the triangle during the rasterization stage.
For example, if you directed a spotlight at the center of a triangle, its effect might go unnoticed because
the brightest part of the spotlight is not directed at a vertex. By adding vertices, you provide additional
interpolant points, at the cost of increasing the size of your vertex data and the calculations performed
on the model. Instead of adding additional vertices, consider moving calculations into the fragment stage
of the pipeline instead:
● If your app uses OpenGL ES 2.0 or later, then your app performs the calculation in the vertex shader
and assigns it to a varying variable. The varying value is interpolated by the graphics hardware and
passed to the fragment shader as an input. Instead, assign the calculation’s inputs to varying variables
and perform the calculation in the fragment shader. Doing this changes the cost of performing that
calculation from a per-vertex cost to a per-fragment cost, reduces pressure on the vertex stage and
more pressure on the fragment stage of the pipeline. Do this when your app is blocked on vertex
processing, the calculation is inexpensive and the vertex count can be significantly reduced by the
change.
75
Best Practices for Working with Vertex Data
Avoid Storing Constants in Attribute Arrays
● If your app uses OpenGL ES 1.1, you can perform per-fragment lighting using DOT3 lighting. You do
this by adding a bump map texture to hold normal information and applying the bump map using
a texture combine operation with the GL_DOT3_RGB mode.
If you specify smaller components, be sure you reorder your vertex format to avoid misaligning your vertex
data. See “Avoid Misaligned Vertex Data” (page 77).
76
Best Practices for Working with Vertex Data
Use Interleaved Vertex Data
Figure 8-2 Interleaved memory structures place all data for a vertex together in memory
An exception to this rule is when your app needs to update some vertex data at a rate different from the rest
of the vertex data, or if some data can be shared between two or more models. In either case, you may want
to separate the attribute data into two or more structures.
Figure 8-3 Use multiple vertex structures when some data is used differently
77
Best Practices for Working with Vertex Data
Use Triangle Strips to Batch Vertex Data
In Figure 8-4 (page 78), the position and normal data are each defined as three short integers, for a total of
six bytes. The normal data begins at offset 6, which is a multiple of the native size (2 bytes), but is not a multiple
of 4 bytes. If this vertex data were submitted to iOS, iOS would have to take additional time to copy and align
the data before passing it to the hardware. To fix this, explicitly add two bytes of padding after each attribute.
Sometimes, your app can combine more than one triangle strip into a single larger triangle strip. All of the
strips must share the same rendering requirements. This means:
● You must use the same shader to draw all of the triangle strips.
● You must be able to render all of the triangle strips without changing any OpenGL state.
● The triangle strips must share the same vertex attributes.
78
Best Practices for Working with Vertex Data
Use Triangle Strips to Batch Vertex Data
To merge two triangle strips, duplicate the last vertex of the first strip and the first vertex of the second strip,
as shown in Figure 8-6. When this strip is submitted to OpenGL ES, triangles DEE, EEF, EFF, and FFG are
considered degenerate and not processed or rasterized.
For best performance, your models should be submitted as a single indexed triangle strip. To avoid specifying
data for the same vertex multiple times in the vertex buffer, use a separate index buffer and draw the triangle
strip using the glDrawElements function (or the glDrawElementsInstanced or glDrawRangeElements
functions, if appropriate).
In OpenGL ES 3.0, you can use the primitive restart feature to merge triangle strips without using degenerate
triangles. When this feature is enabled, OpenGL ES treats the largest possible value in an index buffer as a
command to finish one triangle strip and start another. Listing 8-1 demonstrates this approach.
// Prepare index buffer data (not shown: vertex buffer data, loading vertex and
index buffers)
GLushort indexData[11] = {
};
glEnable(GL_PRIMITIVE_RESTART_FIXED_INDEX);
Where possible, sort vertex and index data so triangles that share common vertices are drawn reasonably close
to each other in the triangle strip. Graphics hardware often caches recent vertex calculations to avoid
recalculating a vertex.
79
Best Practices for Working with Vertex Data
Use Vertex Buffer Objects to Manage Copying Vertex Data
GLfloat position[2];
GLubyte color[4];
} vertexStruct;
void DrawModel()
sizeof(vertexStruct), &vertices[0].position);
glEnableVertexAttribArray(GLKVertexAttribPosition);
sizeof(vertexStruct), &vertices[0].color);
glEnableVertexAttribArray(GLKVertexAttribColor);
glDrawElements(GL_TRIANGLE_STRIP, sizeof(indices)/sizeof(GLubyte),
GL_UNSIGNED_BYTE, indices);
This code works, but is inefficient. Each time DrawModel is called, the index and vertex data are copied to
OpenGL ES, and transferred to the graphics hardware. If the vertex data does not change between invocations,
these unnecessary copies can impact performance. To avoid unnecessary copies, your app should store its
vertex data in a vertex buffer object (VBO). Because OpenGL ES owns the vertex buffer object’s memory, it
can store the buffer in memory that is more accessible to the graphics hardware, or pre-process the data into
the preferred format for the graphics hardware.
80
Best Practices for Working with Vertex Data
Use Vertex Buffer Objects to Manage Copying Vertex Data
Note: When using vertex array objects in OpenGL ES 3.0, you must also use vertex buffer objects.
Listing 8-3 creates a pair of vertex buffer objects, one to hold the vertex data and the second for the strip’s
indices. In each case, the code generates a new object, binds it to be the current buffer, and fills the buffer.
CreateVertexBuffers would be called when the app is initialized.
GLuint vertexBuffer;
GLuint indexBuffer;
void CreateVertexBuffers()
glGenBuffers(1, &vertexBuffer);
glBindBuffer(GL_ARRAY_BUFFER, vertexBuffer);
glGenBuffers(1, &indexBuffer);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, indexBuffer);
Listing 8-4 modifies Listing 8-2 (page 80) to use the vertex buffer objects. The key difference in Listing 8-4 is
that the parameters to the glVertexAttribPointer functions no longer point to the vertex arrays. Instead,
each is an offset into the vertex buffer object.
void DrawModelUsingVertexBuffers()
glBindBuffer(GL_ARRAY_BUFFER, vertexBuffer);
glEnableVertexAttribArray(GLKVertexAttribPosition);
81
Best Practices for Working with Vertex Data
Use Vertex Buffer Objects to Manage Copying Vertex Data
glEnableVertexAttribArray(GLKVertexAttribColor);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, indexBuffer);
glDrawElements(GL_TRIANGLE_STRIP, sizeof(indices)/sizeof(GLubyte),
GL_UNSIGNED_BYTE, (void*)0);
In iOS, GL_DYNAMIC_DRAW and GL_STREAM_DRAW are equivalent. You can use the glBufferSubData function
to update buffer contents, but doing so incurs a performance penalty because it flushes the command buffer
and waits for all commands to complete. Double or triple buffering can reduce this performance cost somewhat.
(See “Use Double Buffering to Avoid Resource Conflicts” (page 59).) For better performance, use the
glMapBufferRange function in OpenGL ES 3.0 or the corresponding function provided by the
EXT_map_buffer_range extension in OpenGL ES 2.0 or 1.1.
If different attributes inside your vertex format require different usage patterns, split the vertex data into
multiple structures and allocate a separate vertex buffer object for each collection of attributes that share
common usage characteristics. Listing 8-5 modifies the previous example to use a separate buffer to hold the
color data. By allocating the color buffer using the GL_DYNAMIC_DRAW hint, OpenGL ES can allocate that buffer
so that your app maintains reasonable performance.
82
Best Practices for Working with Vertex Data
Use Vertex Buffer Objects to Manage Copying Vertex Data
GLfloat position[2];
} vertexStatic;
GLubyte color[4];
} vertexDynamic;
GLuint staticBuffer;
GLuint dynamicBuffer;
GLuint indexBuffer;
void CreateBuffers()
glGenBuffers(1, &staticBuffer);
glBindBuffer(GL_ARRAY_BUFFER, staticBuffer);
// While not shown here, the expectation is that the data in this buffer changes
between frames.
glGenBuffers(1, &dynamicBuffer);
glBindBuffer(GL_ARRAY_BUFFER, dynamicBuffer);
83
Best Practices for Working with Vertex Data
Consolidate Vertex Array State Changes Using Vertex Array Objects
glGenBuffers(1, &indexBuffer);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, indexBuffer);
void DrawModelUsingMultipleVertexBuffers()
glBindBuffer(GL_ARRAY_BUFFER, staticBuffer);
glEnableVertexAttribArray(GLKVertexAttribPosition);
glBindBuffer(GL_ARRAY_BUFFER, dynamicBuffer);
glEnableVertexAttribArray(GLKVertexAttribColor);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, indexBuffer);
glDrawElements(GL_TRIANGLE_STRIP, sizeof(indices)/sizeof(GLubyte),
GL_UNSIGNED_BYTE, (void*)0);
84
Best Practices for Working with Vertex Data
Consolidate Vertex Array State Changes Using Vertex Array Objects
Figure 8-7 shows an example configuration with two vertex array objects. Each configuration is independent
of the other; each vertex array object can reference a different set of vertex attributes, which can be stored in
the same vertex buffer object or split across several vertex buffer objects.
Listing 8-6 provides the code used to configure first vertex array object shown above. It generates an identifier
for the new vertex array object and then binds the vertex array object to the context. After this, it makes the
same calls to configure vertex attributes as it would if the code were not using vertex array objects. The
configuration is stored to the bound vertex array object instead of to the context.
void ConfigureVertexArrayObject()
glGenVertexArrays(1,&vao1);
glBindVertexArray(vao1);
glBindBuffer(GL_ARRAY_BUFFER, vbo1);
sizeof(staticFmt), (void*)offsetof(staticFmt,position));
glEnableVertexAttribArray(GLKVertexAttribPosition);
sizeof(staticFmt), (void*)offsetof(staticFmt,texcoord));
glEnableVertexAttribArray(GLKVertexAttribTexCoord0);
sizeof(staticFmt), (void*)offsetof(staticFmt,normal));
85
Best Practices for Working with Vertex Data
Map Buffers into Client Memory for Fast Updates
glEnableVertexAttribArray(GLKVertexAttribNormal);
glBindBuffer(GL_ARRAY_BUFFER, vbo2);
sizeof(dynamicFmt), (void*)offsetof(dynamicFmt,color));
glEnableVertexAttribArray(GLKVertexAttribColor);
glBindBuffer(GL_ARRAY_BUFFER,0);
glBindVertexArray(0); }
To draw, the code binds the vertex array object and then submits drawing commands as before.
Note: In OpenGL ES 3.0, client storage of vertex array data is not allowed—vertex array objects
must use vertex buffer objects.
For best performance, your app should configure each vertex array object once, and never change it at runtime.
If you need to change a vertex array object in every frame, create multiple vertex array objects instead. For
example, an app that uses double buffering might configure one set of vertex array objects for odd-numbered
frames, and a second set for even numbered frames. Each set of vertex array objects would point at the vertex
buffer objects used to render that frame. When a vertex array object’s configuration does not change, OpenGL ES
can cache information about the vertex format and improve how it processes those vertex attributes.
86
Best Practices for Working with Vertex Data
Map Buffers into Client Memory for Fast Updates
For example, you may want to both modify a vertex buffer and draw its contents on each pass through a high
frame rate rendering loop. A draw command from the last frame rendered may still be utilizing the GPU while
the CPU is attempting to access buffer memory to prepare for drawing the next frame—causing the buffer
update call to block further CPU work until the GPU is done. You can improve performance in such scenarios
by manually synchronizing CPU and GPU access to a buffer.
The glMapBufferRange function provides a more efficient way to dynamically update vertex buffers. (This
function is available as core API in OpenGL ES 3.0 and through the EXT_map_buffer_range extension in
OpenGL ES 1.1 and 2.0.) Use this function to retrieve a pointer to a region of OpenGL ES memory, which you
can then use to write new data. The glMapBufferRange function allows mapping of any subrange of the
buffer’s data storage into client memory. It also supports hints that allow for asynchronous buffer modification
when you use the function together with a OpenGL sync object, as shown in Listing 8-7.
GLsync fence;
GLboolean success;
glBindBuffer(GL_ARRAY_BUFFER, vbo);
GL_MAP_WRITE_BIT | GL_MAP_FLUSH_EXPLICIT_BIT |
GL_MAP_UNSYNCHRONIZED_BIT );
glClientWaitSync(fence, GL_SYNC_FLUSH_COMMANDS_BIT,
GL_TIMEOUT_IGNORED);
success = glUnmapBuffer(GL_ARRAY_BUFFER);
// Issue other OpenGL ES commands that use other ranges of the VBO's data.
// Issue draw commands that use this range of the VBO's data.
DrawMyVBO(vbo);
87
Best Practices for Working with Vertex Data
Map Buffers into Client Memory for Fast Updates
return success;
The UpdateAndDraw function in this example uses the glFenceSync function to establish a synchronization
point, or fence, immediately after submitting drawing commands that use a particular buffer object. It then
uses the glClientWaitSync function (on the next pass through the rendering loop) to check that synchronization
point before modifying the buffer object. If those drawing commands finish executing on the GPU before the
rendering loop comes back around, CPU execution does not block and the UpdateAndDraw function continues
to modify the buffer and draw the next frame. If the GPU has not finished executing those commands, the
glClientWaitSync function blocks further CPU execution until the GPU reaches the fence. By manually
placing synchronization points only around the sections of your code with potential resource conflicts, you
can minimize how long the CPU waits for the GPU.
88
Best Practices for Working with Texture Data
Texture data is often the largest portion of the data your app uses to render a frame; textures provide the detail
required to present great images to the user. To get the best possible performance out of your app, manage
your app’s textures carefully. To summarize the guidelines:
● Create your textures when your app is initialized, and never change them in the rendering loop.
● Reduce the amount of memory your textures use.
● Combine smaller textures into a larger texture atlas.
● Use mipmaps to reduce the bandwidth required to fetch texture data.
● Use multitexturing to perform texturing operations in a single pass.
After you create a texture, avoid changing it except at the beginning or end of a frame. Currently, all iOS devices
use a tile-based deferred renderer, making calls to the glTexSubImage and glCopyTexSubImage functions
particularly expensive. See “Tile-Based Deferred Rendering” in OpenGL ES Hardware Platform Guide for iOS for
more information.
89
Best Practices for Working with Texture Data
Load Textures During Initialization
Note: A GLKTextureInfo object does not own the OpenGL ES texture object it describes. You
must call the glDeleteTextures function to dispose of texture objects when you are done using
them.
Listing 9-1 presents a typical strategy to load a new texture from a file and to bind and enable the texture for
later use.
GLKTextureInfo *spriteTexture;
NSError *theError;
glBindTexture([Link], [Link]); // 3
glEnable([Link]); // 4
Here is what the code does, corresponding to the numbered steps in the listing:
1. Create a path to the image that contains the texture data. This path is passed as a parameter to the
GLKTextureLoader class method textureWithContentsOfFile:options:error:.
2. Load a new texture from the image file and store the texture information in a GLKTextureInfo object.
There are a variety of texture loading options available. For more information, see GLKTextureLoader Class
Reference .
3. Bind the texture to a context, using the appropriate properties of the GLKTextureInfo object as
parameters.
4. Enable use of the texture for drawing using the appropriate property of the GLKTextureInfo object as
a parameter.
The GLKTextureLoader class can also load cubemap textures in most common image formats. And, if your
app needs to load and create new textures while running, the GLKTextureLoader class also provides methods
for asynchronous texture loading. See GLKTextureLoader Class Reference for more information.
90
Best Practices for Working with Texture Data
Reduce Texture Memory Usage
Compress Textures
Texture compression usually provides the best balance of memory savings and quality. OpenGL ES for iOS
supports multiple compressed texture formats.
All iOS devices support the the PowerVR Texture Compression (PVRTC) format by implementing the
GL_IMG_texture_compression_pvrtc extension. There are two levels of PVRTC compression, 4 bits per pixel
and 2 bits per pixel, which offer a 8:1 and 16:1 compression ratio over the uncompressed 32-bit texture format
respectively. A compressed PVRTC texture still provides a decent level of quality, particularly at the 4-bit level.
For more information on compressing textures into PVRTC format, see “Using texturetool to Compress
Textures” (page 131).
OpenGL ES 3.0 also supports the ETC2 and EAC compressed texture formats; however, PVRTC textures are
recommended on iOS devices.
Before shrinking your textures, attempt to compress the texture or use a lower-precision color format first. A
texture compressed with the PVRTC format usually provides higher image quality than shrinking the
texture—and it uses less memory too!
91
Best Practices for Working with Texture Data
Combine Textures into Texture Atlases
Xcode 5 can automatically build texture atlases for you from a collection of images. For details on creating a
texture atlas, see Texture Atlas Help. This feature is provided primarily for developers using the Sprite Kit
framework, but any app can make use of the texture atlas files it produces. For each .atlas folder in your
project, Xcode creates a .atlasc folder in your app bundle, containing one or more compiled atlas images
and a property list (.plist) file. The property list file describes the individual images that make up the atlas and
their locations within the atlas image—you can use this information to calculate appropriate texture coordinates
for use in OpenGL ES drawing.
92
Best Practices for Working with Texture Data
Use Mipmapping to Reduce Memory Bandwidth Usage
The GL_LINEAR_MIPMAP_LINEAR filter mode provides the best quality when texturing but requires additional
texels to be fetched from memory. Your app can trade some image quality for better performance by specifying
the GL_LINEAR_MIPMAP_NEAREST filter mode instead.
When combining mip maps with texture atlases, use the TEXTURE_MAX_LEVEL parameter in OpenGL ES 3.0
to control how your textures are filtered. (This functionality is also available in OpenGL ES 1.1 and 2.0 through
the APPLE_texture_max_level extension.)
All OpenGL ES implementations on iOS support at least two texture units, and most devices support at least
eight. Your app should use these texture units to perform as many steps as possible in your algorithm in each
pass. You can retrieve the number of texture units available to your app by calling the glGetIntegerv function,
passing in GL_MAX_TEXTURE_UNITS as the parameter.
93
Best Practices for Shaders
Shaders provide great flexibility, but they can also be a significant bottleneck if you perform too many
calculations or perform them inefficiently.
#ifdef DEBUG
if(logLen > 0) {
#endif
94
Best Practices for Shaders
Compile and Link Shaders During Initialization
Similarly, you should call the glValidateProgram function only in development builds. You can use this
function to find development errors such as failing to bind all texture units required by a shader program. But
because validating a program checks it against the entire OpenGL ES context state, it is an expensive operation.
Since the results of program validation are only meaningful during development, you should not call this
function in Release builds of your app.
OpenGL ES 2.0 and 3.0 contexts on iOS support the EXT_separate_shader_objects extension. You can
use the functions provided by this extension to compile vertex and fragment shaders separately, and to mix
and match precompiled shader stages at render time using program pipeline objects. Additionally, this extension
provides a simplified interface for compiling and using shaders, shown in Listing 10-2.
- (void)loadShaders
const GLchar *vertexSourceText = " ... vertex shader GLSL source code ... ";
const GLchar *fragmentSourceText = " ... fragment shader GLSL source code ...
";
// Compile and link the separate vertex shader program, then read its uniform
variable locations
_vertexProgram = glCreateShaderProgramvEXT(GL_VERTEX_SHADER, 1,
&vertexSourceText);
_uniformModelViewProjectionMatrix = glGetUniformLocation(_vertexProgram,
"modelViewProjectionMatrix");
// Compile and link the separate fragment shader program (which uses no uniform
variables)
_fragmentProgram = glCreateShaderProgramvEXT(GL_FRAGMENT_SHADER, 1,
&fragmentSourceText);
95
Best Practices for Shaders
Respect the Hardware Limits on Shaders
glGenProgramPipelinesEXT(1, &_ppo);
glBindProgramPipelineEXT(_ppo);
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
// Use the previously constructed program pipeline and set uniform contents
in shader programs
glBindProgramPipelineEXT(_ppo);
glProgramUniformMatrix4fvEXT(_vertexProgram, _uniformModelViewProjectionMatrix,
1, 0, _modelViewProjectionMatrix.m);
glProgramUniformMatrix3fvEXT(_vertexProgram, _uniformNormalMatrix, 1, 0,
_normalMatrix.m);
glBindVertexArrayOES(_vertexArray);
96
Best Practices for Shaders
Use Precision Hints
Important: The range limits defined by the precision hints are not enforced. You cannot assume your data
is clamped to this range.
Listing 10-3 defaults to high precision variables, but calculates the color output using low precision variables
because higher precision is not necessary.
void main()
The actual precision of shader variables can vary between different iOS devices, as can the performance of
operations at each level of precision. Refer to the iOS Device Compatibility Reference for device-specific
considerations.
97
Best Practices for Shaders
Perform Vector Calculations Lazily
If the code in Listing 10-4 were executed on a vector processor, each multiplication would be executed in
parallel across all four of the vector’s components. However, because of the location of the parenthesis, the
same operation on a scalar processor would take eight multiplications, even though two of the three parameters
are scalar values.
The same calculation can be performed more efficiently by shifting the parentheses as shown in Listing 10-5.
In this example, the scalar values are multiplied together first, and the result multiplied against the vector
parameter; the entire operation can be calculated with five multiplications.
v0 = v1 * (f0 * f1);
Similarly, your app should always specify a write mask for a vector operation if it does not use all of the
components of the result. On a scalar processor, calculations for components not specified in the mask can be
skipped. Listing 10-6 runs twice as fast on a scalar processor because it specifies that only two components
are needed.
[Link] = v0 * v1;
98
Best Practices for Shaders
Use Uniforms or Constants Instead of Computing Values in a Shader
Your app may perform best if you avoid branching entirely. For example, instead of creating a large shader
with many conditional options, create smaller shaders specialized for specific rendering tasks. There is a tradeoff
between reducing the number of branches in your shaders and increasing the number of shaders you create.
Test different options and choose the fastest solution.
Eliminate Loops
You can eliminate many loops by either unrolling the loop or using vectors to perform operations. For example,
this code is very inefficient:
int i;
float f;
vec4 v;
v[i] += f;
float f;
vec4 v;
v += f;
99
Best Practices for Shaders
Use Uniforms or Constants Instead of Computing Values in a Shader
When you cannot eliminate a loop, it is preferred that the loop have a constant limit to avoid dynamic branches.
Listing 10-7 shows a fragment shader that calculates new texture coordinates. The calculation in this example
can easily be performed in the vertex shader, instead. By moving the calculation to the vertex shader and
directly using the vertex shader’s computed texture coordinates, you avoid the dependent texture read.
Note: It may not seem obvious, but any calculation on the texture coordinates counts as a dependent
texture read. For example, packing multiple sets of texture coordinates into a single varying parameter
and using a swizzle command to extract the coordinates still causes a dependent texture read.
void main()
100
Best Practices for Shaders
Fetch Framebuffer Data for Programmable Blending
In iOS 6.0 and later, you can use the EXT_shader_framebuffer_fetch extension to implement programmable
blending and other effects. Instead of supplying a source color to be blended by OpenGL ES, your fragment
shader reads the contents of the destination framebuffer corresponding to the fragment being processed.
Your fragment shader can then use whatever algorithm you choose to produce an output color, as shown in
Figure 10-2.
101
Best Practices for Shaders
Fetch Framebuffer Data for Programmable Blending
● Additional blending modes. By defining your own GLSL ES functions for combining source and destination
colors, you can implement blending modes not possible with the OpenGL ES fixed-function blending
stage. For example, Listing 10-8 (page 102) implements the Overlay and Difference blending modes found
in popular graphics software.
● Post-processing effects. After rendering a scene, you can draw a full-screen quad using a fragment shader
that reads the current fragment color and transforms it to produce an output color. The shader in Listing
10-9 (page 103) can be used with this technique to convert a scene to grayscale.
● Non-color fragment operations. Framebuffers may contain non-color data. For example, deferred shading
algorithms use multiple render targets to store depth and normal information. Your fragment shader can
read such data from one (or more) render targets and use them to produce an output color in another
render target.
These effects are possible without the framebuffer fetch extension—for example, grayscale conversion can be
done by rendering a scene into a texture, then drawing a full-screen quad using that texture and a fragment
shader that converts texel colors to grayscale. However, using this extension generally results in better
performance.
To enable this feature, your fragment shader must declare that it requires the
EXT_shader_framebuffer_fetch extension, as shown in Listing 10-8 and Listing 10-9. The shader code to
implement this feature differs between versions of the OpenGL ES Shading Language (GLSL ES).
#define kBlendModeDifference 1
#define kBlendModeOverlay 2
102
Best Practices for Shaders
Fetch Framebuffer Data for Programmable Blending
void main()
if (blendMode == kBlendModeDifference) {
gl_FragColor.a = sourceColor.a;
#version 300 es
void main()
[Link] = vec3(luminance);
103
Best Practices for Shaders
Use Textures for Larger Memory Buffers in Vertex Shaders
void main()
{
// Use the vertex X and Z values to look up a Y value in the texture.
// Put the X and Z values into their places in the position vector.
[Link] = xzPos;
104
Best Practices for Shaders
Use Textures for Larger Memory Buffers in Vertex Shaders
You can also use uniform arrays and uniform buffer objects (in OpenGL ES 3.0) to provide bulk data to a vertex
shader, but vertex texture access offers several potential advantages. You can store much more data in a texture
than in either a uniform array or uniform buffer object, and you can use texture wrapping and filtering options
to interpolate the data stored in a texture. Additionally, you can render to a texture, taking advantage of the
GPU to produce data for use in a later vertex processing stage.
To determine whether vertex texture sampling is available on a device (and the number of texture units
available to vertex shaders), check the value of the MAX_VERTEX_TEXTURE_IMAGE_UNITS limit at run time.
(See “Verifying OpenGL ES Capabilities” (page 16).)
105
Concurrency and OpenGL ES
In computing, concurrency usually refers to executing tasks on more than one processor at the same time. By
performing work in parallel, tasks complete sooner, and apps become more responsive to the user. A
well-designed OpenGL ES app already exhibits a specific form of concurrency—concurrency between app
processing on the CPU and OpenGL ES processing on the GPU. Many techniques introduced in “OpenGL ES
Design Guidelines” (page 48) are aimed specifically at creating OpenGL apps that exhibit great CPU-GPU
parallelism. Designing a concurrent app means decomposing the work into subtasks and identifying which
tasks can safely operate in parallel and which tasks must be executed sequentially—that is, which tasks are
dependent on either resources used by other tasks or results returned from those tasks.
Each process in iOS consists of one or more threads. A thread is a stream of execution that runs code for the
process. Apple offers both traditional threads and a feature called Grand Central Dispatch (GCD). Using Grand
Central Dispatch, you can decompose a task into subtasks without manually managing threads. GCD allocates
threads based on the number of cores available on the device and automatically schedules tasks to those
threads.
At a higher level, Cocoa Touch offers NSOperation and NSOperationQueue to provide an Objective-C
abstraction for creating and scheduling units of work.
This chapter does not describe these technologies in detail. Before you consider how to add concurrency to
your OpenGL ES app, consult Concurrency Programming Guide . If you plan to manage threads manually, also
see Threading Programming Guide . Regardless of which technique you use, there are additional restrictions
when calling OpenGL ES on multithreaded systems. This chapter helps you understand when multithreading
improves your OpenGL ES app’s performance, the restrictions OpenGL ES places on multithreaded app, and
common design strategies you might use to implement concurrency in an OpenGL ES app.
106
Concurrency and OpenGL ES
OpenGL ES Restricts Each Context to a Single Thread
If your app is blocked waiting for the GPU, and has no work it can perform in parallel with its OpenGL ES
drawing, then it is not a good candidate for concurrency. If the CPU and GPU are both idle, then your OpenGL ES
needs are probably simple enough that no further tuning is needed.
OpenGL ES is not reentrant. If you modify the same context from multiple threads simultaneously, the results
are unpredictable. Your app might crash or it might render improperly. If for some reason you decide to set
more than one thread to target the same context, then you must synchronize threads by placing a mutex
around all OpenGL ES calls to the context. OpenGL ES commands that block—such as glFinish—do not
synchronize threads.
GCD and NSOperationQueue objects can execute your tasks on a thread of their choosing. They may create
a thread specifically for that task, or they may reuse an existing thread. But in either case, you cannot guarantee
which thread executes the task. For an OpenGL ES app, that means:
● Each task must set the context before executing any OpenGL ES commands.
● Two tasks that access the same context may never execute simultaneously.
● Each task should clear the thread’s context before exiting.
107
Concurrency and OpenGL ES
Strategies for Implementing Concurrency in OpenGL ES Apps
Multithreaded OpenGL ES
Whenever your application calls an OpenGL ES function, OpenGL ES processes the parameters to put them in
a format that the hardware understands. The time required to process these commands varies depending on
whether the inputs are already in a hardware-friendly format, but there is always overhead in preparing
commands for the hardware.
If your application spends a lot of time performing calculations inside OpenGL ES, and you’ve already taken
steps to pick ideal data formats, your application might gain an additional benefit by enabling multithreading
for the OpenGL ES context. A multithreaded OpenGL ES context automatically creates a worker thread and
transfers some of its calculations to that thread. On a multicore device, enabling multithreading allows internal
OpenGL ES calculations performed on the CPU to act in parallel with your application, improving performance.
Synchronizing functions continue to block the calling thread.
To enable OpenGL ES multithreading, set the value of the multiThreaded property of your EAGLContext
object to YES.
108
Concurrency and OpenGL ES
Perform OpenGL ES Computations in a Worker Task
Note: Enabling or disabling multithreaded execution causes OpenGL ES to flush previous commands
and incurs the overhead of setting up the additional thread. Enable or disable multithreading in an
initialization function rather than in the rendering loop.
Enabling multithreading means OpenGL ES must copy parameters to transmit them to the worker thread.
Because of this overhead, always test your application with and without multithreading enabled to determine
whether it provides a substantial performance improvement. You can minimize this overhead by implementing
your own strategy for x OpenGL ES use in a multithreaded app, as described in the remainder of this chapter.
The approach described in Figure 6-6 (page 56) alternates between updating OpenGL ES objects and executing
rendering commands that use those objects. OpenGL ES renders on the GPU in parallel with your app’s updates
running on the CPU. If the calculations performed on the CPU take more processing time than those on the
GPU, then the GPU spends more time idle. In this situation, you may be able to take advantage of parallelism
on systems with multiple CPUs. Split your OpenGL ES rendering code into separate calculation and processing
tasks, and run them in parallel. One task produces data that is consumed by the second and submitted to
OpenGL.
For best performance, avoid copying data between tasks. Rather than calculating the data in one task and
copying it into a vertex buffer object in the other, map the vertex buffer object in the setup code and hand
the pointer directly to the worker task.
If you can further decompose the modifications task into subtasks, you may see better benefits. For example,
assume two or more vertex buffer objects, each of which needs to be updated before submitting drawing
commands. Each can be recalculated independently of the others. In this scenario, the modifications to each
buffer becomes an operation, using an NSOperationQueue object to manage the work:
1. Set the current context.
2. Map the first buffer.
3. Create an NSOperation object whose task is to fill that buffer.
4. Queue that operation on the operation queue.
5. Perform steps 2 through 4 for the other buffers.
109
Concurrency and OpenGL ES
Use Multiple OpenGL ES Contexts
The GLKTextureLoader class implements this strategy to provide asynchronous loading of texture data. (See
“Use the GLKit Framework to Load Texture Data” (page 89).)
110
Adopting OpenGL ES 3.0
OpenGL ES 3.0 is a superset of the OpenGL ES 2.0 specification, so adopting it in your app is easy. You can
continue to use your OpenGL ES 2.0 code while taking advantage of the higher resource limits available to
OpenGL ES 3.0 contexts on compatible devices, and add support for OpenGL ES 3.0–specific features where
it makes sense for your app’s design.
If you plan to make your app available for devices that do not support OpenGL ES 3.0, follow the procedure
in Listing 2-1 (page 20) to fall back to OpenGL ES 2.0 when necessary.
2. Include or import the OpenGL ES 3.0 API headers in source files that use OpenGL ES 3.0 API:
#import <OpenGLES/ES3/gl.h>
#import <OpenGLES/ES3/glext.h>
3. Update code that uses OpenGL ES 2.0 extensions incorporated into or changed by the OpenGL ES 3.0
specifications, as described in “Updating Extension Code” below.
4. (Optional.) You can use the same shader programs in both OpenGL ES 2.0 and 3.0. However, if you choose
to port shaders to GLSL ES 3.0 to use new features, see the caveats in “Adopting OpenGL ES Shading
Language version 3.0” (page 114).
5. Test your app on an OpenGL ES 3.0–compatible device to verify that it behaves correctly.
111
Adopting OpenGL ES 3.0
Updating Extension Code
112
Adopting OpenGL ES 3.0
Updating Extension Code
● OpenGL ES 3.0 does not define float or half-float formats for LUMINANCE or LUMINANCE_ALPHA data. Use
the corresponding RED or RG formats instead.
● The vector returned by depth and depth/stencil texture samplers no longer repeats the depth value in its
first three components in OpenGL ES 3.0. Use only the first (.r) component in shader code that samples
such textures.
● The sRGB format is only valid when used for the internalformat parameter in OpenGL ES 3.0. Use
GL_RGB or GL_RGBA for the format parameter for sRGB textures.
Alternatively, replace calls to glTexImage functions with calls to the corresponding glTexStorage functions.
Texture storage functions are available in as core API in OpenGL ES 3.0, and through the EXT_texture_storage
extension in OpenGL ES 1.1 and 2.0. These functions offer an additional benefit: using a glTexStorage function
completely specifies an immutable texture object in one call; it performs all consistency checks and memory
allocations immediately, guaranteeing that the texture object can never be incomplete due to missing mipmap
levels or inconsistent cube map faces.
113
Adopting OpenGL ES 3.0
Adopting OpenGL ES Shading Language version 3.0
Discarding Framebuffers
The glInvalidateFramebuffer function in OpenGL ES 3.0 replaces the glDiscardFramebufferEXT
function provided by the EXT_discard_framebuffer extension. The parameters and behavior of both
functions are identical.
Using Multisampling
OpenGL ES 3.0 incorporates all features of the APPLE_framebuffer_multisample extension, except for
the glResolveMultisampleFramebufferAPPLE function. Instead the glBlitFramebuffer function
provides this and other other framebuffer copying options. To resolve a multisampling buffer, set the read and
draw framebuffers (as in “Using Multisampling to Improve Image Quality” (page 40)) and then use
glBlitFramebuffer to copy the entire read framebuffer into the entire draw framebuffer:
Most code written for OpenGL ES 2.0 extensions that are also present as OpenGL ES 3.0 extensions will work
in an OpenGL ES 3.0 context without changes. However, additional caveats apply to extensions which modify
the vertex and fragment shader language—for details, see the next section.
Some language conventions have changed between GLSL ES version 1.0 and 3.0. These changes make shader
source code more portable between OpenGL ES 3.0 and desktop OpenGL ES 3.3 or later, but they also require
minor changes to existing shader source code when porting to GLSL ES 3.0:
● The attribute and varying qualifiers are replaced in GLSL ES 3.0 by by the keywords in and out. In a
vertex shader, use the in qualifier for vertex attributes and the out qualifier for varying outputs. In a
fragment shader, use the in qualifier for varying inputs.
114
Adopting OpenGL ES 3.0
Adopting OpenGL ES Shading Language version 3.0
● GLSL ES 3.0 removes the gl_FragData and gl_FragColor builtin fragment output variables. Instead,
you declare your own fragment output variables with the out qualifier.
● Texture sampling functions have been renamed in GLSL ES 3.0—all sampler types use the same texture
function name. For example, you can use the new texture function with either a sampler2D or
samplerCube parameter (replacing the texture2D and textureCube functions from GLSL ES 1.0).
● The features added to GLSL ES 1.0 by the EXT_shader_texture_lod, EXT_shadow_samplers, and
OES_standard_derivatives extensions are part of the core GLSL ES specification. When porting shaders
that use these features to GLSL ES 3.0, use the corresponding GLSL ES 3.0 functions.
● The EXT_shader_framebuffer_fetch extension works differently. GLSL ES 3.0 removes the
gl_FragData and gl_FragColor builtin fragment output variables in favor of requiring fragment outputs
to be declared in the shader. Correspondingly, the gl_LastFragData builtin variable is not present in
GLSL ES 3.0 fragment shaders. Instead, any fragment output variables you declare with the inout qualifier
contain previous fragment data when the shader runs. For more details, see “Fetch Framebuffer Data for
Programmable Blending” (page 101).
For a complete overview of GLSL ES 3.0, see the OpenGL ES Shading Language 3.0 Specification , available from
the OpenGL ES API Registry.
115
Xcode OpenGL ES Tools Overview
Xcode tools for debugging, analyzing, and tuning OpenGL ES applications are useful during all stages of
development. The FPS Debug Gauge and GPU report summarize your app’s GPU performance every time you
run it from Xcode, so you can quickly spot performance issues while designing and building your renderer.
Once you’ve found a trouble spot, capture a frame and use Xcode’s OpenGL ES Frame Debugger interface to
pinpoint rendering problems and solve performance issues.
Effectively using the Xcode OpenGL ES features requires some familiarity with Xcode’s debugging interface.
For background information, read Xcode Overview .
116
Xcode OpenGL ES Tools Overview
Using the FPS Debug Gauge and GPU Report
Note: Some features of the FPS gauge and GPU report rely on a display link timer. If you do not use
the CADisplayLink or GLKViewController classes to animate your OpenGL ES displays, the
gauge and report cannot show performance relative to a target frame rate or provide accurate CPU
frame time information.
117
Xcode OpenGL ES Tools Overview
Capturing and Analyzing an OpenGL ES Frame
for a program shows the draw calls made using that program and the rendering time contribution from
each. Select a program in the list to view its shader source code in the assistant editor, or click the arrow
icon next to a draw call to select that call in the frame navigator (see “Navigator Area” (page 121) below).
Note: The Program Performance view only appears when debugging on devices that support
OpenGL ES 3.0 (regardless of whether your app uses an OpenGL ES 3.0 or 2.0 context).
When tuning your app, you can use this graph to find opportunities for optimization. For example, if one
program takes 50% of the frame rendering time, you gain more performance by optimizing it than by
improving the speed of a program that accounts for only 10% of frame time. Though this view organizes
frame time by shader program, remember that improving your shader algorithms isn’t the only way to
optimize your app’s performance—for example, you can also reduce the number of draw calls that use a
costly shader program, or reduce the number of fragments processed by a slow fragment shader.
● Problems & Solutions. Only appears after Xcode analyzes a frame capture (see “Capturing and Analyzing
an OpenGL ES Frame” (page 118)), this area lists possible issues found during analysis and recommendations
for improving performance.
When you make changes to a GLSL shader program in a captured frame (see “Editing Shader Programs” (page
125) below), the Frame Time and Program Performance graphs expand to show both the baseline rendering
time of the frame as originally captured and the current rendering time using your edited shaders.
Note: The Capture OpenGL ES Frame button automatically appears only if your project links
against the OpenGL ES or Sprite Kit framework. You can choose whether it appears for other
projects by editing the active scheme. (See About the Scheme Editing Dialog.)
118
Xcode OpenGL ES Tools Overview
Capturing and Analyzing an OpenGL ES Frame
● Breakpoint action. Choose Capture OpenGL ES Frame as an action for any breakpoint. When the debugger
reaches a breakpoint with this action, Xcode automatically captures a frame. (See Setting Breakpoint
Actions and Options.) If you use this action with an OpenGL ES Error breakpoint while developing your
app (see Adding an OpenGL ES Error Breakpoint), you can use the OpenGL ES Frame Debugger to investigate
the causes of OpenGL ES errors whenever they occur.
● OpenGL ES event marker. Programmatically trigger a frame capture by inserting an event marker in the
OpenGL ES command stream. The following command inserts such a marker:
glInsertEventMarkerEXT(0, "[Link]-frame")
When the OpenGL ES client reaches this marker, it finishes rendering the frame, then Xcode automatically
captures the entire sequence of commands used to render that frame.
After Xcode has captured the frame, it shows the OpenGL ES Frame Debugger interface. Use this interface to
inspect the sequence of OpenGL ES commands that render the frame and examine OpenGL ES resources, as
discussed in “Touring the OpenGL ES Frame Debugger” (page 120).
In addition, Xcode can perform an automated analysis of your app’s OpenGL ES usage to determine which
parts of your renderer and shader architecture can benefit most from performance optimizations. To use this
option, click the Analyze button at the top of the GPU report (shown at the top right in Figure B-1 (page 116)).
When you click the Analyze button, Xcode captures a frame (if one hasn’t been captured already), then runs
your rendering code through a series of experiments using the attached iOS device. For example, to see if your
rendering speed is limited by texture sizes, Xcode runs the captured sequence of OpenGL ES commands both
with the texture data your app submitted to the GPU and with a size-reduced texture set. After Xcode finishes
its analysis, the Problems & Solutions area of the GPU report lists any issues it found and suggestions for possible
performance improvements.
119
Xcode OpenGL ES Tools Overview
Touring the OpenGL ES Frame Debugger
120
Xcode OpenGL ES Tools Overview
Touring the OpenGL ES Frame Debugger
Figure B-4 Frame debugger examining shader program performance and analysis results
Navigator Area
In the OpenGL ES frame debugger interface, the debug navigator is replaced by the OpenGL ES frame navigator.
This navigator shows the OpenGL ES commands that render the captured frame, organized sequentially or
according to their associated shader program. Use the Frame View Options popup menu at the top of the
frame navigator to switch between view styles.
121
Xcode OpenGL ES Tools Overview
Touring the OpenGL ES Frame Debugger
You can add structure to this list by using the glPushGroupMarkerEXT and glPopGroupMarkerEXT functions
to annotate groups of OpenGL ES commands—these groups appear as folders you can expand or collapse to
show more or less detail. (For details, see “Annotate Your OpenGL ES Code for Informative Debugging and
Profiling” (page 64).) You can also expand an OpenGL ES command to show a stack trace indicating where in
your application code the command was issued.
Use the context menu to choose whether to abbreviate command names and which commands, groups, and
warnings to show. Use the flag icon at the bottom of the navigator to switch between showing all OpenGL ES
commands and showing only those which draw into the framebuffer.
Clicking an OpenGL ES command in the list navigates to that point in the OpenGL ES command sequence,
affecting the contents of other areas of the frame debugger interface, as discussed below, and showing the
effects of the OpenGL ES calls up to that point on the attached device’s display.
Expand the listing for a program to see the time contribution from each shader in the program and each draw
call. Expand the listing for a draw call to show a stack trace indicating where in your application code that
command was issued.
Use the context menu to refine the display—you can choose whether programs are sorted by their time
contributions and whether timing information is displayed as a percentage of the total rendering time.
Clicking a program or shader shows the corresponding GLSL source code in the primary editor. Clicking an
OpenGL ES command navigates to that point in the frame capture sequence.
Note: The View Frame By Program option is only available when debugging on devices that support
OpenGL ES 3.0 (regardless of whether your app uses an OpenGL ES 3.0 or 2.0 context). On other
devices, the Frame View Options popup menu is disabled.
Editor Area
When working with a frame capture, you use the primary editor to preview the framebuffer being rendered
to, and the assistant editor to examine OpenGL ES resources and edit GLSL shader programs. By default, the
assistant editor shows a graphical overview of all resources currently owned by the OpenGL ES context, as
shown in Figure B-3 (page 120). Use the assistant editor’s jump bar to show only those resources bound for
122
Xcode OpenGL ES Tools Overview
Touring the OpenGL ES Frame Debugger
use as of the call selected in the frame navigator, or to select an individual resource for further inspection. You
can also double-click a resource in the overview to inspect it. When you select a resource, the assistant editor
changes to a format suited for tasks appropriate to that resource’s type.
The editor shows a preview for each framebuffer attachment currently bound for drawing. For example, most
approaches to 3D rendering use a framebuffer with attachments for both color and depth, as illustrated in .
Use the controls in the lower left of the editor to choose which framebuffer attachments are currently shown.
Clicking the info button, left of each framebuffer attachment’s name, shows a popover detailing the attachment’s
123
Xcode OpenGL ES Tools Overview
Touring the OpenGL ES Frame Debugger
properties, as shown in Figure B-6. Click the settings button, right of the framebuffer attachment’s name, to
show a popover with controls that adjust the preview image. For example, you can use these controls to make
a certain range of Z values in a depth buffer more visible in its grayscale preview, as shown in Figure B-7.
Each framebuffer attachment preview also shows a green wireframe highlighting the effect of the current draw
call (as illustrated in Figure B-3 (page 120)). Use the context menu in a preview image to choose whether the
highlight appears in the preview or on the display of the attached device.
124
Xcode OpenGL ES Tools Overview
Touring the OpenGL ES Frame Debugger
Each line of the shader source code is highlighted in the right margin with a bar representing its relative
contribution to rendering time. Use these to focus your shader optimization efforts—if a few lines account for
a greater share of rendering time, look into faster alternatives for those lines. (For shader performance tips,
see “Best Practices for Shaders” (page 94).)
You can make changes to the shader source code in the editor. Then, click the Update button below the editor
(shown in Figure B-8 (page 125)) to recompile the shader program and see its effects on the captured frame.
If compiling the shader results in error or warning messages from the GLSL compiler, Xcode annotates the
shader source code for each issue. The recompiled shader program remains in use on the device, so you can
resume running your app. Click the Continue button in the debug bar to see your shader changes in action.
125
Xcode OpenGL ES Tools Overview
Touring the OpenGL ES Frame Debugger
A vertex array object (VAO) encapsulates one or more data buffers in OpenGL ES memory and the attribute
bindings used for supplying vertex data from the buffers to a shader program. (For details on using VAOs, see
“Consolidate Vertex Array State Changes Using Vertex Array Objects” (page 84).) Because the VAO bindings
include information about the format of the buffers’ contents, inspecting a VAO shows its contents as interpreted
by OpenGL ES (see Figure B-10).
126
Xcode OpenGL ES Tools Overview
Touring the OpenGL ES Frame Debugger
127
Xcode OpenGL ES Tools Overview
Touring the OpenGL ES Frame Debugger
Debug Area
The debug bar provides multiple controls for navigating the captured sequence of OpenGL ES commands
(shown in Figure B-12). You can use its menus to follow the hierarchy shown in the frame navigator and choose
a command, or you can use the arrows and slider to move back and forth in the sequence. Press the Continue
button to end frame debugging and return to running your application.
The frame debugger has no debug console. Instead, Xcode offers multiple variables views, each of which
provides a different summary of the current state of the OpenGL ES rendering process. Use the popup menu
to choose between the available variables views, discussed in the following sections.
128
Xcode OpenGL ES Tools Overview
Touring the OpenGL ES Frame Debugger
Figure B-13 Debug area with GL Context and Bound GL Objects views
129
Xcode OpenGL ES Tools Overview
Touring the OpenGL ES Frame Debugger
Figure B-14 Debug area with Auto and Context Info views
In addition, this view lists aggregate statistics about frame rendering performance, including the number of
draw calls and frame rate.
130
Using texturetool to Compress Textures
The iOS SDK includes a tool to compress your textures into the PVR texture compression format, aptly named
texturetool. If you have Xcode installed with the iOS 7.0 SDK, then texturetool is located at:
/[Link]/Contents/Developer/Platforms/[Link]/Developer/usr/bin/texturetool.
texturetool provides various compression options, with tradeoffs between image quality and size. You need
to experiment with each texture to determine which setting provides the best compromise.
Note: The encoders, formats, and options available with texturetool are subject to change. This
document describes those options available as of iOS 7.
texturetool Parameters
The parameters that may be passed to texturetool are described in the rest of this section.
user$ texturetool -h
first form:
second form:
third form:
131
Using texturetool to Compress Textures
texturetool Parameters
Note: The -p option indicates that it requires the -e option. It also requires the -o option.
user$ texturetool -l
Encoders:
PVRTC
--channel-weighting-linear
--channel-weighting-perceptual
--bits-per-pixel-2
--bits-per-pixel-4
--alpha-is-independent
--alpha-is-opacity
--punchthrough-unused
--punchthrough-allowed
--punchthrough-forced
Formats:
Raw
PVR
132
Using texturetool to Compress Textures
texturetool Parameters
The --bits-per-pixel-2 and --bits-per-pixel-4 options create PVRTC data that encodes source pixels
into 2 or 4 bits per pixel. These options represent a fixed 16:1 and 8:1 compression ratio over the uncompressed
32-bit RGBA image data. There is a minimum data size of 32 bytes; the compressor never produces files smaller
than this, and at least that many bytes are expected when uploading compressed texture data.
The -m option automatically generates mipmap levels for the source image. These levels are provided as
additional image data in the archive created. If you use the Raw image format, then each level of image data
is appended one after another to the archive. If you use the PVR archive format, then each mipmap image is
provided as a separate image in the archive.
The (-f) parameter controls the format of its output file. The default format is Raw. This format is raw compressed
texture data, either for a single texture level (without the -m option) or for each texture level concatenated
together (with the -m option). Each texture level stored in the file is at least 32 bytes in size and must be
uploaded to the GPU in its entirety. The PVR format matches the format used by the PVRTexTool found in
Imagination Technologies’s PowerVR SDK. To load a PVR-compressed texture, use the GLKTextureLoader
class.
The -s and -c options print error metrics during encoding. The -s option compares the input (uncompressed)
image to the output (encoded) image, and the -c option compares any two images. Results of the comparison
include root-mean-square error (rms), perceptually weighted pRms, worst-case texel error (max), and
compositing-based versions of each statistic (rmsC, pRmsC, and maxC). Compositing-based errors assume that
the image’s alpha channel is used for opacity and that the color in each texel is blended with the worst-case
destination color.
The error metrics used with the -s and -c options and by the encoder when optimizing a compressed image
treat the image’s alpha channel as an independent channel by default (or when using the
--alpha-is-independent option). The --alpha-is-opacity option changes the error metric to one
based on a standard blending operation, as implemented by calling glBlendFunc( GL_SRC_ALPHA,
GL_ONE_MINUS_SRC_ALPHA ).
PVR Texture compression supports a special punchthrough mode which can be enabled on a per 4x4 block
basis. This mode limits the color gradations that can be used within the block, but introduces the option of
forcing the pixel’s alpha value to 0. It can defeat PVRTC smooth color interpolation, introducing block boundary
artifacts, so it should be used with care. The three punchthrough options are:
● --punchthrough-unused — No punchthrough (the default option).
133
Using texturetool to Compress Textures
texturetool Parameters
● --punchthrough-allowed — The encoder may enable punchthrough on a block by block basis when
optimizing for image quality. This option generally improves the objective (per-pixel) error metrics used
by the compression algorithm, but may introduce subjective artifacts.
● --punchthrough-forced — Punchthrough is enabled on every block, limiting color gradation but
making it possible for any pixel in the block to have an alpha of 0. This option is provided principally for
completeness, but may be useful when the results can be compared visually to the other options.
Important: Source images for the encoder must satisfy these requirements:
● Height and width must be at least 8.
● Height and width must be a power of 2.
● Must be square (height==width).
● Source images must be in a format that Image IO accepts in OS X. For best results, your original textures
should begin in an uncompressed data format.
Important: If you are using PVRTexTool to compress your textures, then you must create textures that are
square and a power of 2 in length. If your app attempts to load a non-square or non-power-of-two texture
in iOS, an error is returned.
Encode [Link] into PVRTC using linear weights and 4 bpp, and saving as
[Link]
user$ texturetool -e PVRTC --channel-weighting-linear --bits-per-pixel-4 -o
[Link] [Link]
Encode [Link] into PVRTC using perceptual weights and 4 bpp, and saving as
[Link]
user$ texturetool -e PVRTC --channel-weighting-perceptual --bits-per-pixel-4 -o
[Link] [Link]
Encode [Link] into PVRTC using linear weights and 2 bpp, and saving as
[Link]
user$ texturetool -e PVRTC --channel-weighting-linear --bits-per-pixel-2 -o
[Link] [Link]
Encode [Link] into PVRTC using perceptual weights and 2 bpp, and saving as
[Link]
user$ texturetool -e PVRTC --channel-weighting-perceptual --bits-per-pixel-2 -o
[Link] [Link]
134
Using texturetool to Compress Textures
texturetool Parameters
Listing C-3 Encoding images into the PVRTC compression format while creating a preview
Encode [Link] into PVRTC using linear weights and 4 bpp, and saving the output
as [Link] and a PNG preview as [Link]
user$ texturetool -e PVRTC --channel-weighting-linear --bits-per-pixel-4 -o
[Link] -p [Link] [Link]
Encode [Link] into PVRTC using perceptual weights and 4 bpp, and saving the
output as [Link] and a PNG preview as [Link]
user$ texturetool -e PVRTC --channel-weighting-perceptual --bits-per-pixel-4 -o
[Link] -p [Link] [Link]
Encode [Link] into PVRTC using linear weights and 2 bpp, and saving the output
as [Link] and a PNG preview as [Link]
user$ texturetool -e PVRTC --channel-weighting-linear --bits-per-pixel-2 -o
[Link] -p [Link] [Link]
Encode [Link] into PVRTC using perceptual weights and 2 bpp, and saving the
output as [Link] and a PNG preview as [Link]
user$ texturetool -e PVRTC --channel-weighting-perceptual --bits-per-pixel-2 -o
[Link] -p [Link] [Link]
Note: It is not possible to create a preview without also specifying the -o parameter and a valid
output file. Preview images are always in PNG format.
For an example of working with PVR-compressed data directly, see the PVRTextureLoader sample.
135
Document Revision History
This table describes the changes to OpenGL ES Programming Guide for iOS .
Date Notes
2013-09-18 Updated to include more information about OpenGL ES 3.0, GLKit, and
the Xcode debugger.
2013-04-23 Moved the platform notes to OpenGL ES Hardware Platform Guide for
iOS.
Removed the “Platform Notes” chapter and moved the information into
its own book, OpenGL ES Hardware Platform Guide for iOS .
2010-11-15 Significantly revised and expanded all the material in the document.
2010-07-09 Changed the title from "OpenGL ES Programming Guide for iPhone OS."
2010-01-20 Corrected code for creating a framebuffer object that draws to the screen.
136
Document Revision History
Date Notes
2009-09-02 Edited for clarity. Updated extensions list to reflect what's currently
available. Clarified usage of triangle strips for best vertex performance.
Added a note to the platforms chapter about texture performance on the
PowerVR SGX.
2009-06-11 First version of a document that describes how to use the OpenGL ES 1.1
and 2.0 programming interfaces to create high performance graphics
within an iPhone Application.
137
Glossary
This glossary contains terms that are used specifically completeness A state that indicates whether a
for the Apple implementation of OpenGL ES as well framebuffer object meets all the requirements for
as terms that are common in OpenGL ES graphics drawing.
programming. context A set of OpenGL ES state variables that
aliased Said of graphics whose edges appear affect how drawing is performed to a drawable
jagged; can be remedied by performing antialiasing object attached to that context. Also called a
operations. rendering context .
antialiasing In graphics, a technique used to culling Eliminating parts of a scene that can’t be
smooth and soften the jagged (or aliased) edges seen by the observer.
that are sometimes apparent when graphical objects
current context The rendering context to which
such as text, line art, and images are drawn.
OpenGL ES routes commands issued by your app.
attach To establish a connection between two
current matrix A matrix used by OpenGL ES 1.1 to
existing objects. Compare bind.
transform coordinates in one system to those of
bind To create a new object and then establish a another system, such as the modelview matrix, the
connection between that object and a rendering perspective matrix, and the texture matrix. GLSL ES
context. Compare attach. uses user-defined matrices instead.
bitmap A rectangular array of bits. depth In OpenGL, the z coordinate that specifies
how far a pixel lies from the observer.
buffer A block of memory managed by OpenGL ES
dedicated to storing a specific kind of data, such as depth buffer A block of memory used to store a
vertex attributes, color data or indices. depth value for each pixel. The depth buffer is used
to determine whether or not a pixel can be seen by
clipping An operation that identifies the area of
the observer. All fragments rasterized by OpenGL ES
drawing. Anything not in the clipping region is not
must pass a depth test that compares the incoming
drawn.
depth value to the value stored in the depth buffer;
clip coordinates The coordinate system used for only fragments that pass the depth test are stored
view-volume clipping. Clip coordinates are applied to framebuffer.
after applying the projection matrix and prior to
double buffering The practice of using two buffers
perspective division.
to avoid resource conflicts between two different
parts of the graphic subsystem. The front buffer is
138
Glossary
used by one participant and the back buffer is system framebuffer A framebuffer provided by an
modified by the other. When a swap occurs, the front operating system. This type of framebuffer supports
and back buffer change places. integrating OpenGL ES into an operating system’s
windowing system. iOS does not use system
drawable object An object allocated outside of
framebuffers. Instead, it provides framebuffer objects
OpenGL ES that can be used as part of an OpenGL ES
that are associated with a Core Animation layer.
framebuffer object. On iOS, the only type of drawable
object is the CAEAGLLayer class that integrates framebuffer attachable image The rendering
OpenGL ES rendering into Core Animation. destination for a framebuffer object.
extension A feature of OpenGL ES that’s not part framebuffer object A framebuffer that is managed
of the OpenGL ES core API and therefore not entirely by OpenGL ES. A framebuffer object contains
guaranteed to be supported by every state information for an OpenGL ES framebuffer and
implementation of OpenGL ES. The naming its set of images, called renderbuffers . Framebuffers
conventions used for extensions indicate how widely are built into OpenGL ES 2.0 and later, and all iOS
accepted the extension is. The name of an extension implementations of OpenGL ES 1.1 are guaranteed
supported only by a specific company includes an to support framebuffer objects (through the
abbreviation of the company name. If more then OES_framebuffer_object extension).
one company adopts the extension, the extension
frustum The region of space that is seen by the
name is changed to include EXT instead of a
observer and that is warped by perspective division.
company abbreviation. If the Khronos OpenGL
Working Group approves an extension, the extension image A rectangular array of pixels.
name changes to include OES instead of EXT or a
interleaved data Arrays of dissimilar data that are
company abbreviation.
grouped together, such as vertex data and texture
eye coordinates The coordinate system with the coordinates. Interleaving can speed data retrieval.
observer at the origin. Eye coordinates are produced
mipmaps A set of texture maps, provided at various
by the modelview matrix and passed to the
resolutions, whose purpose is to minimize artifacts
projection matrix.
that can occur when a texture is applied to a
filtering A process that modifies an image by geometric primitive whose onscreen resolution
combining pixels or texels. doesn’t match the source texture map. Mipmapping
derives from the latin phrase multum in parvo , which
fog An effect achieved by fading colors to a
means “many things in a small place.”
background color based on the distance from the
observer. Fog provides depth cues to the observer. modelview matrix A 4 x 4 matrix used by OpenGL
to transform points, lines, polygons, and positions
fragment The color and depth values calculated
from object coordinates to eye coordinates.
when rasterizing a primitive. Each fragment must
past a series of tests before being blended with the multisampling A technique that takes multiple
pixel stored in the framebuffer. samples at a pixel and combines them with coverage
values to arrive at a final fragment.
139
Glossary
mutex A mutual exclusion object in a multithreaded renderer A combination of hardware and software
app. that OpenGL ES uses to create an image from a view
and a model.
packing Converting pixel color components from
a buffer into the format needed by an app. rendering context A container for state information.
pixel A picture element—the smallest element that rendering pipeline The order of operations used
the graphics hardware can display on the screen. A by OpenGL ES to transform pixel and vertex data to
pixel is made up of all the bits at the location x , y , an image in the framebuffer.
in all the bitplanes in the framebuffer.
render-to-texture An operation that draws content
pixel depth In a pixel image, the number of bits directly to a texture target.
per pixel.
RGBA Red, green, blue, and alpha color
pixel format A format used to store pixel data in components.
memory. The format describes the pixel components
shader A program that computes surface properties.
(red, green, blue, alpha), the number and order of
components, and other relevant information, such shading language A high-level language, accessible
as whether a pixel contains stencil and depth values. in C, used to produce advanced imaging effects.
premultiplied alpha A pixel whose other stencil buffer Memory used specifically for stencil
components have been multiplied by the alpha testing. A stencil test is typically used to identify
value. For example, a pixel whose RGBA values start masking regions, to identify solid geometry that
as (1.0, 0.5, 0.0, 0.5) would, when premultiplied, be needs to be capped, and to overlap translucent
(0.5, 0.25, 0.0, 0.5). polygons.
primitives The simplest elements in tearing A visual anomaly caused when part of the
OpenGL—points, lines, polygons, bitmaps, and current frame overwrites previous frame data in the
images. framebuffer before the current frame is fully
rendered on the screen. iOS avoids tearing by
projection matrix A matrix that OpenGL uses to
processing all visible OpenGL ES content through
transform points, lines, polygons, and positions from
Core Animation.
eye coordinates to clip coordinates.
tessellation An operation that reduces a surface to
rasterization The process of converting vertex and
a mesh of polygons, or a curve to a sequence of lines.
pixel data to fragments, each of which corresponds
to a pixel in the framebuffer. texel A texture element used to specify the color
to apply to a fragment.
renderbuffer A rendering destination for a 2D pixel
image, used for generalized offscreen rendering, as texture Image data used to modify the color of
defined in the OpenGL specification for the rasterized fragments. The data can be one-, two-, or
OES_framebuffer_object extension. three- dimensional or it can be a cube map.
140
Glossary
141
Apple Inc.
Copyright © 2014 Apple Inc.
All rights reserved.