Screen Resolution


The Xenos’ EDRAM is a great addition. It is a big cache inside the GPU that holds the current in use render targets and it makes texture writing costless (incorrectly some people believe that it was a chip for free antialiasing).

The only problem is that the EDRAM was shaped before the deferred renderer revolution therefore its size (10Mb) may be too limited to fill an entire G-Buffer, even a deferred lighting G-Buffer. To make things worst XNA does not allow accessing the depth buffer so we need to waste one extra render target to hold the depth.

In conclusion, we need space to fill 3 32 bits render targets at the same time in the EDRAM (GPU depth buffer, our own depth buffer and normals and specular power). One good viable resolution is 1024 x 600, this is the same resolution used, for example, in games like Call of Duty. However, you can use higher resolutions, at to some point, XNA will do predicate tiling automatically. We shouldn’t use MSAA for the G-Buffer (because it increases the render targets size) and because deferred lighting does not work well with MSAA (except on DX10.1//11).

Shawn Hargreaves said: "The main expense of predicated tiling is repeating draw calls, which increases the vertex shading workload (there is obviously no difference to pixel shading cost whether the work is split into many small or one big output surface). This is mitigated by various predication tricks using cached extents data, and predicated tiling may not cost anything at all if your app is not vertex shading limited, but it can be a significant overhead for workloads with high vertex shading costs. Generally, predicated tiling costs far less than most people tend to expect."

Predicated Tiling


The Xbox 360 has 10 MB of fast embedded dynamic RAM (EDRAM) that is dedicated for use as the back buffer, depth stencil buffer, and other render targets. Depending on the size and format of the render targets and the antialiasing level, it may not be possible to fit all targets in EDRAM at once. For example, 10 MB of EDRAM is enough to hold two 1280×720 32-bit surfaces with no multisample antialiasing (MSAA) or two 640×480 4× MSAA 32-bit surfaces. However, a 1280×720 2× MSAA 32-bits-per-pixel render target is 7,372,800 bytes. Combined with a 32-bit Z/stencil buffer of the same dimensions, it becomes apparent that 10 MB might not be sufficient.

Predicated tiling allows rendering to larger surfaces than can fit into EDRAM at any one time. In predicated tiling, the screen space is broken up into tiles (rectangles). The following figure shows the screen space broken into two tiles.

PredicateTiling.jpg

In predicated tiling, the commands issued in the Draw method are recorded before execution. The recorded commands, such as DrawPrimitives calls, are then executed for each tile, predicated based on whether the rendered primitives intersect the tile. In the preceding figure, both the triangle primitives would be rendered in Tile 0. Once the primitives for a tile are fully rendered, the tile is then resolved into the texture that is used for the front buffer. Each successive tile is handled the same way and is resolved into the same texture.

Predicated tiling occurs automatically for Xbox 360 games created with XNA Game Studio when the size and format of the render targets exceed the console's available EDRAM. In predicated tiling, the size of the render targets and the depth-stencil are no longer limited by EDRAM memory, although each individual tile has to fit into EDRAM.

Once predicated tiling has been triggered, all rendering commands in Draw are accumulated and played back for each tile in separate passes. An adjusted window offset and clip rectangle are used to render only those portions that intersect with the specified screen-space tile rectangle. Drawing is done only when the primitive drawing call is known to be visible on the given tile. This is determined by using hardware screen–space extent queries, which compute if the geometry lies within the tile. All rendering is then completed using the render target and depth stencil surface that are currently set.

A few restrictions apply when predicated tiling has been triggered. Resources that are referenced in the command buffer cannot be modified by the CPU, because the command buffer has to be played back more than once to accomplish the tiling. It is okay for the GPU to modify resources and then reuse them in the same pass, because the GPU can appropriately recreate the memory on every pass. Moreover, data from an occlusion query (specifically, a call to IsComplete) on Xbox 360 is not available from within the tiling bracket. Access this data by switching to a different render target or end the frame. This condition does not apply to games targeting the Windows platform.

References



Last edited Jan 22, 2013 at 5:37 PM by jischneider, version 20

Comments

No comments yet.