Rendering to the Oculus Rift

The Oculus Rift requires split-screen stereo with distortion correction for each eye to cancel lens-related distortion.

Setting this up can be tricky, but proper distortion correction is critical to achieving an immersive experience.

Figure 11. OculusWorldDemo Stereo Rendering

The Oculus C API provides two types of distortion correction: SDK distortion rendering and Client (application-side) distortion rendering. For each type, the application renders stereo views into individual render textures or into a single combined one. The differences appear in the way the APIs handle distortion, timing, and buffer swap:

  • With the SDK distortion rendering approach, the library takes care of timing, distortion rendering, and buffer swap (the Present call). To make this possible, developers provide low level device and texture pointers to the API, and instrument the frame loop with ovrHmd_BeginFrame and ovrHmd_EndFrame calls that do all of the work. No knowledge of distortion shaders (vertex or pixel-based) is required.
  • With Client distortion rendering, distortion must be rendered by the application code. This is similar to the approach used in SDK Version 0.2. However, distortion rendering is now mesh-based. In other words, the distortion is encoded in mesh vertex data rather than using an explicit function in the pixel shader. To support distortion correction, the Oculus SDK generates a mesh that includes vertices and UV coordinates used to warp the source render target image to the final buffer. The SDK also provides explicit frame timing functions used to support timewarp and prediction.

Rendering to the Oculus Rift

The Oculus Rift requires the scene to be rendered in split-screen stereo with half the screen used for each eye.

When using the Rift, the left eye sees the left half of the screen, and the right eye sees the right half. Although varying from person-to-person, human eye pupils are approximately 65 mm apart. This is known as interpupillary distance (IPD). The in-application cameras should be configured with the same separation.

The lenses in the Rift magnify the image to provide a very wide field of view (FOV) that enhances immersion. However, this process distorts the image significantly. If the engine were to display the original images on the Rift, then the user would observe them with pincushion distortion.

Figure 12. Pincushion Distortion
Figure 13. Barrel Distortion

To counteract this distortion, the software must apply post-processing to the rendered views with an equal and opposite barrel distortion so that the two cancel each other out, resulting in an undistorted view for each eye. Furthermore, the software must also correct chromatic aberration, which is a color separation effect at the edges caused by the lens. Although the exact distortion parameters depend on the lens characteristics and eye position relative to the lens, the Oculus SDK takes care of all necessary calculations when generating the distortion mesh.

When rendering for the Rift, projection axes should be parallel to each other as illustrated in the following figure, and the left and right views are completely independent of one another. This means that camera setup is very similar to that used for normal non-stereo rendering, except that the cameras are shifted sideways to adjust for each eye location.

Figure 14. HMD Eye View Cones

In practice, the projections in the Rift are often slightly off-center because our noses get in the way! But the point remains, the left and right eye views in the Rift are entirely separate from each other, unlike stereo views generated by a television or a cinema screen. This means you should be very careful if trying to use methods developed for those media because they do not usually apply to the Rift.

The two virtual cameras in the scene should be positioned so that they are pointing in the same direction (determined by the orientation of the HMD in the real world), and such that the distance between them is the same as the distance between the eyes, or interpupillary distance (IPD). This is typically done by adding the ovrEyeRenderDesc::ViewAdjust translation vector to the translation component of the view matrix.

Although the Rift’s lenses are approximately the right distance apart for most users, they may not exactly match the user’s IPD. However, because of the way the optics are designed, each eye will still see the correct view. It is important that the software makes the distance between the virtual cameras match the user’s IPD as found in their profile (set in the configuration utility), and not the distance between the Rift’s lenses.

SDK Distortion Rendering

The Oculus SDK provides SDK Distortion Rendering as the recommended path for presenting frames and handling distortion.

With SDK rendering, developers render the scene into one or two render textures, passing these textures into the API. Beyond that point, the Oculus SDK handles the rendering of distortion, calling Present, GPU synchronization, and frame timing.

The following are the steps for SDK rendering:

  1. Initialize:

    1. Modify your application window and swap chain initialization code to use the data provided in the struct e.g. Rift resolution etc.

    2. Compute the desired FOV and texture sizes based on ovrHMDDesc data.

    3. Allocate textures in an API-specific way.

    4. Use ovrHmd_ConfigureRendering to initialize distortion rendering, passing in the necessary API specific device handles, configuration flags, and FOV data.

    5. Under Windows, call ovrHmd_AttachToWindow to direct back buffer output from the window to the HMD.

  2. Set up frame handling:

    1. Call ovrHmd_BeginFrame to start frame processing and obtain timing information.

    2. Perform rendering for each eye in an engine-specific way, rendering into render textures.

    3. Call ovrHmd_EndFrame (passing in the render textures from the previous step) to swap buffers and present the frame. This function will also handle timewarp, GPU sync, and frame timing.

  3. Shutdown:

    1. You can use ovrHmd_ConfigureRendering with a null value for the apiConfig parameter to shut down SDK rendering or change its rendering parameters. Alternatively, you can just destroy the ovrHmd object by calling ovrHmd_Destroy.

Render Texture Initialization

This section describes the steps involved in initialization.

As a first step, you determine the rendering FOV and allocate the required render target textures. The following code sample shows how the OculusRoomTiny demo does this:

    // Configure Stereo settings.
    Sizei recommenedTex0Size = ovrHmd_GetFovTextureSize(hmd, ovrEye_Left, 
                                                        hmd->DefaultEyeFov[0], 1.0f);
    Sizei recommenedTex1Size = ovrHmd_GetFovTextureSize(hmd, ovrEye_Right,
                                                        hmd->DefaultEyeFov[1], 1.0f);
    Sizei renderTargetSize;
    renderTargetSize.w  = recommenedTex0Size.w + recommenedTex1Size.w;
    renderTargetSize.h = max ( recommenedTex0Size.h, recommenedTex1Size.h );

    const int eyeRenderMultisample = 1;
    pRendertargetTexture = pRender->CreateTexture(
                                      Texture_RGBA | Texture_RenderTarget | eyeRenderMultisample,
                                      renderTargetSize.w, renderTargetSize.h, NULL);
    // The actual RT size may be different due to HW limits.
    renderTargetSize.w = pRendertargetTexture->GetWidth();
    renderTargetSize.h = pRendertargetTexture->GetHeight();

The code first determines the render texture size based on the FOV and the desired pixel density at the center of the eye. Although both the FOV and pixel density values can be modified to improve performance, this example uses the recommended FOV (obtained from hmd->DefaultEyeFov). The function ovrHmd_GetFovTextureSize computes the desired texture size for each eye based on these parameters.

The Oculus API allows the application to use either one shared texture or two separate textures for eye rendering. This example uses a single shared texture for simplicity, making it large enough to fit both eye renderings. The sample then calls CreateTexture to allocate the texture in an API-specific way. Under the hood, the returned texture object will wrap either a D3D texture handle or OpenGL texture id. Because video hardware may have texture size limitations, we update renderTargetSize based on the actually allocated texture size. Although use of a different texture size may affect rendering quality and performance, it should function properly if the viewports are set up correctly. The Frame Rendering section of this guide describes viewport setup.

Configure Rendering

After determining FOV, you can initialize SDK rendering.

To initialize SDK rendering, call ovrHmd_ConfigureRendering. This also generates the ovrEyeRenderDesc structure that describes all of the details needed to perform stereo rendering.

In addition to the input eyeFovIn[] structures, this requires a render-API dependent version of ovrRenderAPIConfig that provides API and platform specific interface pointers. The following code shows an example of what this looks like for Direct3D 11:

    // Configure D3D11.
    RenderDevice* render = (RenderDevice*)pRender;
    ovrD3D11Config d3d11cfg;    
    d3d11cfg.D3D11.Header.API         = ovrRenderAPI_D3D11;
    d3d11cfg.D3D11.Header.RTSize      = Sizei(backBufferWidth, backBufferHeight);
    d3d11cfg.D3D11.Header.Multisample = backBufferMultisample;
    d3d11cfg.D3D11.pDevice            = pRender->Device;
    d3d11cfg.D3D11.pDeviceContext     = pRender->Context;
    d3d11cfg.D3D11.pBackBufferRT      = pRender->BackBufferRT;
    d3d11cfg.D3D11.pSwapChain         = pRender->SwapChain;

    if (!ovrHmd_ConfigureRendering(hmd, &d3d11cfg.Config, ovrDistortionCap_Chromatic |
                                                          ovrDistortionCap_TimeWarp |
                                   eyeFov, EyeRenderDesc))

With D3D11, ovrHmd_ConfigureRendering requires the device, context, back buffer and swap chain pointers. Internally, it uses these to allocate the distortion mesh, shaders, and any other resources necessary to correctly output the scene to the Rift display.

Similar code is used to configure rendering with OpenGL. The following code shows a Windows example:

    // Configure OpenGL.
    ovrGLConfig cfg;
    cfg.OGL.Header.API         = ovrRenderAPI_OpenGL;
    cfg.OGL.Header.RTSize      = Sizei(hmd->Resolution.w, hmd->Resolution.h);
    cfg.OGL.Header.Multisample = backBufferMultisample;
    cfg.OGL.Window             = window;
    cfg.OGL.DC                 = dc;

    ovrBool result = ovrHmd_ConfigureRendering(hmd, &cfg.Config, distortionCaps,
                                                    eyesFov, EyeRenderDesc);

In addition to setting up rendering, starting with Oculus SDK 0.4.0, Windows must call ovrHmd_AttachToWindow to direct its swap-chain output to the HMD through the Oculus display driver. This requires a single call:

    // Direct rendering from a window handle to the Hmd.
    // Not required if ovrHmdCap_ExtendDesktop flag is set.
    ovrHmd_AttachToWindow(hmd, window, NULL, NULL);

With the window attached, we are ready to render to the HMD.

Frame Rendering

When used in the SDK distortion rendering mode, the Oculus SDK handles frame timing, motion prediction, distortion rendering, end frame buffer swap (known as Present in Direct3D), and GPU synchronization.

To do this, it makes use of three functions that must be called on the render thread:

  • ovrHmd_BeginFrame
  • ovrHmd_EndFrame
  • ovrHmd_GetEyePoses

As suggested by their names, calls to ovrHmd_BeginFrame and ovrHmd_EndFrame enclose the body of the frame rendering loop. ovrHmd_BeginFrame is called at the beginning of the frame, returning frame timing information in the ovrFrameTiming struct. Values within this structure are useful for animation and correct sensor pose prediction. ovrHmd_EndFrame should be called at the end of the frame, in the same place that you would typically call Present. This function takes care of the distortion rendering, buffer swap, and GPU synchronization. The function also ensures that frame timing is matched with the video card VSync.

In between ovrHmd_BeginFrame and ovrHmd_EndFrame, you will render both of the eye views to a render texture. Before rendering each eye, you should get the latest predicted head pose by calling ovrHmd_GetEyePoses. This will ensure that each predicted pose is based on the latest sensor data. We also recommend that you use the ovrHMDDesc::EyeRenderOrder variable to determine which eye to render first for that HMD, since that can also produce better pose prediction on HMDs with eye-independent scanout.

The ovrHmd_EndFrame function submits the eye images for distortion processing. Because the texture data is passed in an API-specific format, the ovrTexture structure needs some platform-specific initialization.

The following code shows how ovrTexture initialization is done for D3D11 in OculusRoomTiny:

    ovrD3D11Texture EyeTexture[2];

    // Pass D3D texture data, including ID3D11Texture2D and ID3D11ShaderResourceView pointers.
    Texture* rtt = (Texture*)pRendertargetTexture;
    EyeTexture[0].D3D11.Header.API            = ovrRenderAPI_D3D11;
    EyeTexture[0].D3D11.Header.TextureSize    = RenderTargetSize;
    EyeTexture[0].D3D11.Header.RenderViewport = EyeRenderViewport[0];
    EyeTexture[0].D3D11.pTexture              = pRendertargetTexture->Tex.GetPtr();
    EyeTexture[0].D3D11.pSRView               = pRendertargetTexture->TexSv.GetPtr();

    // Right eye uses the same texture, but different rendering viewport.
    EyeTexture[1]                             = EyeTexture[0];
    EyeTexture[1].D3D11.Header.RenderViewport = EyeRenderViewport[1];    

Alternatively, here is OpenGL code:

    ovrGLTexture EyeTexture[2];
    EyeTexture[0].OGL.Header.API            = ovrRenderAPI_OpenGL;
    EyeTexture[0].OGL.Header.TextureSize    = RenderTargetSize;
    EyeTexture[0].OGL.Header.RenderViewport = eyes[0].RenderViewport;
    EyeTexture[0].OGL.TexId                 = textureId;

With texture setup complete, you can set up a frame rendering loop as follows:

    ovrFrameTiming hmdFrameTiming = ovrHmd_BeginFrame(hmd, 0); 

    pRender->SetRenderTarget ( pRendertargetTexture );

    ovrPosef headPose[2];

    for (int eyeIndex = 0; eyeIndex < ovrEye_Count; eyeIndex++)
        ovrEyeType eye         = hmd->EyeRenderOrder[eyeIndex];
        headPose[eye]          = ovrHmd_GetEyePoses(hmd, eye);
        Quatf      orientation = Quatf(headPose[eye].Orientation);
        Matrix4f   proj        = ovrMatrix4f_Projection(EyeRenderDesc[eye].Fov,
                                                        0.01f, 10000.0f, true);
        // * Test code *
        // Assign quaternion result directly to view (translation is ignored).
        Matrix4f   view = Matrix4f(orientation.Inverted()) * Matrix4f::Translation(-WorldEyePos);

        pRoomScene->Render(pRender, Matrix4f::Translation(EyeRenderDesc[eye].ViewAdjust) * view);
    // Let OVR do distortion rendering, Present and flush/sync.
    ovrHmd_EndFrame(hmd, headPose, eyeTextures);

As described earlier, frame logic is enclosed by the begin frame and end frame calls. In this example both eyes share the render target. Rendering is straightforward, although there a few points worth noting:

  • We use hmd->EyeRenderOrder[eyeIndex] to select the order of eye rendering. Although not required, this can improve the quality of pose prediction.
  • The projection matrix is computed based on EyeRenderDesc[eye].Fov, which are the same FOV values used for the rendering configuration.
  • The view matrix is adjusted by the EyeRenderDesc[eye].ViewAdjust vector, which accounts for IPD in meters.
  • This sample uses only the Rift orientation component, whereas real applications should make use of position as well. Please refer to the OculusRoomTiny or OculusWorldDemo source code for a more comprehensive example.

Frame Timing

When used in the SDK distortion rendering mode, the Oculus SDK handles frame timing, motion prediction, distortion rendering, end frame buffer swap (known as Presentin Direct3D), and GPU synchronization.

Accurate frame and sensor timing are required for accurate head motion prediction, which is essential for a good VR experience. Prediction requires knowing exactly when in the future the current frame will appear on the screen. If we know both sensor and display scanout times, we can predict the future head pose and improve image stability. Miscomputing these values can lead to under or over-prediction, degrading perceived latency, and potentially causing overshoot “wobbles”.

To ensure accurate timing, the Oculus SDK uses absolute system time, stored as a double, to represent sensor and frame timing values. The current absolute time is returned by ovr_GetTimeInSeconds. However, it should rarely be necessary because simulation and motion prediction should rely completely on the frame timing values.

Render frame timing is managed at a low level by two functions:

  • ovrHmd_BeginFrameTiming—called at the beginning of the frame; and returns a set of timing values for the frame.
  • ovrHmd_EndFrameTiming—implements most of the actual frame vsync tracking logic. It must be called at the end of the frame after swap buffers and GPU Sync.

With SDK Distortion Rendering, ovrHmd_BeginFrameandovrHmd_EndFrame call the timing functions internally and do not need to be called explicitly. Regardless, you will still use theovrFrameTiming values returned by ovrHmd_BeginFrameto perform motion prediction and sometimes waits.

ovrFrameTimingprovides the following set of absolute times values associated with the current frame:

DeltaSeconds float The amount of time passed since the previous frame (useful for animation).
ThisFrameSeconds double Time that this frame’s rendering started.
TimewarpPointSeconds double Time point, during this frame, when timewarp should start.
NextFrameSeconds double Time when the next frame’s rendering is expected to start.
ScanoutMidpointSeconds double Midpoint time when this frame will show up on the screen. This can be used to obtain head pose prediction for simulation and rendering.
EyeScanoutSeconds[2] double Times when each eye of this frame is expected to appear on screen. This is the best pose prediction time to use for rendering each eye.

Some of the timing values are used internally by the SDK and might not need to be used directly by your application. For example, the EyeScanoutSeconds[] values are used internally by ovrHmd_GetEyePoses to report the predicted head pose when rendering each eye. However, there some cases in which timing values are useful:

  • When using timewarp, to ensure the lowest possible latency, the ovrHmd_EndFrame implementation will pause internally to wait for the timewarp point. If the application frame rendering finishes early, you might decide to execute other processing to manage the wait time before the TimewarpPointSeconds time is reached.
  • If both simulation and rendering are performed on the same thread, then simulation might need an earlier head Pose value that is not specific to either eye. This can be obtained by calling ovrHmd_GetSensorState with ScanoutMidpointSeconds for absolute time.
  • EyeScanoutSeconds[] values are useful when accessing pose from a non-rendering thread. This is discussed later in this guide.