Friday, January 17, 2014

GL Game Devs: Please Send Us Your Game Traces!

We're going through all games in the Steam Linux catalog to ensure they are compatible with our new GL tracer/debugger toolset (VOGL). But this is backwards looking: what we really want is to ensure our full-stream tracing and state snapshot/restore code is compatible and correct with your new (most likely unreleased) GL code.

So if you would like to help right now, please use apitrace (on any "big" GL platform: Linux/OSX/Windows) to make a 30-90 second trace of your app's gameplay:
https://github.com/apitrace/apitrace

The more representative the trace is of your actual (end-user) GL usage, the better. We're especially interested in advanced GL 3.x/4.x usage patterns, or any new extensions you use. We'll add your trace to our private correctness and regression testing repo. We'll replay your trace, capture it with our tools, and take it from there.

E-mail me (richg at valvesoftware dot com) with the location, or for very large traces that are too big for dropbox, etc. we'll set you up a private FTP site.

Also, if your game is already on Steam Linux and you're actively debugging GL issues I can move your app to the top of the testing list.

Saturday, January 11, 2014

VOGL OpenGL Tracer/Debugger - Bonus Content

There's a bunch of content I wanted to get into our Steam Dev Days presentation on our new OpenGL tracer/debugger (VOGL), but you can only cram in so much into a 20 minute presentation with 5-10 minutes set aside for demos. Here's the missing content:

Dev Environment

  • All code written and tested on Linux - Not a port
  • Distros we're developing on: Kubuntu 13.10, Ubuntu 12.04, Linux Mint
  • IDE: QtCreator v3.0.0
  • Building: cmake+ninja
  • Compiler: clang v3.3
    • gcc v4.6 works too, but is very slow
  • chroots used to standardize our build environment across dev machines
  • Source control: Mercurial, TortoiseHG


QtCreator v3.0.0: IDE, gdb debugger front-end, integrated source control:




First Non-Divergent Replay: Portal


  • 5.4 megacalls, 1.9 GB trace file




VOGL's Current GL Compatibility

  • GL v1 - v3.3, core or compatibility contexts, partial support for GL v4.x (full 4.x later this year)
  • Tracer: 2,652 functions, almost all auto-genned. Replayable: 1,498 functions, ~45% auto-genned
  • Fully supported extensions:
  • AMD_draw_buffers_blend, ARB_blend_func_extended, ARB_color_buffer_float, ARB_copy_buffer, ARB_create_context, ARB_debug_output, ARB_draw_buffers, ARB_draw_buffers_blend, ARB_draw_elements_base_vertex, ARB_framebuffer_object, ARB_get_proc_address, ARB_get_program_binary, ARB_gpu_shader_fp64, ARB_instanced_arrays, ARB_internalformat_query, ARB_internalformat_query2, ARB_map_buffer_range, ARB_multisample, ARB_multitexture, ARB_occlusion_query, ARB_point_parameters, ARB_program_interface_query, ARB_provoking_vertex, ARB_sample_shading, ARB_shader_atomic_counters, ARB_shader_objects, ARB_sync, ARB_texture_buffer_object, ARB_texture_compression, ARB_texture_multisample, ARB_texture_storage, ARB_texture_storage_multisample, ARB_timer_query, ARB_transpose_matrix, ARB_uniform_buffer_object, ARB_vertex_array_object, ARB_vertex_buffer_object, ARB_vertex_program, ARB_vertex_shader, ARB_vertex_type_2_10_10_10_rev, ARB_viewport_array, ARB_window_pos, EXT_bindable_uniform, EXT_blend_color, EXT_blend_equation_separate, EXT_blend_func_separate, EXT_blend_minmax, EXT_compiled_vertex_array, EXT_cull_vertex, EXT_depth_bounds_test, EXT_draw_buffers2, EXT_draw_instanced, EXT_draw_range_elements, EXT_fog_coord, EXT_framebuffer_blit, EXT_framebuffer_multisample, EXT_framebuffer_object, EXT_geometry_shader4, EXT_gpu_program_parameters, EXT_gpu_shader4, EXT_multi_draw_arrays, EXT_multisample, EXT_paletted_texture, EXT_point_parameters, EXT_polygon_offset, EXT_provoking_vertex, EXT_secondary_color, EXT_stencil_two_side, EXT_subtexture, EXT_swap_control, EXT_texture3D, EXT_texture_buffer_object, EXT_texture_integer, EXT_texture_object, EXT_timer_query, GREMEDY_frame_terminator, GREMEDY_string_marker, NV_vertex_program4, SGI_swap_control

  • Partial support for many more extensions, prioritized by usage and importance. ARB/EXT higher priority vs. vendor specific.
  • sharelist support
    • Replayer class is currently single threaded and automatically issues MakeCurrent()’s as needed.
    • Trace packets have timestamps and a global call counter, replayer issues them in “wall clock” time order.
  • Lots of support for old-school GL API’s:
    • ARB assembly language shaders (ARB_vertex_program and ARB_pixel_program)
    • Client side arrays (CSA’s):
      • Set via glVertexAttribPointer, glNormalPointer, glTexCoordPointer, glInterleavedArrays, etc.
    • glBegin/glEnd
    • Display lists:
      • Currently only support the most popular usages: whitelist of ~500 funcs, non-recursive, only texture binding
    • Fixed function pipeline:
      • Lights, texgen, texenv, materials, lights, matrix stacks, etc.

Completed and Short-Term Goals


  • Completed goals:
    • Survey all existing solutions, determine strengths/weaknesses of each.
    • Build database of all known GL API’s. No single definitive source found - so we combine:
      • Old Khronos .spec, apitrace's glapi.py, and Fournier's "gl-spec-parser" web scraper.
      • New Khronos XML spec not used yet - wasn’t available at the time.
    • Create 32/64-bit tracer SO and replayer tool based off a common set of reusable C++ classes
      • Replayer class accepts arbitrary packets from any source (even generated on the fly)
      • State snapshot/restore classes for generating GL state snapshots and restoring them
        • Snapshot classes just make standard GL calls, no knowledge of tracing or replaying
        • Any GL state that needs shadowing is handled by the tracer or replayer itself
      • All state objects serializable/deserializable to JSON/UBJ+binary files
    • Test tools on a variety of real and synthetic call streams, get as many apps to work as possible, iterate
  • Current goals:
    • Finish GL 3.x support before moving to 4.x - only a handful of GL 3.x state to snapshot/restore remaining at this point
    • Trace editor UI - needs a lot of love
    • Build library of app traces, implement continuous automated regression testing
    • Profile and optimize GL replayer class, build driver benchmarking tool
      • We’re already ~90% faster than apitrace’s replayer in -benchmark mode on Metro Last Light


Longer Term Goals


  • Trace editor/debugger UI
    • Full control over server: Click button to launch app on Steambox, another to capture frame, etc.
    • Full display of GL state vector, state vector diff’ing between two calls
    • Obvious things: Live editing, vertex/pixel history, CPU and GPU profiling, etc.
  • On the fly tracing
    • Tracer records state snapshots every X seconds, also continuously records ring buffer of GL trace packets
    • User clicks to save trace containing previous X seconds of gameplay to new trace file
  • Faster looping and seeking through huge traces (“DVR” replayer mode)
    • Lazy program shader compilation and linking during state restore/playback
    • Automatic keyframe generation during tracing or playback - use worker thread to write snapshots
    • Generate delta state snapshot objects, for fast seeking between keyframes/faster frame looping
  • Replayer perf: Fully multithreaded playback pipeline
    • For each context: thread A decodes packets and composes x86 opcodes to call GL, thread B execs this
  • Really long term: Vendor neutral shader debugging using standard GL calls
    • Compile shader to AST, insert atomic append ops that dump shader IP+variable state after each op to a huge buffer (only on selected vertex/pixel), output AST as GLSL, run draw with this shader
    • UI reads and parses buffer, now we have all the info we need to simulate shader stepping in a UI
    • Easier said than done, but we believe it’s inevitable that someone is going to do this right.


JSON Support Details


  • Binary traces convertible to JSON traces and vice versa
    • Binary traces are dumped to: one JSON file per frame, loose files for large data blobs contained in trace packets, and the trace’s .zip archive.
    • JSON traces guaranteed lossless vs. binary: float/double’s coded as hex strings when needed.
    • .zip archive can be unzipped and deleted, trace is still replayable from loose files, and loose files take precedence over archive files.
  • Direct playback and debugging of JSON traces
  • JSON traces and blob files designed to be manually editable
    • Replayer does its best to fix up/adjust GL call parameters as needed.
    • Many fields optional, and we’ve tried to avoid “magic” keys or field interdependencies.
  • Binary traces also make extensive use the Universal Binary JSON (UBJ) format: http://ubjson.org/
  • The voglcommon lib contains helper classes to read/write traces and packets, or you can read the JSON data yourself.


References




Minimal OpenGL JSON Trace

Sample JSON full-stream OpenGL trace, visualized as a graph:



Source:
// draw_triangle.json - Draws 1 white triangle on a gray background
{
   "meta" : { "cur_frame" : 0, "eof" : true }, "sof" : { "pointer_sizes" : 4 },
   
   "packets" : [
    { "func" : "glXCreateContext", "context" : "0x0", "params" : { "dpy" : "0x1", "vis" : "0x1", "shareList" : "0x0", "direct" : true  }, "return" : "0x1" },
{ "func" : "glXMakeCurrent", "context" : "0x0", "params" : { "dpy" : "0x1", "drawable" : "0x1", "context" : "0x1" }, "return" : true },
   
    { "func" : "glViewport", "params" : { "x" : 0, "y" : 0, "width" : 400, "height" : 200 } },
   
    { "func" : "glClearColor", "params" : { "red" : 0.25, "green" : .25, "blue" : .25, "alpha" : 1. } },
{ "func" : "glClear", "params" : { "mask" : "0x4000" } },
   
    { "func" : "glMatrixMode", "params" : { "mode" : "GL_PROJECTION" }, },

    { "func" : "glLoadIdentity" },
   
    { "func" : "glMatrixMode", "params" : { "mode" : "GL_MODELVIEW" } },
    { "func" : "glLoadIdentity" },
     
    { "func" : "glColor3f", "params" : { "red" : 1., "green" : 1., "blue" : 1. }, },
    { "func" : "glScalef", "params" : { "x" : 0.2, "y" : 0.2, "z" : 1. } },
    { "func" : "glTranslatef", "params" : { "x" : -1.5, "y" : 0., "z" : 0. } },

    { "func" : "glBegin", "params" : { "mode" : "GL_TRIANGLES" } },
    { "func" : "glVertex2f", "params" : { "x" : 0., "y" : 4. } },
    { "func" : "glVertex2f", "params" : { "x" : 4., "y" : 0. }, },
    { "func" : "glVertex2f", "params" : { "x" : 0., "y" : 0. } },
    { "func" : "glEnd" },

    { "func" : "glXSwapBuffers", "params" : {"dpy" : "0x1", "drawable" : "0x1" } }
]
}

Output from voglreplay draw_triangle.json -dump_screenshots:


OpenGL State Snapshot JSON File - Visualized as a Graph

Been using graphviz's dot tool to visualize JSON files containing OpenGL state snapshots. This tree only contains high-level GL state. The large texture, buffer, shader, etc. data is written to loose binary files and referenced by unique filenames in the JSON data, which isn't represented here.

I had to limit the max # of values per JSON array/object to 20 otherwise dot falls over and takes hours to complete. I wish I knew of a faster graph visualization tool.