Note: All this is on the vogl wiki now: https://github.com/ValveSoftware/vogl/wiki
- We don't support LD_PRELOAD-style tracing on Optimus setups.
We do support manually loading our tracer (libvogltrace32/64.so) on Optimus, but it's not something I've had the time to test much. To do this, manually load libvogltrace and dlsym() the gliGetProcAddressRAD()function (to be renamed to voglGetProcAddress()).
- Can't take state snapshots during tracing or replaying while any buffers are currently mapped.
I'm currently working on removing this restriction during replaying (which is easy because we fully control all GL contexts during replaying), but reliably removing this limitation during tracing in all scenarios seems challenging.
- PBO (pixel pack and unpack buffers) not supported in the current github drop
This is already implemented and is being tested with Steam 10ft traces. I'll hopefully push it up by the end of the week.
- GL 4.x is not supported for full-stream or snapshotting
- Cubemap arrays not supported for snapshotting yet (but are OK for full-stream)
Here's the list of texture types we can snapshot: 1D, 2D, RECTANGLE, CUBE_MAP, 1D_ARRAY, 2D_ARRAY, 3D, 2D_MULTISAMPLE, and 2D_MULTISAMPLE_ARRAY. Incomplete textures are OK, but you'll get a warning if you haven't properly set GL_TEXTURE_MAX_LEVEL (which you most definitely should always do because not doing so is unreliable in practice).
- Abuse of GL handles+multiple contexts
Let's say you create a second context that shares with your first context. It gens a texture (handle=1), binds it on both contexts, calls glTexStorage() to initialize it, then deletes the texture on the 1st context. Everything appears as expected on the 1st context: the texture becomes auto-unbound, glIsTexture() reports false, and I can't retrieve the texture's width anymore (using glGetTexLevelParameteriv()). All nice and neat.
But on the 2nd context, the texture remains bound, glIsTexture() returns false, but I can still retrieve the texture's width. If I call glGenTextures() handle 1 gets immediately reused, even though it's still bound (as reported by glGet() on GL_TEXTURE_BINDING_2D) and even though I can retrieve texture 1's width. At this point handle 1 means two different things (!) on this specific context, which is most wonderful. If I then rebind texture handle 1 (which was just re-genned) I can no longer retrieve the width.
- Can't snapshot textures after they are deleted (but still bound elsewhere)
However, there are other scenarios (such as binding a texture to a FBO, then deleting the texture but keeping it bound to the FBO) that we don't fully support for snapshotting. This scenario may never be fully supported: the last time I tried I couldn't query state of deleted (but still bound) textures on at least one driver, and we're not going to deeply shadow all texture state to work around this. Luckily, I've only ever seen this done purposely in one app so far, and the attached texture was not actually used for rendering purposes after the deletion. (They kept it attached to keep their hands on the GPU memory so the driver wouldn't reclaim it.)
vogl will spit out an error and typically try to continue snapshotting when it encounters a handle attached to an object that has been deleted (and we've lost track of). You'll get a handle remap error, because we won't know how to remap the handle from the GL replay domain back into the trace domain. The snapshot may cause the replayer to diverge, though.
- During replaying the default (GLX) framebuffer is always 32-bit RGBA, no MSAA, with a 24/8 depth stencil buffer.
- Replay window auto-resizing can be a problem in some apps
Unlike apitrace, we only use a single replay window and resize it as needed. The auto-resize logic can get stuck resizing too much. This problem pops up most often in GLUT/FreeGLUT apps. We can capture/replay them, but the replayer's window code tends to get confused by the GLUT UI window activity. It'll still replay properly, but slowly as the replayer auto-resizes the replay window.
If the window auto-resizes too much use "-lock_window_dimensions -width X -height Y" on the voglreplay command line to lock the replay window to a fixed size.
We may switch to apitrace-style multiple windows, or maybe pbuffers, to work around this (needs investigation).
- We can't snapshot inside of glBegin/glEnd regions.
- Display list limitations
No recursion and no resources can be bound in the display list but textures. We do support around 400 API's inside of display lists. GL display lists are ancient API's at this point, so I don't think we'll do much more in this area unless a big title from the past uses them. (We do already support Doom3's usage of GL display lists, though.)
- Be careful deleting contexts that share lists with other contexts
We support tracing/replaying/snapshotting/restoring the state of multiple contexts. vogl has the concept of "root" contexts and "sharelist groups". A sharelist group is 2 or more contexts that share objects, and the first context created in this group (that doesn't, and can't, share with anything else) is marked as the "root" context for that group.
vogl can't snapshot state if the "root" context of a sharelist group is destroyed while other leaf contexts are still present. Either snapshot immediately after all the leaf contexts are destroyed, or reorder your context deletions so the root gets killed last. In 99% of cases none of this matters; most apps just delete all their contexts at once or just leak them at exit.
We've gone back and forth with always disabling program binaries by default in the tracer, but at the end of the day we take the policy of changing the app's behavior during tracing as little as possible unless you have purposely chosen to override something.
Note program binaries are usually *extremely* fragile, so traces containing program binaries may only be replayable on the exact driver version you captured them on.
To snapshot during tracing, write a file named "__trigger_capture__" to the app's current directory and the tracer will immediately take a snapshot. You can take as many snapshots as you want while tracing. (Of course, you can't have specified "--vogl_tracefile X" on your command line, which would have put the tracer into full-stream mode.) I'll better document this within a day or so, for now just search the code in vogl_intercept.cpp.
https://github.com/ValveSoftware/vogl/blob/master/glspec/gl_glx_whitelisted_funcs.txt
https://github.com/ValveSoftware/vogl/blob/master/glspec/gl_glx_simple_replay_funcs.txt
You can still try to replay this trace, but it may diverge or horribly fail. To see a more detailed whitelist, run the "voglgen" tool with the -debug option in the glspec directory.
Some of the newer GL debug related funcs aren't in the whitelist yet, I'll be adding them in very soon.
You'll get warnings if you call GetProcAddress() on GL/GLX functions that are not in the whitelist. This is typically harmless, most apps use GL extension libraries that retrieve the addresses of hundreds to thousands of GL funcs they never actually call.
- Forking while tracing
I've encountered problems with this on some apps (mostly Mono ones I think). Needs investigation, we haven't tested it.
- Try to delete your contexts when exiting
We've got several hooks in there to make sure the trace is properly flushed and closed when apps exit and leak their contexts. These hooks work most of the time, but it's best if you properly tear down your contexts when you exit.
The replayer does support unflushed traces (with no trace archive at the end), but there are no guarantees.
Also, not properly tearing down your contexts before exiting actually makes it very difficult for us to fully flush any in-progress asynchronous PBO readbacks (used for real-time JPEG capturing).
Also, not properly tearing down your contexts before exiting actually makes it very difficult for us to fully flush any in-progress asynchronous PBO readbacks (used for real-time JPEG capturing).
- UI limitations
The entire UI is still very, very new. The texture, renderbuffer, and default framebuffer viewer in particular is very basic. It has little support for viewing traces that have multiple contexts.
Peter Lohrmann is working on improving the UI. We're currently using it to help us debug the debugger itself, which is progress, but there's a bunch of work left before I would try using it to debug a title with it.
- Driver compat
- Program binary gotchas
We've gone back and forth with always disabling program binaries by default in the tracer, but at the end of the day we take the policy of changing the app's behavior during tracing as little as possible unless you have purposely chosen to override something.
Note program binaries are usually *extremely* fragile, so traces containing program binaries may only be replayable on the exact driver version you captured them on.
- Can't take a snapshotting while tracing if other threads have contexts current
- Replayer whitelist
https://github.com/ValveSoftware/vogl/blob/master/glspec/gl_glx_whitelisted_funcs.txt
https://github.com/ValveSoftware/vogl/blob/master/glspec/gl_glx_simple_replay_funcs.txt
You can still try to replay this trace, but it may diverge or horribly fail. To see a more detailed whitelist, run the "voglgen" tool with the -debug option in the glspec directory.
Some of the newer GL debug related funcs aren't in the whitelist yet, I'll be adding them in very soon.
You'll get warnings if you call GetProcAddress() on GL/GLX functions that are not in the whitelist. This is typically harmless, most apps use GL extension libraries that retrieve the addresses of hundreds to thousands of GL funcs they never actually call.
This comment has been removed by the author.
ReplyDeleteDo you plan on adding any GPU memory bandwidth profiling capabilities? This probably requires some driver support, ideally through some kind of a vendor neutral API.
ReplyDeleteI'm asking this because Dota 2 (and probably a lot of other games) seems to be memory bandwidth constrained. So for Intel/Linux combo (and probably for some other combinations) this makes the perf significantly lower than Windows/Intel/D3D. And that quite sucks, given it's one of your flagship products :)