[Public WebGL] WebGL2 and no mapBuffer/mapBufferRange

Jeff Gilbert [email protected]
Wed Mar 18 11:35:31 PDT 2015

On Tue, Mar 17, 2015 at 3:12 PM, Zhenyao Mo <[email protected]> wrote:

> On Tue, Mar 17, 2015 at 2:39 PM, Jeff Gilbert <[email protected]>
> wrote:
> > Just warn on WRITE without INVALIDATE.
> >
> > Here's what the costs look like to me:
> >
> > MapBufferRange(READ) ~= GetBufferSubData:
> > Both require a synchronous GL command followed by a copy.
> >
> > MapBufferRange(WRITE|INVALIDATE) ~= BufferSubData
> > MapBufferRange can create a scratch shmem for writing via ArrayBuffer,
> send
> > it across IPC on flush, and do Map+memcpy on the GL process. (1 copy)
> > BufferSubData is at best from an ArrayBuffer which is already shmem, and
> is
> > then a copy-on-write (ideally no-copy), but still needs to call
> > BufferSubData or Map+memcpy on GL process. (2 copies, but only 1 copy if
> you
> > have a heuristic which allocates shmem to ArrayBuffers)
> This scenario is my main concern. out-of-process-GL will have at least
> one extra copying comparing with in-process-GL.  Since it's likely on
> the critical rendering path, this diff will create a huge perf gap
> among implementations. IMHO, this is really bad for WebGL as a
> standard.
Yet MapBufferRange(WRITE|INVALIDATE) is generally one fewer copy than
BufferSubData, even on out-of-process-GL.

We are in the business of exposing a 'sharp tool' API. If something in
particular is slow on some platforms, tell people about it and have them
use alternate codepaths. Artificially limiting performance for
implementations because of a quirk in one browser does not seem healthy for
a performance-oriented API.

> >
> > With UNSYNCHRONIZED, MapBufferRange can be 'sharper', but potentially
> more
> > performant:
> > READ|UNSYNC: Still synchronous, but lets the GL process use
> > to prevent stalls.
> > WRITE|INVAL|UNSYNC: Still async, but lets the GL process use
> > to prevent stalls.
> I don't think we can allow UNSYNCHRONIZED bit to reach the underlying
> GL. That's leads to undefined behavior.
Let's leave this for a later discussion then.

> >
> > FLUSH_EXPLICIT lets out-of-process GL reduce the amount of data it needs
> to
> > memcpy while not having to allocate many smaller chunks with
> BufferSubData.
> > (Multiple discard+write ranges with the same single shmem scratch buffer)
> >
> > WRITE without INVAL is probably much slower than WRITE|INVAL even on
> > in-process-GL implementations.
> How? Assuming you map, write to some, flush, write to some other,
> flush... unmap.  So unless you change the ArrayBuffer semantics to
> keep track of dirty/clean states for each element, otherwise each
> flush is to write back the entire range.
There is a command for flushing subranges. With WRITE|INVAL,
out-of-process-GL would likely create a scratch buffer shmem and thus
controls its contents. WIth buffer reuse (by the same context), clearing to
zero isn't even required. Writes get made onto this buffer, the flushed
ranges of which are copied into the eventual mapped buffer on the GL
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://khronos.org/pipermail/public_webgl_khronos.org/attachments/20150318/d44d98ff/attachment.html>

More information about the public_webgl mailing list