[Public WebGL] WEBGL_texture_from_depth_video extension proposal
Mon Nov 10 22:59:33 PST 2014
unsigned short 5-6-5 refers to a format where the size of a texel is short
(16-bits). It encodes RGB values, where there are 5 bits for red, 6 bits
for green and 5 bits for blue.
Taking an unsigned short 16-bit value, and uploading it to an unsigned
short 5-6-5 texture, is what's commonly referred to as "packing". You pack
some larger piece of data, into multiply reduced precision channels (other
forms of packing are things like RGBE, packing a rendered depth to 2 bytes,
packing a normal into 2 bytes and so forth).
Packed values are not well behaved for interpolation operations.
Interpolation in OpenGL happens at these stages:
- magnified texture lookups due to interpolation
- minified texture lookups due to mipmapping and anisotropy
- blending (when outputting packed values)
- anti-aliasing (when outputting packed values)
- alpha to coverage
In each of these cases, what the GPU tries to do, is to take a value it
assumes to be a single atomic piece of numerical data, and mix it with
another such piece of data. To demonstrate, let's suppose you have some
arbitrary depth value of 1101111010111000 (57016). Let's average this with
another one like 0110001001001001 (25161). The average is 1010000010000000
(41088). In 5-6-5 this would be chopped into (11011+01100)/10 = 10011,
(110101+010010)/10 = 100011, (11000+01001)/10 = 10000. If you reassemble
that to a short you get 1001110001110000 (40048). You will notice that
40048 is not the average between 57016 and 25161.
And that is why you cannot use the aforementioned operations with packed
values. Some of these operations do not matter much to the uploaded depth
data. You will not use mipmapping because you cannot render to mipmaps in
WebGL 1.0, and gl.generateMipmap() may go trough the CPU, which makes it
infeasible for video data. You will not blend these values unmodified
because you'd likely read them out before blending. You wouldn't anti-alias
the raw values and the same applies to alpha to coverage.
However there is one operation that will be frequently used, which is
linear magnification interpolation. For instance a common usecase for depth
data is to make some or other kind of artsy experiment where you'll offset
a mesh by the depth value as well as color it in some way by the depth
value. Both of these would have to use nearest, which can be an acceptable
choice for the mesh in case the mesh exactly matches the video resolution.
However as a full-hd video contains 1920x1080 pixels, the resulting mesh
would be over 4 million triangles, which might be a tad on the expensive
side. If you use a far smaller mesh, you'll run into problems of aliasing,
and so it'd be desirable to average say 4 pixels in the depth texture to
get one depth for a vertex. A cheap way to do that is to create a mesh
960x540 (a million triangles) and sample at the center between pixels. Of
course that doesn't work on 5-6-5. And so you'd have to sample the 4
surrounding pixels to get an average. Likewise the fragment shader would
probably require linear interpolation for magnification for most usecases.
As a sidenote, even if you sample at the centroid for a gl.LINEAR texture,
for data that cannot be interpolated, you will get garbage, because
interpolation might still be applied and due to floating point rounding
error and other precision artifacts you are rarely sampling exactly the
spot where the you get no inference from nearby values.
For these reasons, what's likely going to happen with these depth values in
practical use, is this:
- upload the depth to 5-6-5
- decode the depth to some interpolatable format
- use the depth data
It'd a rare usecase indeed that somebody would want to directly work with
the data as-is.
On Tue, Nov 11, 2014 at 6:49 AM, Ben Adams <[email protected]>
> i.e. would this mean nearest sampling would be required where the mapping
> between screen pixels and depth camera is not 1:1
> (though I don't know if there is sampling types on a depth buffer :)
> On 11 November 2014 05:34, Ben Adams <[email protected]> wrote:
>> Would 5-6-5 cause interpolation issues? Is it and rgb or float texture?
>> On 10 November 2014 20:33, Florian Bösch <[email protected]> wrote:
>>> On Mon, Nov 10, 2014 at 8:43 PM, Kenneth Russell <[email protected]> wrote:
>>>> It'll only be efficient to upload depth videos to WebGL textures using
>>>> the internal format which avoids converting the depth values during
>>>> the upload process. That's why UNSIGNED_SHORT_5_6_5 was chosen as the
>>>> single supported format for uploading this content to WebGL 1.0. It's
>>>> not desirable for either the browser implementer or the web developer
>>>> to support uploading depth videos to lots of random texture formats if
>>>> they won't be efficient. The Media Capture group should comment on
>>>> what formats depth cameras tend to output, and are likely to output in
>>>> the future.
>>> I think it's demonstratable that conversion between formats is
>>> reasonably efficient if it can be done on-GPU, which is something that's
>>> just about getting done for <video> now.
>>> The reason I'm not in favor of fixing this to ushort 5-6-5 is because it
>>> is quite often the case that an app developer would want something else to
>>> use. So for instance because you cannot interpolate 5-6-5 that's been
>>> bastardized to hold a single depth value, you'd then proceed to write your
>>> own framebuffer to decode it to say, byte, int, float or what have you.
>>> Likewise, 5-6-5 smells smack of an internal format, that's liable to change
>>> with whoever's putting out the next depth capture device, and so, latest by
>>> that point, you'll be converting something like say, a floating point depth
>>> TO 5-6-5, which would be more than a little ironic.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the public_webgl