[Public WebGL] determining depth of z-buffer

[email protected] [email protected]
Thu Feb 3 05:58:36 PST 2011

I've interspersed my responses below:

> I've used the two-pass trick and I like it.  In practice I found that
> I had to leave like 10% overlap in z to avoid the possibility of a
> visible crack, but this was around 10 years ago; maybe GPUs don't need
> so much margin anymore.
> Note that the ideal zmid value is NOT halfway between znear and zfar;
> instead you want znear/zmid == zmid/zfar, so zmid = sqrt(znear*zfar)

My "Learn to love your Z buffer" page:


...has the equation for how to calculate the value that's actually written
into the Z buffer:

  z_buffer_value = (1<<N) * ( a + b / z )


     N = number of bits of Z precision
     a = zFar / ( zFar - zNear )
     b = zFar * zNear / ( zNear - zFar )
     z = distance from the eye to the object

  ...and z_buffer_value is an integer.

But if zFar is MUCH larger than zNear (as is almost always the case) -
then 'a' is more or less 1.0 and b is more or less -zNear so the equation
simplifies to:

  z_buffer_value = (1<<N) * ( 1 - zNear / z )

...and (crucially) that doesn't depend on zFar!   It follows from this
that for most 'normal' applications, you might as well stick zFar out
somewhere near infinity...which is what I recommended as the solution to
the problem that kicked off this thread.

A handy way to think about Z precision is that (with a 24 bit Z buffer),
the range at which the Z buffer is 1% accurate is Z=170,000*zNear and the
range at which it's 5% accurate is Z=1 million * zNear.

> For the case where you have distinct bands, Steve's scaling trick is
> clever.  I suspect you can also get the same effect by playing with
> the projection matrix.  And, there is gl.depthRange to kind of do the
> same thing.  I've never used it but I think it's equivalent to playing
> with the projection matrix.  The big drawback of course is it doesn't
> help if your data is continuous throughout the z range.
> I revisited this problem in Jan 2010, and the crazy thing is, there is
> a simple hardware solution!  If, instead of writing a value based on z
> or 1/z into the depth buffer, the hardware wrote:
>   log(view_z / z_near) / log(z_far / z_near)
> you would get constant relative precision throughout the z range, and
> 16-bit z-buffers would be more or less adequate for planetary scale
> rendering!  24 bits would be enough for pretty much any imaginable
> purpose!

I strongly disagree!

That solution is generally called a 'W-buffer' - and I think Direct3D
supports it, although OpenGL does not without extensions (or shader hacks
that entail writing to gl_FragDepth).

However, it is not the panacea you imagine.  Suppose you want to draw the
Earth and the space-shuttle in orbit around it.  24 bits of W gives you a
one part in ~16 million precision.  The earth is 12,700 km in diameter -
which means that if you scale it to fit within a 24 bit W-buffer you have
a precision of roughly 1 meter.  Now try to draw a space-shuttle with a
depth precision of only 1 meter!  It'll look like complete crap!!

In fact, just consider a "normal" outdoor scene.  At ground level, the
horizon is maybe 8km away - so 24 bits of W would give you (on paper)
about a half-millimeter of precision everywhere.  That might be OK...but
actually, accumulated round-off error throughout the graphics chain would
probably erode that to more like 3mm - and that's pretty nasty.

The reason we want this funny screwed up Z-buffer format is in order to
have good precision near the camera where you can see it - and relatively
poor precision out at the horizon where you don't.

> I did some experiments at the time using WebGL and the gl_FragDepth
> feature, and it appears to work great in practice.  However,
> gl_FragDepth didn't work at all on one of my machines, and apparently
> it is no longer valid in WebGL.  The other problem is that if you used
> it, the driver would need to disable any hardware hierarchical z,
> which would be bad for performance.

Yes - that's true.  Even without hierarchical Z, most hardware will do a Z
buffer test BEFORE running the fragment shader - with huge savings when
things are hidden behind nearer objects - but if you write to gl_FragDepth
then that optimization is turned off.

> I went looking for corroboration of my results and discovered that
> Brano Kemen had recently written a couple of great blog posts on
> Gamasutra exploring the same phenomenon:
> http://www.gamasutra.com/blogs/BranoKemen/20090812/2725/Logarithmic_Depth_Buffer.php
> http://www.gamasutra.com/blogs/BranoKemen/20091231/3972/Floating_Point_Depth_Buffers.php
> His conclusion is that you can get the positive effects of a log depth
> buffer by using a floating-point depth buffer, and running the values
> backwards (1 == near, 0 == far).

Yes - but at the price of an extra divide per pixel.

> I wonder why log depth buffer wasn't written into the original OpenGL
> spec?  Did nobody discover it before Brano Kemen in 2009?  Or was it
> just too expensive to have to do a log() on every pixel?

WAY too expensive!

What we have is (essentially) a reciprocal-Z buffer instead of a
logarithmic Z.  The shapes of those curves are fairly similar - so the
reciprocal-Z approach is almost as good...and it saves one divide per
vertex and one divide per pixel - back then, a 24 bit divide circuit would
fill an entire chip and horribly limit your clock rate.

Remember, this stuff pre-dates OpenGL by quite a long way.  The original
Silicon Graphics "Geometry Engine" in the Personal-IRIS (probably the
first hardware accelerated 3D engine with a Z-buffer that you could
actually buy) did Z just like modern WebGL does.

Before that, hardware 3D didn't do Z buffering at all - the (~$1,000,000)
flight simulator graphics hardware of that era generally used
depth-sorting tricks (specifically "separating planes" - which are akin to
BSP trees) to kinda-sorta solve the ordering problem without using a depth
buffer of any kind.

Those early SGI machines used "IrisGL" - which was the progenitor of
OpenGL.  The first OpenGL implementation was running on SGI "Onyx"
hardware that was dual-purpose IrisGL/OpenGL, so the Z buffer arrangements
were the same.  It's really only in the last 10 years that doing a divide
per-pixel has been considered a "do-able thing" without massive loss of

Later SGI machines used some kind of a lookup table to somewhat linearize
Z without making it completely linear (which, as I said, is undesirable). 
It was a nice compromise between a pure Z-buffer and a pure W-buffer - but
sadly that trick hasn't made it into PC hardware.

  -- Steve

You are currently subscribed to [email protected]
To unsubscribe, send an email to [email protected] with
the following command in the body of your email:
unsubscribe public_webgl

More information about the public_webgl mailing list