From pub...@ Thu Jul 9 10:00:48 2020 From: pub...@ (Kevin Rogovin (...@...)) Date: Thu, 9 Jul 2020 20:00:48 +0300 Subject: [Public WebGL] WebGL2 element array buffer and copy buffer (5.1 Buffer Object Binding) Message-ID: Hi, Reading the text, it appears that: glGenBuffers(1, &index_buffer); glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, index_buffer); glBindBuffer(GL_COPY_READ_BUFFER, some_other_buffer); glBindBuffer(GL_COPY_WRITE_BUFFER, index_buffer); glCopyBufferSubData(GL_COPY_READ_BUFFER, GL_COPY_WRITE_BUFFER, someOffset, someLength); is legal because index_buffer gets the type "element array", from the first bind and binding it to GL_COPY_WRITE_BUFFER is legal because of what the table states. However, the rational states the restriction is to make sure index values are in range, but the above scenario in order to make sure that the indices are in range would require that a browser implementation would need to peek the buffers via CPU which would defeat the purpose of using glCopyBufferSubData() on any index buffer. What is expected to happen with the above? Best Regards, -Kevin ----------------------------------------------------------- You are currently subscribed to public_webgl...@ To unsubscribe, send an email to majordomo...@ with the following command in the body of your email: unsubscribe public_webgl ----------------------------------------------------------- From pub...@ Thu Jul 9 10:49:06 2020 From: pub...@ (Ken Russell (...@...)) Date: Thu, 9 Jul 2020 10:49:06 -0700 Subject: [Public WebGL] WebGL2 element array buffer and copy buffer (5.1 Buffer Object Binding) In-Reply-To: References: Message-ID: WebGL implementations that can't rely on robust buffer access behavior shadow the contents of ELEMENT_ARRAY_BUFFERs on the CPU. The CopyBufferSubData call updates the shadow copies as well, allowing the maximum index per type and range within the buffer to be computed without readback. The restrictions in the WebGL specification are there to avoid the need to shadow the contents of all buffers. -Ken On Thu, Jul 9, 2020 at 10:02 AM Kevin Rogovin (kevinrogovin...@) wrote: > > Hi, > > Reading the text, it appears that: > > glGenBuffers(1, &index_buffer); > glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, index_buffer); > > glBindBuffer(GL_COPY_READ_BUFFER, some_other_buffer); > glBindBuffer(GL_COPY_WRITE_BUFFER, index_buffer); > glCopyBufferSubData(GL_COPY_READ_BUFFER, GL_COPY_WRITE_BUFFER, > someOffset, someLength); > > is legal because index_buffer gets the type "element array", from the > first bind and binding it to GL_COPY_WRITE_BUFFER is legal because of > what the table states. > > However, the rational states the restriction is to make sure index > values are in range, but the above scenario in order to make sure that > the indices are in range would require that a browser implementation > would need to peek the buffers via CPU which would defeat the purpose > of using glCopyBufferSubData() on any index buffer. > > What is expected to happen with the above? > > Best Regards, > -Kevin > > ----------------------------------------------------------- > You are currently subscribed to public_webgl...@ > To unsubscribe, send an email to majordomo...@ with > the following command in the body of your email: > unsubscribe public_webgl > ----------------------------------------------------------- > > -- I support flexible work schedules, and I?m sending this email now because it is within the hours I?m working today. Please do not feel obliged to reply straight away - I understand that you will reply during the hours you work, which may not match mine. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pub...@ Thu Jul 9 11:00:25 2020 From: pub...@ (Kevin Rogovin (...@...)) Date: Thu, 9 Jul 2020 21:00:25 +0300 Subject: [Public WebGL] WebGL2 element array buffer and copy buffer (5.1 Buffer Object Binding) In-Reply-To: References: Message-ID: Hi, My use case is that I am thinking of using transform feedback to generate an index buffer. When one says "The CopyBufferSubData call updates the shadow copies as well, allowing the maximum index per type and range within the buffer to be computed without readback.", what I do not follow is how the computation of the maximum index value per type is computed on GLES3.0 without reading the buffer contents back to CPU. However, I guess the real question for me is this: can one expect that the robust access is implemented by GPU (instead of implemented by a WebGL implementation bt tracking) for Dx10 backends and those OpenGL and GLES backends that are operating on HW with GL_ARB/KHR/EXT_robust_access? -Kevin On Thu, Jul 9, 2020 at 8:50 PM Ken Russell (kbr...@) wrote: > > WebGL implementations that can't rely on robust buffer access behavior shadow the contents of ELEMENT_ARRAY_BUFFERs on the CPU. The CopyBufferSubData call updates the shadow copies as well, allowing the maximum index per type and range within the buffer to be computed without readback. > > The restrictions in the WebGL specification are there to avoid the need to shadow the contents of all buffers. > > -Ken > > > > On Thu, Jul 9, 2020 at 10:02 AM Kevin Rogovin (kevinrogovin...@) wrote: >> >> >> Hi, >> >> Reading the text, it appears that: >> >> glGenBuffers(1, &index_buffer); >> glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, index_buffer); >> >> glBindBuffer(GL_COPY_READ_BUFFER, some_other_buffer); >> glBindBuffer(GL_COPY_WRITE_BUFFER, index_buffer); >> glCopyBufferSubData(GL_COPY_READ_BUFFER, GL_COPY_WRITE_BUFFER, >> someOffset, someLength); >> >> is legal because index_buffer gets the type "element array", from the >> first bind and binding it to GL_COPY_WRITE_BUFFER is legal because of >> what the table states. >> >> However, the rational states the restriction is to make sure index >> values are in range, but the above scenario in order to make sure that >> the indices are in range would require that a browser implementation >> would need to peek the buffers via CPU which would defeat the purpose >> of using glCopyBufferSubData() on any index buffer. >> >> What is expected to happen with the above? >> >> Best Regards, >> -Kevin >> >> ----------------------------------------------------------- >> You are currently subscribed to public_webgl...@ >> To unsubscribe, send an email to majordomo...@ with >> the following command in the body of your email: >> unsubscribe public_webgl >> ----------------------------------------------------------- >> > > > -- > I support flexible work schedules, and I?m sending this email now because it is within the hours I?m working today. Please do not feel obliged to reply straight away - I understand that you will reply during the hours you work, which may not match mine. > ----------------------------------------------------------- You are currently subscribed to public_webgl...@ To unsubscribe, send an email to majordomo...@ with the following command in the body of your email: unsubscribe public_webgl ----------------------------------------------------------- From pub...@ Thu Jul 9 11:09:21 2020 From: pub...@ (Kevin Rogovin (...@...)) Date: Thu, 9 Jul 2020 21:09:21 +0300 Subject: [Public WebGL] WebGL2 element array buffer and copy buffer (5.1 Buffer Object Binding) In-Reply-To: References: Message-ID: Hmm... judging by docs/ExtensionSupport.md of ANGLE: " GL\_EXT\_robustness * reset notifications and sized queries only, no robust buffer access " It looks like if a browser is using ANGLE to do WebGL2, then robust buffer access is done by software on top, rather than hw. Any developers for various browsers able to confirm this or say the above is wrong for their browser? -Kevin On Thu, Jul 9, 2020 at 9:00 PM Kevin Rogovin wrote: > > Hi, > > My use case is that I am thinking of using transform feedback to > generate an index buffer. When one says "The CopyBufferSubData call > updates the shadow copies as well, allowing the maximum index per type > and range within the buffer to be computed without readback.", what I > do not follow is how the computation of the maximum index value per > type is computed on GLES3.0 without reading the buffer contents back > to CPU. > > However, I guess the real question for me is this: can one expect that > the robust access is implemented by GPU (instead of implemented by a > WebGL implementation bt tracking) for Dx10 backends and those OpenGL > and GLES backends that are operating on HW with > GL_ARB/KHR/EXT_robust_access? > > -Kevin > > On Thu, Jul 9, 2020 at 8:50 PM Ken Russell (kbr...@) > wrote: > > > > WebGL implementations that can't rely on robust buffer access behavior shadow the contents of ELEMENT_ARRAY_BUFFERs on the CPU. The CopyBufferSubData call updates the shadow copies as well, allowing the maximum index per type and range within the buffer to be computed without readback. > > > > The restrictions in the WebGL specification are there to avoid the need to shadow the contents of all buffers. > > > > -Ken > > > > > > > > On Thu, Jul 9, 2020 at 10:02 AM Kevin Rogovin (kevinrogovin...@) wrote: > >> > >> > >> Hi, > >> > >> Reading the text, it appears that: > >> > >> glGenBuffers(1, &index_buffer); > >> glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, index_buffer); > >> > >> glBindBuffer(GL_COPY_READ_BUFFER, some_other_buffer); > >> glBindBuffer(GL_COPY_WRITE_BUFFER, index_buffer); > >> glCopyBufferSubData(GL_COPY_READ_BUFFER, GL_COPY_WRITE_BUFFER, > >> someOffset, someLength); > >> > >> is legal because index_buffer gets the type "element array", from the > >> first bind and binding it to GL_COPY_WRITE_BUFFER is legal because of > >> what the table states. > >> > >> However, the rational states the restriction is to make sure index > >> values are in range, but the above scenario in order to make sure that > >> the indices are in range would require that a browser implementation > >> would need to peek the buffers via CPU which would defeat the purpose > >> of using glCopyBufferSubData() on any index buffer. > >> > >> What is expected to happen with the above? > >> > >> Best Regards, > >> -Kevin > >> > >> ----------------------------------------------------------- > >> You are currently subscribed to public_webgl...@ > >> To unsubscribe, send an email to majordomo...@ with > >> the following command in the body of your email: > >> unsubscribe public_webgl > >> ----------------------------------------------------------- > >> > > > > > > -- > > I support flexible work schedules, and I?m sending this email now because it is within the hours I?m working today. Please do not feel obliged to reply straight away - I understand that you will reply during the hours you work, which may not match mine. > > ----------------------------------------------------------- You are currently subscribed to public_webgl...@ To unsubscribe, send an email to majordomo...@ with the following command in the body of your email: unsubscribe public_webgl ----------------------------------------------------------- From pub...@ Thu Jul 9 11:10:29 2020 From: pub...@ (Ken Russell (...@...)) Date: Thu, 9 Jul 2020 11:10:29 -0700 Subject: [Public WebGL] WebGL2 element array buffer and copy buffer (5.1 Buffer Object Binding) In-Reply-To: References: Message-ID: GPU-side creation of index buffers is the one thing that's structurally forbidden by WebGL's rules. It has to be possible for implementations to always know the contents of index buffers without doing GPU->CPU readbacks. Your use case sounds interesting. Can you share any more details? Perhaps you could generate independent triangles into a vertex buffer instead, so that DrawArrays could be used on the results? -Ken On Thu, Jul 9, 2020 at 11:01 AM Kevin Rogovin (kevinrogovin...@) wrote: > > Hi, > > My use case is that I am thinking of using transform feedback to > generate an index buffer. When one says "The CopyBufferSubData call > updates the shadow copies as well, allowing the maximum index per type > and range within the buffer to be computed without readback.", what I > do not follow is how the computation of the maximum index value per > type is computed on GLES3.0 without reading the buffer contents back > to CPU. > > However, I guess the real question for me is this: can one expect that > the robust access is implemented by GPU (instead of implemented by a > WebGL implementation bt tracking) for Dx10 backends and those OpenGL > and GLES backends that are operating on HW with > GL_ARB/KHR/EXT_robust_access? > > -Kevin > > On Thu, Jul 9, 2020 at 8:50 PM Ken Russell (kbr...@) > wrote: > > > > WebGL implementations that can't rely on robust buffer access behavior > shadow the contents of ELEMENT_ARRAY_BUFFERs on the CPU. The > CopyBufferSubData call updates the shadow copies as well, allowing the > maximum index per type and range within the buffer to be computed without > readback. > > > > The restrictions in the WebGL specification are there to avoid the need > to shadow the contents of all buffers. > > > > -Ken > > > > > > > > On Thu, Jul 9, 2020 at 10:02 AM Kevin Rogovin ( > kevinrogovin...@) wrote: > >> > >> > >> Hi, > >> > >> Reading the text, it appears that: > >> > >> glGenBuffers(1, &index_buffer); > >> glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, index_buffer); > >> > >> glBindBuffer(GL_COPY_READ_BUFFER, some_other_buffer); > >> glBindBuffer(GL_COPY_WRITE_BUFFER, index_buffer); > >> glCopyBufferSubData(GL_COPY_READ_BUFFER, GL_COPY_WRITE_BUFFER, > >> someOffset, someLength); > >> > >> is legal because index_buffer gets the type "element array", from the > >> first bind and binding it to GL_COPY_WRITE_BUFFER is legal because of > >> what the table states. > >> > >> However, the rational states the restriction is to make sure index > >> values are in range, but the above scenario in order to make sure that > >> the indices are in range would require that a browser implementation > >> would need to peek the buffers via CPU which would defeat the purpose > >> of using glCopyBufferSubData() on any index buffer. > >> > >> What is expected to happen with the above? > >> > >> Best Regards, > >> -Kevin > >> > >> ----------------------------------------------------------- > >> You are currently subscribed to public_webgl...@ > >> To unsubscribe, send an email to majordomo...@ with > >> the following command in the body of your email: > >> unsubscribe public_webgl > >> ----------------------------------------------------------- > >> > > > > > > -- > > I support flexible work schedules, and I?m sending this email now > because it is within the hours I?m working today. Please do not feel > obliged to reply straight away - I understand that you will reply during > the hours you work, which may not match mine. > > > > ----------------------------------------------------------- > You are currently subscribed to public_webgl...@ > To unsubscribe, send an email to majordomo...@ with > the following command in the body of your email: > unsubscribe public_webgl > ----------------------------------------------------------- > > -- I support flexible work schedules, and I?m sending this email now because it is within the hours I?m working today. Please do not feel obliged to reply straight away - I understand that you will reply during the hours you work, which may not match mine. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pub...@ Thu Jul 9 11:23:31 2020 From: pub...@ (Kevin Rogovin (...@...)) Date: Thu, 9 Jul 2020 21:23:31 +0300 Subject: [Public WebGL] WebGL2 element array buffer and copy buffer (5.1 Buffer Object Binding) In-Reply-To: References: Message-ID: I can do what I plan to do without needing to generate an index buffer, but it is not optimal. Basically, if I cannot generate an index buffer, then I call glDrawElements(GL_TRIANGLES, ..) under transform feedback and do glDrawArrays(GL_TRIANGLES, ) with the transform feedback output. If I can get transform feedback to generate an index buffer, then I can do glDrawElements(GL_POINTS,..) under transform feedback and call glDrawElements(GL_TRIANGLES, ) to render the content. The upshot being that then the post-vertex cache is only in use when I can get the transform feedback to generate the index buffer. -Kevin On Thu, Jul 9, 2020 at 9:11 PM Ken Russell (kbr...@) wrote: > > GPU-side creation of index buffers is the one thing that's structurally forbidden by WebGL's rules. It has to be possible for implementations to always know the contents of index buffers without doing GPU->CPU readbacks. > > Your use case sounds interesting. Can you share any more details? Perhaps you could generate independent triangles into a vertex buffer instead, so that DrawArrays could be used on the results? > > -Ken > > > > On Thu, Jul 9, 2020 at 11:01 AM Kevin Rogovin (kevinrogovin...@) wrote: >> >> >> Hi, >> >> My use case is that I am thinking of using transform feedback to >> generate an index buffer. When one says "The CopyBufferSubData call >> updates the shadow copies as well, allowing the maximum index per type >> and range within the buffer to be computed without readback.", what I >> do not follow is how the computation of the maximum index value per >> type is computed on GLES3.0 without reading the buffer contents back >> to CPU. >> >> However, I guess the real question for me is this: can one expect that >> the robust access is implemented by GPU (instead of implemented by a >> WebGL implementation bt tracking) for Dx10 backends and those OpenGL >> and GLES backends that are operating on HW with >> GL_ARB/KHR/EXT_robust_access? >> >> -Kevin >> >> On Thu, Jul 9, 2020 at 8:50 PM Ken Russell (kbr...@) >> wrote: >> > >> > WebGL implementations that can't rely on robust buffer access behavior shadow the contents of ELEMENT_ARRAY_BUFFERs on the CPU. The CopyBufferSubData call updates the shadow copies as well, allowing the maximum index per type and range within the buffer to be computed without readback. >> > >> > The restrictions in the WebGL specification are there to avoid the need to shadow the contents of all buffers. >> > >> > -Ken >> > >> > >> > >> > On Thu, Jul 9, 2020 at 10:02 AM Kevin Rogovin (kevinrogovin...@) wrote: >> >> >> >> >> >> Hi, >> >> >> >> Reading the text, it appears that: >> >> >> >> glGenBuffers(1, &index_buffer); >> >> glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, index_buffer); >> >> >> >> glBindBuffer(GL_COPY_READ_BUFFER, some_other_buffer); >> >> glBindBuffer(GL_COPY_WRITE_BUFFER, index_buffer); >> >> glCopyBufferSubData(GL_COPY_READ_BUFFER, GL_COPY_WRITE_BUFFER, >> >> someOffset, someLength); >> >> >> >> is legal because index_buffer gets the type "element array", from the >> >> first bind and binding it to GL_COPY_WRITE_BUFFER is legal because of >> >> what the table states. >> >> >> >> However, the rational states the restriction is to make sure index >> >> values are in range, but the above scenario in order to make sure that >> >> the indices are in range would require that a browser implementation >> >> would need to peek the buffers via CPU which would defeat the purpose >> >> of using glCopyBufferSubData() on any index buffer. >> >> >> >> What is expected to happen with the above? >> >> >> >> Best Regards, >> >> -Kevin >> >> >> >> ----------------------------------------------------------- >> >> You are currently subscribed to public_webgl...@ >> >> To unsubscribe, send an email to majordomo...@ with >> >> the following command in the body of your email: >> >> unsubscribe public_webgl >> >> ----------------------------------------------------------- >> >> >> > >> > >> > -- >> > I support flexible work schedules, and I?m sending this email now because it is within the hours I?m working today. Please do not feel obliged to reply straight away - I understand that you will reply during the hours you work, which may not match mine. >> > >> >> ----------------------------------------------------------- >> You are currently subscribed to public_webgl...@ >> To unsubscribe, send an email to majordomo...@ with >> the following command in the body of your email: >> unsubscribe public_webgl >> ----------------------------------------------------------- >> > > > -- > I support flexible work schedules, and I?m sending this email now because it is within the hours I?m working today. Please do not feel obliged to reply straight away - I understand that you will reply during the hours you work, which may not match mine. > ----------------------------------------------------------- You are currently subscribed to public_webgl...@ To unsubscribe, send an email to majordomo...@ with the following command in the body of your email: unsubscribe public_webgl ----------------------------------------------------------- From pub...@ Thu Jul 9 12:14:10 2020 From: pub...@ (Ken Russell (...@...)) Date: Thu, 9 Jul 2020 12:14:10 -0700 Subject: [Public WebGL] WebGL2 element array buffer and copy buffer (5.1 Buffer Object Binding) In-Reply-To: References: Message-ID: Thanks for the details. Please keep us posted on your progress, and share any demos when you have them. BTW, I'm 99% sure that ANGLE delegates to D3D's or EXT/ARB/KHR_robustness' out-of-range behavior when it's available, and that docs/ExtensionSupport.md is just out of date. -Ken On Thu, Jul 9, 2020 at 11:24 AM Kevin Rogovin (kevinrogovin...@) wrote: > > I can do what I plan to do without needing to generate an index > buffer, but it is not optimal. Basically, if I cannot generate an > index buffer, then I call glDrawElements(GL_TRIANGLES, ..) under > transform feedback and do glDrawArrays(GL_TRIANGLES, ) with the > transform feedback output. If I can get transform feedback to generate > an index buffer, then I can do glDrawElements(GL_POINTS,..) under > transform feedback and call glDrawElements(GL_TRIANGLES, ) to render > the content. The upshot being that then the post-vertex cache is only > in use when I can get the transform feedback to generate the index > buffer. > > -Kevin > > On Thu, Jul 9, 2020 at 9:11 PM Ken Russell (kbr...@) > wrote: > > > > GPU-side creation of index buffers is the one thing that's structurally > forbidden by WebGL's rules. It has to be possible for implementations to > always know the contents of index buffers without doing GPU->CPU readbacks. > > > > Your use case sounds interesting. Can you share any more details? > Perhaps you could generate independent triangles into a vertex buffer > instead, so that DrawArrays could be used on the results? > > > > -Ken > > > > > > > > On Thu, Jul 9, 2020 at 11:01 AM Kevin Rogovin ( > kevinrogovin...@) wrote: > >> > >> > >> Hi, > >> > >> My use case is that I am thinking of using transform feedback to > >> generate an index buffer. When one says "The CopyBufferSubData call > >> updates the shadow copies as well, allowing the maximum index per type > >> and range within the buffer to be computed without readback.", what I > >> do not follow is how the computation of the maximum index value per > >> type is computed on GLES3.0 without reading the buffer contents back > >> to CPU. > >> > >> However, I guess the real question for me is this: can one expect that > >> the robust access is implemented by GPU (instead of implemented by a > >> WebGL implementation bt tracking) for Dx10 backends and those OpenGL > >> and GLES backends that are operating on HW with > >> GL_ARB/KHR/EXT_robust_access? > >> > >> -Kevin > >> > >> On Thu, Jul 9, 2020 at 8:50 PM Ken Russell (kbr...@) > >> wrote: > >> > > >> > WebGL implementations that can't rely on robust buffer access > behavior shadow the contents of ELEMENT_ARRAY_BUFFERs on the CPU. The > CopyBufferSubData call updates the shadow copies as well, allowing the > maximum index per type and range within the buffer to be computed without > readback. > >> > > >> > The restrictions in the WebGL specification are there to avoid the > need to shadow the contents of all buffers. > >> > > >> > -Ken > >> > > >> > > >> > > >> > On Thu, Jul 9, 2020 at 10:02 AM Kevin Rogovin ( > kevinrogovin...@) wrote: > >> >> > >> >> > >> >> Hi, > >> >> > >> >> Reading the text, it appears that: > >> >> > >> >> glGenBuffers(1, &index_buffer); > >> >> glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, index_buffer); > >> >> > >> >> glBindBuffer(GL_COPY_READ_BUFFER, some_other_buffer); > >> >> glBindBuffer(GL_COPY_WRITE_BUFFER, index_buffer); > >> >> glCopyBufferSubData(GL_COPY_READ_BUFFER, GL_COPY_WRITE_BUFFER, > >> >> someOffset, someLength); > >> >> > >> >> is legal because index_buffer gets the type "element array", from the > >> >> first bind and binding it to GL_COPY_WRITE_BUFFER is legal because of > >> >> what the table states. > >> >> > >> >> However, the rational states the restriction is to make sure index > >> >> values are in range, but the above scenario in order to make sure > that > >> >> the indices are in range would require that a browser implementation > >> >> would need to peek the buffers via CPU which would defeat the purpose > >> >> of using glCopyBufferSubData() on any index buffer. > >> >> > >> >> What is expected to happen with the above? > >> >> > >> >> Best Regards, > >> >> -Kevin > >> >> > >> >> ----------------------------------------------------------- > >> >> You are currently subscribed to public_webgl...@ > >> >> To unsubscribe, send an email to majordomo...@ with > >> >> the following command in the body of your email: > >> >> unsubscribe public_webgl > >> >> ----------------------------------------------------------- > >> >> > >> > > >> > > >> > -- > >> > I support flexible work schedules, and I?m sending this email now > because it is within the hours I?m working today. Please do not feel > obliged to reply straight away - I understand that you will reply during > the hours you work, which may not match mine. > >> > > >> > >> ----------------------------------------------------------- > >> You are currently subscribed to public_webgl...@ > >> To unsubscribe, send an email to majordomo...@ with > >> the following command in the body of your email: > >> unsubscribe public_webgl > >> ----------------------------------------------------------- > >> > > > > > > -- > > I support flexible work schedules, and I?m sending this email now > because it is within the hours I?m working today. Please do not feel > obliged to reply straight away - I understand that you will reply during > the hours you work, which may not match mine. > > > > ----------------------------------------------------------- > You are currently subscribed to public_webgl...@ > To unsubscribe, send an email to majordomo...@ with > the following command in the body of your email: > unsubscribe public_webgl > ----------------------------------------------------------- > > -- I support flexible work schedules, and I?m sending this email now because it is within the hours I?m working today. Please do not feel obliged to reply straight away - I understand that you will reply during the hours you work, which may not match mine. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pub...@ Fri Jul 10 12:28:31 2020 From: pub...@ (Geoff Lang (...@...)) Date: Fri, 10 Jul 2020 15:28:31 -0400 Subject: [Public WebGL] WebGL2 element array buffer and copy buffer (5.1 Buffer Object Binding) In-Reply-To: References: Message-ID: ANGLE does use the GPU for robust buffer access when using the D3D11 backend (most windows users). D3D9 always validates index ranges on the CPU. We also rely on the Vulkan or OpenGL extensions to do it when available on other platforms. docs/ExtensionSupport.md is not maintained, I will make an action item to remove it. On Thu, Jul 9, 2020 at 3:15 PM Ken Russell (kbr...@) < public_webgl...@> wrote: > Thanks for the details. Please keep us posted on your progress, and share > any demos when you have them. > > BTW, I'm 99% sure that ANGLE delegates to D3D's or EXT/ARB/KHR_robustness' > out-of-range behavior when it's available, and that > docs/ExtensionSupport.md is just out of date. > > -Ken > > > > On Thu, Jul 9, 2020 at 11:24 AM Kevin Rogovin ( > kevinrogovin...@) wrote: > >> >> I can do what I plan to do without needing to generate an index >> buffer, but it is not optimal. Basically, if I cannot generate an >> index buffer, then I call glDrawElements(GL_TRIANGLES, ..) under >> transform feedback and do glDrawArrays(GL_TRIANGLES, ) with the >> transform feedback output. If I can get transform feedback to generate >> an index buffer, then I can do glDrawElements(GL_POINTS,..) under >> transform feedback and call glDrawElements(GL_TRIANGLES, ) to render >> the content. The upshot being that then the post-vertex cache is only >> in use when I can get the transform feedback to generate the index >> buffer. >> >> -Kevin >> >> On Thu, Jul 9, 2020 at 9:11 PM Ken Russell (kbr...@) >> wrote: >> > >> > GPU-side creation of index buffers is the one thing that's structurally >> forbidden by WebGL's rules. It has to be possible for implementations to >> always know the contents of index buffers without doing GPU->CPU readbacks. >> > >> > Your use case sounds interesting. Can you share any more details? >> Perhaps you could generate independent triangles into a vertex buffer >> instead, so that DrawArrays could be used on the results? >> > >> > -Ken >> > >> > >> > >> > On Thu, Jul 9, 2020 at 11:01 AM Kevin Rogovin ( >> kevinrogovin...@) wrote: >> >> >> >> >> >> Hi, >> >> >> >> My use case is that I am thinking of using transform feedback to >> >> generate an index buffer. When one says "The CopyBufferSubData call >> >> updates the shadow copies as well, allowing the maximum index per type >> >> and range within the buffer to be computed without readback.", what I >> >> do not follow is how the computation of the maximum index value per >> >> type is computed on GLES3.0 without reading the buffer contents back >> >> to CPU. >> >> >> >> However, I guess the real question for me is this: can one expect that >> >> the robust access is implemented by GPU (instead of implemented by a >> >> WebGL implementation bt tracking) for Dx10 backends and those OpenGL >> >> and GLES backends that are operating on HW with >> >> GL_ARB/KHR/EXT_robust_access? >> >> >> >> -Kevin >> >> >> >> On Thu, Jul 9, 2020 at 8:50 PM Ken Russell (kbr...@) >> >> wrote: >> >> > >> >> > WebGL implementations that can't rely on robust buffer access >> behavior shadow the contents of ELEMENT_ARRAY_BUFFERs on the CPU. The >> CopyBufferSubData call updates the shadow copies as well, allowing the >> maximum index per type and range within the buffer to be computed without >> readback. >> >> > >> >> > The restrictions in the WebGL specification are there to avoid the >> need to shadow the contents of all buffers. >> >> > >> >> > -Ken >> >> > >> >> > >> >> > >> >> > On Thu, Jul 9, 2020 at 10:02 AM Kevin Rogovin ( >> kevinrogovin...@) wrote: >> >> >> >> >> >> >> >> >> Hi, >> >> >> >> >> >> Reading the text, it appears that: >> >> >> >> >> >> glGenBuffers(1, &index_buffer); >> >> >> glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, index_buffer); >> >> >> >> >> >> glBindBuffer(GL_COPY_READ_BUFFER, some_other_buffer); >> >> >> glBindBuffer(GL_COPY_WRITE_BUFFER, index_buffer); >> >> >> glCopyBufferSubData(GL_COPY_READ_BUFFER, GL_COPY_WRITE_BUFFER, >> >> >> someOffset, someLength); >> >> >> >> >> >> is legal because index_buffer gets the type "element array", from >> the >> >> >> first bind and binding it to GL_COPY_WRITE_BUFFER is legal because >> of >> >> >> what the table states. >> >> >> >> >> >> However, the rational states the restriction is to make sure index >> >> >> values are in range, but the above scenario in order to make sure >> that >> >> >> the indices are in range would require that a browser implementation >> >> >> would need to peek the buffers via CPU which would defeat the >> purpose >> >> >> of using glCopyBufferSubData() on any index buffer. >> >> >> >> >> >> What is expected to happen with the above? >> >> >> >> >> >> Best Regards, >> >> >> -Kevin >> >> >> >> >> >> ----------------------------------------------------------- >> >> >> You are currently subscribed to public_webgl...@ >> >> >> To unsubscribe, send an email to majordomo...@ with >> >> >> the following command in the body of your email: >> >> >> unsubscribe public_webgl >> >> >> ----------------------------------------------------------- >> >> >> >> >> > >> >> > >> >> > -- >> >> > I support flexible work schedules, and I?m sending this email now >> because it is within the hours I?m working today. Please do not feel >> obliged to reply straight away - I understand that you will reply during >> the hours you work, which may not match mine. >> >> > >> >> >> >> ----------------------------------------------------------- >> >> You are currently subscribed to public_webgl...@ >> >> To unsubscribe, send an email to majordomo...@ with >> >> the following command in the body of your email: >> >> unsubscribe public_webgl >> >> ----------------------------------------------------------- >> >> >> > >> > >> > -- >> > I support flexible work schedules, and I?m sending this email now >> because it is within the hours I?m working today. Please do not feel >> obliged to reply straight away - I understand that you will reply during >> the hours you work, which may not match mine. >> > >> >> ----------------------------------------------------------- >> You are currently subscribed to public_webgl...@ >> To unsubscribe, send an email to majordomo...@ with >> the following command in the body of your email: >> unsubscribe public_webgl >> ----------------------------------------------------------- >> >> > > -- > I support flexible work schedules, and I?m sending this email now because > it is within the hours I?m working today. Please do not feel obliged to > reply straight away - I understand that you will reply during the hours you > work, which may not match mine. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pub...@ Mon Jul 20 05:50:14 2020 From: pub...@ (Kevin Rogovin (...@...)) Date: Mon, 20 Jul 2020 15:50:14 +0300 Subject: [Public WebGL] WebGL2 element array buffer and copy buffer (5.1 Buffer Object Binding) Message-ID: Hi, Hopefully this thread necromancy will live on the correct thread still. > ANGLE does use the GPU for robust buffer access when using the D3D11 backend (most windows users). D3D9 always validates index ranges on the CPU. We also rely on the Vulkan or OpenGL extensions to do it when available on other platforms. What do people think of having an extension for WebGL2 that: removes the restriction jazz of ELEMENT_ARRAY_BUFFER and gives an assurance to an application that robust access is used for fetching vertices via an index buffer, i.e. the implementation won't induce a CPU inspect of an index buffer. I can draft the extension, but I would like to feel the water on this. My use case, which a follow up question on a different thread will address in more detail, is for GPU generated index buffers. Right now, I have scenes where instancing is not sufficient but I manage to get the GPU to generate my vertex buffers entirely. I have a scene of 1.5 million vertices; for these loads the actual vertex count is 1 million and the index count is 1.5 million.; getting the index buffer generated by GPU is a big performance gain for these kinds of scenes (something like 3N ms/frame without index buffer and 2N ms/frame with index buffer) for some value of N. Best Regards, -Kevin Rogovin -------------- next part -------------- An HTML attachment was scrubbed... URL: From pub...@ Mon Jul 20 06:39:07 2020 From: pub...@ (Kevin Rogovin (...@...)) Date: Mon, 20 Jul 2020 16:39:07 +0300 Subject: [Public WebGL] glReadPixels (to buffer object) and endianness Message-ID: Hi all, This is a question on what is the guaranteed behaviour related to an endianness. The situation: I plan to generate a LARGE index buffer and using transform feedback is not suitable. The plan is to essentially rasterize the indices. For desktop GL, this is easy, the render target would be GL_R32UI and glReadPixels would be passed GL_RED_INTEGER with GL_UNSIGNED_INT. However, GLES3 and this WebGL2 do not allow that combo in glReadPixels(); indeed for reading from such a buffer would require GL_RGBA_INTEGER which would make one want to read the index buffer with a stride of 4. What I'd like to do is to rasterize an GL_RGBA8 fixed point buffer, do the right thing to convert each 8-bit chunk into a vec4 tuple and then call glReadPixels with GL_RGBA, GL_UNSIGNED_BYTE. (Note that because of the rules associated to GL_ELEMENT_ARRAY_BUFFER, the glReadPixels will write to a staging buffer which is then copied to the index buffer). The question is: will the "bit-casting" of the RGBA8-tuple data to GL_UNSIGNED_INT be platform independent? I would really like to avoid the idea of reading the 32-bit values as (GL_RGBA_INTEGER, RL_UNSIGNED_BYTE) and issuing transform feedback as that adds a lot more data copy and bandwidth. Best Regards, -Kevin Rogovin ----------------------------------------------------------- You are currently subscribed to public_webgl...@ To unsubscribe, send an email to majordomo...@ with the following command in the body of your email: unsubscribe public_webgl ----------------------------------------------------------- From pub...@ Mon Jul 20 11:56:23 2020 From: pub...@ (Jeff Gilbert (...@...)) Date: Mon, 20 Jul 2020 11:56:23 -0700 Subject: [Public WebGL] glReadPixels (to buffer object) and endianness In-Reply-To: References: Message-ID: FWIW, ES3 and WebGL2 potentially do support ReadPixels with GL_UNSIGNED_INT via GL_IMPLEMENTATION_COLOR_READ_FORMAT/TYPE, which you can query for your current READ_FRAMEBUFFER/ReadBuffer. (though this support is driver-dependent) If I understand your last question, it would work. E.g. you can pass your MSB in R (LSB in A) for RGBA/UNSIGNED_BYTE, then reassemble it as `(R * 0xff) << 24 | ...` in the shader. On Mon, Jul 20, 2020 at 6:40 AM Kevin Rogovin (kevinrogovin...@) wrote: > > > Hi all, > > This is a question on what is the guaranteed behaviour related to an > endianness. The situation: I plan to generate a LARGE index buffer and > using transform feedback is not suitable. The plan is to essentially > rasterize the indices. For desktop GL, this is easy, the render target > would be GL_R32UI and glReadPixels would be passed GL_RED_INTEGER with > GL_UNSIGNED_INT. However, GLES3 and this WebGL2 do not allow that > combo in glReadPixels(); indeed for reading from such a buffer would > require GL_RGBA_INTEGER which would make one want to read the index > buffer with a stride of 4. What I'd like to do is to rasterize an > GL_RGBA8 fixed point buffer, do the right thing to convert each 8-bit > chunk into a vec4 tuple and then call glReadPixels with GL_RGBA, > GL_UNSIGNED_BYTE. (Note that because of the rules associated to > GL_ELEMENT_ARRAY_BUFFER, the glReadPixels will write to a staging > buffer which is then copied to the index buffer). The question is: > will the "bit-casting" of the RGBA8-tuple data to GL_UNSIGNED_INT be > platform independent? I would really like to avoid the idea of reading > the 32-bit values as (GL_RGBA_INTEGER, RL_UNSIGNED_BYTE) and issuing > transform feedback as that adds a lot more data copy and bandwidth. > > Best Regards, > -Kevin Rogovin > > ----------------------------------------------------------- > You are currently subscribed to public_webgl...@ > To unsubscribe, send an email to majordomo...@ with > the following command in the body of your email: > unsubscribe public_webgl > ----------------------------------------------------------- > ----------------------------------------------------------- You are currently subscribed to public_webgl...@ To unsubscribe, send an email to majordomo...@ with the following command in the body of your email: unsubscribe public_webgl ----------------------------------------------------------- From pub...@ Mon Jul 20 12:00:39 2020 From: pub...@ (Jeff Gilbert (...@...)) Date: Mon, 20 Jul 2020 12:00:39 -0700 Subject: [Public WebGL] WebGL2 element array buffer and copy buffer (5.1 Buffer Object Binding) In-Reply-To: References: Message-ID: We would need to check whether we do indeed have RBAB everywhere we have WebGL2. Otherwise, we'd need more info on how this enables compelling workloads, to offset the downside of Apps accidentally not working on a number of older desktop and mobile drivers. Worth considering for your usecase (complicated vertex fetch) is vertex-pulling, which should be possible with vanilla WebGL 2. On Mon, Jul 20, 2020 at 5:51 AM Kevin Rogovin (kevinrogovin...@) wrote: > > Hi, > > Hopefully this thread necromancy will live on the correct thread still. > > > ANGLE does use the GPU for robust buffer access when using the D3D11 backend (most windows users). D3D9 always validates index ranges on the CPU. We also rely on the Vulkan or OpenGL extensions to do it when available on other platforms. > > What do people think of having an extension for WebGL2 that: removes the restriction jazz of ELEMENT_ARRAY_BUFFER and gives an assurance to an application that robust access is used for fetching vertices via an index buffer, i.e. the implementation won't induce a CPU inspect of an index buffer. > > I can draft the extension, but I would like to feel the water on this. > > My use case, which a follow up question on a different thread will address in more detail, is for GPU generated index buffers. Right now, I have scenes where instancing is not sufficient but I manage to get the GPU to generate my vertex buffers entirely. I have a scene of 1.5 million vertices; for these loads the actual vertex count is 1 million and the index count is 1.5 million.; getting the index buffer generated by GPU is a big performance gain for these kinds of scenes (something like 3N ms/frame without index buffer and 2N ms/frame with index buffer) for some value of N. > > Best Regards, > -Kevin Rogovin ----------------------------------------------------------- You are currently subscribed to public_webgl...@ To unsubscribe, send an email to majordomo...@ with the following command in the body of your email: unsubscribe public_webgl ----------------------------------------------------------- From pub...@ Mon Jul 20 13:04:01 2020 From: pub...@ (Kevin Rogovin (...@...)) Date: Mon, 20 Jul 2020 23:04:01 +0300 Subject: [Public WebGL] glReadPixels (to buffer object) and endianness In-Reply-To: References: Message-ID: Hi, My use case is the reverse. A fragment shader does: uint varying_idx; vec4 out_value; void main(void) { uvec4 raw; raw.r = varying_idx >> 24u; raw.g = (varying_idx >> 16u) & 0xFF; raw.b = (varying_idx >> 8u) & 0xFF; raw.a = varying_idx & 0xFF; // add a little fudge to make sure... out_value = (vec4(raw) + vec4(0.1)) / 255.0; } and then the GL calls are something like: glBindBuffer(GL_PIXEL_PACK_BUFFER, m_index_bo); glReadPixels(0, 0, w, h, GL_RGBA, GL_UNSIGNED_BYTE, NULL); which gives me an index buffer generated by the GPU. The catch is that this assumes that the byte 0 corresponds to the highest 8 bits and so on which appears to me to be a very endian specific thing. This is the heart of the question, is portability actually guaranteed for this use(or really abuse)? The only way I see to make it perfectly portable would be to make the fragment shader render to an GL_R32UI buffer, and then do: // this will make what we want strided 4 essentially glBindBuffer(GL_ARRAY_BUFFER, m_staging_buffer); glReadPixels(0, 0, w, h, GL_RGBA_INTEGER, GL_UNSIGNED_INT, NULL); //undo the strider // shader is simple vertex shader that just echos a single scalar uint attribute // and the VAO is set with glVertexAttribIPointer(0, count = 1, type = GL_UNSIGNED_INT, GL_FALSE, 4 * sizeof(GLuint), NULL); glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 0, m_index_buffer) glBeginTransformFeedback(m_xf); glDrawArrays(0, number_of_indices_to_capture); this is awful because it uses 8 or 12 times as much bandwidth as a directly would give me (or if GLES3/WebGL2 guaranteed to support glReadPixels(0, 0, w, h, GL_RED_INTEGER, GL_UNSIGNED_INT, NULL); -Kevin On Mon, Jul 20, 2020 at 9:57 PM Jeff Gilbert (jgilbert...@) wrote: > > > FWIW, ES3 and WebGL2 potentially do support ReadPixels with > GL_UNSIGNED_INT via GL_IMPLEMENTATION_COLOR_READ_FORMAT/TYPE, which > you can query for your current READ_FRAMEBUFFER/ReadBuffer. (though > this support is driver-dependent) > > If I understand your last question, it would work. E.g. you can pass > your MSB in R (LSB in A) for RGBA/UNSIGNED_BYTE, then reassemble it as > `(R * 0xff) << 24 | ...` in the shader. > > On Mon, Jul 20, 2020 at 6:40 AM Kevin Rogovin > (kevinrogovin...@) wrote: > > > > > > Hi all, > > > > This is a question on what is the guaranteed behaviour related to an > > endianness. The situation: I plan to generate a LARGE index buffer and > > using transform feedback is not suitable. The plan is to essentially > > rasterize the indices. For desktop GL, this is easy, the render target > > would be GL_R32UI and glReadPixels would be passed GL_RED_INTEGER with > > GL_UNSIGNED_INT. However, GLES3 and this WebGL2 do not allow that > > combo in glReadPixels(); indeed for reading from such a buffer would > > require GL_RGBA_INTEGER which would make one want to read the index > > buffer with a stride of 4. What I'd like to do is to rasterize an > > GL_RGBA8 fixed point buffer, do the right thing to convert each 8-bit > > chunk into a vec4 tuple and then call glReadPixels with GL_RGBA, > > GL_UNSIGNED_BYTE. (Note that because of the rules associated to > > GL_ELEMENT_ARRAY_BUFFER, the glReadPixels will write to a staging > > buffer which is then copied to the index buffer). The question is: > > will the "bit-casting" of the RGBA8-tuple data to GL_UNSIGNED_INT be > > platform independent? I would really like to avoid the idea of reading > > the 32-bit values as (GL_RGBA_INTEGER, RL_UNSIGNED_BYTE) and issuing > > transform feedback as that adds a lot more data copy and bandwidth. > > > > Best Regards, > > -Kevin Rogovin > > > > ----------------------------------------------------------- > > You are currently subscribed to public_webgl...@ > > To unsubscribe, send an email to majordomo...@ with > > the following command in the body of your email: > > unsubscribe public_webgl > > ----------------------------------------------------------- > > > > ----------------------------------------------------------- > You are currently subscribed to public_webgl...@ > To unsubscribe, send an email to majordomo...@ with > the following command in the body of your email: > unsubscribe public_webgl > ----------------------------------------------------------- > ----------------------------------------------------------- You are currently subscribed to public_webgl...@ To unsubscribe, send an email to majordomo...@ with the following command in the body of your email: unsubscribe public_webgl ----------------------------------------------------------- From pub...@ Mon Jul 20 13:17:14 2020 From: pub...@ (Kevin Rogovin (...@...)) Date: Mon, 20 Jul 2020 23:17:14 +0300 Subject: [Public WebGL] WebGL2 element array buffer and copy buffer (5.1 Buffer Object Binding) In-Reply-To: References: Message-ID: Just to be clear: I am only advocating this extension for WebGL2 ONLY (not WebGL 1) which on MS-Windows means mapping to D3D10/D3D11/D3D12 which have robustness already anyways. For Vulkan and OpenGL support, on desktop robust access is there already; mobile is messier (and one can argue that my below use-case is pointless because mobile is always shared memory, so CPU-GPU traffic is not much of a thing... but it is still because of the wonkiness of drivers, mostly kernel side, of allocating the memory for the buffer objects to which to stream). The extension would have to be requested at context creation (there's the iffy "every extension please" issue, but there are already extensions that have that nature too, for example render to floating point buffer). The goal is that I am aiming to have the GPU generate an index buffer instead of the CPU generating it each frame and sending it to the GPU every frame. Just on using/abusing render to texture and samping from texture from vertex shader (together with attributeless rendering) I have use cases that doubled and even tripled performance compared to playing buffer object ouija board with glBufferData, glBufferSubData, pools and mucking with sizes. The benefit (for my loads) of a GPU generated index buffer is an additional 33% performance advantage (i.e something that is 24 ms/frame would then be like 16 ms/frame). However, that is pointless to do if the WebGL2 implementation needs to snoop the index buffer anyways. I am much better off then doing non-indexed draw calls for this case. So, I'd really like to know or have an extension that guarantees the no snoop. The extension would also allow me to save an additional load of bandwidth copying the buffer made to an index buffer all in one swoop that makes sense together. After all, if the CPU must snoop, there is zero point (nearly) for doing GPU generated index buffers. So the extension question: is this an extension worth the time to draft? Best Regards, -Kevin On Mon, Jul 20, 2020 at 10:01 PM Jeff Gilbert (jgilbert...@) wrote: > > > We would need to check whether we do indeed have RBAB everywhere we > have WebGL2. Otherwise, we'd need more info on how this enables > compelling workloads, to offset the downside of Apps accidentally not > working on a number of older desktop and mobile drivers. > > Worth considering for your usecase (complicated vertex fetch) is > vertex-pulling, which should be possible with vanilla WebGL 2. > > On Mon, Jul 20, 2020 at 5:51 AM Kevin Rogovin > (kevinrogovin...@) wrote: > > > > Hi, > > > > Hopefully this thread necromancy will live on the correct thread still. > > > > > ANGLE does use the GPU for robust buffer access when using the D3D11 backend (most windows users). D3D9 always validates index ranges on the CPU. We also rely on the Vulkan or OpenGL extensions to do it when available on other platforms. > > > > What do people think of having an extension for WebGL2 that: removes the restriction jazz of ELEMENT_ARRAY_BUFFER and gives an assurance to an application that robust access is used for fetching vertices via an index buffer, i.e. the implementation won't induce a CPU inspect of an index buffer. > > > > I can draft the extension, but I would like to feel the water on this. > > > > My use case, which a follow up question on a different thread will address in more detail, is for GPU generated index buffers. Right now, I have scenes where instancing is not sufficient but I manage to get the GPU to generate my vertex buffers entirely. I have a scene of 1.5 million vertices; for these loads the actual vertex count is 1 million and the index count is 1.5 million.; getting the index buffer generated by GPU is a big performance gain for these kinds of scenes (something like 3N ms/frame without index buffer and 2N ms/frame with index buffer) for some value of N. > > > > Best Regards, > > -Kevin Rogovin > > ----------------------------------------------------------- > You are currently subscribed to public_webgl...@ > To unsubscribe, send an email to majordomo...@ with > the following command in the body of your email: > unsubscribe public_webgl > ----------------------------------------------------------- > ----------------------------------------------------------- You are currently subscribed to public_webgl...@ To unsubscribe, send an email to majordomo...@ with the following command in the body of your email: unsubscribe public_webgl ----------------------------------------------------------- From pub...@ Mon Jul 20 13:39:09 2020 From: pub...@ (Jeff Gilbert (...@...)) Date: Mon, 20 Jul 2020 13:39:09 -0700 Subject: [Public WebGL] glReadPixels (to buffer object) and endianness In-Reply-To: References: Message-ID: Well you can use ReadPixels(GL_RED_INTEGER, GL_UNSIGNED_INT) where available, otherwise change your uint->RGBA packing shader based on host endianess. I think you should strongly consider vertex-pulling in your case, bypassing the fixed-function index fetch. On Mon, Jul 20, 2020 at 1:05 PM Kevin Rogovin (kevinrogovin...@) wrote: > > > Hi, > > My use case is the reverse. A fragment shader does: > > uint varying_idx; > vec4 out_value; > void main(void) > { > uvec4 raw; > > raw.r = varying_idx >> 24u; > raw.g = (varying_idx >> 16u) & 0xFF; > raw.b = (varying_idx >> 8u) & 0xFF; > raw.a = varying_idx & 0xFF; > > // add a little fudge to make sure... > out_value = (vec4(raw) + vec4(0.1)) / 255.0; > } > > and then the GL calls are something like: > > glBindBuffer(GL_PIXEL_PACK_BUFFER, m_index_bo); > glReadPixels(0, 0, w, h, GL_RGBA, GL_UNSIGNED_BYTE, NULL); > > which gives me an index buffer generated by the GPU. The catch is that > this assumes that the byte 0 corresponds to the highest 8 bits and so > on which appears to me to be a very endian specific thing. This is the > heart of the question, is portability actually guaranteed for this > use(or really abuse)? > > The only way I see to make it perfectly portable would be to make the > fragment shader render to an GL_R32UI buffer, and then do: > > // this will make what we want strided 4 essentially > glBindBuffer(GL_ARRAY_BUFFER, m_staging_buffer); > glReadPixels(0, 0, w, h, GL_RGBA_INTEGER, GL_UNSIGNED_INT, NULL); > > //undo the strider > // shader is simple vertex shader that just echos a single scalar uint attribute > // and the VAO is set with glVertexAttribIPointer(0, count = 1, type = > GL_UNSIGNED_INT, GL_FALSE, 4 * sizeof(GLuint), NULL); > glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 0, m_index_buffer) > glBeginTransformFeedback(m_xf); > glDrawArrays(0, number_of_indices_to_capture); > > this is awful because it uses 8 or 12 times as much bandwidth as a > directly would give me (or if GLES3/WebGL2 guaranteed to support > glReadPixels(0, 0, w, h, GL_RED_INTEGER, GL_UNSIGNED_INT, NULL); > > -Kevin > > On Mon, Jul 20, 2020 at 9:57 PM Jeff Gilbert (jgilbert...@) > wrote: > > > > > > FWIW, ES3 and WebGL2 potentially do support ReadPixels with > > GL_UNSIGNED_INT via GL_IMPLEMENTATION_COLOR_READ_FORMAT/TYPE, which > > you can query for your current READ_FRAMEBUFFER/ReadBuffer. (though > > this support is driver-dependent) > > > > If I understand your last question, it would work. E.g. you can pass > > your MSB in R (LSB in A) for RGBA/UNSIGNED_BYTE, then reassemble it as > > `(R * 0xff) << 24 | ...` in the shader. > > > > On Mon, Jul 20, 2020 at 6:40 AM Kevin Rogovin > > (kevinrogovin...@) wrote: > > > > > > > > > Hi all, > > > > > > This is a question on what is the guaranteed behaviour related to an > > > endianness. The situation: I plan to generate a LARGE index buffer and > > > using transform feedback is not suitable. The plan is to essentially > > > rasterize the indices. For desktop GL, this is easy, the render target > > > would be GL_R32UI and glReadPixels would be passed GL_RED_INTEGER with > > > GL_UNSIGNED_INT. However, GLES3 and this WebGL2 do not allow that > > > combo in glReadPixels(); indeed for reading from such a buffer would > > > require GL_RGBA_INTEGER which would make one want to read the index > > > buffer with a stride of 4. What I'd like to do is to rasterize an > > > GL_RGBA8 fixed point buffer, do the right thing to convert each 8-bit > > > chunk into a vec4 tuple and then call glReadPixels with GL_RGBA, > > > GL_UNSIGNED_BYTE. (Note that because of the rules associated to > > > GL_ELEMENT_ARRAY_BUFFER, the glReadPixels will write to a staging > > > buffer which is then copied to the index buffer). The question is: > > > will the "bit-casting" of the RGBA8-tuple data to GL_UNSIGNED_INT be > > > platform independent? I would really like to avoid the idea of reading > > > the 32-bit values as (GL_RGBA_INTEGER, RL_UNSIGNED_BYTE) and issuing > > > transform feedback as that adds a lot more data copy and bandwidth. > > > > > > Best Regards, > > > -Kevin Rogovin > > > > > > ----------------------------------------------------------- > > > You are currently subscribed to public_webgl...@ > > > To unsubscribe, send an email to majordomo...@ with > > > the following command in the body of your email: > > > unsubscribe public_webgl > > > ----------------------------------------------------------- > > > > > > > ----------------------------------------------------------- > > You are currently subscribed to public_webgl...@ > > To unsubscribe, send an email to majordomo...@ with > > the following command in the body of your email: > > unsubscribe public_webgl > > ----------------------------------------------------------- > > > > ----------------------------------------------------------- > You are currently subscribed to public_webgl...@ > To unsubscribe, send an email to majordomo...@ with > the following command in the body of your email: > unsubscribe public_webgl > ----------------------------------------------------------- > ----------------------------------------------------------- You are currently subscribed to public_webgl...@ To unsubscribe, send an email to majordomo...@ with the following command in the body of your email: unsubscribe public_webgl ----------------------------------------------------------- From pub...@ Mon Jul 20 13:43:58 2020 From: pub...@ (Jeff Gilbert (...@...)) Date: Mon, 20 Jul 2020 13:43:58 -0700 Subject: [Public WebGL] WebGL2 element array buffer and copy buffer (5.1 Buffer Object Binding) In-Reply-To: References: Message-ID: I would try vertex pulling or other index-buffer-data caching mechanisms before making the case for this extension, given the portability concerns I mentioned. On Mon, Jul 20, 2020 at 1:18 PM Kevin Rogovin (kevinrogovin...@) wrote: > > > Just to be clear: I am only advocating this extension for WebGL2 ONLY > (not WebGL 1) which on MS-Windows means mapping to D3D10/D3D11/D3D12 > which have robustness already anyways. For Vulkan and OpenGL support, > on desktop robust access is there already; mobile is messier (and one > can argue that my below use-case is pointless because mobile is always > shared memory, so CPU-GPU traffic is not much of a thing... but it is > still because of the wonkiness of drivers, mostly kernel side, of > allocating the memory for the buffer objects to which to stream). The > extension would have to be requested at context creation (there's the > iffy "every extension please" issue, but there are already extensions > that have that nature too, for example render to floating point > buffer). > > The goal is that I am aiming to have the GPU generate an index buffer > instead of the CPU generating it each frame and sending it to the GPU > every frame. Just on using/abusing render to texture and samping from > texture from vertex shader (together with attributeless rendering) I > have use cases that doubled and even tripled performance compared to > playing buffer object ouija board with glBufferData, glBufferSubData, > pools and mucking with sizes. The benefit (for my loads) of a GPU > generated index buffer is an additional 33% performance advantage (i.e > something that is 24 ms/frame would then be like 16 ms/frame). > However, that is pointless to do if the WebGL2 implementation needs to > snoop the index buffer anyways. I am much better off then doing > non-indexed draw calls for this case. So, I'd really like to know or > have an extension that guarantees the no snoop. The extension would > also allow me to save an additional load of bandwidth copying the > buffer made to an index buffer all in one swoop that makes sense > together. After all, if the CPU must snoop, there is zero point > (nearly) for doing GPU generated index buffers. > > So the extension question: is this an extension worth the time to draft? > > Best Regards, > -Kevin > > > > On Mon, Jul 20, 2020 at 10:01 PM Jeff Gilbert (jgilbert...@) > wrote: > > > > > > We would need to check whether we do indeed have RBAB everywhere we > > have WebGL2. Otherwise, we'd need more info on how this enables > > compelling workloads, to offset the downside of Apps accidentally not > > working on a number of older desktop and mobile drivers. > > > > Worth considering for your usecase (complicated vertex fetch) is > > vertex-pulling, which should be possible with vanilla WebGL 2. > > > > On Mon, Jul 20, 2020 at 5:51 AM Kevin Rogovin > > (kevinrogovin...@) wrote: > > > > > > Hi, > > > > > > Hopefully this thread necromancy will live on the correct thread still. > > > > > > > ANGLE does use the GPU for robust buffer access when using the D3D11 backend (most windows users). D3D9 always validates index ranges on the CPU. We also rely on the Vulkan or OpenGL extensions to do it when available on other platforms. > > > > > > What do people think of having an extension for WebGL2 that: removes the restriction jazz of ELEMENT_ARRAY_BUFFER and gives an assurance to an application that robust access is used for fetching vertices via an index buffer, i.e. the implementation won't induce a CPU inspect of an index buffer. > > > > > > I can draft the extension, but I would like to feel the water on this. > > > > > > My use case, which a follow up question on a different thread will address in more detail, is for GPU generated index buffers. Right now, I have scenes where instancing is not sufficient but I manage to get the GPU to generate my vertex buffers entirely. I have a scene of 1.5 million vertices; for these loads the actual vertex count is 1 million and the index count is 1.5 million.; getting the index buffer generated by GPU is a big performance gain for these kinds of scenes (something like 3N ms/frame without index buffer and 2N ms/frame with index buffer) for some value of N. > > > > > > Best Regards, > > > -Kevin Rogovin > > > > ----------------------------------------------------------- > > You are currently subscribed to public_webgl...@ > > To unsubscribe, send an email to majordomo...@ with > > the following command in the body of your email: > > unsubscribe public_webgl > > ----------------------------------------------------------- > > > > ----------------------------------------------------------- > You are currently subscribed to public_webgl...@ > To unsubscribe, send an email to majordomo...@ with > the following command in the body of your email: > unsubscribe public_webgl > ----------------------------------------------------------- > ----------------------------------------------------------- You are currently subscribed to public_webgl...@ To unsubscribe, send an email to majordomo...@ with the following command in the body of your email: unsubscribe public_webgl ----------------------------------------------------------- From pub...@ Mon Jul 20 14:55:41 2020 From: pub...@ (Kevin Rogovin (...@...)) Date: Tue, 21 Jul 2020 00:55:41 +0300 Subject: [Public WebGL] WebGL2 element array buffer and copy buffer (5.1 Buffer Object Binding) In-Reply-To: References: Message-ID: Hi, Like I said I can make it work without the extension, but with the extension I am looking at a 33% performance gain, which is a big deal. I confess I do not really follow the portability objection: I'd check for the extension and if it existed would use GL_ELEMENT_ARRAY_BUFFER binded buffers more freely and if not, then not (and just lose the performance gain). Just as the case for applications use EXT_color_buffer_float. -Kevin On Mon, Jul 20, 2020 at 11:44 PM Jeff Gilbert (jgilbert...@) wrote: > > > I would try vertex pulling or other index-buffer-data caching > mechanisms before making the case for this extension, given the > portability concerns I mentioned. > > On Mon, Jul 20, 2020 at 1:18 PM Kevin Rogovin > (kevinrogovin...@) wrote: > > > > > > Just to be clear: I am only advocating this extension for WebGL2 ONLY > > (not WebGL 1) which on MS-Windows means mapping to D3D10/D3D11/D3D12 > > which have robustness already anyways. For Vulkan and OpenGL support, > > on desktop robust access is there already; mobile is messier (and one > > can argue that my below use-case is pointless because mobile is always > > shared memory, so CPU-GPU traffic is not much of a thing... but it is > > still because of the wonkiness of drivers, mostly kernel side, of > > allocating the memory for the buffer objects to which to stream). The > > extension would have to be requested at context creation (there's the > > iffy "every extension please" issue, but there are already extensions > > that have that nature too, for example render to floating point > > buffer). > > > > The goal is that I am aiming to have the GPU generate an index buffer > > instead of the CPU generating it each frame and sending it to the GPU > > every frame. Just on using/abusing render to texture and samping from > > texture from vertex shader (together with attributeless rendering) I > > have use cases that doubled and even tripled performance compared to > > playing buffer object ouija board with glBufferData, glBufferSubData, > > pools and mucking with sizes. The benefit (for my loads) of a GPU > > generated index buffer is an additional 33% performance advantage (i.e > > something that is 24 ms/frame would then be like 16 ms/frame). > > However, that is pointless to do if the WebGL2 implementation needs to > > snoop the index buffer anyways. I am much better off then doing > > non-indexed draw calls for this case. So, I'd really like to know or > > have an extension that guarantees the no snoop. The extension would > > also allow me to save an additional load of bandwidth copying the > > buffer made to an index buffer all in one swoop that makes sense > > together. After all, if the CPU must snoop, there is zero point > > (nearly) for doing GPU generated index buffers. > > > > So the extension question: is this an extension worth the time to draft? > > > > Best Regards, > > -Kevin > > > > > > > > On Mon, Jul 20, 2020 at 10:01 PM Jeff Gilbert (jgilbert...@) > > wrote: > > > > > > > > > We would need to check whether we do indeed have RBAB everywhere we > > > have WebGL2. Otherwise, we'd need more info on how this enables > > > compelling workloads, to offset the downside of Apps accidentally not > > > working on a number of older desktop and mobile drivers. > > > > > > Worth considering for your usecase (complicated vertex fetch) is > > > vertex-pulling, which should be possible with vanilla WebGL 2. > > > > > > On Mon, Jul 20, 2020 at 5:51 AM Kevin Rogovin > > > (kevinrogovin...@) wrote: > > > > > > > > Hi, > > > > > > > > Hopefully this thread necromancy will live on the correct thread still. > > > > > > > > > ANGLE does use the GPU for robust buffer access when using the D3D11 backend (most windows users). D3D9 always validates index ranges on the CPU. We also rely on the Vulkan or OpenGL extensions to do it when available on other platforms. > > > > > > > > What do people think of having an extension for WebGL2 that: removes the restriction jazz of ELEMENT_ARRAY_BUFFER and gives an assurance to an application that robust access is used for fetching vertices via an index buffer, i.e. the implementation won't induce a CPU inspect of an index buffer. > > > > > > > > I can draft the extension, but I would like to feel the water on this. > > > > > > > > My use case, which a follow up question on a different thread will address in more detail, is for GPU generated index buffers. Right now, I have scenes where instancing is not sufficient but I manage to get the GPU to generate my vertex buffers entirely. I have a scene of 1.5 million vertices; for these loads the actual vertex count is 1 million and the index count is 1.5 million.; getting the index buffer generated by GPU is a big performance gain for these kinds of scenes (something like 3N ms/frame without index buffer and 2N ms/frame with index buffer) for some value of N. > > > > > > > > Best Regards, > > > > -Kevin Rogovin > > > > > > ----------------------------------------------------------- > > > You are currently subscribed to public_webgl...@ > > > To unsubscribe, send an email to majordomo...@ with > > > the following command in the body of your email: > > > unsubscribe public_webgl > > > ----------------------------------------------------------- > > > > > > > ----------------------------------------------------------- > > You are currently subscribed to public_webgl...@ > > To unsubscribe, send an email to majordomo...@ with > > the following command in the body of your email: > > unsubscribe public_webgl > > ----------------------------------------------------------- > > > > ----------------------------------------------------------- > You are currently subscribed to public_webgl...@ > To unsubscribe, send an email to majordomo...@ with > the following command in the body of your email: > unsubscribe public_webgl > ----------------------------------------------------------- > ----------------------------------------------------------- You are currently subscribed to public_webgl...@ To unsubscribe, send an email to majordomo...@ with the following command in the body of your email: unsubscribe public_webgl ----------------------------------------------------------- From pub...@ Mon Jul 20 15:57:26 2020 From: pub...@ (Kevin Rogovin (...@...)) Date: Tue, 21 Jul 2020 01:57:26 +0300 Subject: [Public WebGL] glReadPixels (to buffer object) and endianness In-Reply-To: References: Message-ID: Hi, I guess I did not explain the use case adequately: I get a 33% performance improvement with using an index buffer because of post vertex shader cache. 33% is nothing to sneeze at. It sounds like the answer s what I had assumed but did not want to be true: endianess checking is required if GL_IMPLEMENTATION_COLOR_READ_FORMAT/TYPE for a GL_R32UI is not GL_RED_INTEGER/GL_UNSIGNED_INT. -Kevin On Mon, Jul 20, 2020 at 11:40 PM Jeff Gilbert (jgilbert...@) wrote: > > > Well you can use ReadPixels(GL_RED_INTEGER, GL_UNSIGNED_INT) where > available, otherwise change your uint->RGBA packing shader based on > host endianess. > > I think you should strongly consider vertex-pulling in your case, > bypassing the fixed-function index fetch. > > On Mon, Jul 20, 2020 at 1:05 PM Kevin Rogovin > (kevinrogovin...@) wrote: > > > > > > Hi, > > > > My use case is the reverse. A fragment shader does: > > > > uint varying_idx; > > vec4 out_value; > > void main(void) > > { > > uvec4 raw; > > > > raw.r = varying_idx >> 24u; > > raw.g = (varying_idx >> 16u) & 0xFF; > > raw.b = (varying_idx >> 8u) & 0xFF; > > raw.a = varying_idx & 0xFF; > > > > // add a little fudge to make sure... > > out_value = (vec4(raw) + vec4(0.1)) / 255.0; > > } > > > > and then the GL calls are something like: > > > > glBindBuffer(GL_PIXEL_PACK_BUFFER, m_index_bo); > > glReadPixels(0, 0, w, h, GL_RGBA, GL_UNSIGNED_BYTE, NULL); > > > > which gives me an index buffer generated by the GPU. The catch is that > > this assumes that the byte 0 corresponds to the highest 8 bits and so > > on which appears to me to be a very endian specific thing. This is the > > heart of the question, is portability actually guaranteed for this > > use(or really abuse)? > > > > The only way I see to make it perfectly portable would be to make the > > fragment shader render to an GL_R32UI buffer, and then do: > > > > // this will make what we want strided 4 essentially > > glBindBuffer(GL_ARRAY_BUFFER, m_staging_buffer); > > glReadPixels(0, 0, w, h, GL_RGBA_INTEGER, GL_UNSIGNED_INT, NULL); > > > > //undo the strider > > // shader is simple vertex shader that just echos a single scalar uint attribute > > // and the VAO is set with glVertexAttribIPointer(0, count = 1, type = > > GL_UNSIGNED_INT, GL_FALSE, 4 * sizeof(GLuint), NULL); > > glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 0, m_index_buffer) > > glBeginTransformFeedback(m_xf); > > glDrawArrays(0, number_of_indices_to_capture); > > > > this is awful because it uses 8 or 12 times as much bandwidth as a > > directly would give me (or if GLES3/WebGL2 guaranteed to support > > glReadPixels(0, 0, w, h, GL_RED_INTEGER, GL_UNSIGNED_INT, NULL); > > > > -Kevin > > > > On Mon, Jul 20, 2020 at 9:57 PM Jeff Gilbert (jgilbert...@) > > wrote: > > > > > > > > > FWIW, ES3 and WebGL2 potentially do support ReadPixels with > > > GL_UNSIGNED_INT via GL_IMPLEMENTATION_COLOR_READ_FORMAT/TYPE, which > > > you can query for your current READ_FRAMEBUFFER/ReadBuffer. (though > > > this support is driver-dependent) > > > > > > If I understand your last question, it would work. E.g. you can pass > > > your MSB in R (LSB in A) for RGBA/UNSIGNED_BYTE, then reassemble it as > > > `(R * 0xff) << 24 | ...` in the shader. > > > > > > On Mon, Jul 20, 2020 at 6:40 AM Kevin Rogovin > > > (kevinrogovin...@) wrote: > > > > > > > > > > > > Hi all, > > > > > > > > This is a question on what is the guaranteed behaviour related to an > > > > endianness. The situation: I plan to generate a LARGE index buffer and > > > > using transform feedback is not suitable. The plan is to essentially > > > > rasterize the indices. For desktop GL, this is easy, the render target > > > > would be GL_R32UI and glReadPixels would be passed GL_RED_INTEGER with > > > > GL_UNSIGNED_INT. However, GLES3 and this WebGL2 do not allow that > > > > combo in glReadPixels(); indeed for reading from such a buffer would > > > > require GL_RGBA_INTEGER which would make one want to read the index > > > > buffer with a stride of 4. What I'd like to do is to rasterize an > > > > GL_RGBA8 fixed point buffer, do the right thing to convert each 8-bit > > > > chunk into a vec4 tuple and then call glReadPixels with GL_RGBA, > > > > GL_UNSIGNED_BYTE. (Note that because of the rules associated to > > > > GL_ELEMENT_ARRAY_BUFFER, the glReadPixels will write to a staging > > > > buffer which is then copied to the index buffer). The question is: > > > > will the "bit-casting" of the RGBA8-tuple data to GL_UNSIGNED_INT be > > > > platform independent? I would really like to avoid the idea of reading > > > > the 32-bit values as (GL_RGBA_INTEGER, RL_UNSIGNED_BYTE) and issuing > > > > transform feedback as that adds a lot more data copy and bandwidth. > > > > > > > > Best Regards, > > > > -Kevin Rogovin > > > > > > > > ----------------------------------------------------------- > > > > You are currently subscribed to public_webgl...@ > > > > To unsubscribe, send an email to majordomo...@ with > > > > the following command in the body of your email: > > > > unsubscribe public_webgl > > > > ----------------------------------------------------------- > > > > > > > > > > ----------------------------------------------------------- > > > You are currently subscribed to public_webgl...@ > > > To unsubscribe, send an email to majordomo...@ with > > > the following command in the body of your email: > > > unsubscribe public_webgl > > > ----------------------------------------------------------- > > > > > > > ----------------------------------------------------------- > > You are currently subscribed to public_webgl...@ > > To unsubscribe, send an email to majordomo...@ with > > the following command in the body of your email: > > unsubscribe public_webgl > > ----------------------------------------------------------- > > > > ----------------------------------------------------------- > You are currently subscribed to public_webgl...@ > To unsubscribe, send an email to majordomo...@ with > the following command in the body of your email: > unsubscribe public_webgl > ----------------------------------------------------------- > ----------------------------------------------------------- You are currently subscribed to public_webgl...@ To unsubscribe, send an email to majordomo...@ with the following command in the body of your email: unsubscribe public_webgl ----------------------------------------------------------- From pub...@ Tue Jul 21 17:53:45 2020 From: pub...@ (Ken Russell (...@...)) Date: Tue, 21 Jul 2020 17:53:45 -0700 Subject: [Public WebGL] WebGL2 element array buffer and copy buffer (5.1 Buffer Object Binding) In-Reply-To: References: Message-ID: WebGL's API structure and restrictions have been designed from the beginning to enable maximum portability of content. This is one area that has the potential to prevent content from running on a significant fraction of devices. As a concrete example, Apple and Google are collaborating to upgrade WebKit's WebGL 2.0 implementation, and iOS' OpenGL ES driver does not advertise support for robust buffer access behavior. It may in fact have that behavior under the hood - we'd need to test with some content - but if not, and if Safari on iOS had WebGL 2.0 support, I'm pretty sure you wouldn't want your content to run everywhere except there. Could you please share the source code of some example which can demonstrate the performance difference? Feel free to write the "before" and "after" cases, with the expectation that the "after" case won't work. I can help you measure its performance on a browser that lifts the restrictions on element array buffers' usage. If you'd propose it as a pull request under sdk/demos/ on https://github.com/KhronosGroup/WebGL, even better. (I fully recognize that there's likely a large performance gain to be had here, but it's important to motivate this to the community and not just develop an extension for one particular customer's use case.) Thanks much, -Ken On Mon, Jul 20, 2020 at 2:56 PM Kevin Rogovin (kevinrogovin...@) wrote: > > Hi, > > Like I said I can make it work without the extension, but with the > extension I am looking at a 33% performance gain, which is a big deal. > > I confess I do not really follow the portability objection: I'd check > for the extension and if it existed would use GL_ELEMENT_ARRAY_BUFFER > binded buffers more freely and if not, then not (and just lose the > performance gain). Just as the case for applications use > EXT_color_buffer_float. > > -Kevin > > On Mon, Jul 20, 2020 at 11:44 PM Jeff Gilbert (jgilbert...@) > wrote: > > > > > > I would try vertex pulling or other index-buffer-data caching > > mechanisms before making the case for this extension, given the > > portability concerns I mentioned. > > > > On Mon, Jul 20, 2020 at 1:18 PM Kevin Rogovin > > (kevinrogovin...@) wrote: > > > > > > > > > Just to be clear: I am only advocating this extension for WebGL2 ONLY > > > (not WebGL 1) which on MS-Windows means mapping to D3D10/D3D11/D3D12 > > > which have robustness already anyways. For Vulkan and OpenGL support, > > > on desktop robust access is there already; mobile is messier (and one > > > can argue that my below use-case is pointless because mobile is always > > > shared memory, so CPU-GPU traffic is not much of a thing... but it is > > > still because of the wonkiness of drivers, mostly kernel side, of > > > allocating the memory for the buffer objects to which to stream). The > > > extension would have to be requested at context creation (there's the > > > iffy "every extension please" issue, but there are already extensions > > > that have that nature too, for example render to floating point > > > buffer). > > > > > > The goal is that I am aiming to have the GPU generate an index buffer > > > instead of the CPU generating it each frame and sending it to the GPU > > > every frame. Just on using/abusing render to texture and samping from > > > texture from vertex shader (together with attributeless rendering) I > > > have use cases that doubled and even tripled performance compared to > > > playing buffer object ouija board with glBufferData, glBufferSubData, > > > pools and mucking with sizes. The benefit (for my loads) of a GPU > > > generated index buffer is an additional 33% performance advantage (i.e > > > something that is 24 ms/frame would then be like 16 ms/frame). > > > However, that is pointless to do if the WebGL2 implementation needs to > > > snoop the index buffer anyways. I am much better off then doing > > > non-indexed draw calls for this case. So, I'd really like to know or > > > have an extension that guarantees the no snoop. The extension would > > > also allow me to save an additional load of bandwidth copying the > > > buffer made to an index buffer all in one swoop that makes sense > > > together. After all, if the CPU must snoop, there is zero point > > > (nearly) for doing GPU generated index buffers. > > > > > > So the extension question: is this an extension worth the time to > draft? > > > > > > Best Regards, > > > -Kevin > > > > > > > > > > > > On Mon, Jul 20, 2020 at 10:01 PM Jeff Gilbert (jgilbert...@) > > > wrote: > > > > > > > > > > > > We would need to check whether we do indeed have RBAB everywhere we > > > > have WebGL2. Otherwise, we'd need more info on how this enables > > > > compelling workloads, to offset the downside of Apps accidentally not > > > > working on a number of older desktop and mobile drivers. > > > > > > > > Worth considering for your usecase (complicated vertex fetch) is > > > > vertex-pulling, which should be possible with vanilla WebGL 2. > > > > > > > > On Mon, Jul 20, 2020 at 5:51 AM Kevin Rogovin > > > > (kevinrogovin...@) wrote: > > > > > > > > > > Hi, > > > > > > > > > > Hopefully this thread necromancy will live on the correct thread > still. > > > > > > > > > > > ANGLE does use the GPU for robust buffer access when using the > D3D11 backend (most windows users). D3D9 always validates index ranges on > the CPU. We also rely on the Vulkan or OpenGL extensions to do it when > available on other platforms. > > > > > > > > > > What do people think of having an extension for WebGL2 that: > removes the restriction jazz of ELEMENT_ARRAY_BUFFER and gives an assurance > to an application that robust access is used for fetching vertices via an > index buffer, i.e. the implementation won't induce a CPU inspect of an > index buffer. > > > > > > > > > > I can draft the extension, but I would like to feel the water on > this. > > > > > > > > > > My use case, which a follow up question on a different thread will > address in more detail, is for GPU generated index buffers. Right now, I > have scenes where instancing is not sufficient but I manage to get the GPU > to generate my vertex buffers entirely. I have a scene of 1.5 million > vertices; for these loads the actual vertex count is 1 million and the > index count is 1.5 million.; getting the index buffer generated by GPU is a > big performance gain for these kinds of scenes (something like 3N ms/frame > without index buffer and 2N ms/frame with index buffer) for some value of N. > > > > > > > > > > Best Regards, > > > > > -Kevin Rogovin > > > > > > > > ----------------------------------------------------------- > > > > You are currently subscribed to public_webgl...@ > > > > To unsubscribe, send an email to majordomo...@ with > > > > the following command in the body of your email: > > > > unsubscribe public_webgl > > > > ----------------------------------------------------------- > > > > > > > > > > ----------------------------------------------------------- > > > You are currently subscribed to public_webgl...@ > > > To unsubscribe, send an email to majordomo...@ with > > > the following command in the body of your email: > > > unsubscribe public_webgl > > > ----------------------------------------------------------- > > > > > > > ----------------------------------------------------------- > > You are currently subscribed to public_webgl...@ > > To unsubscribe, send an email to majordomo...@ with > > the following command in the body of your email: > > unsubscribe public_webgl > > ----------------------------------------------------------- > > > > ----------------------------------------------------------- > You are currently subscribed to public_webgl...@ > To unsubscribe, send an email to majordomo...@ with > the following command in the body of your email: > unsubscribe public_webgl > ----------------------------------------------------------- > > -- I support flexible work schedules, and I?m sending this email now because it is within the hours I?m working today. Please do not feel obliged to reply straight away - I understand that you will reply during the hours you work, which may not match mine. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pub...@ Wed Jul 22 00:32:08 2020 From: pub...@ (Kevin Rogovin (...@...)) Date: Wed, 22 Jul 2020 10:32:08 +0300 Subject: [Public WebGL] WebGL2 element array buffer and copy buffer (5.1 Buffer Object Binding) In-Reply-To: References: Message-ID: Hi, I *really* cannot share the source code for the project; I can generate a toy example but the toy will be terribly artificial. Due to the complexity this is going to take a few days; I will need to beg to my bosses for that time. My aim is that I will check for the extension that would allow for copies directly to index buffer and if it was not present, just assume that robust access is also not present on device and skip the index buffer completely to achieve compatibility with such devices (and avoid the danger that the CPU might snoop the GPU buffer). Best Regards, -Kevin On Wed, Jul 22, 2020 at 3:54 AM Ken Russell (kbr...@) wrote: > > WebGL's API structure and restrictions have been designed from the beginning to enable maximum portability of content. This is one area that has the potential to prevent content from running on a significant fraction of devices. > > As a concrete example, Apple and Google are collaborating to upgrade WebKit's WebGL 2.0 implementation, and iOS' OpenGL ES driver does not advertise support for robust buffer access behavior. It may in fact have that behavior under the hood - we'd need to test with some content - but if not, and if Safari on iOS had WebGL 2.0 support, I'm pretty sure you wouldn't want your content to run everywhere except there. > > Could you please share the source code of some example which can demonstrate the performance difference? Feel free to write the "before" and "after" cases, with the expectation that the "after" case won't work. I can help you measure its performance on a browser that lifts the restrictions on element array buffers' usage. If you'd propose it as a pull request under sdk/demos/ on https://github.com/KhronosGroup/WebGL, even better. > > (I fully recognize that there's likely a large performance gain to be had here, but it's important to motivate this to the community and not just develop an extension for one particular customer's use case.) > > Thanks much, > > -Ken > > > > On Mon, Jul 20, 2020 at 2:56 PM Kevin Rogovin (kevinrogovin...@) wrote: >> >> >> Hi, >> >> Like I said I can make it work without the extension, but with the >> extension I am looking at a 33% performance gain, which is a big deal. >> >> I confess I do not really follow the portability objection: I'd check >> for the extension and if it existed would use GL_ELEMENT_ARRAY_BUFFER >> binded buffers more freely and if not, then not (and just lose the >> performance gain). Just as the case for applications use >> EXT_color_buffer_float. >> >> -Kevin >> >> On Mon, Jul 20, 2020 at 11:44 PM Jeff Gilbert (jgilbert...@) >> wrote: >> > >> > >> > I would try vertex pulling or other index-buffer-data caching >> > mechanisms before making the case for this extension, given the >> > portability concerns I mentioned. >> > >> > On Mon, Jul 20, 2020 at 1:18 PM Kevin Rogovin >> > (kevinrogovin...@) wrote: >> > > >> > > >> > > Just to be clear: I am only advocating this extension for WebGL2 ONLY >> > > (not WebGL 1) which on MS-Windows means mapping to D3D10/D3D11/D3D12 >> > > which have robustness already anyways. For Vulkan and OpenGL support, >> > > on desktop robust access is there already; mobile is messier (and one >> > > can argue that my below use-case is pointless because mobile is always >> > > shared memory, so CPU-GPU traffic is not much of a thing... but it is >> > > still because of the wonkiness of drivers, mostly kernel side, of >> > > allocating the memory for the buffer objects to which to stream). The >> > > extension would have to be requested at context creation (there's the >> > > iffy "every extension please" issue, but there are already extensions >> > > that have that nature too, for example render to floating point >> > > buffer). >> > > >> > > The goal is that I am aiming to have the GPU generate an index buffer >> > > instead of the CPU generating it each frame and sending it to the GPU >> > > every frame. Just on using/abusing render to texture and samping from >> > > texture from vertex shader (together with attributeless rendering) I >> > > have use cases that doubled and even tripled performance compared to >> > > playing buffer object ouija board with glBufferData, glBufferSubData, >> > > pools and mucking with sizes. The benefit (for my loads) of a GPU >> > > generated index buffer is an additional 33% performance advantage (i.e >> > > something that is 24 ms/frame would then be like 16 ms/frame). >> > > However, that is pointless to do if the WebGL2 implementation needs to >> > > snoop the index buffer anyways. I am much better off then doing >> > > non-indexed draw calls for this case. So, I'd really like to know or >> > > have an extension that guarantees the no snoop. The extension would >> > > also allow me to save an additional load of bandwidth copying the >> > > buffer made to an index buffer all in one swoop that makes sense >> > > together. After all, if the CPU must snoop, there is zero point >> > > (nearly) for doing GPU generated index buffers. >> > > >> > > So the extension question: is this an extension worth the time to draft? >> > > >> > > Best Regards, >> > > -Kevin >> > > >> > > >> > > >> > > On Mon, Jul 20, 2020 at 10:01 PM Jeff Gilbert (jgilbert...@) >> > > wrote: >> > > > >> > > > >> > > > We would need to check whether we do indeed have RBAB everywhere we >> > > > have WebGL2. Otherwise, we'd need more info on how this enables >> > > > compelling workloads, to offset the downside of Apps accidentally not >> > > > working on a number of older desktop and mobile drivers. >> > > > >> > > > Worth considering for your usecase (complicated vertex fetch) is >> > > > vertex-pulling, which should be possible with vanilla WebGL 2. >> > > > >> > > > On Mon, Jul 20, 2020 at 5:51 AM Kevin Rogovin >> > > > (kevinrogovin...@) wrote: >> > > > > >> > > > > Hi, >> > > > > >> > > > > Hopefully this thread necromancy will live on the correct thread still. >> > > > > >> > > > > > ANGLE does use the GPU for robust buffer access when using the D3D11 backend (most windows users). D3D9 always validates index ranges on the CPU. We also rely on the Vulkan or OpenGL extensions to do it when available on other platforms. >> > > > > >> > > > > What do people think of having an extension for WebGL2 that: removes the restriction jazz of ELEMENT_ARRAY_BUFFER and gives an assurance to an application that robust access is used for fetching vertices via an index buffer, i.e. the implementation won't induce a CPU inspect of an index buffer. >> > > > > >> > > > > I can draft the extension, but I would like to feel the water on this. >> > > > > >> > > > > My use case, which a follow up question on a different thread will address in more detail, is for GPU generated index buffers. Right now, I have scenes where instancing is not sufficient but I manage to get the GPU to generate my vertex buffers entirely. I have a scene of 1.5 million vertices; for these loads the actual vertex count is 1 million and the index count is 1.5 million.; getting the index buffer generated by GPU is a big performance gain for these kinds of scenes (something like 3N ms/frame without index buffer and 2N ms/frame with index buffer) for some value of N. >> > > > > >> > > > > Best Regards, >> > > > > -Kevin Rogovin >> > > > >> > > > ----------------------------------------------------------- >> > > > You are currently subscribed to public_webgl...@ >> > > > To unsubscribe, send an email to majordomo...@ with >> > > > the following command in the body of your email: >> > > > unsubscribe public_webgl >> > > > ----------------------------------------------------------- >> > > > >> > > >> > > ----------------------------------------------------------- >> > > You are currently subscribed to public_webgl...@ >> > > To unsubscribe, send an email to majordomo...@ with >> > > the following command in the body of your email: >> > > unsubscribe public_webgl >> > > ----------------------------------------------------------- >> > > >> > >> > ----------------------------------------------------------- >> > You are currently subscribed to public_webgl...@ >> > To unsubscribe, send an email to majordomo...@ with >> > the following command in the body of your email: >> > unsubscribe public_webgl >> > ----------------------------------------------------------- >> > >> >> ----------------------------------------------------------- >> You are currently subscribed to public_webgl...@ >> To unsubscribe, send an email to majordomo...@ with >> the following command in the body of your email: >> unsubscribe public_webgl >> ----------------------------------------------------------- >> > > > -- > I support flexible work schedules, and I?m sending this email now because it is within the hours I?m working today. Please do not feel obliged to reply straight away - I understand that you will reply during the hours you work, which may not match mine. > ----------------------------------------------------------- You are currently subscribed to public_webgl...@ To unsubscribe, send an email to majordomo...@ with the following command in the body of your email: unsubscribe public_webgl ----------------------------------------------------------- From pub...@ Wed Jul 22 10:55:43 2020 From: pub...@ (Ken Russell (...@...)) Date: Wed, 22 Jul 2020 10:55:43 -0700 Subject: [Public WebGL] WebGL2 element array buffer and copy buffer (5.1 Buffer Object Binding) In-Reply-To: References: Message-ID: Understood that you can't share the source code for your project. Please do create an example, even artificial. There's no urgent rush. It's very important for motivating such a significant change to WebGL's API. Thanks, -Ken On Wed, Jul 22, 2020 at 12:33 AM Kevin Rogovin (kevinrogovin...@) wrote: > > Hi, > > I *really* cannot share the source code for the project; I can > generate a toy example but the toy will be terribly artificial. Due to > the complexity this is going to take a few days; I will need to beg to > my bosses for that time. > > My aim is that I will check for the extension that would allow for > copies directly to index buffer and if it was not present, just assume > that robust access is also not present on device and skip the index > buffer completely to achieve compatibility with such devices (and > avoid the danger that the CPU might snoop the GPU buffer). > > Best Regards, > -Kevin > > On Wed, Jul 22, 2020 at 3:54 AM Ken Russell (kbr...@) > wrote: > > > > WebGL's API structure and restrictions have been designed from the > beginning to enable maximum portability of content. This is one area that > has the potential to prevent content from running on a significant fraction > of devices. > > > > As a concrete example, Apple and Google are collaborating to upgrade > WebKit's WebGL 2.0 implementation, and iOS' OpenGL ES driver does not > advertise support for robust buffer access behavior. It may in fact have > that behavior under the hood - we'd need to test with some content - but if > not, and if Safari on iOS had WebGL 2.0 support, I'm pretty sure you > wouldn't want your content to run everywhere except there. > > > > Could you please share the source code of some example which can > demonstrate the performance difference? Feel free to write the "before" and > "after" cases, with the expectation that the "after" case won't work. I can > help you measure its performance on a browser that lifts the restrictions > on element array buffers' usage. If you'd propose it as a pull request > under sdk/demos/ on https://github.com/KhronosGroup/WebGL, even better. > > > > (I fully recognize that there's likely a large performance gain to be > had here, but it's important to motivate this to the community and not just > develop an extension for one particular customer's use case.) > > > > Thanks much, > > > > -Ken > > > > > > > > On Mon, Jul 20, 2020 at 2:56 PM Kevin Rogovin ( > kevinrogovin...@) wrote: > >> > >> > >> Hi, > >> > >> Like I said I can make it work without the extension, but with the > >> extension I am looking at a 33% performance gain, which is a big deal. > >> > >> I confess I do not really follow the portability objection: I'd check > >> for the extension and if it existed would use GL_ELEMENT_ARRAY_BUFFER > >> binded buffers more freely and if not, then not (and just lose the > >> performance gain). Just as the case for applications use > >> EXT_color_buffer_float. > >> > >> -Kevin > >> > >> On Mon, Jul 20, 2020 at 11:44 PM Jeff Gilbert (jgilbert...@) > >> wrote: > >> > > >> > > >> > I would try vertex pulling or other index-buffer-data caching > >> > mechanisms before making the case for this extension, given the > >> > portability concerns I mentioned. > >> > > >> > On Mon, Jul 20, 2020 at 1:18 PM Kevin Rogovin > >> > (kevinrogovin...@) wrote: > >> > > > >> > > > >> > > Just to be clear: I am only advocating this extension for WebGL2 > ONLY > >> > > (not WebGL 1) which on MS-Windows means mapping to D3D10/D3D11/D3D12 > >> > > which have robustness already anyways. For Vulkan and OpenGL > support, > >> > > on desktop robust access is there already; mobile is messier (and > one > >> > > can argue that my below use-case is pointless because mobile is > always > >> > > shared memory, so CPU-GPU traffic is not much of a thing... but it > is > >> > > still because of the wonkiness of drivers, mostly kernel side, of > >> > > allocating the memory for the buffer objects to which to stream). > The > >> > > extension would have to be requested at context creation (there's > the > >> > > iffy "every extension please" issue, but there are already > extensions > >> > > that have that nature too, for example render to floating point > >> > > buffer). > >> > > > >> > > The goal is that I am aiming to have the GPU generate an index > buffer > >> > > instead of the CPU generating it each frame and sending it to the > GPU > >> > > every frame. Just on using/abusing render to texture and samping > from > >> > > texture from vertex shader (together with attributeless rendering) I > >> > > have use cases that doubled and even tripled performance compared to > >> > > playing buffer object ouija board with glBufferData, > glBufferSubData, > >> > > pools and mucking with sizes. The benefit (for my loads) of a GPU > >> > > generated index buffer is an additional 33% performance advantage > (i.e > >> > > something that is 24 ms/frame would then be like 16 ms/frame). > >> > > However, that is pointless to do if the WebGL2 implementation needs > to > >> > > snoop the index buffer anyways. I am much better off then doing > >> > > non-indexed draw calls for this case. So, I'd really like to know or > >> > > have an extension that guarantees the no snoop. The extension would > >> > > also allow me to save an additional load of bandwidth copying the > >> > > buffer made to an index buffer all in one swoop that makes sense > >> > > together. After all, if the CPU must snoop, there is zero point > >> > > (nearly) for doing GPU generated index buffers. > >> > > > >> > > So the extension question: is this an extension worth the time to > draft? > >> > > > >> > > Best Regards, > >> > > -Kevin > >> > > > >> > > > >> > > > >> > > On Mon, Jul 20, 2020 at 10:01 PM Jeff Gilbert (jgilbert...@ > ) > >> > > wrote: > >> > > > > >> > > > > >> > > > We would need to check whether we do indeed have RBAB everywhere > we > >> > > > have WebGL2. Otherwise, we'd need more info on how this enables > >> > > > compelling workloads, to offset the downside of Apps accidentally > not > >> > > > working on a number of older desktop and mobile drivers. > >> > > > > >> > > > Worth considering for your usecase (complicated vertex fetch) is > >> > > > vertex-pulling, which should be possible with vanilla WebGL 2. > >> > > > > >> > > > On Mon, Jul 20, 2020 at 5:51 AM Kevin Rogovin > >> > > > (kevinrogovin...@) wrote: > >> > > > > > >> > > > > Hi, > >> > > > > > >> > > > > Hopefully this thread necromancy will live on the correct > thread still. > >> > > > > > >> > > > > > ANGLE does use the GPU for robust buffer access when using > the D3D11 backend (most windows users). D3D9 always validates index ranges > on the CPU. We also rely on the Vulkan or OpenGL extensions to do it when > available on other platforms. > >> > > > > > >> > > > > What do people think of having an extension for WebGL2 that: > removes the restriction jazz of ELEMENT_ARRAY_BUFFER and gives an assurance > to an application that robust access is used for fetching vertices via an > index buffer, i.e. the implementation won't induce a CPU inspect of an > index buffer. > >> > > > > > >> > > > > I can draft the extension, but I would like to feel the water > on this. > >> > > > > > >> > > > > My use case, which a follow up question on a different thread > will address in more detail, is for GPU generated index buffers. Right now, > I have scenes where instancing is not sufficient but I manage to get the > GPU to generate my vertex buffers entirely. I have a scene of 1.5 million > vertices; for these loads the actual vertex count is 1 million and the > index count is 1.5 million.; getting the index buffer generated by GPU is a > big performance gain for these kinds of scenes (something like 3N ms/frame > without index buffer and 2N ms/frame with index buffer) for some value of N. > >> > > > > > >> > > > > Best Regards, > >> > > > > -Kevin Rogovin > >> > > > > >> > > > ----------------------------------------------------------- > >> > > > You are currently subscribed to public_webgl...@ > >> > > > To unsubscribe, send an email to majordomo...@ with > >> > > > the following command in the body of your email: > >> > > > unsubscribe public_webgl > >> > > > ----------------------------------------------------------- > >> > > > > >> > > > >> > > ----------------------------------------------------------- > >> > > You are currently subscribed to public_webgl...@ > >> > > To unsubscribe, send an email to majordomo...@ with > >> > > the following command in the body of your email: > >> > > unsubscribe public_webgl > >> > > ----------------------------------------------------------- > >> > > > >> > > >> > ----------------------------------------------------------- > >> > You are currently subscribed to public_webgl...@ > >> > To unsubscribe, send an email to majordomo...@ with > >> > the following command in the body of your email: > >> > unsubscribe public_webgl > >> > ----------------------------------------------------------- > >> > > >> > >> ----------------------------------------------------------- > >> You are currently subscribed to public_webgl...@ > >> To unsubscribe, send an email to majordomo...@ with > >> the following command in the body of your email: > >> unsubscribe public_webgl > >> ----------------------------------------------------------- > >> > > > > > > -- > > I support flexible work schedules, and I?m sending this email now > because it is within the hours I?m working today. Please do not feel > obliged to reply straight away - I understand that you will reply during > the hours you work, which may not match mine. > > > > ----------------------------------------------------------- > You are currently subscribed to public_webgl...@ > To unsubscribe, send an email to majordomo...@ with > the following command in the body of your email: > unsubscribe public_webgl > ----------------------------------------------------------- > > -- I support flexible work schedules, and I?m sending this email now because it is within the hours I?m working today. Please do not feel obliged to reply straight away - I understand that you will reply during the hours you work, which may not match mine. -------------- next part -------------- An HTML attachment was scrubbed... URL: