Sodium

Sodium

61M Downloads

additional primitive type suggestion

Closed this issue ยท 4 comments

commented

Request Description

Looking at sodium, i noticed that GL_POINTS, GL_LINES, GL_TRIANGLES and GL_PATCHES were supported, i was wondering if GL_TRIANGLE_STRIPS could be supported in the near future?
reasons why i think it would be beneficial:

  • decrease in data needed to to render quads being sent to the GPU
  • slight leverage of GPU for triangle construction

i'm assuming it's not being used currently because:

  • increased complexity in rendering code
  • no strip alternative for patches or points
commented

Doing so would not be useful. Specifically because we can't share vertices between quads, so a triangle strip wouldn't work. Further, if at some point our architecture changes and this becomes useful we'll take advantage of it as necessary.

commented

Thanks, this was more of a question than a serious suggestion

commented

Sorry to be a bother, but the reason for my question was because i had watched this video some time back: https://www.youtube.com/watch?v=40JzyaOYJeY&t=247s
ande was wandering if some of the performance enchancements could be used for minecraft.

  • vertex shader based normals would probably require to much changes, as AO system would have to be replaced

  • glMultiDrawArraysIndirect, would require more modern opengl support than what most people would be running

  • Instance rendering for GPU based mesh construction: difficult as minecraft requires fractional scaling for entities, slabs, stairs and fences/fence gates, could be done by having a single base triangle quad storing float data and having length and width being passed to a vertex shader stored as float for fractional scaling

  • custom back face culling: requires custom implementation of back face culling, not advised but could lead to less cpu-gpu communication than standard opengl back face culling

commented

Sodium uses alternative methods to the ones presented in the video that are better supported and don't require modern hardware features.

Indirect rendering has performance issues on Intel GPUs due to the limited dispatch throughput of the command processor. At higher render distances (>24 chunks) the number of draw calls that need to be submitted is too much for most integrated Intel GPUs and would severely limit the frame rate.

Note that, the problem with command throughput could be alleviated through methods of Geometry Virtualization (where effectively you implement virtual memory in software within a shader with a translation table), since you could then just dispatch one very large draw command, but unfortunately the necessary functionality (sparse descriptor sets) is not supported in OpenGL.

Removing per-vertex attributes (light, color, etc) is theoretically possible and a good optimization, but would be incompatible with many mods, hence the reason we don't do it. We have discussed instead using texture maps for each attribute instead, which solves many problems with shading anisotropy and vertex front-end bottlenecks, but it also requires modern OpenGL functionality that is not well supported.

As for advanced mesh optimization (greedy meshing, monotone polygon triangulation, etc), that would all require per-vertex attributes to be removed (or moved into texture maps), which again is not possible as of the moment.

Back-face culling is done in Sodium by assigning geometry to vertex groups based on their face normal and dispatching draw commands only for vertex groups that are determined to be front-facing. This means that back-facing geometry is never submitted to the device and that it contributes zero overhead in rendering.

Using different primitive topologies would also not improve performance in our case either, since we use triangle list rendering with a small, repeating, monotonic index buffer. Because of this, the index buffer is always fully resident within the device's cache and does not need to be fetched from video memory, meaning that we are not paying any cost for it. Furthermore, since the sequence is short enough, we have optimal re-use of the vertex cache anyway.