For SpaceEngine’s upcoming debut on Steam, program stability is crucial. As most of you know, SE has never shone in this area. Therefore, I am attacking the main culprit: the terrain engine. Historically, SE has used a dynamic redistribution of memory between different subsystems, depending on the situation: terrain, stars, models of galaxies and nebulae, and ships. This was necessary when the amount of video memory on most video cards was just 1-2 gigabytes. Now, however, video cards with less than 3 gigabytes of memory can barely run SE, and most video cards today have 4 gigabytes or more of memory. Therefore, we can abandon dynamic distribution and change to a more efficient, fast, and stable static one. Instead of allocating new memory for a texture each time it is required, the engine allocates a large array of empty textures at once, when the program starts, and does not change it anymore. When the engine needs a new texture, the memory manager simply gives the index of the first free texture from the array. Instead of removing the texture and returning its memory to the system, its index is simply inserted to the list of free textures, and can be used next time.
Changing the way textures are stored in video memory (OpenGL texture arrays), usage of meshes (almost no meshes anymore, the terrain is constructed in the vertex shader on the fly, by reading the heightmap texture), and additional data storage in RAM (semi-static pool), led to the following results:
1) Stability is 100%. SE no longer crashes on terrain loading/generation, primarily due to static memory usage.
2) Terrain needs no more than 100-150 MB of RAM, and more than 2 GB of video memory (if texture compression is enabled). The allocated texture arrays are fixed and never change. SE determines how much it can allocate at launch, according to the amount of available video memory.
3) The speed of loading/generation of terrain has increased tenfold! It is best to watch the following video (sorry for the buggy autofocus, I shot this video on my phone so that the video capture software does not affect SE’s performance):
(The video shows glitches with terrain geometry at the moment of loading, but this is already fixed).
This increase in performance is due to the elimination of the dynamic texture allocation overhead. Allocation was very slow! Now, on LOD 0, full loading now occurrs in just 5 seconds, and on LOD 1 – in 30 seconds! Note however, that I have an RTX 2080, but my 4k display adds a significant load to it. The “Loading speed” slider in the settings menu allows you to exchange loading speed for frame rate. With a loading speed value of up to 8, the frame rate does not drop below 60 fps; at value of 20, it drops down to 18 fps, but the terrain is fully loaded in 12 seconds on LOD 1!
4) Theoretically, rendering performance has increased, because another overhead was removed from rendering – switching texture units for each terrain node (thanks to texture arrays). At the same time, it has decreased slightly (by 5-10 fps) due to reading textures in the vertex shader for building the terrain geometry (this is called “displacement mapping”). It is difficult to compare precisely, but the first one significantly outweighs the second. And this is without implementing instancing (this is the next step, almost everything is ready for this). Note that all modern engines use displacement mapping to build terrain; it also enables animation of geometry, if the texture is being updated every frame (as with water waves, for example).
5) Unexpectedly, but usefully, an annoying effect is eliminated: in the past, asteroids appear spherical momentarily at first, if you approach them too fast, or open the Wiki. Now, because terrain is built right from the heightmap texture in the shader, the lowest level of detail for asteroids has its correct 3D shape.
6) Bonus: almost free displacement mapping of detailed textures (e.g., rocks on the surface). Since the main terrain is already rendered by displacement mapping, adding another texture to sample in the vertex shader is not a problem. As a result, we get this:
I must say, 3D rocks look so cool in VR that I myself stopping to look at them for half an hour every time I start a test 🙂 Now I need to get some hi-quality art for them (a set of textures).
Among the disadvantages are the following:
1) Since the texture cache in memory is now a fixed size, it is impossible to load more textures than it holds. LOD 1 is the limit on 1920×1080 displays, and at 4k the limit is LOD 0. When you turn the camera to the side and back, the terrain reloads, because the texture cache is almost full. However, the reloading is so fast that it’s not bothersome. As the LOD increases beyond this limit, the landscape will constantly reload itself (some distant nodes will flash, but the effect of cyclic wiping out and reloading is eliminated). The cache capacity limit is 2048 textures (of each type), this is an OpenGL limit on the size of texture arrays. However, it turns out that this is enough for both the most popular display resolutions and for VR. In the future, it will be possible to increase the resolution of the textures themselves, without changing the cache capacity.
2) SE now does not support planetary textures with an arbitrary tile resolution. Now they must be 256×256. This means that I will have to remake all 45 gigabytes of the Solar system textures. Mods such as the Grand Canyon will also require an udpate. Non-cubic textures (in a simple cylindrical projection) are also no longer supported, except at 256×256 resolution, if you okay with it being that low. I haven’t dealt with Solar system textures yet; I plan to significantly boost their loading speed as well.
3) Now there is no dynamic redistribution of video memory between terrain and other systems (stars, models of galaxies, and ships). On weaker machines, this can be a serious problem. Now, 1 GB of video memory is catastrophically small for SE. I added a parameter to the config to disable detail textures – this allows shrinking down the terrain texture cache to 300-400 MB. I may have to force automatic configuration for this on weak hardware.
But all of the advantages outweigh the disadvantages significantly. Stability and performance is the main goal! As for system requirements… well, SE has always been harsh to them.
Completion of this work will require no less than a month. As always, the “80/20” Pareto principle applies: I did 80% of the work within a week, which means that the remaining 20% will require a month…
But it’s worth it.
Discuss this post on the forum.