As I have mentioned in my previous blog post, I want to use texture compression for the terrain engine, in order to reduce its memory requirements. All graphics cards support S3TC - a lossy texture compression algorithm, which reduces the memory footprint of the texture 2-4 times, and decompresses it on-the-fly during rendering. Decompression is implemented in hardware, so it has no impact on performance (it even may increase performance in bandwidth-demanding scenarios, because the amount of data cycling between VRAM and GPU is reduced 2-4 times).

On the screenshots below, move the slider to compare. You can also right-click on it and choose "Open in a new tab" to see it in full resolution (do it two times - on left and right side of the image).

Original Image Original
Modified Image Modified

Planet from space - no visual difference

So far I implemented DXT1, DXT5, LATC1 and LATC2 formats (or their DirectX equivalents BC1, BC3, BC4 and BC5). Compression is performed on-the-fly by a special shader, after generation/loading of the planet textures. The table below shows a summary of the formats and their usage in SE. For those who are interested in the details, this is a very good article about block compression algorithms.

Format Channels Compression rate Quality Usage in SE
DXT1 (BC1) 3 (RGB) 1:4 medium not currently used
DXT5 (BC3) 4 (RGBA) 1:4 medium not currently used
DXT5 (BC3) YCoCg 3 (RGB) 1:4 high color (albedo) map, emission (city lights) map
LATC1 (BC4) 1 (gray) 1:2 high clouds opacity, water/ice mask, roughness, detail height
LATC2 (BC5) 2 (2x gray) 1:2 high normal map
Uncompressed 16 bit 1 (gray) 1:1 lossless elevation map, temperature map

Original Image Original
Modified Image Modified

Here you can notice some difference

Before implementing compression, SE had this texture layout:

Elevation 16 bit grayscale
Color + water/ice specular 8 bit RGBA
Normal 8 bit RGBA only 2 channels used, in 0.980 other 2 channels are used to store 16 bit elevation map
Emission 8 bit RGBA alpha specifies mode: city lights, permanent lights or thermal lights, RGB part is either lights color or encoded temperature
Detail color + roughness 8 bit RGBA
Detail normal + height 8 bit RGBA

In 0.990 I switched to a dedicated 16-bit elevation map to improve precision and quality. This map is a basis for the terrain engine: it is used to displace mesh vertices, and to generate all other maps (normals/color/emission etc). Unfortunately, there are no S3TC compression formats for 16 bit maps. But the elevation map must be stored in lossless format anyway, at the highest possible quality.

While implementing texture compression, I was forced to separate the alpha channel of some maps into dedicated textures, because the chosen compression formats (YCoCg DXT5 and LATC2) do not support alpha. I chose YCoCg encoded DXT5 because it has better quality than default DXT5 or DXT1; the LATC2 format is crucial for normal maps (DXT-compressed normal maps result in too-large artifacts in lighting). Normal maps store only two components of the normal vector, the third one is computed by the planet shader on the fly.

Original Image Original
Modified Image Modified

The thermal emission map previously stored the temperature of the surface, encoded into 24 bits of the RGB channels. Such large precision was needed to avoid banding (thermal emission is highly sensitive to variations in temperature, so temperature must be stored as accurately as possible). But lossy compression of the encoded data destroys it completely: large and very bright artifacts appear on volcanoes and star surfaces. So I separated the thermal map into a dedicated texture, stored in 16-bit (uncompressed) format. Planets now have two maps: the old GlowMap is used for city lights or permanent lights (visible even during daytime, like lava on Io), and a new TempMap ("temperature map") which simply defines surface temperature in Kelvins. 16 bits can store values from 0 to 65535 - enough to describe the temperature of a planet or a star (65 thousand Kelvin is a blindingly bright blue, hotter temperatures have almost the same blue color tint). The new TempMap can be specified in scripts just like GlowMap; planets can use both simultaneously.

Another bonus of the dedicated 16-bit thermal maps: no more patches are visible on a star's surface!

So, the new compressed texture layout is the following:

Elevation 16 bit uncompressed
Color YCoCg DXT5 or LATC1 for grayscale textures (like Earth's clouds), LACT1 is used
Water/ice specular or clouds alpha LATC1
Normal LATC2 only 2 channels are used
Emission YCoCg DXT5 or LATC1 for grayscale textures (Earth city lights), LACT1 is used
Thermal 16 bit uncompressed temperature in Kelvins
Detail color YCoCg DXT5
Detail roughness LATC1
Detail normal LATC2
Detail ambient occlusion LATC1 not used currently, but easy to add
Detail height LATC1 not used currently, but will be needed for Parallax Occlusion Mapping

So many textures, but they use just 4 different formats: 16 bit uncompressed, DXT5, LATC1 and LATC2. There are more advanced hardware compression formats - BC6, BC7, ASTC - but they all either have too-slow encoding (like 1 second on GTX780 to compress 512*512 texture using CUDA), or don't have wide support on desktop CPUs (ASTC is more common on mobile devices).

Original Image Original
Modified Image Modified

Close-up view - subtle difference

The engine compresses planet textures right after they are generated or loaded from the disk. This is not a free process, it takes 2-5 milliseconds per texture, so loading time is increased by 20-50%. But compression saves a lot of memory! It's hard to tell how much exactly is saved, because in real usage planets generate different sets of textures, depending on altitude and planet type. Without compression, the cache for 4000 terrain nodes consumed 1200-1500 MB of VRAM (at LOD 0), with compression the number falls down to 500-600 MB. So I assume that overall improvement is roughly 2.5 times. Not bad, taking into account that SE still uses elevation and thermal maps in a non-compressed format.

I also compressed the library of materials (rock/pebbles/grass/sand images, used to generate detail textures). This was easy, because SE already supported S3TC compressed textures as *.dds files (used for some ship and galaxy textures). I used nvcompress and ImageMagic tools for this (I wrote a few batch scripts to automate the process). Before this, 49 materials took up 522 MB of video memory, now they take up just 130 MB (4 times less). This also speeds up loading of the library textures at SE startup by an order of magnitude! (1 second instead of 10).

Original Image Original
Modified Image Modified

Another close-up view - noticeable difference. Patterns of the detail textures rely on the global texture. If the global texture is compressed, especially normal maps, it may lead to a significantly different result. A workaround may be not compressing those levels which are used to generate detail textures.

The graphics settings now have the option "Compress textures". It switches the mode for the newly generated textures, i.e. old textures that are already in VRAM do not change their format. They will be removed by the terrain engine after a while anyway, when space is needed for new textures.

Original Image Original
Modified Image Modified

At dawn, a subtle difference in lighting is easily visible. This is because the small difference between compressed and non-compressed normals leads to a significant change in the light/shadow direction.

There is a problem with Earth and other real planets. Over the years, they have used an unusual texture size - 258x258. The GPU does not like non-power-of-two sizes, so the resolution of the textures of procedural planets is 256x256. This was not a major problem before, but now it is changed. S3TC compression is based on blocks of 4x4 pixels, so the texture size must be a multiple of 4 (which is not the case for 258). Compression code adds a padding border to the texture and its mip levels to keep their size a multiple of 4. This leads to some shifts of the textures over terrain. This may be fixed by adding and applying a reverse shift in the shader, but I think it would be better to re-process all textures of real planets to make them 256x256. This will give two benefits. First, it will allow the use of a fixed VRAM texture cache: just allocate 1000 textures for every 4 formats, and re-use them, rather than allocate a new texture each time. I used dynamic allocation from the beginning of SE, because the engine cannot know which tile resolution the next planet will use. Allocation is a slow operation, so re-using a fixed set of textures will speed up generation time. The second benefit is that the textures of real planets will be re-processed to have 2-pixel-wide borders, which will eliminate seams in their normal maps, which they currently have.

I experimented with DDS compression for Earth's textures, but it produces a much larger final pak file size than it has currently with the jpg/png images (or lossless png in the HD and Ultra addons). So SE will still use jpg/png compression to store textures on the disk, and re-compress to S3TC on the fly while loading. Some users may like to disable compression in settings to keep the quality at maximum.

Discuss this post on the forum.