Volumetric Rendering in 20 Lines of JS: The Magic and Nostalgia of Voxel Space

If you played video games in the late 1990s, you probably remember the mind-blowing terrains of games like Delta Force or Comanche: Maximum Overkill. In an era when consumer 3D accelerators like the 3dfx Voodoo were just emerging, these games rendered vast, organic, undulating landscapes with hundreds of hills and valleys—all running smoothly on a humble Intel Pentium CPU.

How did they do it? They didn’t use polygons. Instead, they used an incredibly elegant algorithm known as Voxel Space.

Recently, the Voxel Space algorithm has been trending on Hacker News, sparking a wave of nostalgia and technical curiosity among modern web developers. In a world where we routinely ship megabytes of JavaScript and rely on massive, complex WebGL engines like Three.js just to render a spinning cube, there is something deeply satisfying about revisiting a rendering technique so efficient that you can write the entire core engine in about 20 lines of vanilla JavaScript.

Today, we are going to dive deep into how the Voxel Space engine works, look at why its CPU-bound constraints are actually a masterclass in software engineering, and build our own interactive, high-performance Voxel Space renderer in HTML5 canvas. Grab your favorite caffeinated beverage, and let's write some beautiful, retro-inspired rendering code.

The Core Concept: Heightmaps and Color Maps

Unlike modern ray-casters or polygonal rasterizers, the Voxel Space algorithm relies on a highly simplified projection trick. To render a 3D landscape, the algorithm requires only two 2D images (textures) of the exact same dimensions (typically power-of-two, like 1024x1024):

The Color Map: A standard 2D texture containing the colors of the terrain (the grass, dirt, roads, or snow) as seen from directly above.
The Heightmap: A grayscale image where the brightness of each pixel represents the elevation of the terrain at that specific coordinate. A pure black pixel (0) is sea level, while a pure white pixel (255) is the peak of a mountain.

By pairing these two images, we have a complete 3D representation of a terrain. The magic of Voxel Space is how it projects this 2D data onto a 2D screen to create the illusion of 3D depth, perspective, and scale.

How the Algorithm Works (Without Math Ph.Ds)

The brilliance of Voxel Space lies in its simplification of the 3D rendering equation. Instead of projecting arbitrary 3D coordinates using complex matrix math, the algorithm makes a few brilliant assumptions:

The camera only rotates horizontally (yaw), meaning there is no pitch (tilt up/down) or roll (banking left/right). While modern adaptations can simulate pitch, the classic engine assumes the horizon is always a flat, horizontal line on the screen.
The terrain is viewed from a perspective where we render column-by-column, from the foreground (close to the camera) to the background (far away).

For every vertical column of pixels on your screen (from left to right), the engine casts a ray outward from the camera’s position into the world map. As the ray travels forward, we sample the height and color of the map at discrete steps.

Here is the visual step-by-step logic for a single vertical column on the screen:

1. Start at the camera position.
2. Step forward along the camera's line of sight by a distance 'depth' (z).
3. Compute the 2D map coordinate (x, y) at this step.
4. Retrieve the terrain height (h) and color (c) from our heightmap and colormap at (x, y).
5. Project this 3D height onto the 2D screen using perspective division: 
   screen_y = (camera_height - terrain_height) / depth * scale + horizon
6. Draw a vertical line of color 'c' from this projected screen_y down to the previously drawn highest point.
7. Repeat for the next step forward, moving further into the distance.

Because we render from front to back, we keep track of the highest screen coordinate we've drawn so far for that column. If a sampled point in the distance projects to a screen height lower than our current horizon mask, it is hidden behind a mountain we've already drawn, and we can skip drawing it! This is a built-in, zero-cost occlusion culling (Z-buffer) mechanism.

Building Our Voxel Space Renderer in JavaScript

Let’s write a production-ready, highly optimized implementation using the HTML5 Canvas API. To keep things incredibly fast, we will bypass standard canvas drawing commands like lineTo or fillRect, which have high overhead. Instead, we'll manipulate the canvas's raw 32-bit pixel array directly using an ImageData buffer.

Step 1: Setting up the HTML Canvas

First, we need a standard HTML structure to host our renderer and a canvas element. We will use a relatively low resolution (e.g., 320x240 or 640x480). Part of the aesthetic of Voxel Space is its pixelated, retro charm, and keeping the resolution low ensures we easily hit 60 frames per second even on mobile browsers.

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Voxel Space Demo</title>
    <style>
        body { margin: 0; background: #000; overflow: hidden; display: flex; justify-content: center; align-items: center; height: 100vh; }
        canvas { width: 100vw; height: 100vh; image-rendering: pixelated; object-fit: contain; }
    </style>
</head>
<body>
    <canvas id="screen" width="320" height="240"></canvas>
    <script src="voxel.js"></script>
</body>
</html>

Step 2: Loading Map Data into Memory

To render the world, we must load our color and heightmaps and extract their raw byte arrays. In our JavaScript file, we'll write a helper to fetch these images and read them into Uint8ClampedArray buffers via an off-screen canvas.

const canvas = document.getElementById('screen');
const ctx = canvas.getContext('2d');
const width = canvas.width;
const height = canvas.height;

// Allocate screen buffer for direct pixel manipulation
const screenImg = ctx.createImageData(width, height);
const screenBuffer = new Uint32Array(screenImg.data.buffer);

const mapSize = 1024; // Assuming 1024x1024 textures
let colorMap = new Uint32Array(mapSize * mapSize);
let heightMap = new Uint8Array(mapSize * mapSize);

// Helper to load images
async function loadMap(url, isHeightMap) {
    return new Promise((resolve) => {
        const img = new Image();
        img.src = url;
        img.onload = () => {
            const tempCanvas = document.createElement('canvas');
            tempCanvas.width = mapSize;
            tempCanvas.height = mapSize;
            const tempCtx = tempCanvas.getContext('2d');
            tempCtx.drawImage(img, 0, 0);
            const imgData = tempCtx.getImageData(0, 0, mapSize, mapSize);
            
            if (isHeightMap) {
                // For heightmap, we only need one channel (e.g., Red) to get grayscale value (0-255)
                for (let i = 0; i < mapSize * mapSize; i++) {
                    heightMap[i] = imgData.data[i * 4];
                }
            } else {
                // For colormap, cast the entire RGBA buffer to 32-bit unsigned integers for fast writing
                const view = new Uint32Array(imgData.data.buffer);
                colorMap.set(view);
            }
            resolve();
        };
    });
}

Step 3: The Core Render Loop

Now, let’s look at the engine's core. We will define a camera object that tracks the player's 2D position (x, y), the camera height above sea level (height), the angle they are looking (angle), and the rendering distance (depth).

To optimize execution speed, we bypass typical floating-point trigonometry within the rendering loop and scan the screen horizontally. For each screen column, we interpolate coordinates across the field of view.

const camera = {
    x: 512.0,
    y: 512.0,
    height: 120.0,
    angle: 0.0,
    horizon: 60,
    distance: 400,
    scale_height: 120.0
};

function render() {
    // Clear the screen buffer with a nice sky color (e.g., light blue / purple)
    screenBuffer.fill(0xFFE0A030); // ABGR representation of color

    const sinAngle = Math.sin(camera.angle);
    const cosAngle = Math.cos(camera.angle);

    // Loop through every vertical column on the screen (from left to right)
    for (let i = 0; i < width; i++) {
        // Calculate the direction vector for this specific column's ray based on Field of View (FOV)
        const dx = (i / width - 0.5) * 2.0; 
        
        // Transform screen ray space to world space using camera rotation
        const rx = dx * cosAngle - sinAngle;
        const ry = dx * sinAngle + cosAngle;

        // Keep track of the highest pixel drawn so far for this column.
        // We initialize this to the bottom of the screen.
        let maxScreenY = height;

        // Step from front to back along the ray
        for (let depth = 1.0; depth < camera.distance; depth += 1.5) {
            // Find map coordinates corresponding to this depth step
            let mapX = Math.floor(camera.x + rx * depth) & (mapSize - 1);
            let mapY = Math.floor(camera.y + ry * depth) & (mapSize - 1);

            const mapOffset = mapY * mapSize + mapX;
            const terrainHeight = heightMap[mapOffset];
            const color = colorMap[mapOffset];

            // Project 3D point to 2D screen coordinate
            const projectedHeight = Math.floor((camera.height - terrainHeight) / depth * camera.scale_height + camera.horizon);

            // Bounds check
            if (projectedHeight < 0) continue;

            // If the projected point is higher up on the screen (smaller Y value) than our previous maximum,
            // we have found a visible vertical segment of terrain!
            if (projectedHeight < maxScreenY) {
                // Draw a vertical line from the projected height down to the previously drawn highest point
                for (let y = projectedHeight; y < maxScreenY; y++) {
                    screenBuffer[y * width + i] = color;
                }
                // Update our occlusion mask
                maxScreenY = projectedHeight;
            }
        }
    }

    // Put our modified pixel buffer onto the visible canvas
    ctx.putImageData(screenImg, 0, 0);
}

Step 4: Making It Interactive

What good is a beautiful retro world if we can't explore it? We can hook up standard event listeners to standard WASD or arrow keys to modify our camera positions and trigger updates.

const keys = {};
window.onkeydown = (e) => keys[e.key] = true;
window.onkeyup = (e) => keys[e.key] = false;

function update() {
    if (keys['ArrowLeft'] || keys['a'])  camera.angle -= 0.03;
    if (keys['ArrowRight'] || keys['d']) camera.angle += 0.03;
    
    const speed = 2.0;
    if (keys['ArrowUp'] || keys['w']) {
        camera.x += Math.cos(camera.angle) * speed;
        camera.y += Math.sin(camera.angle) * speed;
    }
    if (keys['ArrowDown'] || keys['s']) {
        camera.x -= Math.cos(camera.angle) * speed;
        camera.y -= Math.sin(camera.angle) * speed;
    }
    if (keys['r']) camera.height += 1.5;
    if (keys['f']) camera.height -= 1.5;

    // Keep camera bounds within map limits
    camera.x = (camera.x + mapSize) & (mapSize - 1);
    camera.y = (camera.y + mapSize) & (mapSize - 1);

    render();
    requestAnimationFrame(update);
}

// Kickstart engine after loading textures
async function init() {
    await Promise.all([
        loadMap('color.png', false),
        loadMap('height.png', true)
    ]);
    update();
}
init();

Why Does This Algorithm Matter Today?

You might look at Voxel Space and think, "Alex, this is a neat toy, but we have modern GPUs and WebGL. Why should I care about this in 2024?"

There are three critical lessons modern developers can take away from this retro technology:

1. CPU Cache Locality & Memory Efficiency

Modern CPUs are insanely fast, but RAM latency is still a massive bottleneck. The Voxel Space engine is incredibly cache-friendly. It accesses the heightmap and colormap in sequential memory locations as the ray steps forward, and writes to sequential memory locations inside the canvas buffer. By structuring your application's data layout to match memory layout (Data-Oriented Design), you get massive performance gains without relying on heavy parallelization.

2. The Power of Software Fallbacks

Sometimes, spinning up a heavy WebGL or WebGPU context is overkill. If you are building low-power IoT dashboards, digital signage, or ultra-lightweight retro games, being able to render interactive, beautiful 3D landscapes purely on the CPU using 2D Canvas is a massive advantage. It works on absolutely every device with zero driver issues, memory leaks, or context losses.

3. Algorithmic Thinking over Brute Force

Today, we tend to solve performance issues by throwing hardware at them—using faster servers, bigger GPUs, or throwing a multi-gigabyte library at a simple UI transition. Voxel Space is a stark reminder that smart, creative compromises in design (like assuming a fixed camera tilt and utilizing column-based scanning) can achieve performance jumps that would otherwise require orders of magnitude more computing power.

Conclusion: The Beauty of Doing More with Less

Voxel Space shows us that elegance in software development isn't about using the newest, most complex framework. It's about knowing your constraints, understanding your data, and writing clean, intentional code to solve a problem beautifully.

To try this out yourself, you just need to find a 1024x1024 heightmap (even a simple grayscale perlin-noise map will work) and a matching color texture. Drop the code above into an HTML file, run a local server (like python -m http.server), and fly through your own custom voxel world!

Over to you: Have you ever worked with retro rendering techniques or ray-casting engines? What is your favorite piece of elegant, old-school game design math? Let me know in the comments below, and don't forget to subscribe for more deep dives into retro-engineering and modern web performance!