How many times have you been on a flight, in a remote cabin, or stuck in a basement server room with zero internet access, wishing you had the complete, interactive documentation for that complex framework you’re wrangling? We’ve all been there. We try saving pages as PDFs, running wget --mirror, or using pocket apps, only to find the CSS is broken, the local fonts are missing, and the interactive JavaScript elements are completely dead.
Enter Kage (pronounced kah-geh, meaning "shadow" in Japanese), a newly open-sourced tool that has been making waves on Hacker News. Kage isn't just another command-line web scraper. It compiles an entire website—assets, scripts, styles, and markup—into a single, self-contained, executable Go binary. You run the binary, and it spins up a local web server hosting a perfect, offline "shadow" of the site.
For developers, DevOps engineers, and system administrators, this is incredibly powerful. Let’s dive deep into how Kage works, its architectural design, why it beats traditional archiving tools, and how you can use it in your daily workflow.
The Problem with Traditional Web Archiving
To appreciate why Kage is such a breath of fresh air, we need to look at the shortcomings of existing tools. For decades, developers have relied on tools like wget, curl, or HTTrack to scrape websites for offline use. While these tools are classic, they fail spectacularly on modern web applications for three major reasons:
- Asset Fragmentation: A scraped site leaves you with a messy directory containing thousands of HTML, CSS, JS, and image files. Moving, sharing, or managing this directory is a headache.
- Absolute vs. Relative Paths: Classic scrapers often struggle to rewrite URLs dynamically. If a script loads a resource via an absolute path or a CDN, the offline version breaks the moment your network connection drops.
- The Local File Origin Policy: Opening a raw
index.htmldirectly from your file system (using thefile://protocol) triggers CORS and security blocks in modern browsers. Active JS components and API simulations simply won't run.
Kage solves all of this by utilizing a different paradigm. Instead of saving a folder of assets to your disk, it packages those assets directly into a compiled Go binary using Go’s native virtual file system embedding. When you execute the binary, it launches a lightweight, local HTTP server that serves these assets over localhost, bypassing browser security limitations and ensuring 100% fidelity.
How Kage Works: Under the Hood
Kage's architecture is elegant and leverages the power of Go’s compiler ecosystem. The process can be broken down into three distinct phases: Crawling, Embedding, and Compilation.
1. Deep Crawling and Rewriting
Kage starts by recursively crawling the target URL. It doesn't just download HTML; it parses the DOM to discover stylesheet links, script tags, images, web fonts, and manifest files. Critically, it rewrites these asset references to point to local, relative paths, ensuring that no external requests leak out to the WAN when the shadow site is run.
2. Go embed Generation
Once the assets are downloaded and structured, Kage generates a temporary Go workspace. It uses Go's embed package (introduced in Go 1.16), which allows programs to include arbitrary files and directories in the compiled binary at build time. Kage writes a small, customized Go web server wrapper around these embedded files.
3. Static Compilation
Finally, Kage calls the Go compiler to build a statically linked binary for your target architecture (e.g., Linux, macOS, or Windows). Because Go binaries are self-contained and don't rely on dynamic system libraries, the resulting file is completely portable.
Here is a conceptual diagram of Kage's workflow:
[ Target Website ]
│
▼ (1. Crawl & Rewrite)
[ Local Asset Tree (HTML, JS, CSS, Web Fonts) ]
│
▼ (2. Code Generation)
[ Temp Go Source with //go:embed ]
│
▼ (3. Go Compiler)
[ Statically Linked Binary (e.g., kage-site-amd64) ]
│
▼ (Execution)
[ Spins up Local HTTP Server on Port 8080 ] ---> Perfect Offline UX
Setting Up Kage: A Practical Walkthrough
Let’s get our hands dirty. Since Kage is written in Go, you’ll need the Go toolchain installed on your machine (version 1.20 or later is recommended). Let's install Kage and use it to shadow a documentation site.
Step 1: Installation
You can install Kage directly via go install:
go install github.com/username/kage@latest
(Note: Replace the repository URL with the official Kage repository path once you download it from the GitHub release page).
Step 2: Shadowing a Site
Let's say we want to shadow a static documentation site, such as a local Hugo blog or a framework's reference guide. We run the kage command, passing the target URL and the desired output name for our binary:
kage clone --url https://docs.example.com --output my-docs-app
Kage will output its progress to the terminal as it crawls the site, resolves assets, generates the Go embedding code, and compiles the binary:
[+] Crawling https://docs.example.com...
[+] Found 142 assets (HTML, JS, CSS, PNG)
[+] Rewriting asset paths for offline compatibility...
[+] Generating Go source harness...
[+] Compiling binary 'my-docs-app' for darwin/arm64...
[+] Success! Binary created: ./my-docs-app (Size: 14.2 MB)
Step 3: Running Your Offline Shadow
You now have a single, executable file named my-docs-app. You can copy this file to a USB drive, send it to a colleague, or drop it into a secure, air-gapped environment. To view the site, simply run the binary:
./my-docs-app --port 9090
Open your browser and navigate to http://localhost:9090. You are now browsing the target website entirely offline, served by a highly optimized, concurrent Go web server.
Real-World Developer Use Cases
While having offline documentation is the most obvious use case, Kage opens up several fascinating possibilities for DevOps and development teams:
1. Air-Gapped Environments and Secure Enclaves
If you work in banking, defense, or critical infrastructure, your production environments are likely air-gapped (completely disconnected from the public internet). Installing dependencies or reading documentation in these environments is notoriously difficult. With Kage, you can shadow essential API wikis, internal library documentations, or compliance guides into single binaries, verify their hashes, and safely run them inside the secure perimeter.
2. Zero-Dependency Demos for Clients
If you are showcasing a frontend web design or static prototype to a client, relying on conference Wi-Fi is a recipe for disaster. Instead of setting up a local Node.js environment or Docker containers on your presenter laptop, you can compile the entire prototype into a single binary. You can run it on any machine with a single click, completely independent of local runtimes or environment variables.
3. Archiving Legacy Internal Tools
Every engineering team has that legacy internal tool—the one running on an old server that everyone is afraid to touch. If the tool is largely informational or static, you can use Kage to take a snapshot "shadow" of it, compile it, and store it in your team’s shared drive or S3 bucket as a permanent, immutable archive before turning off the expensive legacy server.
Comparing Kage with the Alternatives
To help you decide when to use Kage versus traditional tools, here is a quick comparison matrix:
| Feature | Kage | Wget / HTTrack | Docker Container |
|---|---|---|---|
| Output Format | Single Compiled Binary | Folder of loose files | Docker Image |
| Host Dependencies | None (Self-contained) | Browser or Local Server | Docker Engine installed |
| Network Footprint | Zero (completely local) | Local, but pathing can leak | Zero (after download) |
| Ease of Sharing | Excellent (single file) | Poor (requires zipping) | Moderate (requires registry) |
| Memory Footprint | Very Low (~10-15MB RAM) | N/A (Browser dependent) | Medium to High |
Security Considerations
When using Kage, security should remain top of mind. Because Kage embeds JavaScript files directly from the target site into a binary that runs on your local machine, you must only shadow trusted websites.
If you shadow a malicious site or a site that has been compromised with Cross-Site Scripting (XSS), that malicious JS will execute within your local browser context when you visit localhost. Additionally, because the local server acts as an open port on your machine, always ensure Kage binds to 127.0.0.1 (which it does by default) rather than 0.0.0.0, to prevent exposing your offline shadow to others on your local network.
Wrapping Up: Shadow All the Things
Kage is an excellent showcase of the modern Go philosophy: taking complex operations, packaging them into a single, predictable, compile-once-run-anywhere artifact, and improving developer quality of life. It bridges the gap between raw web scraping and Docker-style application packaging, giving us an incredibly lightweight way to keep our favorite web resources close at hand, no matter where our offline adventures take us.
Have you tried Kage yet? What documentation or legacy sites are you planning to shadow? Let me know in the comments below, or share your experiences with offline dev setups on our community forum!
Until next time, happy coding! — Alex