If you've ever written software for constrained devices—whether it's an STM32 microcontroller running bare-metal C or an ESP32 managing a smart home sensor—you know the absolute terror of the heap. In the world of embedded systems, dynamic memory allocation (malloc and free) is the enemy. It introduces non-deterministic execution times, risks memory fragmentation, and, in worst-case scenarios, leads to catastrophic out-of-memory crashes in devices that are supposed to run untouched for a decade.
That is why the embedded security world just got a major upgrade. wolfSSL recently announced the release of wolfCOSE, a zero-allocation, lightweight implementation of the CBOR Object Signing and Encryption (COSE) standard written in embedded C.
Today, we are going to dive deep into what COSE is, why zero-allocation library design is a massive win for firmware engineers, and how you can leverage wolfCOSE to secure your resource-constrained Internet of Things (IoT) deployments without breaking your memory budget.
What is COSE (and Why Should Web Devs and IoT Engineers Care)?
To understand why wolfCOSE is a big deal, we first need to look at the standard it implements. If you come from a web development background, you are likely intimately familiar with JOSE (JSON Object Signing and Encryption). JOSE gives us JSON Web Tokens (JWTs), JSON Web Signatures (JWS), and JSON Web Encryption (JWE). It's how we securely pass identity claims between microservices and APIs.
But JSON is incredibly verbose. It is text-based, requires parsing strings, and consumes a lot of bandwidth and RAM. In the resource-constrained world of IoT, where microcontrollers might have only 16KB of RAM and communicate over low-bandwidth networks like LPWAN, NB-IoT, or LoRaWAN, JOSE is dead on arrival.
Enter CBOR (Concise Binary Object Representation), defined in RFC 8949. Think of CBOR as a binary equivalent of JSON. It is incredibly compact and trivial to parse in code.
COSE (RFC 9052) is to CBOR what JOSE is to JSON. It provides a standardized binary format for:
- Signing data (digitally signing firmware updates or sensor telemetry)
- Encrypting data (protecting payloads over the wire)
- Message authentication codes (MACs) for integrity verification
COSE is the backbone of modern lightweight security standards, including EDHOC (Ephemeral Diffie-Hellman over COSE) and wakaama (LwM2M), and is increasingly used for securing firmware-over-the-air (FOTA) updates.
The Power of "Zero-Allocation" C
So, why is wolfSSL’s focus on a "zero-allocation" (zero-alloc) stack such a headline feature?
In standard application programming (think Node.js, Go, or desktop C++), we don't think twice about creating objects on the fly. The runtime or OS manages the heap for us. In embedded systems, however, relying on dynamic memory allocation is highly discouraged for several critical reasons:
- Fragmentation: Over time, repeatedly allocating and freeing chunks of memory of varying sizes leaves holes in your heap. Eventually, a request for a contiguous block of memory fails, even if your total free memory is technically sufficient. Your device halts or panics.
- Determinism:
malloc()does not run in constant time ($O(1)$). Depending on how fragmented the heap is, searching for a free block can take variable amounts of time, which ruins real-time guarantees (RTOS). - Security: Heap-based buffer overflows and use-after-free bugs are primary vectors for remote code execution vulnerabilities. Eliminating the heap drastically reduces your attack surface.
A "zero-alloc" stack like wolfCOSE works entirely on the stack or within statically pre-allocated memory buffers provided by the caller. The library never calls malloc() or realloc(). This design guarantees a deterministic memory footprint, which can be validated at compile time.
Under the Hood: How wolfCOSE Achieves Zero-Allocation
How do you parse complex cryptographic structures without allocating memory? The magic lies in how wolfCOSE handles CBOR parsing and cryptographic operations.
Instead of building an abstract syntax tree (AST) of the CBOR object in memory (which would require dynamic nodes), wolfCOSE uses a single-pass, streaming-style parser. As it reads the binary stream, it tracks state using a small, fixed-size context struct on the stack.
Let's look at a conceptual architectural layout of how a zero-alloc COSE signature verification pipeline works:
+-------------------------------------------------------------+
| Stack Memory |
| |
| [ wolfCOSE_Sign Context ] <-- Holds state, algorithm IDs |
| [ wolfCrypt Context ] <-- Ephemeral crypto state |
| [ Static Buffers ] <-- Buffer keys & parsed hashes |
+-------------------------------------------------------------+
^
| No Heap Allocations!
v
+-------------------------------------------------------------+
| Flash Memory (ROM) |
| |
| [ Payload to Sign ] ---> [ CBOR Encoder ] ---> [ Output ] |
+-------------------------------------------------------------+
Writing Code with wolfCOSE: A Practical Example
To give you a feel for how this works in practice, let's look at how you would initialize and sign a payload using wolfCOSE. Since it is designed to work seamlessly with wolfCrypt (wolfSSL's underlying cryptography engine), it leverages hardware acceleration if your microcontroller supports it (like STM32 Cryptographic Co-processor or ESP32 Crypto hardware).
Here is an example of how you might sign a simple temperature sensor reading using a COSE Sign1 structure (a single signer signature):
#include <stdio.h>
#include "wolfssl/wolfcrypt/cose.h"
#include "wolfssl/wolfcrypt/ecc.h"
/* Define static buffers instead of using malloc */
#define PAYLOAD_SIZE 64
#define OUT_BUFFER_SIZE 256
int sign_sensor_data(void) {
int ret;
byte payload[PAYLOAD_SIZE];
byte outBuf[OUT_BUFFER_SIZE];
word32 outLen = OUT_BUFFER_SIZE;
/* Example sensor payload: CBOR encoded {"temp": 22.5} */
/* For brevity, we are using raw bytes representing CBOR */
byte cborPayload[] = { 0xA1, 0x64, 't', 'e', 'm', 'p', 0xF9, 0x49, 0xA0 };
word32 payloadLen = sizeof(cborPayload);
/* Allocate our cryptographic keys on the stack */
ecc_key privateKey;
WC_RNG rng;
wc_CoseSign signCtx;
/* Initialize the wolfCrypt Random Number Generator */
ret = wc_InitRng(&rng);
if (ret != 0) return ret;
/* Initialize and generate a temporary ECC key (NIST P-256) */
ret = wc_ecc_init(&privateKey);
if (ret != 0) return ret;
ret = wc_ecc_make_key(&rng, 32, &privateKey);
if (ret != 0) return ret;
/* Initialize the wolfCOSE context - ZERO dynamic allocation here */
ret = wc_CoseSign_Init(&signCtx, COSE_INIT_DEFAULTS);
if (ret != 0) return ret;
/* Set our signing parameters */
ret = wc_CoseSign_SetAlgorithm(&signCtx, COSE_ALG_ES256);
ret = wc_CoseSign_SetKey(&signCtx, &privateKey);
/* Perform the signing operation */
/* This builds the COSE_Sign1 structure directly into our static outBuf */
ret = wc_CoseSign_Write(&signCtx, cborPayload, payloadLen,
outBuf, &outLen, &rng);
if (ret == 0) {
printf("Success! Generated COSE Sign1 envelope of %d bytes.\n", outLen);
/* 'outBuf' now contains the secured, binary-encoded telemetry packet */
} else {
printf("Signing failed with error code: %d\n", ret);
}
/* Clean up stack structures */
wc_CoseSign_Free(&signCtx);
wc_ecc_free(&privateKey);
wc_FreeRng(&rng);
return ret;
}
Why this approach is incredibly safe:
Notice that we initialized everything—the random number generator (RNG), the ECC key (privateKey), and the COSE context (signCtx)—on the stack. The wc_CoseSign_Write function writes the final binary output directly into outBuf. If the buffer is too small, the library simply returns an error code instead of overflowing or attempting to expand the buffer dynamically. This is defensive, deterministic C programming at its best.
The Broader Impact: Securing the Edge
As microcontrollers become more connected, they are increasingly targeted by botnets (like Mirai) and sophisticated advanced persistent threats (APTs). Securing edge-to-cloud communication is no longer optional.
By bringing a zero-allocation COSE implementation to market, wolfSSL is lowering the barrier to entry for robust security. Firmware developers no longer have to choose between standard-compliant security and the stability of their application runtime. You can now easily implement secure boot, secure firmware updates (FOTA), and authenticated sensor telemetries on systems with only a few kilobytes of RAM.
Wrapping Up
The release of wolfCOSE highlights a growing and welcome trend in the developer ecosystem: shifting away from complex, bloated security runtimes toward deterministic, resource-conscious, and highly specialized micro-libraries.
If you are building IoT firmware, working on embedded systems, or designing telemetry protocols for resource-constrained environments, you should absolutely take a look at wolfCOSE. It gives you all the security guarantees of the JOSE/JWT ecosystem but optimized for the brutal, resource-starved realities of bare-metal hardware.
Over to you: Have you had to implement cryptography on constrained devices? What are your strategies for managing memory fragmentation in C? Let me know in the comments below!
If you enjoyed this deep dive, don't forget to subscribe to the "Coding with Alex" newsletter to get the latest systems programming, DevOps, and security guides delivered straight to your inbox.