|
KEEL 1.0.0
Minimal C11 HTTP client/server library built on epoll/kqueue/io_uring/poll
|
Minimal C11 HTTP client/server library built on raw epoll/kqueue/io_uring/poll. Both the server and client support sync and async operation — sync handlers return immediately, async handlers suspend and resume via the event loop; the client offers both a blocking API and an event-driven API. Pluggable allocator, pluggable HTTP parser, pluggable TLS, pluggable body readers, per-route middleware, streaming responses, multipart uploads, connection timeouts, thread pool, zero forced buffering.
101K req/s on a single thread. 671 tests (40 suites) with ASan/UBSan. One vendored dependency (llhttp).
KlConfig.parser:param capture, no allocation, pointers into read bufferKlAsyncOp, resume later from watchers or thread pool workers — no stalling the event loopkl_client_request() for simple use cases, event-driven kl_client_start() for non-blocking I/O, both with TLS supportKlEventCtx decouples the event loop from the server, enabling standalone clients and thread pools without a KlServerKlWatcherkl_sse_event, kl_sse_comment) over chunked streamingContent-Encoding: gzip response decompressionKlDrain buffers unsent data on would-block, flushes on write-readinesskl_strerror()kl_server_stats() for connection counts, enabling user-space load-shedding middleware31 orthogonal modules, each independently testable:
| Module | Header | Description |
|---|---|---|
| allocator | allocator.h | Bring-your-own allocator interface |
| event | event.h | epoll / kqueue / io_uring / poll abstraction |
| event_ctx | event_ctx.h | Composable event loop context (watchers + allocator) |
| request | request.h | Parsed HTTP request struct (header-only, zero alloc) |
| parser | parser.h | Pluggable request/response parser vtables |
| response | response.h | Response builder: buffered, sendfile, or streaming chunked |
| router | router.h | Route matching with :param capture + middleware chain |
| connection | connection.h | Pre-allocated connection pool + state machine |
| server | server.h | Top-level glue: init, bind, async event loop, stop |
| body_reader | body_reader.h | Pluggable body reader vtable + buffer reader |
| body_reader_multipart | body_reader_multipart.h | RFC 2046 multipart/form-data parser |
| chunked | chunked.h | Parser-agnostic chunked transfer-encoding decoder |
| cors | cors.h | Built-in CORS middleware with configurable origins |
| tls | tls.h | Pluggable TLS transport vtable (bring-your-own backend) |
| async | async.h | Connection suspension for async operations |
| thread_pool | thread_pool.h | Worker thread pool with pipe-based event loop wakeup |
| url | url.h | URL parser (http/https/ws/wss, IPv6, CRLF injection guard) |
| client | client.h | HTTP/1.1 client (sync blocking + async event-driven) |
| websocket | websocket.h + websocket_server.h | RFC 6455 WebSocket server (shared frame parser + server API) |
| websocket_client | websocket_client.h | RFC 6455 WebSocket client (masked frames, async handshake) |
| h2 | h2.h + h2_server.h | HTTP/2 server (pluggable session vtable) |
| h2_client | h2_client.h | HTTP/2 client (multiplexed streams, pluggable session) |
| resolver | resolver.h | Pluggable async DNS resolver vtable |
| sse | sse.h | Server-Sent Events: line framing over chunked streaming (zero alloc) |
| error | error.h | Diagnostic error codes (KlError enum) + kl_strerror() |
| timer | timer.h | One-shot timer scheduling on KlEventCtx (min-heap) |
| client_pool | client_pool.h | HTTP client connection pool with keep-alive reuse |
| redirect | redirect.h | Automatic 3xx redirect following (RFC 7231/7538) |
| compress | compress.h | Pluggable response compression vtable (buffer + streaming) |
| decompress | decompress.h | Pluggable response decompression vtable (client-side) |
| drain | drain.h | Backpressure write buffer with on_drain callback |
| file_io | file_io.h | Pluggable async file I/O vtable (io_uring backend) |
| resolver_cache | resolver_cache.h | Caching DNS resolver decorator with configurable TTL/capacity |
Deliberate design choices:
KlThreadPool offloads blocking work to workers; multi-core scaling is horizontal via SO_REUSEPORT with multiple processes.memcmp scan over even hundreds of routes costs nanoseconds, invisible next to network I/O syscalls. A trie or radix tree would add complexity to param extraction and middleware pattern matching for no measurable gain.max_connections = 256 (default), this is a tight loop over a contiguous array well within L1 cache.kl_server_stats() exposes connection counts for user-space middleware to make load-shedding decisions. Thresholds and Retry-After policy belong in application code, not the framework.max_body_size, max_header_size, KlDrain.max_size) bound the main vectors; OS-level OOM handling covers the rest.KEEL uses a vtable-based body reader interface. Register a body reader factory per-route — the connection layer creates the reader after headers are parsed, feeds it data as it arrives, and makes the finished reader available in the handler via req->body_reader.
Built-in buffer reader — accumulates the body into a growable buffer:
Pass NULL as the body reader factory for routes that don't accept a body. If a request with a body arrives on a route with no reader, KEEL discards the body. If the reader factory returns NULL, KEEL sends 415 Unsupported Media Type.
Custom readers — implement the KlBodyReader vtable (on_data, on_complete, on_error, destroy) and provide a factory function.
Register middleware that runs before handlers. Middleware can inspect/modify the request and response, or short-circuit the chain by returning a non-zero value (e.g., to reject unauthenticated requests).
Handles Access-Control-Allow-Origin, Allow-Credentials, and automatically responds to OPTIONS preflight requests with 204 + all required CORS headers.
Middleware uses the same (KlRequest *, KlResponse *, void *) signature. Return 0 to continue, non-zero to short-circuit:
Logging middleware:
Auth middleware:
Request context passing (middleware → handler):
/* are prefix matches: /api/* matches /api, /api/users, /api/users/123/* are exact matches: /health matches only /health"*" matches any HTTP method; "GET" also matches HEAD requestsRegister a WebSocket endpoint and get bidirectional communication:
The WebSocket server module handles frame parsing, masking, and protocol details. The handler receives callbacks for each message — use kl_ws_server_send_text() or kl_ws_server_send_binary() to reply.
Server handlers can be sync (return immediately with a response set) or async (suspend the connection for later resumption). KEEL provides two primitives for async handlers: KlWatcher (generic FD callbacks) and KlAsyncOp (connection suspension). Together they allow handlers to park a connection, perform work asynchronously, and resume when done — without stalling the event loop.
The watcher callback runs on the event loop thread, making it safe to call kl_async_complete() which re-registers the connection FD and drives the state machine forward.
KlThreadPool bridges blocking work (SQLite queries, file I/O, DNS, crypto) and the single-threaded event loop. Submit work items from the event loop, execute on worker threads, resume connections via pipe wakeup.
Each KlWorkItem has three callbacks:
| Callback | Thread | Purpose |
|---|---|---|
work_fn | Worker | Execute blocking work |
done_fn | Event loop | Resume connection (called via pipe watcher) |
cancel_fn | Event loop | Cleanup for items still queued at shutdown (may be NULL) |
Thread safety is guaranteed by construction — workers never touch the event loop directly. They push completed items to a done queue and write a byte to a pipe; the pipe watcher fires on the event loop thread and calls done_fn. Backpressure: submit() returns -1 when the queue is full.
KEEL includes both sync (blocking) and async (event-driven) HTTP/1.1 clients with TLS support. The sync client is a single function call for simple use cases. The async client uses KlEventCtx (not KlServer), so it works standalone — no server required.
Sync (blocking):
Async (event-driven):
The URL parser (kl_url_parse) handles http://, https://, ws://, wss://, IPv6 [::1]:port, default ports, and rejects CRLF injection.
Zero-copy file responses via sendfile(2):
Uses sendfile(2) on Linux and macOS, with TCP_CORK coalescing on Linux for optimal throughput.
Write directly to the socket via chunked transfer encoding — zero intermediate buffering:
The KlWriteFn signature (int (*)(void *ctx, const char *data, size_t len)) is designed to be compatible with streaming JSON writers.
The router returns 200 (match), 405 (path matched, wrong method), or 404 (not found).
KEEL stamps each connection with a monotonic clock on every I/O event. A periodic sweep (every ~400ms) closes connections that have been idle longer than read_timeout_ms and sends a 408 Request Timeout response. This protects against slow-loris attacks and abandoned connections without affecting active transfers.
Set a callback in KlConfig and KEEL calls it after each response is fully sent. The callback receives the full request (method, path, headers), response status, body size, and wall-clock duration in milliseconds. Users implement their own formatting (JSON, CLF, custom). NULL = no logging, zero overhead.
The allocator interface passes size to free and old_size to realloc — enabling arena and pool allocators that don't store per-allocation metadata.
Ships with llhttp (default). Swap by setting KlConfig.parser:
Implement the 3-function KlRequestParser vtable (parse, reset, destroy) for any backend. The response parser (KlResponseParser) uses the same pattern for the HTTP client.
KEEL doesn't vendor any TLS library. Bring your own backend (BearSSL, LibreSSL, OpenSSL, rustls-ffi) by implementing the 7-function KlTls vtable:
The vtable interface (handshake, read, write, shutdown, pending, reset, destroy) wraps the transport layer. Everything above it — parser, router, middleware, body readers, handlers — works identically on plaintext and TLS connections.
When TLS is active, sendfile(2) falls back to pread + TLS write (encryption requires userspace access to plaintext). All other response modes (buffered, streaming) work transparently.
KEEL deliberately does not own your sandbox policy — that's an application concern. The server separates initialization (bind/listen) from the event loop (accept/read/write), so you can lock down syscalls and filesystem access between the two:
On Linux, use the pledge polyfill (seccomp-bpf + Landlock) for the same API. The key insight: KEEL's init/run split makes this natural — no library changes needed.
The benchmark suite runs 4 endpoints against a dedicated bench server:
| Endpoint | What it measures |
|---|---|
GET /hello | Baseline — minimal JSON, no routing params, no middleware |
GET /users/:id | Router — param extraction + snprintf response |
GET /mw/hello | Middleware — same response through 2 pass-through middleware |
POST /echo | Body reading — KlBufReader + echo body back |
Sample results (Apple M1 Max, single thread, 100 connections, kqueue):
| Endpoint | Req/sec | Avg Latency | p99 |
|---|---|---|---|
GET /hello (baseline) | 111,650 | 0.89ms | 1.13ms |
GET /users/42 (route params) | 109,112 | 0.91ms | 1.15ms |
GET /mw/hello (middleware chain) | 111,247 | 0.89ms | 1.14ms |
POST /echo (body reading) | 109,370 | 0.90ms | 1.15ms |
Route params, middleware, and body reading add no measurable overhead — all within ~2% of the baseline. No GC pauses. No goroutine scheduling. No async runtime overhead. Just kqueue → read → write.
| Platform | Backend | Build |
|---|---|---|
| macOS / BSD | kqueue (edge-triggered) | make |
| Linux | epoll (edge-triggered) | make |
| Linux 5.6+ | io_uring (POLL_ADD) | make BACKEND=iouring |
| Any POSIX | poll (level-triggered) | make BACKEND=poll |
| Linux (musl/Alpine) | epoll (edge-triggered) | make |
| Cosmopolitan (APE) | poll (auto-selected) | make CC=cosmocc |
| Bare-metal + lwIP | poll (via lwIP sockets) | make BACKEND=poll + -DKL_NO_SIGNAL |
The io_uring backend uses IORING_OP_POLL_ADD for readiness notification — a drop-in replacement for epoll with io_uring's batched submission advantage. Requires liburing-dev.
The poll backend is a universal POSIX fallback that works on any platform with poll(2). It enables Cosmopolitan C support (Actually Portable Executables that run on Linux, macOS, Windows, FreeBSD, OpenBSD, NetBSD from a single binary). When CC=cosmocc is detected, the Makefile automatically selects the poll backend.
For bare-metal targets (STM32, ESP32, etc.), link against lwIP or picoTCP — their BSD socket compatibility layers provide all the POSIX functions Keel uses (accept, read, write, close, poll, getaddrinfo). Compile with -DKL_NO_SIGNAL to disable POSIX signal handling, and exclude thread_pool.c from the build if no RTOS is available. See docs/comparison.md for details.
671 tests across 40 test suites, covering every module (678 on io_uring builds):
| Suite | Tests | Covers |
|---|---|---|
test_allocator | 4 | Default + custom tracking allocators |
test_async | 14 | Watchers (KlEventCtx), suspend/resume, deadlines, cancel, e2e async handler |
test_body_reader | 30 | Buffer + multipart: limits, spanning, binary, edge cases |
test_chunked | 17 | Chunked decoder: single/multi chunk, hex, extensions, trailers, errors |
test_client | 18 | Sync/async client, response free, TLS config, error handling |
test_client_pool | 24 | Connection pool: acquire/release, per-host limits, idle expiry, stale detection |
test_client_stream | 27 | Response streaming (push), request streaming (pull), chunked body production |
test_compress | 16 | Compression vtable, buffer + streaming, miniz gzip backend |
test_connection | 12 | Pool init, acquire/release, exhaustion, active count, state machine, monotonic clock |
test_cors | 17 | Config, origin whitelist, wildcard, preflight, credentials, middleware |
test_cross_module | 7 | Cross-module integration: compress+drain, TLS+async, middleware+body+async, resolver cache, TLS+middleware+compress, stats during load |
test_decompress | 14 | Decompression vtable, gzip one-shot + streaming, CRC/ISIZE verification |
test_drain | 28 | Backpressure buffer: passthrough, partial, EAGAIN, flush, on_drain, max_size, overreport |
test_error | 11 | Error codes, kl_strerror, per-struct error storage |
test_event | 8 | Event loop init/close, add/wait, del, multiple FDs, timeout, mod mask |
test_event_ctx | 7 | Standalone event context init/free, watcher lifecycle, dispatch helpers |
test_file_io | 14 | Async file I/O vtable: mock submit/cancel/tick, state machine, EAGAIN, TLS fallback |
test_file_io_iouring | 7 | io_uring integration: real IORING_OP_READ submissions, CQE routing, offset/EOF (io_uring builds only) |
test_h2 | 29 | HTTP/2 sessions, streams, routing, ALPN, goaway, body limits |
test_h2_client | 18 | Mock session vtable, stream tracking, response free, API validation |
test_integration | 27 | Full server: hello, POST, keepalive, multipart, chunked, middleware |
test_overflow | 20 | Integer overflow guards across all modules |
test_parser | 9 | GET, POST, query strings, incomplete, reset, chunked TE |
test_proxy | 11 | HTTP proxy: forwarding, CONNECT tunnel, auth, async proxy states, pool keying |
test_redirect | 33 | 3xx redirect following, method transform, cross-origin auth strip, pooled |
test_request | 14 | Header case-insensitive lookup, params, query strings, empty/missing values |
test_response | 24 | Status, headers, body, JSON, error, streaming, sendfile, compression |
test_response_parser | 10 | HTTP response parsing, chunked, headers, body limits, malformed |
test_router | 27 | Exact match, params, 404, 405, wildcard, middleware chain |
test_server_integration | 6 | Pool exhaustion, backpressure recovery, concurrent requests, drain |
test_server_stats | 4 | Server stats: initial, active count, max connections, null safety |
test_resolver_cache | 13 | DNS cache: hit/miss, TTL expiry, eviction, cancel, error non-caching |
test_sse | 7 | SSE framing: event, data, id, comment, multiline, begin/end |
test_thread_pool | 12 | Create/free, submit, backpressure, FIFO ordering, multi-worker, shutdown, stress |
test_timeout | 8 | Idle, partial headers, partial body, active connections, body timeout, keepalive idle, concurrent |
test_timer | 10 | Min-heap scheduling, cancellation, callback safety, next-timeout |
test_tls | 20 | TLS vtable, handshake FSM, response send/stream/file via mock, shutdown retry, pool teardown |
test_tls_integration | 3 | Passthrough TLS mock: full handshake→read→write path |
test_url | 20 | URL parsing, IPv6, CRLF rejection, default ports, ws/wss schemes |
test_websocket | 48 | Frame parsing, masking, opcode, fragments, close, echo, unmasked rejection |
test_websocket_client | 30 | Client frame encoding, mask XOR, handshake, parser, API, config, auto-ping |
An HTTP server at this level is mostly syscalls, pointer arithmetic, and state machines. C is a natural fit: direct writev/sendfile/epoll_wait access, zero-copy pointers into read buffers, explicit memory layout, no runtime. One vendored dependency (llhttp), 2-second clean builds, runs on everything from io_uring to bare-metal MCUs.
The tradeoff is real — C has no borrow checker, no bounds-checked slices, no RAII. We compensate with defense-in-depth:
malloc, no fragmentation)SIZE_MAX/2 overflow guards on all arithmetic, bounds checks at system boundariespledge()/unveil() sandboxing, -D_FORTIFY_SOURCE=2 -fstack-protector-strongThis is adequate for a focused ~14K LOC library with thorough testing, but it's not a language-level guarantee. If you're evaluating Keel and memory safety is your primary concern, that's a legitimate reason to look elsewhere.
KEEL is a transport library — it handles sockets, parsing, routing, and response serialization. Everything above the HTTP layer is an application concern:
access_log callback with method, path, status, body size, and duration — you bring the formatter. Hull provides a structured JSON logger middleware.kl_request_header() / kl_response_header() for the headers; your application handles 304 logic. Hull handles conditional responses for static assets.examples/static_files.c for the pattern. Hull auto-serves embedded or filesystem static assets with MIME detection.hull.middleware.csrf with automatic token generation and validation.Idempotency-Key header. Hull provides hull.middleware.idempotency with configurable TTL and response caching.The general principle: if it requires policy decisions that vary between applications, it belongs in application code, not in the transport library. KEEL provides the hooks (middleware, body readers, access log callback) — you provide the policy.
Three embedded C HTTP libraries compared. See docs/comparison.md for full details with API examples.
| Keel | Mongoose | GNU libmicrohttpd | |
|---|---|---|---|
| License | MIT | GPLv2 / Commercial | LGPLv2.1+ |
| LOC | ~14K | ~33K | ~19K |
| Architecture | 31 independent modules | Monolithic amalgam | Monolithic |
| Maturity | New (2025–2026) | 20+ years (NASA, Siemens, Samsung) | GNU project, 18+ years (NASA, Sony, systemd) |
| HTTP/2 | Server + client | No | No |
| Event backends | epoll, kqueue, io_uring, poll | select/poll only | select, poll, epoll |
| Router + middleware | Built-in with :param capture | None (DIY if/else) | None (single callback) |
| HTTP client | Sync + async + streaming + H2 | Basic client | Server only |
| Allocator | Runtime vtable (bring-your-own) | Compile-time macros | None (raw malloc) |
| TLS | Pluggable vtable — any backend | Built-in TLS 1.3 + pluggable | GnuTLS only |
| Compression | Pluggable vtable (gzip + extensible) | No | No |
| Threading | Single-threaded + thread pool | Single-threaded | 4 modes incl. thread-per-connection |
| Bare-metal MCU | Via lwIP/picoTCP (BSD sockets) | Built-in TCP/IP stack | Requires OS networking |
| Cosmopolitan C | Supported (APE binaries) | No | No |
| Tests | 671 (40 suites) | ~4K LOC tests | Fewer relative to size |
Choose Keel when you want MIT licensing, HTTP/2, a built-in router/middleware/client, and pluggable everything. Choose Mongoose when you're targeting bare-metal MCUs with no OS, need a built-in TCP/IP stack, or need battle-tested maturity. Choose libmicrohttpd when you need multi-threaded request handling, independently audited security, or wide distro packaging.
GitHub Actions runs on every push and PR against main:
A separate benchmark workflow runs on push to main (informational, not gating).
MIT