Module 1

Why the Sandbox Works

Capsicum is easiest to understand if you stop thinking about permissions as a mood and start thinking about them as tickets. Once a process enters capability mode, it must show an explicit ticket for almost everything interesting.

1
Setup phase
2
Capability shaping
3
Sandboxed loop
Click Next Step to trace the handoff
🌍

Ambient access

Classic UNIX code can often reach out to the whole system namespace whenever it wants.

🎟️

Delegated handles

Capsicum turns file descriptors into authority tokens you can pass around and shrink.

🔒

Irreversible drop

Once the process enters capability mode, future descendant processes stay there too.

CODE

if ((fd = open(basedir, O_DIRECTORY | O_RDONLY | O_CLOEXEC)) < 0)
	err(EXIT_FAILURE, "open(%s)", basedir);

cap_rights_init(&rights, CAP_READ, CAP_FSTATAT, CAP_LOOKUP,
    CAP_FCNTL);
if (cap_rights_limit(fd, &rights) < 0 && errno != ENOSYS)
	err(EXIT_FAILURE, "cap_rights_limit");

if (cap_enter() < 0 && errno != ENOSYS)
	err(EXIT_FAILURE, "cap_enter");
        
PLAIN ENGLISH

Open one directory while the process still has full naming power.

Prepare a rights set that says what this directory handle is allowed to do later.

Shrink the handle permanently. It can never grow broader again.

Only after the handle is shaped do we enter capability mode.

💡
A subtle but crucial idea

Capsicum is not mainly about denying things at random. It is about forcing naming to happen early and forcing the steady-state part of the program to happen through explicit handles.

You are sandboxing a file-serving worker. Which phase should still be doing broad pathname discovery?

Module 2

Two Locks, Not One

A lot of Capsicum confusion comes from mixing up the process-wide wall with the per-descriptor wall. The kernel keeps those separate, and your design should too.

🧩
One new kernel term

`SYF_CAPENABLED` is just the kernel's way of labeling a syscall as safe to use in capability mode.

CODE

if ((se->sy_flags & SYF_CAPENABLED) == 0) {
	if (CAP_TRACING(td))
		ktrcapfail(CAPFAIL_SYSCALL, NULL);
	if (IN_CAPABILITY_MODE(td)) {
		td->td_errno = error = ECAPMODE;
		goto retval;
	}
}
        
PLAIN ENGLISH

Look at the syscall entry in the kernel table.

If that syscall is not marked capability-enabled, record the violation if tracing is on.

If the process is sandboxed, stop right there and return ECAPMODE.

CODE

if (!cap_rights_contains(havep, needp)) {
	if (CAP_TRACING(curthread))
		ktrcapfail(type, rights);
	return (ENOTCAPABLE);
}
return (0);
        
PLAIN ENGLISH

Compare the rights this fd has with the rights the operation needs.

If the smaller set does not fit inside the larger one, record the violation for tracing.

Return ENOTCAPABLE because the handle is too weak for the requested action.

ECAPMODE Your code still depends on ambient namespace power.
ENOTCAPABLE Your architecture is close, but the delegated handle is missing rights.
kern.trap_enotcap Useful debugging knob that turns capability failures into immediate, catchable traps.

A sandboxed process calls a syscall that the kernel has not marked as capability-safe. What kind of fix should you be thinking about first?

Module 3

Naming Without Namespaces

The hardest mental shift is that pathnames stop being a free-floating string problem. In capability mode, a path is only meaningful relative to a delegated directory handle.

delegated-root/ The directory fd the worker was explicitly given
packagesite.pkgCan be opened with `openat()` if rights allow it
latest/Still reachable because it is beneath the delegated root
../etc/passwdNot a real option. Escape attempts become capability violations.
CODE

if (IN_CAPABILITY_MODE(td)) {
	ndp->ni_lcf |= NI_LCF_STRICTREL;
	ndp->ni_resflags |= NIRES_STRICTREL;
	if (ndp->ni_dirfd == AT_FDCWD)
		return (ECAPMODE);
}
        
PLAIN ENGLISH

When the caller is sandboxed, switch pathname lookup into strict relative mode.

Mark the result as a strict-relative lookup too.

If the caller tried to use AT_FDCWD, reject the request immediately.

CODE

if (chdir(_PATH_RWHODIR) < 0)
	err(1, "chdir(%s)", _PATH_RWHODIR);
if ((dirp = opendir(".")) == NULL)
	err(1, "opendir(%s)", _PATH_RWHODIR);
dfd = dirfd(dirp);
cap_rights_init(&rights, CAP_READ, CAP_LOOKUP);
if (caph_rights_limit(dfd, &rights) < 0)
	err(1, "cap_rights_limit failed: %s", _PATH_RWHODIR);
        
PLAIN ENGLISH

Move into the directory while broad lookup still works.

Open the directory and turn it into a stable handle.

Shrink that handle so it can only read and look up entries beneath it later.

1

Open a directory before the sandbox starts.

2

Give it `CAP_LOOKUP` plus the exact rights needed for relative `*at()` operations such as `openat()` and `fstatat()`

3

Use only relative names rooted at that descriptor.

⚠️
Common porting trap

Changing `open()` to `openat()` is not enough by itself. The real change is that the directory fd becomes part of the security boundary, so you must choose it carefully and limit it carefully.

You are sandboxing a worker that may read only one subtree. What is the most capability-native interface to give it?

Module 4

Patterns from the Tree

The FreeBSD tree does not use one giant Capsicum template. It uses a handful of repeatable patterns chosen to match the shape of each program.

📁

Directory-rooted worker

`pkg-serve` and `rwho` operate inside one delegated subtree using `openat()` and similar calls.

🌐

Pre-connected sockets

`traceroute` does naming and connection setup before entering capability mode, then keeps only send and receive rights.

🧰

Helper-assisted stdio

`capsicum_helpers.h` cuts boilerplate for common stream setups and cache preloads.

🛰️

Brokered global services

`syslogd` uses Casper services instead of reopening broad namespaces from inside the sandbox.

CODE

caph_cache_catpages();

if (cansandbox && cap_enter() < 0) {
	if (errno != ENOSYS) {
		Fprintf(stderr, "%s: cap_enter: %s\n", prog,
		    strerror(errno));
		exit(1);
	} else {
		cansandbox = false;
	}
}
        
PLAIN ENGLISH

Warm up localized message data first so later error handling does not touch the filesystem.

Try to enter the sandbox once setup is done.

If the kernel does not support Capsicum, degrade gracefully for this utility.

CODE

cap_casper = cap_init();
if (cap_casper == NULL)
	err(1, "Failed to communicate with libcasper");
cap_syslogd = cap_service_open(cap_casper, "syslogd.casper");
if (cap_syslogd == NULL)
	err(1, "Failed to open the syslogd.casper libcasper service");
        
PLAIN ENGLISH

Connect to a brokered capability service before or around sandbox entry.

Open the specific service channel the daemon will rely on later.

From this point on, the daemon can ask the broker for narrow help instead of reopening global access directly.

🧭
Pattern selection is architecture selection

If your code still needs broad naming during the main loop, do not fight the sandbox. Split the program into a broker and a worker so the narrow phase becomes obvious.

A long-lived sandboxed daemon sometimes needs hostname resolution well after startup. Which pattern from the tree is the best fit?

Module 5

A Porting Playbook

By this point the individual tools should feel less mysterious. The remaining question is how to sequence the work so an existing application becomes capability-friendly without turning into chaos.

1

List every ambient dependency: paths, DNS, user databases, ioctls, and PIDs.

2

Split setup from steady-state work.

3

Pre-open and limit descriptors.

4

Enter capability mode and debug the remaining violations.

CODE

caph_cache_catpages();
caph_cache_tzdata();

/*
 * Cache UTX database fds.
 */
setutxent();

if (caph_enter() < 0)
	err(1, "cap_enter");
        
PLAIN ENGLISH

Warm up locale and timezone data first.

Force the utmp database handles to exist before the sandbox starts.

Only then drop the process into capability mode.

Scenario

You inherited a daemon that mixes config loading, file opens, DNS lookups, and packet handling inside one giant loop. You want Capsicum without a month-long rewrite.

🪓

First cut

Separate startup discovery from long-lived packet handling, even before you add the actual Capsicum calls.

🔬

Second cut

Instrument failures as `ECAPMODE` vs `ENOTCAPABLE`. That tells you whether to move code or just widen one rights set a little.

🧱

Third cut

If late global operations remain, build a broker boundary instead of weakening the worker.

💡
The real payoff

Capsicum-friendly structure is also easier to reason about with humans and with AI coding tools, because resource acquisition and steady-state work stop being tangled together.

You are starting a Capsicum port of a messy existing daemon. What is usually the highest-leverage first refactor?