Linux Fu: Don’t Share Well With Others

In kindergarten, you learn that you should share. But for computer security, sharing is often a bad thing. The Linux kernel introduced the concept of namespaces starting with version 2.6.24. That’s been a few years ago, but namespaces are not used by many even though the tools exist to manipulate them. Granted, you don’t always need namespaces, but it is one of those things that when you do need it, the capability is priceless. In a nutshell, namespaces let you give a process its own private resources and — more importantly — prevents a process from seeing resources in other namespaces.

Turns out, you use namespaces all the time because every process you run lives in some set of namespaces. I say set, because there are a number of namespaces for different resources. For example, you can set a different network namespace to give a process its own set of networking items including routing tables, firewall rules, and everything else network-related.

So let’s have a look at how Linux doesn’t share names.

The possible namespaces are:

  • Mount – File system mounts. It is possible to share mounts with other namespaces, but you have to do so explicitly.
  • UTS – This namespace controls things like hostname and domain name.
  • IPC – A program with a separate IPC namespace will have its own message queues, semaphores, shared memory, and other interprocess communications items.
  • Network – Processes in the namespace will have their own networking stacks and related configurations.
  • PID – Processes in a PID namespace can’t see other processes outside the namespace.
  • Cgroup – A namespace that provides a virtualized view of the cgroup mounts for CPU management.
  • User – Individual users, groups, etc.

Obviously, some of these are more useful than others. It is easy to see, however, that if you had a system of cooperating programs, you might find it attractive to create a private space for IPC or networking between them.

Go to Shell

If you want to experiment with namespaces from the shell, you can use unshare. The name might seem odd, but the command takes its name from the fact that a new process typically shares the namespaces of its parent. The unshare command lets you create new namespaces.

One key feature or quirk of unshare is that, by default, it runs a program with the new namespaces created, but it does not associate that program with these namespaces. Instead, the new namespaces go to any children that program creates. You can add a –fork option to make it work more as you’d expect.

For example, let’s start a new shell in its own private Idaho:


sudo unshare --pid --fork --mount-proc /bin/bash
ps alx</pre>

If you try that command without a separate namespace, you’ll get a long list of processes. But the output inside our new namespace is much less bulky:


F   UID     PID    PPID PRI  NI    VSZ   RSS WCHAN  STAT TTY        TIME COMMAND 
4     0       1       0  20   0  10820  4376 -      S    pts/6      0:00 /bin/bash 
0     0       9       1  20   0  12048  1168 -      R+   pts/6      0:00 ps alx

You do have to think a bit about how the different utilities work. For example, ps reads from /proc so if we didn’t provide --mount-proc, it would still display all the main processes. (Try it.) You wouldn’t be able to interact with them, but since you can read /proc, you’d still see them. The --mount-proc flag is really just a shorthand for --mount (to get a new mount namespace) and then doing a mount of the proc filesystem.

Omitting the fork option will cause strange shell behavior because the shell usually spins off new processes which will now be a different namespace than your main process.

If you add a filename to most of the arguments (like --pid or --mount) you can create a persistent namespace that you share among processes. You can also use virtual ethernet adapters (type veth) or a network bridge to expose a network in one namespace to another.

Mounts and More Options

Another useful isolation is in the mount table. Linux handles mounts a bit differently. You can make mounts propagate in several ways. If you want total privacy, you can do that, but you can also share within a group, or track changes in other groups but not propagate your own changes. You can read more on the man page.

One interesting thing is that since the namespaces are isolated, it is possible for a normal user to have quasi-root privileges in the new namespaces. The --map-root-user allows for this and also turns on an option to deny users calling setgroups which could allow them to get elevated permissions.

There’s more, of course. If you have util-linux installed, just ask for the unshare man page to read more. If you want to use these things in a program, which is probably easier to imagine, there is an unshare system call. Use man 2 unshare to see the details. Note that you can exercise even more control with the system call. For example, you can disassociate the file system. It is closely tied to the clone system call which is sort of a super version of fork.

You might find it interesting that all the namespace data for a process show up in /proc. For example, try:

sudo ls -l /proc/$$/ns/*

You’ll see specialized symlinks with information about the different namespaces for the current process. For example:


lrwxrwxrwx 1 alw alw 0 Dec 8 07:29 /proc/2182630/ns/cgroup -> 'cgroup:[4026531835]'
lrwxrwxrwx 1 alw alw 0 Dec 8 07:29 /proc/2182630/ns/ipc -> 'ipc:[4026531839]'
lrwxrwxrwx 1 alw alw 0 Dec 8 07:29 /proc/2182630/ns/mnt -> 'mnt:[4026531840]'
lrwxrwxrwx 1 alw alw 0 Dec 8 07:29 /proc/2182630/ns/net -> 'net:[4026531992]'
lrwxrwxrwx 1 alw alw 0 Dec 8 07:29 /proc/2182630/ns/pid -> 'pid:[4026531836]'
lrwxrwxrwx 1 alw alw 0 Dec 8 07:29 /proc/2182630/ns/pid_for_children -> 'pid:[4026531836]'
lrwxrwxrwx 1 alw alw 0 Dec 8 07:29 /proc/2182630/ns/time -> 'time:[4026531834]'
lrwxrwxrwx 1 alw alw 0 Dec 8 07:29 /proc/2182630/ns/time_for_children -> 'time:[4026531834]'
lrwxrwxrwx 1 alw alw 0 Dec 8 07:29 /proc/2182630/ns/user -> 'user:[4026531837]'
lrwxrwxrwx 1 alw alw 0 Dec 8 07:29 /proc/2182630/ns/uts -> 'uts:[4026531838]'

This is one of those Linux-isms that is somewhat obscure but can be very useful when you need it. Even if you don’t need it right now, it is worth understanding because it just might solve your next development challenge. Sure, you could run your program in its own virtual machine, but that’s a pretty heavy option compared to simply isolating what you want in a clean and simple way. Even from a shell script.

Menu