A namespace wraps a global system resource in an abstraction that makes it appear to the processes within the namespace that they have their own isolated instance of the global resource. Containers are an implementation of using namespaces.
The namespaces in Linux are:
- Cgroup (control groups) for resource limitations
- IPC namespace for separating inter-process communication
- Network namespace for managing/separating network interfaces,
- Mount namespace for managing/separating filesystem mount points
- PID namespace for process isolation,
- User ID namespace for privilege isolation,
- UTS namespace for isolating kernel and version identifiers (setting hostname and domain name visible to the process)
The isolation features included in Apptainer/Singularity are:
- by default: mount namespace
- optional: IPC, network, PID, UTS and user namespace
Singularity/Apptainer works either by making use of unprivileged user namespaces or with a setuid-root assist program. Apptainer uses unprivileged user namespaces by default beginning with Apptainer 1.1.
WLCG VOs and some other VOs require enabling Apptainer/Singularity to run unprivileged (unprivileged user namespaces are enabled). Network namespaces are not needed and can be disabled (see comments below for further explanation).
In el8 user namespaces are enabled by default, to enable user namespaces in el7:
root@host # echo "user.max_user_namespaces = 15000" > /etc/sysctl.d/91-max_user_namespaces.conf
root@host # sysctl -p /etc/sysctl.d/91-max_user_namespaces.conf
In 2022 there were already a few kernel vulnerabilities that were related to unprivileged user namespaces in combination with network namespaces:
https://advisories.egi.eu/Advisory-SVG-CVE-2022-32250
https://advisories.egi.eu/Advisory-SVG-CVE-2022-1015
https://advisories.egi.eu/Advisory-SVG-CVE-2022-25636
Since network namespaces are not required for running jobs in Apptainer containers, EGI CSIRT recommends sites to disable network namespaces:
root@host # echo "user.max_net_namespaces = 0" > /etc/sysctl.d/92-max_net_namespaces.conf
root@host # sysctl -p /etc/sysctl.d/92-max_net_namespaces.conf
This setting will improve the system security.
Disabling network namespaces can have impact on some systemd services, such as PrivateNetwork feature.
Verify which systemd services have this feature enabled by running:
cat /usr/lib/systemd/system/*.service| grep PrivateNetwork
PrivateNetwork feature can be disabled for a service, by creating <service>.d/*.conf file.
In el8 an example of such service is systemd-hostnamed:
root@host # cd /etc/systemd/system
root@host # mkdir -p systemd-hostnamed.service.d
root@host # (echo "[Service]"; echo "PrivateNetwork=no") > \ systemd-hostnamed.service.d/no-private-network.conf
After this setting, reload systemd deaemon and systemd-hostnamed service.
Podman and Docker use network namespaces by default, but they can instead use the host’s network with the –network=host option. Charliecloud, Sarus, Apptainer do not require network namespaces by default.
For further reading, please see https://apptainer.org/docs/admin/main/user_namespace.html
Recent Comments