Sunday March 17 2019
Docker + nftables
Normally, when you install docker it takes care of mucking about the firewall
rules for you. It uses iptables
under the hood to do this. Unfortunately
at this time Docker does not have any native support for nftables
. This
leaves us with a couple of options, stop using the current Linux firewall
and go back to the
now legacy iptables
utilities.
As you can probably imagine from the title of the article I do not plan on
going back to iptables
, so let’s get into making this work with nftables
.
Modifying the service
On Void Linux it’s pretty straightforward. Once docker is installed, but
before you symlink the service directory simply add a conf
file:
/etc/sv/docker/conf
OPTS="--iptables=false"
If you’re on another systemd
based distribution such as Ubuntu or Fedora
you can copy the service from /lib/systemd/system/docker.service
to
/etc/systemd/system/docker.service
, modify the line that starts with ExecStart
like so:
ExecStart=/usr/bin/dockerd --iptables=false -H fd://
Then reload the daemons:
# systemctl daemon-reload
You can then enable
and start
the service from there.
Basic /etc/nftables.conf
nftables
is sometimes a bit tricky to get started with, so an example should
help out a bit:
/etc/nftables.conf
#!/usr/sbin/nft -f
flush ruleset
table inet filter {
chain input {
type filter hook input priority 0;
# Allow all input on loopback
iif lo accept
# Accept stateful traffic
ct state established,related accept
# Accept SSH
tcp dport 22 accept
# Accept HTTP and HTTPs
tcp dport { 80, 443 } accept
# Allow some icmp traffic for ipv6
ip6 nexthdr icmpv6 icmpv6 type {
nd-neighbor-solicit, echo-request,
nd-router-advert, nd-neighbor-advert
} accept
counter drop
}
chain forward {
type filter hook forward priority 0;
# Note that by default docker has a drop default on the
# forward chain. This is done for security reasons, and I
# highly recommend you do the same. You will however have
# to explicitly define what traffic is to be acepted here,
# e.g. can the networks communicate to the world, other
# networks, etc.
}
chain output {
type filter hook output priority 0;
}
}
table ip nat {
chain prerouting {
type nat hook prerouting priority 0
}
chain postrouting {
type nat hook postrouting priority 100
# You may need to change 'eth0' to your primary interface
oif eth0 masquerade persistent
}
}
Once you have the configuration in place it’s as simple as enabling the nftables
service for your distribution. You can reload the rules with nft -f /etc/nftables.conf
Forwarding ports
This is all fine and dandy if you’re not trying to hos a service inside of your docker container. Most people use Docker to deploy their services and it would only make sense if I also showed an example of forwarding a port on the host machine to that of an internal docker container.
To do this we don’t have to, but it’s easier to create our own network in the long run as docker does not let us specify a container’s IP on the default network.
It’s pretty straightforward to create this new network:
$ docker network create \
-o com.docker.network.bridge.name=user0 \
--subnet=172.20.0.0/16 \
user
As you can see above we’re passing in an additional option -o
to set the
bridge name that we’re going to be using for this network. Although
not explicitly required this is going to us to easily know what bridge
we just created, as docker would otherwise create a random name for us.
We can also inspect and see a little bit more information about the network:
$ docker network inspect user
[
{
"Name": "user",
"Id": "290a98a6f57739f29d758964924c95214219426c02c5b58e0a1049627b8da535",
"Created": "2019-03-17T14:15:12.635615052-04:00",
"Scope": "local",
"Driver": "bridge",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": {},
"Config": [
{
"Subnet": "172.20.0.0/16",
"Gateway": "172.20.0.1"
}
]
},
"Internal": false,
"Attachable": false,
"Ingress": false,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {},
"Options": {
"com.docker.network.bridge.name": "user0"
},
"Labels": {}
}
]
We can also see that the new bridge we specified exists:
$ brctl show
bridge name bridge id STP enabled interfaces
docker0 8000.0242a6395da2 no
user0 8000.0242b7d38495 no
We can then run a container for our service:
$ docker run \
--network user \
--rm \
-d \
--ip 172.20.1.80 \
-v /var/lib/couchdb/data:/opt/couchdb/data \
couchdb:latest
We can then test that our service, couchdb in this case is responding as expected with:
mitch@void.rygel.us /u/mitch/my-vehicle/couchdb $ curl 172.20.1.80:5984
{"couchdb":"Welcome","version":"2.3.1","git_sha":"c298091a4","uuid":"4e10b1c62f372d787c503842e54e230b","features":["pluggable-storage-engines","scheduler"],"vendor":{"name":"The Apache Software Foundation"}}
We can then adjust our nftables
configuration to forward along the proper
port:
chain prerouting {
type nat hook prerouting priority 0
iif eth0 tcp dport 5984 dnat 172.20.1.80
}
or if you’d prefer a different port:
iif eth0 tcp dport 24000 dnat 172.20.1.80:5984
Final thoughts
The bridge user0
and the network name user
can be changed to anything
else you’d like, same with the --subnet
when creating the network. Keep
in mind though that you should stick to
RFC 1918 networks for your --subnet
value unless you know what you’re doing.
This makes docker a bit harder to use–of course. Many would likely see more
benefit to using the legacy iptables
tools and holding off their switch
to nftables
to after Docker supports it.
As mentioned in the configuration example above, docker by default limits communication between forwarded networks which is good from a security standpoint. I recommend that you do this as well by changing the default policy and explicitly allowing the traffic you want.