Sunday September 29 2019

Installing Alpine Linux on AWS an an EC2 instance.

It’s not quite as straightforward as it was with Google Compute in the previous post of setting up a virtual machine and then uploading the image. However I’m happy to report that the process isn’t all that much different and in some ways is actually less work if you know what you’re doing.

I’ve opted to omit screenshots as I hope my readers are somewhat familiar with the AWS EC2 console, plus it has a decent albeit clunky design that I’ve found reasonably easy to navigate.

Setup

Fire up a new EC2 instance, Amazon Linux is what I used for the OS which will be used to bootstrap Alpine Linux. Most other Linux distributions will work without much issue as well.

Now run through the AWS console, or use the command line to create an EBS volume in the same Availability Zone that you put your instance in. It doesn’t need to be large, 4gb is plenty and what I used.

Now attach that volume to your EC2 instance, login and become root. Once you’re logged in check fdisk -l and isolate which disk is the new EBS volume you attached, in my case it was /dev/xvdf, remember this we’re going to need it in a little bit.

Chroot Setup

From the Alpine Linux Downloads page copy the latest Mini Root Filesystem URL for x86_64.

Now download and extract it to the EC2 instance:

# wget -O alpine.tgz http://dl-cdn.alpinelinux.org/alpine/v3.10/releases/x86_64/alpine-minirootfs-3.10.2-x86_64.tar.gz
# mkdir chroot && tar -C chroot -xf alpine.tgz

From there let’s setup the filesystems mounts and nameservers for our chroot

# cd chroot
# mount -t proc none proc
# mount -o bind /dev dev
# mount -o bind /sys sys
# echo nameserver 8.8.4.4 >> etc/resolv.conf
# echo nameserver 8.8.8.8 >> etc/resolv.conf
# chroot . /bin/ash

Inside the Chroot

Now that you’re in the chroot, let’s get the repositories updated and the setup scripts installed

# apk update
# apk add alpine-base

From there now that we have the install script lets export the environment variables that make installation a little bit easier much as we did with the install for Google Compute:

# export BOOTLOADER=grub
# export ROOTFS=xfs # optional, but I recommend it
# export DISKOPTS='-L -s 512 -m sys /dev/xvdf'

Note the disk path above should match what you figured was the new EBS volume from the fdisk output earlier.

Now you should skip an NTP server by supplying none, as well as skipping network configuration ( You don’t want to lose your shell ) With this in mind, let’s run the setup script

# setup-alpine

Cleanup

Once that’s finished drop out of the chroot, and let’s mount the filesystem to do some housekeeping to make sure we get an IP address on reboot

# mount /dev/vg0/lv_root /mnt/
# cd /mnt/

Now edit /mnt/etc/network/interfaces and make sure the content matches:

auto lo
iface lo inet loopback


auto eth0
iface eth0 inet dhcp

From there it’s important to make sure that we have an SSH key, note the lack of a leading slash so we operate on our new install under the /mnt folder:

# umask 077
# mkdir root/.ssh
# vi root/.ssh/authorized_keys

Remove SSH Host keys

rm -f etc/ssh/ssh_host*

Adding required modules for EBS optimization and ENA

Amazon’s t3 and other instance types use “EBS Optimization” and the “Elastic Network Adapater” by default. In order to take advantage of this and be able to boot to EBS optimized instances we need to make sure the modules are in the initial ram disk. We’ll also take this opportunity to switch to the trimmed down linux-virt kernel package

# # Just like before
# cd /mnt
# mount -t proc none proc
# mount -o bind /dev dev
# mount -o bind /sys sys
# chroot . /bin/ash
# sed -i.bak -re's/(virtio)/nvme nvme_core ena \1/' /etc/mkinitfs/mkinitfs.conf
# cat /etc/mkinitfs/mkinitfs.conf
features="ata base ide scsi usb ena nvme nvme_core virtio xfs lvm"
# apk add linux-virt
# apk del linux-vanilla

Since we’ve made changes to the packages after modifying the configuration there’s no need to manually run mkinitfs as the package manager has done this for us.

Create a snapshot in the AWS console.

It’s about as easy as it sounds, go to your EBS volume right click, create a snapshot. Once it’s finished right click on it and Create Image. All of the defaults are fine, simply fill in the Name and Description then click create.

Once that’s done, deploy an instance from that AMI and try to login!

Why not use Packer?

Packer, by its own definition is for “Building automated machine images.” Which is absolutely fine if that’s what you want to do. I suspect however that Packer, and other “Infrastructure as Code” utilities are being used in place of documentation–or as if they are documentation–which they are not–they are a liability. They’re more things for you to maintain.

These tools have their place, don’t get me wrong. The problem is that you’re adding additional complexity by introducing yet another tool and sacrificing time you could have engineers documenting what’s happening with time engineers are now spending learning, debugging, and maintaining yet another piece of code and or configuration.

If given the choice between having some documentation that explains how something works and some jumble of code that should work I’m going to choose the documentation each time. As any software engineer that’s been in the industry awhile, writing code is the easy part. Reading it is much much harder.

So, if you know how to do everything above it shouldn’t be difficult to write some scripts and toss it into Packer to make this happen. It’s learning how to build these things in the first place that’s difficult–looking at some packer configuration without documentation as to what is going on isn’t going to help anyone learn faster.

Plus just because you built the AMI by hand doesn’t meant that you can’t use other tools to further enforce a configuration standard such as Ansible, or orchestrate your infrastructure with utilities such as Terraform, CloudFormation, Pulumi, etc.

Sometimes the most effective solution is to document, take backups, document the restore process. Then test it on a schedule. Then you know you can recover from that disaster scenario–and that’s worth a lot more than some code that gets used only a handful of times and then forgotten about.

Being effective is about understanding what is and isn’t a good use of your time.