Basic Steps to build an Arch Linux AMI image

You can build your own Arch Linux AMI. The basic steps are below. They can be executed on any Arch Linux System (no need to be an EC2 instanes at AWS), even your own local laptop. If done on a AWS EC2 Arch Linux it is possible to skip the image file step and write direclty to an EBS volume (instead of copying the image file to the EBS volume later). For the purpose of these commands being universal usable on any Arch Linux system using the image file was left in.

Note, it is assumed that everything is run as root. The below code snippets are from the internal bash script that is used during the automated image build process. Some variables that are referred may not be defined in the code below. Discretion is adviced and some knowledge of bash etc. is required to implement this on your own system.

If you have comments to the build process and/or suggestions for improvements/changes you can try to write an email to:arch-ami 'at' drzee.net (no gaurantees you get a response)

If you like to support this project you can buy me a beer: https://buymeacoffee.com/drzee

The build process

We start by creating the image file we are going to use and setting up the loop device for it. An 8GB file is sufficient. Note, that the variables setup here are referred to in later code blocks as well.

mnt=<mountpoint>
img_file=<imgfile>
dd if=/dev/zero of=$img_file bs=1 count=0 seek=8G

#Setup the loop device
image=`losetup -fP --show $img_file`

We now setup the partitions in the image. We are going to use Legacy Boot (over UEFI) and Grub. All AWS instance types can handle Legacy Boot. But only some instance types can handle UEFI boot. By creating the image for Legacy Boot we can use it for more (and in partcilar older) instance types. It make little difference in a virtual setting if we use Legacy of UEFI boot.

#We setup Legoacy Boot via GRUB
parted -s $image -- mklabel msdos
parted -s $image -- mkpart primary 0% 100%
parted -s $image -- toggle 1 boot

partition=`lsblk -po kname -n $image | grep -v "^$image$"`

Format and mount the single partition.

mkfs.ext4 $partition	
	 
mount $partition $mnt

We start the build process by using pacstrap to create the system. As pacstrap is run in the context of the host machine, we need to make sure that we do not have any Ignore Package settings in pacman, that prevents installation of certain components that we need in what is to become the AMI. Any Ignore Package configurations should be (temporarly) removed.

#Start the build process
#We remove any ignore package settings in pacman.conf and restore it back later.
cp /etc/pacman.conf /etc/pacman.conf.org
sed -i '/^IgnorePkg/ s/./#&/' /etc/pacman.conf

pacstrap $mnt base grub mkinitcpio

#Restore pacman.conf
cp /etc/pacman.conf.org /etc/pacman.conf

genfstab -U -p $mnt >> $mnt/etc/fstab

Now we are almost ready to chroot into the system in the imagefile and continue the configruation. First however we configure some mirrorlists and pacman settings.

#Setup the mirror list - copy in the mirror list from the build system
#We make a copy of the original mirror list in the image and will restore it back later
cp $mnt/etc/pacman.d/mirrorlist $mnt/etc/pacman.d/mirrorlist.bk01
/bin/cp /etc/pacman.d/mirrorlist $mnt/etc/pacman.d/mirrorlist
 
sed -ri '/^\[core\]/aSigLevel = PackageRequired' $mnt/etc/pacman.conf
sed -ri '/^\[extra\]/aSigLevel = PackageRequired' $mnt/etc/pacman.conf
sed -ri '/^\[community\]/aSigLevel = PackageRequired' $mnt/etc/pacman.conf
 
#Enable ParallelDownloads in pacman to make it faster (default is 5 parallel downloads)
sed -i '/ParallelDownloads/s/^#//g' $mnt/etc/pacman.conf

We activate pacman inside the image file. Refresh the package list and install the latest keyring.

arch-chroot $mnt /bin/bash -c "pacman-key --init"
arch-chroot $mnt /bin/bash -c "pacman-key --populate archlinux"
arch-chroot $mnt /bin/bash -c "pacman -Syy"
arch-chroot $mnt /bin/bash -c "pacman --needed --noconfirm -S archlinux-keyring"

Setup of locals and time to UTC

#Setup with en-US and en-GB locale
sed -i 's/^#en_GB/en_GB/' $mnt/etc/locale.gen 
sed -i 's/^#en_US/en_US/' $mnt/etc/locale.gen
 
#Generate the locales
arch-chroot $mnt /bin/bash -c "locale-gen"
 
#Set more locales for consoles
cat > $mnt/etc/vconsole.conf << "EOF"
KEYMAP=us
FONT=LatArCyrHeb-14
EOF

cat > $mnt/etc/locale.conf << "EOF"
LANG=en_US.utf8
EOF
 
#Set time to UTC
ln -sf ../usr/share/zoneinfo/UTC $mnt/etc/localtime

The minimal package to be part of the future AMI. This is (almost) the minimal selection to make it boot successfully. If youmay want to bake in additional pacakges it can be done at this stage.

#List of default minimal packages we want in the AMI
arch-chroot $mnt /bin/bash -c "pacman --needed --noconfirm -S systemd-sysvcompat dosfstools e2fsprogs exfatprogs ntfs-3g xfsprogs man which lsof reflector rsync vi python3 audit irqbalance openssh haveged cloud-init cloud-utils aws-cli jq"

We install the kernel in the image file. If you want to use a different kernel (like LTS) just install the corresponding package here.

#Normal Kernel
arch-chroot $mnt /bin/bash -c "pacman --needed --noconfirm -S linux linux-headers"

We configure audit rule to log certain activities. This is a basic configuraiton and may need adjustment as desired.

#Audit Rules
mkdir $mnt/etc/audit/rules.d

# Set audit rules
cat > $mnt/etc/audit/rules.d/audit.rules <<"EOF"
# From:
# https://security.blogoverflow.com/2013/01/a-brief-introduction-to-auditd/
# This file contains the auditctl rules that are loaded
# whenever the audit daemon is started via the initscripts.
# The rules are simply the parameters that would be passed
# to auditctl.
# First rule - delete all
-D

# Increase the buffers to survive stress events.
# Make this bigger for busy systems
-b 1024
-a always,exit -S adjtimex -S settimeofday -S stime -k time-change
-a always,exit -S clock_settime -k time-change
-a always,exit -S sethostname -S setdomainname -k system-locale
-w /etc/group -p wa -k identity
-w /etc/passwd -p wa -k identity
-w /etc/shadow -p wa -k identity
-w /etc/sudoers -p wa -k identity
-w /var/run/utmp -p wa -k session
-w /var/log/wtmp -p wa -k session
-w /var/log/btmp -p wa -k session
-w /etc/selinux/ -p wa -k MAC-policy

# Disable adding any additional rules. 
# Note that adding new rules will require a reboot
-e 2
EOF

chmod -R o-rwx $mnt/etc/audit

Some changes are needed to the cloud-init configruation to make sure it runs correctly on Arch Linux.

#The cloud-init locale module creates an invalid format in /etc/locale.gen so we disable it
sed -ri '/- locale/s/^/#/' $mnt/etc/cloud/cloud.cfg

# enable syslog logger for cloud-init
sed -ri 's/# (.*log_syslog.*)$/ \1/g' $mnt/etc/cloud/cloud.cfg.d/05_logging.cfg
sed -ri 's/( - \[ \*log_base, \*log_file)/#\1/g' $mnt/etc/cloud/cloud.cfg.d/05_logging.cfg

# Disable other unnecessary or broken modules
sed -ri '/ ntp/d' $mnt/etc/cloud/cloud.cfg #<-Not supported on Arch

We configure the kernel modules that we need to build into the initramfs. Mostly these are virtualization modules and specific network adapters. We also disable floppy disks altogether.

# Some Kernel Module loading config
echo "blacklist floppy" > $mnt/etc/modprobe.d/blacklist-floppy.conf

# Include modules that may be needed in a variety
# of hypervisors, depending on where the guest is run.
MODULES=""

# Support power-off requests. - ipmi is Intelligent Platform Management Interface, used to manage a machine outside the OS.
MODULES+="button ipmi-msghandler ipmi-poweroff"

# Support nvme, Non-Volatile Memory Express, a controller spec for SSDs
MODULES+=" nvme"

# Support the KVM, kernel-based virtual machine
MODULES+=" virtio virtio-blk virtio-net virtio-pci virtio-ring"

# Support the Xen virtual machine
MODULES+=" xen-blkfront xen-netfront xen-pcifront xen-privcmd"

# Support SR-IOV, single root i/o virtualization
MODULES+=" ixgbevf"

# Support for AWS EC2 ENA, Elastic Network Adapter
MODULES+=" ena"

sed -ri "s/^MODULES=.*/MODULES=($MODULES)/g" $mnt/etc/mkinitcpio.conf
sed -ri "s/^FILES=.*/FILES=(\/etc\/modprobe.d\/blacklist-floppy.conf)/g" $mnt/etc/mkinitcpio.conf

We get ready to build the initramfs. We remove the default presets and replace these with our own minimalist. Note, that this makes switching between different kernel packages (Normal->LTS) more difficult and will require some manual adjustment to make sure the kernel images are found during boot.

img=/boot/vmlinuz-linux
init_img=/boot/initramfs-linux.img
preset_name=linux.preset

rm $mnt/etc/mkinitcpio.d/*.preset

# Disable module auto-detection and setup the presets
cat > $mnt/etc/mkinitcpio.d/$preset_name <<EOF
# mkinitcpio preset file for linux
ALL_config="/etc/mkinitcpio.conf"
ALL_kver=$img
PRESETS=('default')
default_image=$init_img
# Turn off autodetect:
default_options="-S autodetect"
EOF

We set the default boot target to be multiuser mode.

#Boot target
ln -sf ../../../../usr/lib/systemd/system/multi-user.target $mnt/etc/systemd/system/default.target

We build the kernel and initramfs images ...

# Finally, run mkinitcpio.
# Reads /etc/mkinitcpio.conf, /etc/mkinitcpio.d/*.preset.
# As specified in .conf, writes /boot/initramfs-linux.img
arch-chroot $mnt /bin/bash -c "mkinitcpio -P"

... and setup GRUB

#Setup GRUB boot load
arch-chroot $mnt /bin/bash -c "grub-install --target=i386-pc --recheck ${image}"

sed -ri 's/GRUB_TIMEOUT=5/GRUB_TIMEOUT=1/' $mnt/etc/default/grub
sed -ri 's/^#GRUB_TERMINAL_OUTPUT/GRUB_TERMINAL_OUTPUT/' $mnt/etc/default/grub
sed -ri "s/^GRUB_CMDLINE_LINUX_DEFAULT.*/GRUB_CMDLINE_LINUX_DEFAULT=\"console=ttyS0 earlyprint=serial,ttyS0,keep loglevel=7 nomodeset\"/g" $mnt/etc/default/grub
sed -ri '/^GRUB_TIMEOUT=/a GRUB_DISABLE_SUBMENU=y' $mnt/etc/default/grub

arch-chroot $mnt /bin/bash -c "grub-mkconfig -o /boot/grub/grub.cfg"

Configure the basic SSH settings. On first boot cloud-init will make changes and, among other, install the certificates.

# Set SSH port
sed -ri '1i Port 22' $mnt/etc/ssh/sshd_config
# Disable password authentication. It doesn't make sense in a cloud setting.
sed -ri 's/^#PasswordAuthentication yes/PasswordAuthentication no/' $mnt/etc/ssh/sshd_config

Initial network configuraiton. Cloud-init will modify this on first boot and make the final network setup. This just needs to be present to get the instance started.

# Configure the initial network
cat > $mnt/etc/systemd/network/20.ethernet << "EOF"
[Match]
Name = en* eth*
[Network]
DHCP = yes
[DHCP]
UseMTU = yes
UseDNS = yes
UseDomains = yes
EOF

Setup resolve.conf correctly.

#Setup resolv.conf correct to work with GPG and possibly other tools relying on resolve.conf
rm $mnt/etc/resolv.conf
ln -s /run/systemd/resolve/stub-resolv.conf $mnt/etc/resolv.conf

Enable the basic services. Particular SSH and the Cloud-xxxx services are important.

#Enable services
arch-chroot $mnt /bin/bash -c "systemctl enable systemd-timesyncd.service"
arch-chroot $mnt /bin/bash -c "systemctl enable nscd.service"
arch-chroot $mnt /bin/bash -c "systemctl enable auditd.service"
arch-chroot $mnt /bin/bash -c "systemctl enable haveged.service"
arch-chroot $mnt /bin/bash -c "systemctl enable irqbalance.service"
arch-chroot $mnt /bin/bash -c "systemctl enable cloud-init.service"
arch-chroot $mnt /bin/bash -c "systemctl enable cloud-config.service"
arch-chroot $mnt /bin/bash -c "systemctl enable cloud-final.service"
arch-chroot $mnt /bin/bash -c "systemctl enable systemd-networkd"
arch-chroot $mnt /bin/bash -c "systemctl enable systemd-resolved"
arch-chroot $mnt /bin/bash -c "systemctl enable sshd.service"

Finally we clean out all the pacman package caches (no need to keep them arround), clear the logs generated during the buildprocess and delete/reset the keyring to allow for a fresh pacman init when the instance boots.

# Do some cleanup
# Copy in the original mirror list again
mv $mnt/etc/pacman.d/mirrorlist.bk01 $mnt/etc/pacman.d/mirrorlist
# Clear package cache, repo DB cache and logs
find $mnt/var/cache/pacman/pkg -type f -print0 | xargs -0 rm -fv
find $mnt/var/lib/pacman/sync -type f -print0 | xargs -0 rm -fv
find $mnt/var/log -type f -print0 | xargs -0 rm -fv
 
#Clear arch key ring
rm -Rf $mnt/etc/pacman.d/gnupg

We run the trim command to clear the space in the image file, unmount it and then delete the loop device.

#Trim the FS to minmize space
fstrim $mnt

sleep 5

#Umount and remove loop device
umount $partition
losetup -d $image

Now we have an image file and need to convert that to an AMI. For that we need to boot up an EC2 instance (any linux operating system is fine), copy the image file over to it (via S3 - if we not already did the above using an EC2 instance), create and attach a second 8GB EBS volume and copy the content of the image to the volume (using something like dd - or even better ddpt). Finally we snapshot the volume and turn the snapshot into the AMI.

Using AWS CLI tool (on the EC2 instance and assuming the image file is already copied to the instance):

ami_name=<some name>
# Create the volume used for the image target and attach it
volume_id=`aws ec2 create-volume --region $aws_region --availability-zone $aws_az --no-encrypted --volume-type gp3 --size 8 --tag-specifications 'ResourceType=volume,Tags=[{Key=arch_ami,Value='${ami_name}'}]' | jq -r ."VolumeId"`

#Wait for the volume to be ready
aws ec2 wait volume-available --region $aws_region --volume-ids $volume_id

#Attach the volume
aws ec2 attach-volume --region $aws_region --device /dev/xvdf --instance $instance_id --volume-id $volume_id

#Slep a moment to let the volume settle and devices get created
sleep 10

We use ddpt (https://aur.archlinux.org/packages/ddpt and http://sg.danny.cz/sg/ddpt.html) to copy the image to the EBS volume, but dd can also be used, its just slower. We need to provide the target device name that the gets assigned to the EBS volume when its attached.

#Copy the image to the volume using ddpt (needs to be installed from AUR)
target=<target block device>
sudo ddpt if=$img_file of=$target bs=512 conv=sparse,fsync oflag=sparse,strunc

After we completed the copy we detach the volume again and make the snapshot.

#Detach the volume 
aws ec2 detach-volume --region $aws_region --volume-id $volume_id
 
#Detach will take a few seconds
sleep 10

#Take the snapshot
snap_id=`aws ec2 create-snapshot --region $aws_region --volume-id $volume_id --tag-specifications 'ResourceType=snapshot,Tags=[{Key=arch_ami,Value='${ami_name}'}]' | jq -r ."SnapshotId"`

#wait for snapshot completion
aws ec2 wait snapshot-completed --region $aws_region --snapshot-ids $snap_id

We now have the snapshot and can delete the volume (its no longer needed) and convert the snapshot to the AMI

#Delete the volume
aws ec2 delete-volume --region $aws_region --volume-id $volume_id

#Register the AMI in the current region
ami_id=`aws ec2 register-image --region $aws_region --architecture x86_64 --ena-support --block-device-mappings "DeviceName=/dev/sda1,Ebs={SnapshotId=$snap_id}" --name $ami_name --description "$ami_description" --root-device-name /dev/sda1 --virtualization-type hvm | jq -r .ImageId`

And that's it. You now have a private AMI that you can use to boot new instances. To make it public you just need to change the access permissions and also remember to allow the public read access to the underlying snapshot. If you don't do the latter others can still use the AMI, but they can not make copies of it and move it to other regions.