Do-It-Yourself Bare-Metal Provisioning, or Automatic Server Provisioning from Scratch

Hello, I'm Denis and one of my areas of activity is the development of infrastructure solutions in X5. Today I would like to share with you how you can deploy an automatic server preparation system based on publicly available tools. In my opinion, this is an interesting, simple and flexible solution.

Do-It-Yourself Bare-Metal Provisioning, or Automatic Server Provisioning from Scratch

Preparation means: to make a fully configured server with OS from a new server out of the box. Linux or with the ESXi hypervisor (casting Windows servers is not discussed in this article).

terms:

  • servers - servers to be configured.
  • install server - the main server that provides the entire process of provisioning over the network.

Why is automation needed?

Let's say there is a task: to massively prepare servers from scratch, at the peak - 30 per day. Servers of different manufacturers and models, different operating systems can be installed on them, there may or may not be a hypervisor.

What operations are included in the setup process (without automation):

  • connect keyboard, mouse, monitor to the server;
  • configure BIOS, RAID, IPMI;
  • update component firmware;
  • deploy a file system image (or install a hypervisor and copy virtual machines);

Note. Alternatively, deploying the OS is possible through installation with an auto-response file. But this will not be discussed in the article. Although below you will see that adding this functionality is not difficult.

  • configure OS parameters (hostname, IP, etc.).

With this approach, the same settings are performed sequentially on each server. The efficiency of such work is very low.

The essence of automation is to exclude human participation from the server preparation process. As much as possible.

Automation reduces downtime between operations and enables multiple servers to be provisioned at the same time. It also greatly reduces the likelihood of errors due to the human factor.

Do-It-Yourself Bare-Metal Provisioning, or Automatic Server Provisioning from Scratch

How is the automatic configuration of servers?

Let's analyze all the stages in detail.

You have a linux server that you are using as a PXE install server. Services are installed and configured on it: DHCP, TFTP.

So, we load the server (which needs to be configured) via PXE. Let's remember how it works:

  • The server is set to boot over the network.
  • The server loads the PXE-ROM of the network card and accesses the install server via DHCP to obtain a network address.
  • The DHCP install server issues an address, as well as instructions for further boot via PXE.
  • The server loads the network loader from the install server via PXE, further loading occurs according to the PXE configuration file.
  • Loading takes place based on the received parameters (kernel, initramfs, mount points, squashfs image, etc.).

Note. The article provides a description of booting via PXE through BIOS mode. Currently, manufacturers are actively implementing UEFI bootmode. For PXE, the difference will be in the configuration of the DHCP server and the presence of an additional bootloader.

Let's look at an example PXE server configuration (pxelinux menu).

pxelinux.cfg/default file:

default menu.c32
prompt 0
timeout 100
menu title X5 PXE Boot Menu
LABEL InstallServer Menu
	MENU LABEL InstallServer
	KERNEL menu.c32
	APPEND pxelinux.cfg/installserver
LABEL VMware Menu
	MENU LABEL VMware ESXi Install
	KERNEL menu.c32
	APPEND pxelinux.cfg/vmware
LABEL toolkit // мСню ΠΏΠΎ ΡƒΠΌΠΎΠ»Ρ‡Π°Π½ΠΈΡŽ
	MENU LABEL Linux Scripting Toolkits
	MENU default
	KERNEL menu.c32
	APPEND pxelinux.cfg/toolkit // ΠΏΠ΅Ρ€Π΅Ρ…ΠΎΠ΄ Π½Π° ΡΠ»Π΅Π΄ΡƒΡŽΡ‰Π΅Π΅ мСню

pxelinux.cfg/toolkit file:

prompt 0
timeout 100
menu title X5 PXE Boot Menu
label mainmenu
    menu label ^Return to Main Menu
    kernel menu.c32
    append pxelinux.cfg/default
label x5toolkit-auto // ΠΏΠΎ ΡƒΠΌΠΎΠ»Ρ‡Π°Π½ΠΈΡŽ β€” автоматичСский Ρ€Π΅ΠΆΠΈΠΌ
        menu label x5 toolkit autoinstall
        menu default
        kernel toolkit/tkcustom-kernel
        append initrd=toolkit/tk-initramfs.gz quiet net.ifnames=0 biosdevname=0 nfs_toolkit_ip=192.168.200.1 nfs_toolkit_path=tftpboot/toolkit nfs_toolkit_script=scripts/mount.sh script_cmd=master-install.sh CMDIS2=”…”
label x5toolkit-shell // для ΠΎΡ‚Π»Π°Π΄ΠΊΠΈ - консоль
        menu label x5 toolkit shell
        kernel toolkit/tkcustom-kernel
        append initrd=toolkit/tkcustom-initramfs.gz quiet net.ifnames=0 biosdevname=0 nfs_toolkit_ip=192.168.200.1 nfs_toolkit_path=tftpboot/toolkit nfs_toolkit_script=scripts/mount.sh script_cmd=/bin/bash CMDIS2=”…”

The kernel and initramfs at this stage is an intermediate linux image, with the help of which the main preparation and configuration of the server will take place.

As you can see, the bootloader passes a lot of parameters to the kernel. Some of these parameters are used by the kernel itself. And some we can use for our own purposes. This will be discussed later, but for now, you can just remember that all the parameters passed will be available in the intermediate linux image via /proc/cmdline.

Where can I get them, kernel and initramfs?
As a basis, you can choose any linux distribution. What we pay attention to when choosing:

  • the boot image must be universal (the presence of drivers, the ability to install additional utilities);
  • most likely, you will need to customize the initramfs.

How is it done in our X5 solution? CentOS 7 was chosen as the basis. Let's do the following trick: prepare the future image structure, pack it into an archive and create an initramfs that will contain our file system archive. When the image is loaded, the archive will be expanded into the created tmpfs partition. Thus, we will get a minimal, yet full-fledged linux live image with all the necessary utilities, consisting of only two files: vmkernel and initramfs.

#создаСм Π΄ΠΈΡ€Π΅ΠΊΡ‚ΠΎΡ€ΠΈΠΈ: 

mkdir -p /tftpboot/toolkit/CustomTK/rootfs /tftpboot/toolkit/CustomTK/initramfs/bin

#ΠΏΠΎΠ΄Π³ΠΎΡ‚Π°Π²Π»ΠΈΠ²Π°Π΅ΠΌ структуру:

yum groups -y install "Minimal Install" --installroot=/tftpboot/toolkit/CustomTK/rootfs/
yum -y install nfs-utils mariadb ntpdate mtools syslinux mdadm tbb libgomp efibootmgr dosfstools net-tools pciutils openssl make ipmitool OpenIPMI-modalias rng-tools --installroot=/tftpboot/toolkit/CustomTK/rootfs/
yum -y remove biosdevname --installroot=/tftpboot/toolkit/CustomTK/rootfs/

# ΠΏΠΎΠ΄Π³ΠΎΡ‚Π°Π²Π»ΠΈΠ²Π°Π΅ΠΌ initramfs:

wget https://busybox.net/downloads/binaries/1.31.0-defconfig-multiarch-musl/busybox-x86_64 -O /tftpboot/toolkit/CustomTK/initramfs/bin/busybox
chmod a+x /tftpboot/toolkit/CustomTK/initramfs/bin/busybox
cp /tftpboot/toolkit/CustomTK/rootfs/boot/vmlinuz-3.10.0-957.el7.x86_64 /tftpboot/toolkit/tkcustom-kernel

# создаСм /tftpboot/toolkit/CustomTK/initramfs/init (Π½ΠΈΠΆΠ΅ содСрТаниС скрипта):

#!/bin/busybox sh
/bin/busybox --install /bin
mkdir -p /dev /proc /sys /var/run /newroot
mount -t proc proc /proc
mount -o mode=0755 -t devtmpfs devtmpfs /dev
mkdir -p /dev/pts /dev/shm /dev/mapper /dev/vc
mount -t devpts -o gid=5,mode=620 devpts /dev/pts
mount -t sysfs sysfs /sys
mount -t tmpfs -o size=4000m tmpfs /newroot
echo -n "Extracting rootfs... "
xz -d -c -f rootfs.tar.xz | tar -x -f - -C /newroot
echo "done"
mkdir -p /newroot/dev /newroot/proc /newroot/sys
mount --move /sys  /newroot/sys
mount --move /proc /newroot/proc
mount --move /dev  /newroot/dev
exec switch_root /newroot /sbin/init

# ΡƒΠΏΠ°ΠΊΠΎΠ²Ρ‹Π²Π°Π΅ΠΌ rootfs ΠΈ initramfs:

cd /tftpboot/toolkit/CustomTK/rootfs
tar cJf /tftpboot/toolkit/CustomTK/initramfs/rootfs.tar.xz --exclude ./proc --exclude ./sys --exclude ./dev .
cd /tftpboot/toolkit/CustomTK/initramfs
find . -print0 | cpio --null -ov --format=newc | gzip -9 > /tftpboot/toolkit/tkcustom-initramfs-new.gz

So we have specified the kernel and initramfs to be loaded. As a result, at this stage, by loading the intermediate linux image via PXE, we will get the OS console.

Great, but now we need to transfer control to our β€œautomation”.

It can be done like this.

Suppose after loading the image we plan to transfer control to the mount.sh script.
Let's include the mount.sh script in autorun. To do this, you need to modify the initramfs:

  • unpack the initramfs (if using the above initramfs option, this is not required)
  • include in autoload code that will analyze the parameters passed through /proc/cmdline and pass control further;
  • pack initramfs.

Note. In the case of the X5 toolkit, loading control is transferred to the script /opt/x5/toolkit/bin/hook.sh с ΠΏΠΎΠΌΠΎΡ‰ΡŒΡŽ override.conf Π² getty tty1 (ExecStart=…)

So, the image is loaded, in which the mount.sh script starts at autorun. Further, the mount.sh script parses the passed parameters (script_cmd=) during execution and launches the necessary program/script.

label toolkitauto
kernel...
append ... nfs_toolkit_script=scripts/mount.sh script_cmd=master-install.sh

label toolkitshell
kernel...
append ... nfs_toolkit_script=scripts/mount.sh script_cmd=/bin/bash

Do-It-Yourself Bare-Metal Provisioning, or Automatic Server Provisioning from Scratch

Here on the left is the PXE menu, on the right is the control transfer diagram.

We figured out the transfer of control. Depending on the choice of the PXE menu, either the autoconfiguration script or the debugging console is launched.

In the case of automatic configuration, the necessary directories are mounted from the install server, which contain:

  • scripts;
  • saved BIOS/UEFI templates of various servers;
  • firmware;
  • server utilities;
  • logs.

Next, the mount.sh script passes control to the master-install.sh script from the scripts directory.

The script tree (the order in which they are run) looks something like this:

  • master install
  • sharefunctions (shared functions)
  • info (output information)
  • models (setting installation parameters based on the server model)
  • prepare_utils (Install necessary utilities)
  • fwupdate (firmware update)
  • diag (elementary diagnostics)
  • biosconf (bios/ufi setup)
  • clockfix (time setting on motherboard)
  • srmconf (remote interface interface setting)
  • raidconf (configuring logical volumes)

one of:

  • preinstall (transfer of control to the OS or hypervisor installer, such as ESXi)
  • merged-install (immediate start of image unpacking)

Now we know:

  • how to boot the server via PXE;
  • how to transfer control to your own script.


Let's continue. The following questions have become relevant:

  • How to identify the server we are provisioning?
  • What utilities and how to configure the server?
  • How to get settings for a specific server?

How to identify the server we are provisioning?

It's simple - DMI:

dmidecode –s system-product-name
dmidecode –s system-manufacturer
dmidecode –s system-serial-number

Everything you need is there: vendor, model, serial number. If you are not sure that all servers have this information, you can identify them by their MAC address. Or in both ways at the same time, if the server vendors are different and on some models there is simply no information about the serial number.

Based on the information received, network folders are mounted from the install server and everything necessary is loaded (utilities, firmware, etc.).

What utilities and how to configure the server?

I will give utilities for linux for some manufacturers. All utilities are available on the official websites of vendors.

Do-It-Yourself Bare-Metal Provisioning, or Automatic Server Provisioning from Scratch

With firmware, I think everything is clear. They are usually shipped as packaged executables. The executable controls the firmware update process and reports the return code.

BIOS and IPMI are usually configured through templates. If necessary, the template can be edited before downloading.

Some vendors' RAID utilities can also set up a template. If this is not the case, then you will have to write a configuration script.

The procedure for setting up a RAID is most often the following:

  • We request the current configuration.
  • If there are already logical arrays, we erase them.
  • We look at what physical disks are present and how many of them.
  • We create a new logical array. We interrupt the process in case of an error.

How to get settings for a specific server?

Let's assume that the settings of all servers will be stored on the install server. In this case, to answer our question, we must first decide how to transfer the settings to the install server.

At first, it is quite possible to get by with text files. (In the future, you can use a text file as a backup method for transferring settings).

You can "share" a text file on the install server. And add its mounting to the mount.sh script.

The lines will look like this:

<serial number> <hostname> <subnet>

These strings will be filed by the engineer from his work machine. And further, when setting up the server, the parameters for a particular server will be read from the file.

But, in the future, it is better to use a database to store settings, states, and logs of server installations.

Of course, one database is not enough, and you will need to create a client part, with the help of which the settings will be transferred to the database. This is more difficult to implement compared to a text file, but in fact, everything is not as difficult as it seems. The minimum version of the client, which will simply transfer data to the database, is quite feasible to write yourself. And in the future, it will be possible to improve the client program in the free mode (reports, label printing, sending notifications, and other things that come to mind).

After making a certain request to the database and specifying the serial number of the server, we will get the necessary parameters for configuring the server.

Plus, we do not need to invent locks for simultaneous access, as is the case with a text file.

We can write the configuration log to the database at all stages and control the installation process through events and flags of the preparation stages.

Now we know how:

  • boot the server via PXE;
  • transfer control to our script;
  • identify the server to be provisioned by serial number;
  • configure the server with appropriate utilities;
  • transfer settings to the install server database using the client side.

Found out how:

  • the server to be installed receives the necessary settings from the database;
  • all preparation progress is recorded in the database (logs, events, stage flags).

What about different types of installed software? How to install a hypervisor, copy a VM and set it all up?

In the case of deploying a file system image (linux) to hardware, everything is quite simple:

  • After configuring all server components, we deploy the image.
  • Install grub bootloader.
  • We make chroot and configure everything that is necessary.

How to transfer control to the OS installer (on the example of ESXi).

  • Let's organize the transfer of control from our script to the hypervisor installer using the auto-response file (kickstart):
  • Delete the current partitions on the disk.
  • Create a 500MB partition.
  • We mark it as bootable.
  • Format in FAT32.
  • Copy the ESXi installation files to the root.
  • Install syslinux.
  • Copy syslinux.cfg to /syslinux/

default esxi
prompt 1
timeout 50
label esxi
kernel mboot.c32
append -c boot.cfg

  • Copy mboot.c32 to /syslinux.
  • Boot.cfg should have kernelopt=ks=ftp:// /ks_esxi.cfg
  • We reboot the server.

After rebooting the server, the ESXi installer will boot from its hard drive. All necessary installer files will be loaded into memory and ESXi installation will then begin according to the specified auto-response file.

Here are a few lines from the ks_esxi.cfg auto-response file:

%firstboot --interpreter=busybox
…
# ΠΏΠΎΠ»ΡƒΡ‡Π°Π΅ΠΌ сСрийный Π½ΠΎΠΌΠ΅Ρ€

SYSSN=$(esxcli hardware platform get | grep Serial | awk -F " " '{print $3}')

# ΠΏΠΎΠ»ΡƒΡ‡Π°Π΅ΠΌ IP

IPADDRT=$(esxcli network ip interface ipv4 get | grep vmk0 | awk -F " " '{print $2}')
LAST_OCTET=$(echo $IPADDRT | awk -F'.' '{print $4}')

# ΠΏΠΎΠ΄ΠΊΠ»ΡŽΡ‡Π°Π΅ΠΌ NFS инсталл-сСрвСра

esxcli storage nfs add -H is -s /srv/nfs_share -v nfsshare1

# ΠΊΠΎΠΏΠΈΡ€ΡƒΠ΅ΠΌ Π²Ρ€Π΅ΠΌΠ΅Π½Π½Ρ‹Π΅ настройки ssh, для использования ssh-ΠΊΠ»ΠΈΠ΅Π½Ρ‚Π°

mv /etc/ssh /etc/ssh.tmp
cp -R /vmfs/volumes/nfsshare1/ssh /etc/
chmod go-r /etc/ssh/ssh_host_rsa_key

# ΠΊΠΎΠΏΠΈΡ€ΡƒΠ΅ΠΌ ovftool, для развСртывания Π’Πœ сСйчас, плюс Π²ΠΎΠ·ΠΌΠΎΠΆΠ½ΠΎ пригодится ΠΏΠΎΠ·ΠΆΠ΅

cp -R /vmfs/volumes/nfsshare1/ovftool /vmfs/volumes/datastore1/

# Ρ€Π°Π·Π²Π΅Ρ€Ρ‚Ρ‹Π²Π°Π΅ΠΌ Π’Πœ

/vmfs/volumes/datastore1/ovftool/tools/ovftool --acceptAllEulas --noSSLVerify --datastore=datastore1 --name=VM1 /vmfs/volumes/nfsshare1/VM_T/VM1.ova vi://root:[email protected]
/vmfs/volumes/datastore1/ovftool/tools/ovftool --acceptAllEulas --noSSLVerify --datastore=datastore1 --name=VM2 /vmfs/volumes/nfsshare1/VM_T/VM2.ova vi://root:[email protected]

# ΠΏΠΎΠ»ΡƒΡ‡Π°Π΅ΠΌ строку с настройками нашСго сСрвСра

ssh root@is "mysql -h'192.168.0.1' -D'servers' -u'user' -p'secretpassword' -e "SELECT ... WHERE servers.serial='$SYSSN'"" | grep -v ^$ | sed 's/NULL//g' > /tmp/servers
...
# Π³Π΅Π½Π΅Ρ€ΠΈΡ€ΡƒΠ΅ΠΌ скрипт настройки сСти

echo '#!/bin/sh' > /vmfs/volumes/datastore1/netconf.sh
echo "esxcli network ip interface ipv4 set -i=vmk0 -t=static --ipv4=$IPADDR --netmask=$S_SUB || exit 1" >> /vmfs/volumes/datastore1/netconf.sh
echo "esxcli network ip route ipv4 add -g=$S_GW -n=default || exit 1" >> /vmfs/volumes/datastore1/netconf.sh
chmod a+x /vmfs/volumes/datastore1/netconf.sh

# Π·Π°Π΄Π°Π΅ΠΌ ΠΏΠ°Ρ€Π°ΠΌΠ΅Ρ‚Ρ€ guestinfo.esxihost.id, ΡƒΠΊΠ°Π·Ρ‹Π²Π°Π΅ΠΌ Π² Π½Π΅ΠΌ сСрийный Π½ΠΎΠΌΠ΅Ρ€

echo "guestinfo.esxihost.id = "$SYSSN"" >> /vmfs/volumes/datastore1/VM1/VM1.vmx
echo "guestinfo.esxihost.id = "$SYSSN"" >> /vmfs/volumes/datastore1/VM2/VM2.vmx
...
# обновляСм ΠΈΠ½Ρ„ΠΎΡ€ΠΌΠ°Ρ†ΠΈΡŽ Π² Π±Π°Π·Π΅

SYSNAME=$(esxcli hardware platform get | grep Product | sed 's/Product Name://' | sed 's/^ *//')
UUID=$(vim-cmd hostsvc/hostsummary | grep uuid | sed 's/ //g;s/,$//' | sed 's/^uuid="//;s/"$//')
ssh root@is "mysql -D'servers' -u'user' -p'secretpassword' -e "UPDATE servers ... SET ... WHERE servers.serial='$SYSSN'""
ssh root@is "mysql -D'servers' -u'user' -p'secretpassword' -e "INSERT INTO events ...""

# Π²ΠΎΠ·Π²Ρ€Π°Ρ‰Π°Π΅ΠΌ настройки SSH

rm -rf /etc/ssh
mv /etc/ssh.tmp /etc/ssh

# настраиваСм ΡΠ΅Ρ‚ΡŒ ΠΈ пСрСзагруТаСмся

esxcli system hostname set --fqdn=esx-${G_NICK}.x5.ru
/vmfs/volumes/datastore1/netconf.sh
reboot

At this stage, the hypervisor is installed and configured, the virtual machines are copied.

How to set up virtual machines now?

We cheated a little: during the installation, we set the guestinfo.esxihost.id = "$SYSSN" parameter in the VM1.vmx file, indicating the serial number of the physical server in it.

Now after starting the virtual machine (with the vmware-tools package installed) can access this parameter:

ESXI_SN=$(vmtoolsd --cmd "info-get guestinfo.esxihost.id")

That is, the VM will be able to identify itself (it knows the serial number of the physical host), make a request to the install server database and get the parameters that need to be configured. All this is made into a script, which should be launched automatically when the guestos vm starts (but once: RunOnce).

Now we know how:

  • boot the server via PXE;
  • transfer control to our script;
  • identify the server to be provisioned by serial number;
  • configure the server with appropriate utilities;
  • transfer settings to the install server database using the client part;
  • configure various types of software, including deploying the esxi hypervisor and configuring virtual machines (and all automatically).

Found out how:

  • the server to be installed receives the necessary settings from the database;
  • all preparation progress is recorded in the database (logs, events, stage flags).


The bottom line:

I believe that the uniqueness of this solution is in flexibility, simplicity, its capabilities and versatility.

Please write in the comments what you think.

Source: habr.com

Add a comment