ProHoster > Blog > Administration > Creation of fault-tolerant IT infrastructure. Part 1 - Preparing to Deploy an oVirt 4.3 Cluster
Creation of fault-tolerant IT infrastructure. Part 1 - Preparing to Deploy an oVirt 4.3 Cluster
Readers are invited to familiarize themselves with the principles of building a fault-tolerant infrastructure for a small enterprise within a single data center, which will be discussed in detail in a short series of articles.
Introduction
Under data center (Data Processing Center) can be understood as:
own rack in its own "server room" on the territory of the enterprise, which meets the minimum requirements for providing power and cooling equipment, and also has Internet access through two independent providers;
a rented rack with its own equipment, located in a real data center - the so-called. a Tier III or IV collocation that guarantees reliable power, cooling and failover Internet access;
fully leased equipment in a Tier III or IV data center.
Which accommodation option to choose - in each case, everything is individual, and usually depends on several main factors:
why does an enterprise need its own IT infrastructure at all;
what exactly does the enterprise want from the IT infrastructure (reliability, scalability, manageability, etc.);
the amount of initial investment in IT infrastructure, as well as what type of costs for it - capital (which means buying your own equipment), or operating (equipment is usually rented);
the planning horizon of the enterprise itself.
You can write a lot about the factors influencing the decision of an enterprise to create and use its IT infrastructure, but our goal is to show in practice how to create this very infrastructure so that it is both fault-tolerant and you can still save - reduce the cost of acquiring commercial software, or avoid them altogether.
As long practice shows, it is not worth saving on hardware, since the miser pays twice, and even much more. But again - good hardware, this is just a recommendation, and in the end what exactly to buy and for how much depends on the capabilities of the enterprise, and the "greed" of its management. Moreover, the word "greed" should be understood in the good sense of the word, since it is better to invest in hardware at the initial stage, so that later you do not have serious problems with its further support and scaling, since initially incorrect planning and excessive savings can lead to higher costs than when starting a project.
So, the initial data for the project:
there is an enterprise that has decided to create its own web portal and bring its activities to the Internet;
the company decided to rent a rack to accommodate its equipment in a good data center certified according to the Tier III standard;
the company decided not to save much on hardware, and therefore bought the following equipment with extended warranties and support:
Equipment list
two physical Dell PowerEdge R640 servers as follows:
two Intel Xeon Gold 5120 processors
512 Gb RAM
two SAS disks in RAID1, for OS installation
built-in 4-port 1G network card
two 2-port 10G network cards
one 2-port FC HBA 16G.
Dell MD2f 3820 controller storage connected via FC 16G directly to Dell hosts;
two switches of the second level - Cisco WS-C2960RX-48FPS-L stacked;
two switches of the third level - Cisco WS-C3850-24T-E, combined into a stack;
Rack, UPS, PDU, console servers - provided by the data center.
As we can see, the existing equipment has good prospects for horizontal and vertical scaling, if the enterprise can compete with other companies of a similar profile on the Internet, and starts to make a profit that can be invested in expanding resources for further competition and profit growth.
What equipment can we add if the enterprise decides to increase the performance of our computing cluster:
we have a large reserve in terms of the number of ports on the 2960X switches, which means we can add more hardware servers;
buy two FC switches to connect storage systems and additional servers to them;
existing servers can be upgraded - add memory, replace processors with more efficient ones, connect to a 10G network with existing network adapters;
you can add additional disk shelves to the storage system with the required type of disks - SAS, SATA or SSD, depending on the planned load;
after adding FC switches, you can purchase another storage system to add even more disk capacity, and if you purchase a special Remote Replication option to it, you can configure data replication between storage systems both within the boundaries of one data center and between data centers (but this is already beyond the scope of the article);
there are also third-level switches - Cisco 3850, which can be used as a fault-tolerant network core for high-speed routing between internal networks. This will help a lot in the future, as the internal infrastructure grows. The 3850 also has 10G ports that can be used later when upgrading network equipment to 10G speed.
Since now there is nowhere without virtualization, we will certainly be in trend, especially since this is a great way to reduce the cost of acquiring expensive servers for individual infrastructure elements (web servers, databases, etc.), which are not always optimal are used in case of low load, and this is exactly what will be at the beginning of the project launch.
In addition, virtualization has many other advantages that can be very useful to us: VM fault tolerance from a hardware server failure, Live migration between cluster hardware nodes for their maintenance, manual or automatic load distribution between cluster nodes, etc.
For the hardware purchased by the enterprise, the deployment of a highly-available VMware vSphere cluster suggests itself, but since any software from VMware is known for its βhorseβ price tags, we will use absolutely free virtualization management software - oVirt, on the basis of which a well-known, but already commercial product is created - rhev.
Software oVirt necessary to combine all the elements of the infrastructure into one whole in order to be able to conveniently work with highly available virtual machines - these are databases, web applications, proxy servers, balancers, servers for collecting logs and analytics, etc., that is, what the web portal of our enterprise consists of.
Summing up this introduction, the following articles await us, which will show in practice exactly how to deploy the entire hardware and software infrastructure of an enterprise:
List of articles
Part 1. Preparing to Deploy an oVirt Cluster 4.3.
Part 2. Installing and configuring an oVirt cluster 4.3.
Part 3. Setting up a VyOS cluster, organizing fault-tolerant external routing.
Part 4. Setting up the Cisco 3850 stack, organizing intranet routing.
Part 1. Preparing to Deploy an oVirt 4.3 Cluster
Basic host setup
Installing and configuring the OS is the easiest step. There are a lot of articles on how to properly install and configure the OS, so it makes no sense to try to give out something exclusive about this.
So, we have two Dell PowerEdge R640 hosts on which we need to install the OS and perform preliminary settings in order to use them as hypervisors to run virtual machines in an oVirt 4.3 cluster.
Since we plan to use the free non-commercial software oVirt, we chose the OS for deploying hosts 7.7 CentOS, although it is possible to install other operating systems on hosts for oVirt:
a special build based on RHEL, the so-called. oVirt Node;
OS Oracle Linux Summer 2019 was announced about keeping oVirt running on it.
Before installing the OS, it is recommended:
configure the iDRAC network interface on both hosts;
update firmware for BIOS and iDRAC to the latest versions;
configure the System Profile of the server, preferably in Performance mode;
configure RAID from local disks (RAID1 is recommended) to install the OS on the server.
Then we install the OS on the disk created earlier through iDRAC - the installation process is normal, there are no special moments in it. You can also access the server console to start OS installation via iDRAC, although nothing prevents you from connecting a monitor, keyboard and mouse directly to the server and installing the OS from a flash drive.
After installing the OS, we perform its initial settings:
systemctl enable network.service
systemctl start network.service
systemctl status network.service
systemctl stop NetworkManager
systemctl disable NetworkManager
systemctl status NetworkManager
For the initial OS setup, you need to configure any network interface on the server so that you can access the Internet to update the OS and install the necessary software packages. This can be done both during the OS installation process and after it.
All of the above settings and set of software is a matter of personal preference, and this set is only a recommendation.
Since our host will play the role of a hypervisor, we will enable the desired performance profile:
systemctl enable tuned
systemctl start tuned
systemctl status tuned
tuned-adm profile
tuned-adm profile virtual-host
You can read more about the performance profile here:Chapter 4Β«.
After installing the OS, we move on to the next part - configuring network interfaces on hosts, and a stack of Cisco 2960X switches.
Configuring a Cisco 2960X Switch Stack
In our project, the following VLAN numbers will be used - or broadcast domains isolated from each other, in order to separate different types of traffic:
VLAN 10 - Internet VLAN 17 β Management (iDRAC, storage, switches management) VLAN 32 β VM production network VLAN 33 β interconnection network (to external contractors) VLAN 34 β VM test network VLAN 35 β VM developer network VLAN 40 β monitoring network
Before starting work, let's give a diagram at the L2 level, which we should eventually come to:
For network interaction of oVirt hosts and virtual machines with each other, as well as for managing our storage system, it is necessary to configure a stack of Cisco 2960X switches.
Dell hosts have built-in 4-port network cards, therefore, it is advisable to organize their connection to the Cisco 2960X using a fault-tolerant network connection, using the grouping of physical network ports into a logical interface, and the LACP (802.3ad) protocol:
the first two ports on the host are configured in bonding mode and connected to the 2960X switch - this logical interface will be configured bridge with an address for host management, monitoring, communication with other hosts in the oVirt cluster, it will also be used for Live migration of virtual machines;
the second two ports on the host are also configured in bonding mode and connected to the 2960X - on this logical interface using oVirt, bridges will be created later (in the corresponding VLANs) to which virtual machines will connect.
both network ports within the same logical interface will be active, i.e. traffic on them can be transmitted simultaneously, in the balancing mode.
network settings on cluster nodes must be exactly the same, except for IP addresses.
Basic switch stack setup 2960 times and its ports
Previously, our switches should be:
rack mounted;
connected with two special cables of the required length, for example, CAB-STK-E-1M;
connected to the power supply;
connected to the administrator's workstation via the console port for their initial configuration.
The necessary guidance for this is available at the official website manufacturer.
After completing the above steps, we configure the switches.
What each command means is not supposed to be deciphered within the framework of this article; if necessary, all the information can be found independently.
Our goal is to quickly set up a switch stack and connect hosts and storage management interfaces to it.
1) We connect to the master switch, go to the privileged mode, then go to the configuration mode and make the basic settings.
Basic switch config:
enable
configure terminal
hostname 2960X
no service pad
service timestamps debug datetime msec
service timestamps log datetime localtime show-timezone msec
no service password-encryption
service sequence-numbers
switch 1 priority 15
switch 2 priority 14
stack-mac persistent timer 0
clock timezone MSK 3
vtp mode transparent
ip subnet-zero
vlan 17
name Management
vlan 32
name PROD
vlan 33
name Interconnect
vlan 34
name Test
vlan 35
name Dev
vlan 40
name Monitoring
spanning-tree mode rapid-pvst
spanning-tree etherchannel guard misconfig
spanning-tree portfast bpduguard default
spanning-tree extend system-id
spanning-tree vlan 1-40 root primary
spanning-tree loopguard default
vlan internal allocation policy ascending
port-channel load-balance src-dst-ip
errdisable recovery cause loopback
errdisable recovery cause bpduguard
errdisable recovery interval 60
line con 0
session-timeout 60
exec-timeout 60 0
logging synchronous
line vty 5 15
session-timeout 60
exec-timeout 60 0
logging synchronous
ip http server
ip http secure-server
no vstack
interface Vlan1
no ip address
shutdown
exit
Save the config with the command "wr meme" and restart the switch stack with the command "reloadΒ» on the master switch 1.
2) We configure the network ports of the switch in the access mode (access) in VLAN 17, to connect the control interfaces of storage systems and iDRAC servers.
3) After reloading the stack, check that it works correctly:
Checking the functioning of the stack:
2960X#show switch stack-ring speed
Stack Ring Speed : 20G
Stack Ring Configuration: Full
Stack Ring Protocol : FlexStack
2960X#show switch stack-ports
Switch # Port 1 Port 2
-------- ------ ------
1 Ok Ok
2 Ok Ok
2960X#show switch neighbors
Switch # Port 1 Port 2
-------- ------ ------
1 2 2
2 1 1
2960X#show switch detail
Switch/Stack Mac Address : 0cd0.f8e4.Π₯Π₯Π₯Π₯
Mac persistency wait time: Indefinite
H/W Current
Switch# Role Mac Address Priority Version State
----------------------------------------------------------
*1 Master 0cd0.f8e4.Π₯Π₯Π₯Π₯ 15 4 Ready
2 Member 0029.c251.Π₯Π₯Π₯Π₯ 14 4 Ready
Stack Port Status Neighbors
Switch# Port 1 Port 2 Port 1 Port 2
--------------------------------------------------------
1 Ok Ok 2 2
2 Ok Ok 1 1
4) Setting up SSH access to the 2960X stack
To manage the stack remotely via SSH, we will use IP 172.20.1.10 configured on SVI (switch virtual interface) VLAN17.
Although it is desirable to use a dedicated dedicated port on the switch for management purposes, this is a matter of personal preference and opportunity.
Setting up SSH access to the switch stack:
ip default-gateway 172.20.1.2
interface vlan 17
ip address 172.20.1.10 255.255.255.0
hostname 2960X
ip domain-name hw.home-lab.ru
no ip domain-lookup
clock set 12:47:04 06 Dec 2019
crypto key generate rsa
ip ssh version 2
ip ssh time-out 90
line vty 0 4
session-timeout 60
exec-timeout 60 0
privilege level 15
logging synchronous
transport input ssh
line vty 5 15
session-timeout 60
exec-timeout 60 0
privilege level 15
logging synchronous
transport input ssh
aaa new-model
aaa authentication login default local
username cisco privilege 15 secret my_ssh_password
Set up a password to enter privileged mode:
enable secret *myenablepassword*
service password-encryption
Set up NTP:
ntp server 85.21.78.8 prefer
ntp server 89.221.207.113
ntp server 185.22.60.71
ntp server 192.36.143.130
ntp server 185.209.85.222
show ntp status
show ntp associations
show clock detail
5) Set up logical Etherchannel interfaces and physical ports connected to hosts. For ease of configuration, all available VLANs will be allowed on all logical interfaces, but it is generally recommended to configure only what is needed:
After completing the settings on the stack 2960X and hosts, restart the network on the hosts, and check the operability of the logical interface.
on host:
systemctl restart network
cat /proc/net/bonding/bond1
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2+3 (2)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
...
802.3ad info
LACP rate: fast
Min links: 0
Aggregator selection policy (ad_select): stable
System priority: 65535
...
Slave Interface: em2
MII Status: up
Speed: 1000 Mbps
Duplex: full
...
Slave Interface: em3
MII Status: up
Speed: 1000 Mbps
Duplex: full
on the switch stack 2960X:
2960X#show lacp internal
Flags: S - Device is requesting Slow LACPDUs
F - Device is requesting Fast LACPDUs
A - Device is in Active mode P - Device is in Passive mode
Channel group 1
LACP port Admin Oper Port Port
Port Flags State Priority Key Key Number State
Gi1/0/1 SA bndl 32768 0x1 0x1 0x102 0x3D
Gi2/0/1 SA bndl 32768 0x1 0x1 0x202 0x3D
2960X#sh etherchannel summary
Flags: D - down P - bundled in port-channel
I - stand-alone s - suspended
H - Hot-standby (LACP only)
R - Layer3 S - Layer2
U - in use N - not in use, no aggregation
f - failed to allocate aggregator
M - not in use, minimum links not met
m - not in use, port not aggregated due to minimum links not met
u - unsuitable for bundling
w - waiting to be aggregated
d - default port
A - formed by Auto LAG
Number of channel-groups in use: 11
Number of aggregators: 11
Group Port-channel Protocol Ports
------+-------------+-----------+-----------------------------------------------
1 Po1(SU) LACP Gi1/0/1(P) Gi2/0/1(P)
Initial configuration of network interfaces for managing cluster resources, on hosts Guest1 ΠΈ Guest2
Configuring the BOND1 logical interface for management on hosts, and its physical interfaces:
We restart the network on the hosts and check their visibility to each other.
This completes the configuration of the Cisco 2960X switch stack, and if everything was done correctly, now we have network connectivity of all infrastructure elements to each other at the L2 level.
Dell MD3820f storage setup
Before starting work on configuring the storage system, it must already be connected to the Cisco switch stack 2960X management interfaces, as well as to hosts Guest1 ΠΈ Guest2 via FC.
The general scheme of how the storage system should be connected to the switch stack was given in the previous chapter.
The scheme for connecting storage via FC to hosts should look like this:
During the connection, it is necessary to write down the WWPN addresses for the FC HBA hosts connected to the FC ports on the storage system - this will be necessary for the subsequent configuration of host binding to LUNs on the storage system.
Download and install the Dell MD3820f storage management utility on the administrator workstation - PowerVault Modular Disk Storage Manager (MDSM).
We connect to her via her default IP addresses, and then configure our addresses from VLAN17, to manage controllers via TCP/IP:
Storage1:
ControllerA IP - 172.20.1.13, MASK - 255.255.255.0, Gateway - 172.20.1.2
ControllerB IP - 172.20.1.14, MASK - 255.255.255.0, Gateway - 172.20.1.2
After setting up the addresses, we go to the storage management interface and set a password, set the time, update the firmware for controllers and disks, if necessary, etc.
How this is done is described in administration guide storage.
After making the above settings, we only need to do a few things:
Configure host FC port IDs - Host Port Identifiers.
Create a host group β host group and add our two Dell hosts to it.
Create a disk group and virtual disks (or LUNs) in it, which will be presented to hosts.
Configure the presentation of virtual disks (or LUNs) for hosts.
Adding new hosts and binding identifiers of host FC ports to them is done through the menu - Host Mappings -> Define -> Hostsβ¦
The WWPN addresses of the FC HBA hosts can be found, for example, in the server's iDRAC.
As a result, we should get something like this picture:
Adding a new host group and binding hosts to it is done through the menu - Host Mappings -> Define -> Host Groupβ¦
For hosts, select the type of OS - Linux (DM-MP).
After creating a host group, through the tab Storage & Copy Services, create a disk group - disk group, with a type depending on the requirements for fault tolerance, for example, RAID10, and in it virtual disks of the required size:
And finally, the final stage is the presentation of virtual disks (or LUNs) for hosts.
To do this, through the menu - Host Mappings -> Moon mapping -> Add ... we bind virtual disks to hosts by assigning numbers to them.
Everything should look like this screenshot:
This is where we finish with the storage setup, and if everything was done correctly, then the hosts should see the LUNs presented to them through their FC HBAs.
Let's force the system to update information about connected drives:
ls -la /sys/class/scsi_host/
echo "- - -" > /sys/class/scsi_host/host[0-9]/scan
Let's see what devices are visible on our servers:
On hosts, you can also additionally configure multipath, and although it can do it itself when installing oVirt, it is better to check the correctness of the MP beforehand.
As you can see, all three virtual disks on the storage system are visible in two ways. Thus, all the preparatory work has been completed, which means that you can proceed to the main part - setting up the oVirt cluster, which will be discussed in the next article.