opennebula. Short Notes

opennebula. Short Notes

Hi all. This article is written for those who are still torn between the choice of virtualization platforms and after reading an article from the series β€œWe installed proxmox and everything is fine, 6 years of uptime, not a single gap.” But after installing one or another boxed solution, the question arises, how to fix it here, so that monitoring is more understandable and here, to control backups .... And then the time comes and you realize that you want something more functional, well, or you want everything to become clear inside your system, and not this black box, or you want to use something more than a hypervisor and a bunch of virtual machines. In this article there will be some reflections and practice based on the Opennebula platform - I chose it because. is not demanding on resources and the architecture is not so complicated.

And so, as we see, many cloud providers work on kvm and make an external harness to control machines. It is clear that large hosters write their own bindings for cloud infrastructure, the same YANDEX for example. Someone uses openstack and makes a binding on this basis - SELECTEL, MAIL.RU. But if you have your own hardware and a small staff of specialists, then they usually choose something from ready-made ones - VMWARE, HYPER-V, there are free licenses and paid ones, but now it’s not about that. Let's talk about enthusiasts - these are those who are not afraid to offer and try something new, despite the fact that the company clearly made it clear "Who will serve it after you", "Will we then roll it out for sale? Scary." But after all, you can first apply these solutions in a test bench, and if everyone likes it, then you can raise the issue of further development and use in more serious environments.

Also here is a link to the report. www.youtube.com/watch?v=47Mht_uoX3A from an active participant in the development of this platform.

Perhaps in this article something will be superfluous and is already clear to an experienced specialist, and in some cases I will not describe everything, because similar commands and descriptions are on the network. This is just my experience with this platform. I hope the active participants will add in the comments what can be done better and what mistakes I made. All actions were in the conditions of a home stand consisting of 3 PCs with different characteristics. Also, I specifically did not begin to indicate how this software works and how to install it. No, only experience of administration and problems which faced. Maybe someone will find it useful in choosing.

And so, let's get started. As a system administrator, the following points are important to me, without which I am unlikely to use this solution.

1. Installation repeatability

There are tons of instructions for installing opennebula, you shouldn't have any problems with it. From version to version, new features appear that cannot always be earned when moving from version to version.

2. Monitoring

We will monitor the node itself, kvm and opennebula. Thankfully, it's already ready. There are a lot of options about monitoring linux hosts, the same zabbix or node exporter - whoever likes it better - at the moment I define it so that monitoring system metrics (temperature where it can be measured, disk array consistency), through zabbix, but as for applications through the exporter to prometheus. For kvm monitoring, for example, you can take a project github.com/zhangjianweibj/prometheus-libvirt-exporter.git and put the launch through systemd, it works quite well and shows kvm metrics, there is also a ready-made dashboard grafana.com/grafana/dashboards/12538.

For example, here is my file:

/etc/systemd/system/libvirtd_exporter.service
[Unit]
Description=Node Exporter

[Service]
User=node_exporter
ExecStart=/usr/sbin/prometheus-libvirt-exporter --web.listen-address=":9101"

[Install]
WantedBy=multi-user.target

And so we have 1 exporter, we need a second one to monitor opennebula itself, I used this github.com/kvaps/opennebula-exporter/blob/master/opennebula_exporter

Can be added to regular node_exporter to monitor the system as follows.

In the node_exporter file, we change the start in this way:

ExecStart=/usr/sbin/node_exporter --web.listen-address=":9102" --collector.textfile.directory=/var/lib/opennebula_exporter/textfile_collector

Create directory mkdir -p /var/lib/opennebula_exporter

bash script presented above, first we check the work through the console, if it shows what we need (if it gives an error, then we put xmlstarlet), copy it to /usr/local/bin/opennebula_exporter.sh

Add a cron task for every minute:

*/1 * * * * (/usr/local/bin/opennebula_exporter.sh > /var/lib/opennebula_exporter/textfile_collector/opennebula.prom)

Metrics began to appear, you can take them with Prometheus and build graphs and make alerts. In grafana, for example, you can draw such a simple dashboard.

opennebula. Short Notes

(you can see that here I did overcommit cpu, ram)

For those who love and use Zabbix, there is github.com/OpenNebula/addon-zabbix

On monitoring everything, the main thing it is. Of course, in addition, using the built-in virtual machine monitoring tools and uploading data to billing, everyone has their own vision, until they started to do this more closely.

To logging, while especially did not start. The easiest option is to add td-agent to parse the /var/lib/one directory with regular expressions. For example, the sunstone.log file matches the nginx regexp and other files that show the history of accessing the platform - what's the plus? Well, for example, we can explicitly track the number of "Error, error" and quickly track where and at what level there is a malfunction.

3. Backups

There are also paid finished projects - for example sep wiki.sepsoftware.com/wiki/index.php/4_4_3_Tigon:OpenNebula_Backup. Here we must understand that simply backing up the image of the machine, in this case, is not at all, because our virtual machines must work with full integration (the same file context that describes the network settings, the vm name and custom settings for your applications). Therefore, here we determine what and how we will backup. In some cases it is better to make copies of what is in the vm itself. And maybe you need to backup only one disk from this machine.

For example, we have determined that all machines start up with persistent images, therefore, after reading docs.opennebula.io/5.12/operation/vm_management/img_guide.html

so first we can upload the image from our vm:

onevm disk-saveas 74 3 prom.qcow2
Image ID: 77

Π‘ΠΌΠΎΡ‚Ρ€ΠΈΠΌ, ΠΏΠΎΠ΄ ΠΊΠ°ΠΊΠΈΠΌ ΠΈΠΌΠ΅Π½Π΅ΠΌ ΠΎΠ½ сохранился

oneimage show 77
/var/lib/one//datastores/100/f9503161fe180658125a9b32433bf6e8
   
И Π΄Π°Π»Π΅Π΅ ΠΊΠΎΠΏΠΈΡ€ΡƒΠ΅ΠΌ ΠΊΡƒΠ΄Π° Π½Π°ΠΌ Π½Π΅ΠΎΠ±Ρ…ΠΎΠ΄ΠΈΠΌΠΎ. ΠšΠΎΠ½Π΅Ρ‡Π½ΠΎ, Ρ‚Π°ΠΊ сСбС способ. ΠŸΡ€ΠΎΡΡ‚ΠΎ Ρ…ΠΎΡ‚Π΅Π» ΠΏΠΎΠΊΠ°Π·Π°Ρ‚ΡŒ, Ρ‡Ρ‚ΠΎ ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΡ инструмСнты opennebula ΠΌΠΎΠΆΠ½ΠΎ ΡΡ‚Ρ€ΠΎΠΈΡ‚ΡŒ ΠΏΠΎΠ΄ΠΎΠ±Π½Ρ‹Π΅ Ρ€Π΅ΡˆΠ΅Π½ΠΈΡ.

Also found on the net interesting report and there is more such an open project, but here only under qcow2 storage.

But as we all know, sooner or later there comes a moment when you want incremental backups, it’s more difficult and perhaps the management will allocate money for a paid solution, or go the other way and understand that here we only cut resources, and make reservations at the application level and adding the number of new nodes and virtual machines - yes, here, I say that using the cloud is purely for launching application clusters, and running the database on another platform or taking it ready from the supplier, if possible.

4. Ease of use

In this paragraph, I will describe the problems that I encountered. For example, according to images, as we know, there is persistent - when this image is mounted to a vm, all data is written to this image. And if non-persistent, then the image is copied to the storage and the data is written to what was copied from the original image - this is how template blanks work. He repeatedly caused problems for himself by forgetting to specify persistent and the 200 GB image was copied, the problem is that this procedure cannot be canceled for sure, you have to go to the node and kill the current β€œcp” process.

One of the important disadvantages is that you cannot undo actions simply using the gui. Or rather, you will cancel them and see that nothing happens and start again, cancel and in fact there will already be 2 cp processes that copy the image.

And then it comes to understanding why opennebula numbers each new instance with a new id, for example, in the same proxmox created a vm with id 101, deleted it, then re-create id 101. This will not happen in opennebula, each new instance will be created with a new id and this has its own logic - for example, clearing old data or unsuccessful installations.

The same goes for storage, most of all this platform is aimed at centralized storage. There are addons for using local, but in this case it's not about that. I think that in the future someone will write an article about how they managed to use local storage on nodes and successfully use it in production.

5. Maximum simplicity

Of course, the further you go, the fewer those who will understand you become.

In the conditions of my stand - 3 nodes with nfs storage - everything works fine. But if we carry out experiments to turn off the power, then for example, when we run a snapshot and turn off the power of the node, we save the settings in the database, that there is a snapshot, but in fact it is not (well, we all understand that we initially wrote the database about this action in sql but the operation itself was not successful). The advantage is that when creating a snapshot, a separate file is formed and there is a β€œparent”, therefore, in case of problems and even if it does not work through the gui, we can pick up the qcow2 file and recover separately docs.opennebula.io/5.8/operation/vm_management/vm_instances.html

On networks, unfortunately, not everything is so simple. Well, at least it’s easier than in openstack, I used only vlan (802.1Q) - it works fine, but if you make changes to the settings from the template network, then these settings will not be applied on already working machines, i.e. you need to delete and add a network map, then the new settings will be applied.

If you still want to compare with openstack, then you can say this, in opennebula there is no clear definition of which technologies to use for data storage, network management, resources - each administrator decides for himself what is more convenient for him.

6. Additional plugins and installations

After all, as we understand, the cloud platform can manage not only kvm, but also vmware esxi. Unfortunately, I didn’t have a pool with Vcenter, if anyone tried to write.

In support of other cloud providers, it is stated docs.opennebula.io/5.12/advanced_components/cloud_bursting/index.html
AWS, AZURE.

I also tried to tie Vmware Cloud from selectel, but nothing happened - in general, I scored because there are many factors, and it makes no sense to write to the technical support of the hosting provider.

Also, now in the new version there is a firecracker - this is the launch of microvm, such as kvm binding over docker, which gives even more versatility, security and performance improvements since there is no need to waste resources on hardware emulation. I see only an advantage in relation to the docker that it does not take up an additional number of processes and there are no occupied sockets when using this emulation, i.e. it’s quite possible to use it as a load balancer (but it’s probably worth writing a separate article about this until you have completed all the tests in full).

7. Positive user experience and error debugging

I wanted to share my observations about the work, I described part of it above, I want to write more. Indeed, I’m probably not the only one who at first thinks that this is not the right system and in general everything is a crutch - how do they work with it? But then comes the understanding and that everything is quite logical. Of course, you can't please everyone, and some things need to be improved.

For example, a simple operation of copying a disk image from one datastore to another. In my case, there are 2 nodes with nfs, I send the image - copying goes through the frontend opennebula, although we are all used to the fact that data should be copied directly between hosts - in the same vmware, hyper-v, we are used to this, but here to another. There is a different approach and a different ideology, and in version 5.12 the β€œmigrate to datastore” button was removed - only the machine itself is transferred, but not the storage. means centralized storage.

Further, a popular error with various reasons β€œError deploying virtual machine: Could not create domain from /var/lib/one//datastores/103/10/deployment.5” Below is the top thing to look at.

  • Rights to the image for the oneadmin user;
  • Permissions for the oneadmin user to run libvirtd;
  • Is the datastore mounted correctly? Go and check the path on the node itself, something may have fallen off;
  • An incorrectly configured network, or rather on the frontend, is in the network settings that br0 is the main interface for vlan, and bridge0 is written on the node - it must be the same.

The system datastore stores metadata for your vm, if you run a vm with a persistent image, then the vm needs to have access to the initially created configuration on the storage where you created the vm - this is very important. Therefore, when moving a vm to another datastore, you need to double-check everything.

8. Documentation, community. Further development

And the rest, good documentation, community, and most importantly, that the project continues to live in the future.

Here, in general, everything is pretty well documented, and even according to an official source, it will not be a problem to establish and find answers to questions.

Community active. It publishes many ready-made solutions that you can use in your installations.

At the moment, some policies in the company have changed since 5.12 forum.opennebula.io/t/towards-a-stronger-opennebula-community/8506/14 It will be interesting to see how the project develops. At the beginning, I specifically pointed out some of the vendors who use their solutions and what the industry offers. Of course, there is no clear answer what to use for you. But for smaller organizations, maintaining their own small private cloud may not be as costly as it seems. The main thing is to know exactly what you need.

As a result, no matter what you choose as a cloud system, you should not stop at one product. If you have time, you should look at other more open solutions.

There is a good chat t.me/opennebula actively help and do not send to look for a solution to the problem in Google. Join.

Source: habr.com

Add a comment