Splunk Universal Forwarder in docker as a system log collector

Splunk Universal Forwarder in docker as a system log collector

Splunk is one of several of the most recognizable commercial log collection and analysis products. Even now, when sales in Russia are no longer made, this is no reason not to write instructions / how-to for this product.

Task: collect system logs from docker nodes in Splunk without changing the configuration of the host machine

I would like to start with the official approach, which looks weird when using docker.
Link to Docker hub
What do we have:

1. Pullim look

$ docker pull splunk/universalforwarder:latest

2. We start the container with the necessary parameters

$ docker run -d  -p 9997:9997 -e 'SPLUNK_START_ARGS=--accept-license' -e 'SPLUNK_PASSWORD=<password>' splunk/universalforwarder:latest

3. We go into the container

docker exec -it <container-id> /bin/bash

Next, we are asked to go to a known address in the documentation.

And configure the container after it starts:


./splunk add forward-server <host name or ip address>:<listening port>
./splunk add monitor /var/log
./splunk restart

Wait. What?

But the surprises don't end there. If you run the container from the official image in interactive mode, you will see the following:

A bit of disappointment


$ docker run -it -p 9997:9997 -e 'SPLUNK_START_ARGS=--accept-license' -e 'SPLUNK_PASSWORD=password' splunk/universalforwarder:latest

PLAY [Run default Splunk provisioning] *******************************************************************************************************************************************************************************************************
Tuesday 09 April 2019  13:40:38 +0000 (0:00:00.096)       0:00:00.096 *********

TASK [Gathering Facts] ***********************************************************************************************************************************************************************************************************************
ok: [localhost]
Tuesday 09 April 2019  13:40:39 +0000 (0:00:01.520)       0:00:01.616 *********

TASK [Get actual hostname] *******************************************************************************************************************************************************************************************************************
changed: [localhost]
Tuesday 09 April 2019  13:40:40 +0000 (0:00:00.599)       0:00:02.215 *********
Tuesday 09 April 2019  13:40:40 +0000 (0:00:00.054)       0:00:02.270 *********

TASK [set_fact] ******************************************************************************************************************************************************************************************************************************
ok: [localhost]
Tuesday 09 April 2019  13:40:40 +0000 (0:00:00.075)       0:00:02.346 *********
Tuesday 09 April 2019  13:40:40 +0000 (0:00:00.067)       0:00:02.413 *********
Tuesday 09 April 2019  13:40:40 +0000 (0:00:00.060)       0:00:02.473 *********
Tuesday 09 April 2019  13:40:40 +0000 (0:00:00.051)       0:00:02.525 *********
Tuesday 09 April 2019  13:40:40 +0000 (0:00:00.056)       0:00:02.582 *********
Tuesday 09 April 2019  13:40:41 +0000 (0:00:00.216)       0:00:02.798 *********
included: /opt/ansible/roles/splunk_common/tasks/change_splunk_directory_owner.yml for localhost
Tuesday 09 April 2019  13:40:41 +0000 (0:00:00.087)       0:00:02.886 *********

TASK [splunk_common : Update Splunk directory owner] *****************************************************************************************************************************************************************************************
ok: [localhost]
Tuesday 09 April 2019  13:40:41 +0000 (0:00:00.324)       0:00:03.210 *********
included: /opt/ansible/roles/splunk_common/tasks/get_facts.yml for localhost
Tuesday 09 April 2019  13:40:41 +0000 (0:00:00.094)       0:00:03.305 *********

Π½Ρƒ ΠΈ Ρ‚Π°ΠΊ Π΄Π°Π»Π΅Π΅...

Great. There is not even an artifact in the image. That is, every time you start it will take time to download the archive with binaries, unpack and configure.
But what about docker-way and all that?

No thanks. We will go the other way. What if we perform all these operations at the assembly stage? Then let's go!

In order not to delay for a long time, I will immediately show the final image:

Dockerfile

# Π’ΡƒΡ‚ Ρƒ ΠΊΠΎΠ³ΠΎ ΠΊΠ°ΠΊΠΈΠ΅ прСдпочтСния
FROM centos:7

# Π—Π°Π΄Π°Ρ‘ΠΌ ΠΏΠ΅Ρ€Π΅ΠΌΠ΅Π½Π½Ρ‹Π΅, Ρ‡Ρ‚ΠΎΠ±Ρ‹ ΠΊΠ°ΠΆΠ΄Ρ‹ΠΉ Ρ€Π°Π· ΠΏΡ€ΠΈ стартС Π½Π΅ ΡƒΠΊΠ°Π·Ρ‹Π²Π°Ρ‚ΡŒ ΠΈΡ…
ENV SPLUNK_HOME /splunkforwarder
ENV SPLUNK_ROLE splunk_heavy_forwarder
ENV SPLUNK_PASSWORD changeme
ENV SPLUNK_START_ARGS --accept-license

# Π‘Ρ‚Π°Π²ΠΈΠΌ ΠΏΠ°ΠΊΠ΅Ρ‚Ρ‹
# wget - Ρ‡Ρ‚ΠΎΠ±Ρ‹ ΡΠΊΠ°Ρ‡Π°Ρ‚ΡŒ Π°Ρ€Ρ‚Π΅Ρ„Π°ΠΊΡ‚Ρ‹
# expect - понадобится для ΠΏΠ΅Ρ€Π²ΠΎΠ½Π°Ρ‡Π°Π»ΡŒΠ½ΠΎΠ³ΠΎ запуска Splunk Π½Π° этапС сборки
# jq - ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΠ΅Ρ‚ΡΡ Π² скриптах, ΠΊΠΎΡ‚ΠΎΡ€Ρ‹Π΅ ΡΠΎΠ±ΠΈΡ€Π°ΡŽΡ‚ статистику Π΄ΠΎΠΊΠ΅Ρ€Π°
RUN yum install -y epel-release 
    && yum install -y wget expect jq

# ΠšΠ°Ρ‡Π°Π΅ΠΌ, распаковываСм, удаляСм
RUN wget -O splunkforwarder-7.2.4-8a94541dcfac-Linux-x86_64.tgz 'https://www.splunk.com/bin/splunk/DownloadActivityServlet?architecture=x86_64&platform=linux&version=7.2.4&product=universalforwarder&filename=splunkforwarder-7.2.4-8a94541dcfac-Linux-x86_64.tgz&wget=true' 
    && wget -O docker-18.09.3.tgz 'https://download.docker.com/linux/static/stable/x86_64/docker-18.09.3.tgz' 
    && tar -xvf splunkforwarder-7.2.4-8a94541dcfac-Linux-x86_64.tgz 
    && tar -xvf docker-18.09.3.tgz  
    && rm -f splunkforwarder-7.2.4-8a94541dcfac-Linux-x86_64.tgz 
    && rm -f docker-18.09.3.tgz

# Π‘ shell скриптами всё понятно, Π° Π²ΠΎΡ‚ inputs.conf, splunkclouduf.spl ΠΈ first_start.sh Π½ΡƒΠΆΠ΄Π°ΡŽΡ‚ΡΡ Π² пояснСнии. Об этом расскаТу послС source тэга.
COPY [ "inputs.conf", "docker-stats/props.conf", "/splunkforwarder/etc/system/local/" ]
COPY [ "docker-stats/docker_events.sh", "docker-stats/docker_inspect.sh", "docker-stats/docker_stats.sh", "docker-stats/docker_top.sh", "/splunkforwarder/bin/scripts/" ]
COPY splunkclouduf.spl /splunkclouduf.spl
COPY first_start.sh /splunkforwarder/bin/

#  Π”Π°Ρ‘ΠΌ ΠΏΡ€Π°Π²Π° Π½Π° исполнСниС, добавляСм ΠΏΠΎΠ»ΡŒΠ·ΠΎΠ²Π°Ρ‚Π΅Π»Ρ ΠΈ выполняСм ΠΏΠ΅Ρ€Π²ΠΎΠ½Π°Ρ‡Π°Π»ΡŒΠ½ΡƒΡŽ настройку
RUN chmod +x /splunkforwarder/bin/scripts/*.sh 
    && groupadd -r splunk 
    && useradd -r -m -g splunk splunk 
    && echo "%sudo ALL=NOPASSWD:ALL" >> /etc/sudoers 
    && chown -R splunk:splunk $SPLUNK_HOME 
    && /splunkforwarder/bin/first_start.sh 
    && /splunkforwarder/bin/splunk install app /splunkclouduf.spl -auth admin:changeme 
    && /splunkforwarder/bin/splunk restart

# ΠšΠΎΠΏΠΈΡ€ΡƒΠ΅ΠΌ ΠΈΠ½ΠΈΡ‚ скрипты
COPY [ "init/entrypoint.sh", "init/checkstate.sh", "/sbin/" ]

# По ТСланию. ΠšΠΎΠΌΡƒ Π½ΡƒΠΆΠ½ΠΎ локально ΠΈΠΌΠ΅Ρ‚ΡŒ ΠΊΠΎΠ½Ρ„ΠΈΠ³ΠΈ/Π»ΠΎΠ³ΠΈ, ΠΊΠΎΠΌΡƒ Π½Π΅Ρ‚.
VOLUME [ "/splunkforwarder/etc", "/splunkforwarder/var" ]

HEALTHCHECK --interval=30s --timeout=30s --start-period=3m --retries=5 CMD /sbin/checkstate.sh || exit 1

ENTRYPOINT [ "/sbin/entrypoint.sh" ]
CMD [ "start-service" ]

And so, what is contained in

first_start.sh

#!/usr/bin/expect -f
set timeout -1
spawn /splunkforwarder/bin/splunk start --accept-license
expect "Please enter an administrator username: "
send -- "adminr"
expect "Please enter a new password: "
send -- "changemer"
expect "Please confirm new password: "
send -- "changemer"
expect eof

At the first start, Splunk asks for a username / password, BUT this data is used only to execute administrative commands for that particular installation, that is, inside a container. In our case, we just want to run the container so that everything works and the logs flow like water. Of course, this is hardcode, but I did not find other ways.

Further on the script is executed

/splunkforwarder/bin/splunk install app /splunkclouduf.spl -auth admin:changeme

splunkclouduf.spl - This is a credits file for Splunk Universal Forwarder, which can be downloaded from the web interface.

Where to click to download (in pictures)Splunk Universal Forwarder in docker as a system log collector

Splunk Universal Forwarder in docker as a system log collector
This is a regular archive that can be unpacked. Inside - certificates and password to connect to our SplunkCloud and outputs.conf with a list of our input instances. This file will be up to date until you reinstall your Splunk installation or add an input node if the installation is on-premise. Therefore, there is nothing wrong with adding it inside the container.

And the last one is restart. Yes, to apply the changes, you need to restart it.

In our inputs.conf add the logs we want to send to Splunk. It is not necessary to add this file to the image if, for example, you distribute configs via puppet. The main thing is that Forwarder sees the configs when the daemon starts, otherwise it will be needed ./splunk restart.

What are docker stats scripts? There is an old solution on github from outcoldman, the scripts are taken from there and modified to work with the current versions of Docker (ce-17.*) and Splunk (7.*).

With the obtained data, you can build such

dashboards: (couple of pictures)Splunk Universal Forwarder in docker as a system log collector

Splunk Universal Forwarder in docker as a system log collector
The source code of the dashes is in the turnip indicated at the end of the article. Please note that there are 2 select fields: 1 - index selection (search by mask), host / container selection. The index mask will most likely need to be updated depending on the names you use.

In conclusion, I want to draw attention to the function start() Π²

entrypoint.sh

start() {
    trap teardown EXIT
	if [ -z $SPLUNK_INDEX ]; then
	echo "'SPLUNK_INDEX' env variable is empty or not defined. Should be 'dev' or 'prd'." >&2
	exit 1
	else
	sed -e "s/@index@/$SPLUNK_INDEX/" -i ${SPLUNK_HOME}/etc/system/local/inputs.conf
	fi
	sed -e "s/@hostname@/$(cat /etc/hostname)/" -i ${SPLUNK_HOME}/etc/system/local/inputs.conf
    sh -c "echo 'starting' > /tmp/splunk-container.state"
	${SPLUNK_HOME}/bin/splunk start
    watch_for_failure
}

In my case, for each environment and each individual entity, whether it be an application in a container or a host machine, we use a separate index. So the search speed will not suffer with a significant accumulation of data. The simple rule for naming indexes is: _. Therefore, in order for the container to be universal, before starting the daemon itself, we replace thirst-ohm wildcard to the name of the environment. The variable with the name of the environment is passed through environment variables. Sounds funny.

It is also worth noting that for some reason Splunk is not affected by the presence of the docker parameter hostname. It will still send logs with the id of its container in the host field anyway. As a solution, you can mount / etc / hostname from the host machine and at startup, make a replacement similar to the names of the indexes.

docker-compose.yml example

version: '2'
services:
  splunk-forwarder:
    image: "${IMAGE_REPO}/docker-stats-splunk-forwarder:${IMAGE_VERSION}"
    environment:
      SPLUNK_INDEX: ${ENVIRONMENT}
    volumes:
    - /etc/hostname:/etc/hostname:ro
    - /var/log:/var/log
    - /var/run/docker.sock:/var/run/docker.sock:ro

Π‘onclusion

Yes, perhaps the solution is not ideal and certainly not universal for everyone, since there are many "hardcode". But on the basis of it, everyone can build their own image and put it in their private artifactory, if, as it happens, you need Splunk Forwarder in the docker.

Links:

Solution from the article
Solution from outcoldman inspired to reuse part of the functionality
Of. Documentation for configuring Universal Forwarder

Source: habr.com

Add a comment