In this article, we use bash, ssh, docker и we organize a seamless layout of a web application. blue-green deployment is a technique that allows you to instantly update an application without rejecting a single request. It is one of the zero downtime deployment strategies and is best suited for applications with one instance, but the ability to load a second, ready-to-run instance nearby.
Let's say you have a web application with which many clients are actively working, and there is absolutely no way for it to lie down for a couple of seconds. And you really need to roll out a library update, a bug fix or a cool new feature. In a normal situation, you will need to stop the application, replace it and start it again. In the case of docker, you can first replace, then restart, but there will still be a period in which requests to the application are not processed, because usually the application takes some time to initial load. What if it starts but doesn't work? Here is such a task, let's solve it with minimal means and as elegantly as possible.
DISCLAIMER: Most of the article is presented in an experimental format - as a recording of the console session. Hopefully it won't be too hard to read, and this code is self-documenting enough. For atmosphere, imagine that these are not just code snippets, but paper from an “iron” teletype.
Interesting techniques that are difficult to google just by reading the code are described at the beginning of each section. If something else is not clear - google and check in explainshell (fortunately, it works again, in connection with the unblocking of the telegram). What is not googled - ask in the comments. With pleasure I will supplement the corresponding section "Interesting Techniques".
Let's get started.
$ mkdir blue-green-deployment && cd $_
Service
Let's make an experimental service and place it in a container.
Interesting techniques
cat << EOF > file-name (Here Document + I/O redirection) is a way to create a multi-line file with a single command. Anything bash will read from /dev/stdin after this line and before the line EOF will be recorded in file-name.
wget -qO- URL (explainshell) — output the document received via HTTP to /dev/stdout (analog curl URL).
printout
I break the snippet on purpose to enable highlighting for Python. There will be another one at the end. Consider that in these places the paper was cut to pass to the highlighting department (where the code was painted by hand with highlighters), and then these pieces were glued back.
$ cat << EOF > uptimer.py
from http.server import BaseHTTPRequestHandler, HTTPServer
from time import monotonic
app_version = 1
app_name = f'Uptimer v{app_version}.0'
loading_seconds = 15 - app_version * 5
class Handler(BaseHTTPRequestHandler):
def do_GET(self):
if self.path == '/':
try:
t = monotonic() - server_start
if t < loading_seconds:
self.send_error(503)
else:
self.send_response(200)
self.send_header('Content-Type', 'text/html')
self.end_headers()
response = f'<h2>{app_name} is running for {t:3.1f} seconds.</h2>n'
self.wfile.write(response.encode('utf-8'))
except Exception:
self.send_error(500)
else:
self.send_error(404)
httpd = HTTPServer(('', 8080), Handler)
server_start = monotonic()
print(f'{app_name} (loads in {loading_seconds} sec.) started.')
httpd.serve_forever()
EOF
$ cat << EOF > Dockerfile
FROM python:alpine
EXPOSE 8080
COPY uptimer.py app.py
CMD [ "python", "-u", "./app.py" ]
EOF
$ docker build --tag uptimer .
Sending build context to Docker daemon 39.42kB
Step 1/4 : FROM python:alpine
---> 8ecf5a48c789
Step 2/4 : EXPOSE 8080
---> Using cache
---> cf92d174c9d3
Step 3/4 : COPY uptimer.py app.py
---> a7fbb33d6b7e
Step 4/4 : CMD [ "python", "-u", "./app.py" ]
---> Running in 1906b4bd9fdf
Removing intermediate container 1906b4bd9fdf
---> c1655b996fe8
Successfully built c1655b996fe8
Successfully tagged uptimer:latest
$ docker run --rm --detach --name uptimer --publish 8080:8080 uptimer
8f88c944b8bf78974a5727070a94c76aa0b9bb2b3ecf6324b784e782614b2fbf
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
8f88c944b8bf uptimer "python -u ./app.py" 3 seconds ago Up 5 seconds 0.0.0.0:8080->8080/tcp uptimer
$ docker logs uptimer
Uptimer v1.0 (loads in 10 sec.) started.
$ wget -qSO- http://localhost:8080
HTTP/1.0 503 Service Unavailable
Server: BaseHTTP/0.6 Python/3.8.3
Date: Sat, 22 Aug 2020 19:52:40 GMT
Connection: close
Content-Type: text/html;charset=utf-8
Content-Length: 484
$ wget -qSO- http://localhost:8080
HTTP/1.0 200 OK
Server: BaseHTTP/0.6 Python/3.8.3
Date: Sat, 22 Aug 2020 19:52:45 GMT
Content-Type: text/html
<h2>Uptimer v1.0 is running for 15.4 seconds.</h2>
$ docker rm --force uptimer
uptimer
Reverse proxy
In order for our application to be able to change imperceptibly, it is necessary that there is some other entity in front of it that will hide its substitution. It could be a web server в reverse proxy mode. A reverse proxy is established between the client and the application. It takes requests from clients and forwards them to the application, and sends the application's responses to the clients.
The application and reverse proxy can be linked inside docker with docker network. Thus, the container with the application does not even need to forward the port in the host system, this allows you to isolate the application as much as possible from external threats.
If the reverse proxy lives on another host, you will have to abandon the docker network and connect the application to the reverse proxy through the host network by forwarding the port apps parameter --publish, as at the first start and as with a reverse proxy.
We will run the reverse proxy on port 80, because this is exactly the entity that should listen to the outside. If the 80th port is busy on your test host, change the parameter --publish 80:80 on --publish ANY_FREE_PORT:80.
$ docker network create web-gateway
5dba128fb3b255b02ac012ded1906b7b4970b728fb7db3dbbeccc9a77a5dd7bd
$ docker run --detach --rm --name uptimer --network web-gateway uptimer
a1105f1b583dead9415e99864718cc807cc1db1c763870f40ea38bc026e2d67f
$ docker run --rm --network web-gateway alpine wget -qO- http://uptimer:8080
<h2>Uptimer v1.0 is running for 11.5 seconds.</h2>
$ docker run --detach --publish 80:80 --network web-gateway --name reverse-proxy nginx:alpine
80695a822c19051260c66bf60605dcb4ea66802c754037704968bc42527bf120
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
80695a822c19 nginx:alpine "/docker-entrypoint.…" 27 seconds ago Up 25 seconds 0.0.0.0:80->80/tcp reverse-proxy
a1105f1b583d uptimer "python -u ./app.py" About a minute ago Up About a minute 8080/tcp uptimer
$ cat << EOF > uptimer.conf
server {
listen 80;
location / {
proxy_pass http://uptimer:8080;
}
}
EOF
$ docker cp ./uptimer.conf reverse-proxy:/etc/nginx/conf.d/default.conf
$ docker exec reverse-proxy nginx -s reload
2020/06/23 20:51:03 [notice] 31#31: signal process started
$ wget -qSO- http://localhost
HTTP/1.1 200 OK
Server: nginx/1.19.0
Date: Sat, 22 Aug 2020 19:56:24 GMT
Content-Type: text/html
Transfer-Encoding: chunked
Connection: keep-alive
<h2>Uptimer v1.0 is running for 104.1 seconds.</h2>
Seamless deployment
Let's roll out a new version of the application (with a XNUMXx startup performance boost) and try to deploy it seamlessly.
Interesting techniques
echo 'my text' | docker exec -i my-container sh -c 'cat > /my-file.txt' - Write text my text to file /my-file.txt inside the container my-container.
cat > /my-file.txt - Write the contents of standard input to a file /dev/stdin.
printout
$ sed -i "s/app_version = 1/app_version = 2/" uptimer.py
$ docker build --tag uptimer .
Sending build context to Docker daemon 39.94kB
Step 1/4 : FROM python:alpine
---> 8ecf5a48c789
Step 2/4 : EXPOSE 8080
---> Using cache
---> cf92d174c9d3
Step 3/4 : COPY uptimer.py app.py
---> 3eca6a51cb2d
Step 4/4 : CMD [ "python", "-u", "./app.py" ]
---> Running in 8f13c6d3d9e7
Removing intermediate container 8f13c6d3d9e7
---> 1d56897841ec
Successfully built 1d56897841ec
Successfully tagged uptimer:latest
$ docker run --detach --rm --name uptimer_BLUE --network web-gateway uptimer
96932d4ca97a25b1b42d1b5f0ede993b43f95fac3c064262c5c527e16c119e02
$ docker logs uptimer_BLUE
Uptimer v2.0 (loads in 5 sec.) started.
$ docker run --rm --network web-gateway alpine wget -qO- http://uptimer_BLUE:8080
<h2>Uptimer v2.0 is running for 23.9 seconds.</h2>
$ sed s/uptimer/uptimer_BLUE/ uptimer.conf | docker exec --interactive reverse-proxy sh -c 'cat > /etc/nginx/conf.d/default.conf'
$ docker exec reverse-proxy cat /etc/nginx/conf.d/default.conf
server {
listen 80;
location / {
proxy_pass http://uptimer_BLUE:8080;
}
}
$ docker exec reverse-proxy nginx -s reload
2020/06/25 21:22:23 [notice] 68#68: signal process started
$ wget -qO- http://localhost
<h2>Uptimer v2.0 is running for 63.4 seconds.</h2>
$ docker rm -f uptimer
uptimer
$ wget -qO- http://localhost
<h2>Uptimer v2.0 is running for 84.8 seconds.</h2>
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
96932d4ca97a uptimer "python -u ./app.py" About a minute ago Up About a minute 8080/tcp uptimer_BLUE
80695a822c19 nginx:alpine "/docker-entrypoint.…" 8 minutes ago Up 8 minutes 0.0.0.0:80->80/tcp reverse-proxy
At this stage, the image is built directly on the server, which requires the application sources to be there, and also loads the server with unnecessary work. The next step is to allocate the image assembly to a separate machine (for example, to a CI system) and then transfer it to the server.
Transferring images
Unfortunately, there is no point in downloading an image from localhost to localhost, so this section can only be felt if you have two docker hosts at hand. At the minimum, it looks like this:
$ ssh production-server docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
$ docker image save uptimer | ssh production-server 'docker image load'
Loaded image: uptimer:latest
$ ssh production-server docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
uptimer latest 1d56897841ec 5 minutes ago 78.9MB
Team docker save stores the image data in a .tar archive, which means it weighs about 1.5 times more than it could weigh in a compressed form. So let's shake it in the name of saving time and traffic:
Now let's collect everything that we did manually into one script. Let's start with the top-level function, and then look at the rest used in it.
Interesting techniques
${parameter?err_msg} - one of the bash magic spells (aka parameter substitution). If a parameter not set, output err_msg and exit with code 1.
docker --log-driver journald - by default, the docker logging driver is a text file without any rotation. With this approach, the logs quickly fill up the entire disk, so for a production environment, you need to change the driver to a smarter one.
Deployment script
deploy() {
local usage_msg="Usage: ${FUNCNAME[0]} image_name"
local image_name=${1?$usage_msg}
ensure-reverse-proxy || return 2
if get-active-slot $image_name
then
local OLD=${image_name}_BLUE
local new_slot=GREEN
else
local OLD=${image_name}_GREEN
local new_slot=BLUE
fi
local NEW=${image_name}_${new_slot}
echo "Deploying '$NEW' in place of '$OLD'..."
docker run
--detach
--restart always
--log-driver journald
--name $NEW
--network web-gateway
$image_name || return 3
echo "Container started. Checking health..."
for i in {1..20}
do
sleep 1
if get-service-status $image_name $new_slot
then
echo "New '$NEW' service seems OK. Switching heads..."
sleep 2 # Ensure service is ready
set-active-slot $image_name $new_slot || return 4
echo "'$NEW' service is live!"
sleep 2 # Ensure all requests were processed
echo "Killing '$OLD'..."
docker rm -f $OLD
docker image prune -f
echo "Deployment successful!"
return 0
fi
echo "New '$NEW' service is not ready yet. Waiting ($i)..."
done
echo "New '$NEW' service did not raise, killing it. Failed to deploy T_T"
docker rm -f $NEW
return 5
}
Functions used:
ensure-reverse-proxy - Make sure the reverse proxy is working (useful for the first deployment)
get-active-slot service_name — Determines which slot is currently active for the given service (BLUE or GREEN)
get-service-status service_name deployment_slot - Determines if the service is ready to process incoming requests
set-active-slot service_name deployment_slot - Changes the nginx config in the reverse proxy container
In order:
ensure-reverse-proxy() {
is-container-up reverse-proxy && return 0
echo "Deploying reverse-proxy..."
docker network create web-gateway
docker run
--detach
--restart always
--log-driver journald
--name reverse-proxy
--network web-gateway
--publish 80:80
nginx:alpine || return 1
docker exec --interactive reverse-proxy sh -c "> /etc/nginx/conf.d/default.conf"
docker exec reverse-proxy nginx -s reload
}
is-container-up() {
local container=${1?"Usage: ${FUNCNAME[0]} container_name"}
[ -n "$(docker ps -f name=${container} -q)" ]
return $?
}
get-active-slot() {
local service=${1?"Usage: ${FUNCNAME[0]} service_name"}
if is-container-up ${service}_BLUE && is-container-up ${service}_GREEN; then
echo "Collision detected! Stopping ${service}_GREEN..."
docker rm -f ${service}_GREEN
return 0 # BLUE
fi
if is-container-up ${service}_BLUE && ! is-container-up ${service}_GREEN; then
return 0 # BLUE
fi
if ! is-container-up ${service}_BLUE; then
return 1 # GREEN
fi
}
get-service-status() {
local usage_msg="Usage: ${FUNCNAME[0]} service_name deployment_slot"
local service=${1?usage_msg}
local slot=${2?$usage_msg}
case $service in
# Add specific healthcheck paths for your services here
*) local health_check_port_path=":8080/" ;;
esac
local health_check_address="http://${service}_${slot}${health_check_port_path}"
echo "Requesting '$health_check_address' within the 'web-gateway' docker network:"
docker run --rm --network web-gateway alpine
wget --timeout=1 --quiet --server-response $health_check_address
return $?
}
set-active-slot() {
local usage_msg="Usage: ${FUNCNAME[0]} service_name deployment_slot"
local service=${1?$usage_msg}
local slot=${2?$usage_msg}
[ "$slot" == BLUE ] || [ "$slot" == GREEN ] || return 1
get-nginx-config $service $slot | docker exec --interactive reverse-proxy sh -c "cat > /etc/nginx/conf.d/$service.conf"
docker exec reverse-proxy nginx -t || return 2
docker exec reverse-proxy nginx -s reload
}
Function get-active-slot needs a little explanation:
Why does it return a number and not output a string?
Anyway, in the calling function, we check the result of its work, and checking the exit code using bash is much easier than a string. In addition, getting a string from it is very simple: get-active-slot service && echo BLUE || echo GREEN.
Are three conditions exactly enough to distinguish all states?
Even two is enough, the last one is just for completeness, so as not to write else.
Only the function that returns nginx configs remains undefined: get-nginx-config service_name deployment_slot. By analogy with the health check, here you can set any config for any service. Of the interesting - only cat <<- EOF, which allows you to remove all the tabs at the beginning. True, the price of plausible formatting is mixed tabs with spaces, which today is considered very bad form. But bash forces tabs, and it would also be nice to have normal formatting in the nginx config. In short, here mixing tabs with spaces seems like the best solution out of the worst. However, in the snippet below you will not see this, since the habr “does it well”, changing all tabs to 4 spaces and making EOF invalid. And here it is noticeable.
In order not to get up twice, I’ll immediately tell you about cat << 'EOF', which will be encountered later. If it's easy to write cat << EOF, then inside heredoc string interpolation is performed (variables are expanded ($foo), command calls ($(bar)), etc.), and if you enclose the end-of-document sign in single quotes, then interpolation is disabled and the character $ output as is. What you need to insert a script inside another script.
get-nginx-config() {
local usage_msg="Usage: ${FUNCNAME[0]} service_name deployment_slot"
local service=${1?$usage_msg}
local slot=${2?$usage_msg}
[ "$slot" == BLUE ] || [ "$slot" == GREEN ] || return 1
local container_name=${service}_${slot}
case $service in
# Add specific nginx configs for your services here
*) nginx-config-simple-service $container_name:8080 ;;
esac
}
nginx-config-simple-service() {
local usage_msg="Usage: ${FUNCNAME[0]} proxy_pass"
local proxy_pass=${1?$usage_msg}
cat << EOF
server {
listen 80;
location / {
proxy_pass http://$proxy_pass;
}
}
EOF
}
This is the whole script. And so gist with this script to download via wget or curl.
Executing parameterized scripts on a remote server
It's time to knock on the target server. This time localhost quite suitable:
$ ssh-copy-id localhost
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
himura@localhost's password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'localhost'"
and check to make sure that only the key(s) you wanted were added.
We have written a deployment script that uploads a pre-built image to the target server and seamlessly replaces the service container, but how to execute it on a remote machine? The script has arguments, since it is universal and can deploy several services at once under one reverse proxy (nginx configs can be sorted by which url which service will be). The script cannot be stored on the server, since in this case we will not be able to automatically update it (for the purpose of bug fixes and adding new services), and in general, state = evil.
Solution 1: Still store the script on the server, but copy it every time through scp. Then connect by ssh and execute the script with the necessary arguments.
Cons:
Two actions instead of one
The place where you copy may not be, or there may not be access to it, or the script may be executed at the time of the substitution.
It is advisable to clean up after yourself (delete the script).
Already three steps.
2 Solution:
Keep only function definitions in the script and run nothing at all
With sed append a function call to the end
Send all this directly to shh via pipe (|)
Pros:
Truly stateless
No boilerplate entities
Feeling cool
Here let's only without Ansible. Yes, it's all been figured out. Yes, a bicycle. See what a simple, elegant and minimalistic bike:
$ cat << 'EOF' > deploy.sh
#!/bin/bash
usage_msg="Usage: $0 ssh_address local_image_tag"
ssh_address=${1?$usage_msg}
image_name=${2?$usage_msg}
echo "Connecting to '$ssh_address' via ssh to seamlessly deploy '$image_name'..."
( sed "$a deploy $image_name" | ssh -T $ssh_address ) << 'END_OF_SCRIPT'
deploy() {
echo "Yay! The '${FUNCNAME[0]}' function is executing on '$(hostname)' with argument '$1'"
}
END_OF_SCRIPT
EOF
$ chmod +x deploy.sh
$ ./deploy.sh localhost magic-porridge-pot
Connecting to localhost...
Yay! The 'deploy' function is executing on 'hut' with argument 'magic-porridge-pot'
However, we can't be sure that the remote host has a valid bash, so we add a little check at the beginning (this is instead shellbang):
if [ "$SHELL" != "/bin/bash" ]
then
echo "The '$SHELL' shell is not supported by 'deploy.sh'. Set a '/bin/bash' shell for '$USER@$HOSTNAME'."
exit 1
fi
And now it's for real:
$ docker exec reverse-proxy rm /etc/nginx/conf.d/default.conf
$ wget -qO deploy.sh https://git.io/JUURc
$ chmod +x deploy.sh
$ ./deploy.sh localhost uptimer
Sending gzipped image 'uptimer' to 'localhost' via ssh...
Loaded image: uptimer:latest
Connecting to 'localhost' via ssh to seamlessly deploy 'uptimer'...
Deploying 'uptimer_GREEN' in place of 'uptimer_BLUE'...
06f5bc70e9c4f930e7b1f826ae2ca2f536023cc01e82c2b97b2c84d68048b18a
Container started. Checking health...
Requesting 'http://uptimer_GREEN:8080/' within the 'web-gateway' docker network:
HTTP/1.0 503 Service Unavailable
wget: server returned error: HTTP/1.0 503 Service Unavailable
New 'uptimer_GREEN' service is not ready yet. Waiting (1)...
Requesting 'http://uptimer_GREEN:8080/' within the 'web-gateway' docker network:
HTTP/1.0 503 Service Unavailable
wget: server returned error: HTTP/1.0 503 Service Unavailable
New 'uptimer_GREEN' service is not ready yet. Waiting (2)...
Requesting 'http://uptimer_GREEN:8080/' within the 'web-gateway' docker network:
HTTP/1.0 200 OK
Server: BaseHTTP/0.6 Python/3.8.3
Date: Sat, 22 Aug 2020 20:15:50 GMT
Content-Type: text/html
New 'uptimer_GREEN' service seems OK. Switching heads...
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
2020/08/22 20:15:54 [notice] 97#97: signal process started
The 'uptimer_GREEN' service is live!
Killing 'uptimer_BLUE'...
uptimer_BLUE
Total reclaimed space: 0B
Deployment successful!
Now you can open http://localhost/ in the browser, run the deployment again and make sure it runs seamlessly by refreshing the page on the CD during deployment.