Project configuration inside and outside Kubernetes

I recently wrote an answer about the life of the project in Docker and debugging code outside of it, where he briefly mentioned that you can make your own configuration system so that the service works well in Kuber, pulls up secrets, and runs locally conveniently, including outside Docker altogether. Nothing complicated, but the described "recipe" may be useful to someone πŸ™‚ The code is in Python, but the logic is not tied to the language.

Project configuration inside and outside Kubernetes

The background of the question is as follows: once upon a time there was one project, at first it was a small monolith with utilities and scripts, but over time it grew, divided into services, which in turn began to be divided into microservices, and then also scaled. At first, this was all done on bare VPS, the processes of setting up and deploying the code on which were automated using Ansible, and a YAML config was compiled for each service with the necessary settings and keys, and a similar config file was used for local launches, which was very convenient, t .to this config is loaded into a global object, accessible from anywhere in the project.

However, the growth in the number of microservices, their connections, as well as the need for centralized logging and monitoring, foreshadowed the move to Kuber, which is still in progress. Together with help in solving the above problems, Kubernetes offers its own approaches to infrastructure management, including so-called Secrets ΠΈ ways to work with them. The mechanism is standard and reliable, so it is literally a sin not to use it! But at the same time, I would like to keep my current format for working with the config: firstly, to use it uniformly in different microservices of the project, and secondly, to be able to run the code on the local machine using one simple config file.

In this regard, the mechanism for building the configuration object has been improved so that it can work both with our classic config file and with secrets from Kuber. A more rigid config structure was also set, in the language of the third Python, like this:

Dict[str, Dict[str, Union[str, int, float]]]

That is, the final config is a dictionary with named sections, each of which is a dictionary with values ​​from simple types. And sections describe the configuration and access to resources of a certain type. An example of a piece of our config:

adminka:
  django_secret: "ExtraLongAndHardCode"

db_main:
  engine: mysql
  host: 256.128.64.32
  user: cool_user
  password: "SuperHardPassword"

redis:
  host: 256.128.64.32
  pw: "SuperHardPassword"
  port: 26379

smtp:
  server: smtp.gmail.com
  port: 465
  email: [email protected]
  pw: "SuperHardPassword"

At the same time, the field engine databases can be installed on SQLite, and redis tune in mock, specifying also the name of the file to save - these parameters are correctly recognized and processed, which makes it easy to run the code locally for debugging, unit testing, and any other needs. This is especially true for us because there are many other needs - part of our code is designed for various analytical calculations, it runs not only on servers with orchestration, but also on various scripts, and on the computers of analysts who need to work out and debug complex data processing pipelines without worrying backender questions. By the way, it would not be superfluous to share that our main tools, including the config link code, are installed via setup.py – together, this unites our code into a single ecosystem that is independent of the platform and the way it is used.

The description of a pod in Kubernetes looks like this:

containers:
  - name : enter-api
    image: enter-api:latest
    ports:
      - containerPort: 80
    volumeMounts:
      - name: db-main-secret-volume
        mountPath: /etc/secrets/db-main

volumes:
  - name: db-main-secret-volume
    secret:
      secretName: db-main-secret

That is, one section is described in each secret. The secrets themselves are created like this:

apiVersion: v1
kind: Secret
metadata:
  name: db-main-secret
type: Opaque
stringData:
  db_main.yaml: |
    engine: sqlite
    filename: main.sqlite3

Together, this results in the creation of YAML files along the path /etc/secrets/db-main/section_name.yaml

And for local launches, the config is used, located in the root directory of the project or along the path specified in the environment variable. The code responsible for these conveniences can be seen in the spoiler.

config.py

__author__ = 'AivanF'
__copyright__ = 'Copyright 2020, AivanF'

import os
import yaml

__all__ = ['config']
PROJECT_DIR = os.path.abspath(__file__ + 3 * '/..')
SECRETS_DIR = '/etc/secrets'
KEY_LOG = '_config_log'
KEY_DBG = 'debug'

def is_yes(value):
    if isinstance(value, str):
        value = value.lower()
        if value in ('1', 'on', 'yes', 'true'):
            return True
    else:
        if value in (1, True):
            return True
    return False

def update_config_part(config, key, data):
    if key not in config:
        config[key] = data
    else:
        config[key].update(data)

def parse_big_config(config, filename):
    '''
    Parse YAML config with multiple section
    '''
    if not os.path.isfile(filename):
        return False
    with open(filename) as f:
        config_new = yaml.safe_load(f.read())
        for key, data in config_new.items():
            update_config_part(config, key, data)
        config[KEY_LOG].append(filename)
        return True

def parse_tiny_config(config, key, filename):
    '''
    Parse YAML config with a single section
    '''
    with open(filename) as f:
        config_tiny = yaml.safe_load(f.read())
        update_config_part(config, key, config_tiny)
        config[KEY_LOG].append(filename)

def combine_config():
    config = {
        # To debug config load code
        KEY_LOG: [],
        # To debug other code
        KEY_DBG: is_yes(os.environ.get('DEBUG')),
    }
    # For simple local runs
    CONFIG_SIMPLE = os.path.join(PROJECT_DIR, 'config.yaml')
    parse_big_config(config, CONFIG_SIMPLE)
    # For container's tests
    CONFIG_ENVVAR = os.environ.get('CONFIG')
    if CONFIG_ENVVAR is not None:
        if not parse_big_config(config, CONFIG_ENVVAR):
            raise ValueError(
                f'No config file from EnvVar:n'
                f'{CONFIG_ENVVAR}'
            )
    # For K8s secrets
    for path, dirs, files in os.walk(SECRETS_DIR):
        depth = path[len(SECRETS_DIR):].count(os.sep)
        if depth > 1:
            continue
        for file in files:
            if file.endswith('.yaml'):
                filename = os.path.join(path, file)
                key = file.rsplit('.', 1)[0]
                parse_tiny_config(config, key, filename)
    return config

def build_config():
    config = combine_config()
    # Preprocess
    for key, data in config.items():
        if key.startswith('db_'):
            if data['engine'] == 'sqlite':
                data['filename'] = os.path.join(PROJECT_DIR, data['filename'])
    # To verify correctness
    if config[KEY_DBG]:
        print(f'** Loaded config:n{yaml.dump(config)}')
    else:
        print(f'** Loaded config from: {config[KEY_LOG]}')
    return config

config = build_config()

The logic here is quite simple: we combine large configs from the project directory and path by environment variable, and small config-sections from Kuber's secrets, and then we preprocess them a little. Plus some variables. I note that when searching for files from secrets, a depth limit is used, because K8s creates another hidden folder in each secret, where the secrets themselves are stored, and just a link is located a level higher.

I hope the described will be useful to someone πŸ™‚ Any comments and recommendations regarding security or other points for improvement are accepted. The opinion of the community is also interesting, perhaps it is worth adding support for ConfigMaps (they are not used in our project yet) and submitting the code on GitHub / PyPI? Personally, I think that such things are too individual for projects to be universal, and a little peeping at other people's implementations, like the one given here, and the discussion of nuances, tips and best practices, which I hope to see in the comments, is enough πŸ˜‰

Only registered users can participate in the survey. Sign in, you are welcome.

Should I publish as a project/library?

  • 0,0%Yes, I would use /contributyl0

  • 33,3%Yes, sounds great

  • 41,7%No, who needs to do it themselves in their own format and to suit their needs5

  • 25,0%Refrain from answering 3

12 users voted. 3 users abstained.

Source: habr.com

Add a comment