mirror of
https://git.sr.ht/~magic_rb/website
synced 2024-11-22 08:04:21 +01:00
6019c01875
Signed-off-by: Magic_RB <magic_rb@redalder.org>
82 lines
5.6 KiB
Org Mode
82 lines
5.6 KiB
Org Mode
#+TITLE: On Docker Databases and Nixos
|
|
#+DATE: <2020-09-28 Mon>
|
|
|
|
While learning Hashicorp Nomad+Vault+Consul, I decided that I'd convert all the Docker containers I use currently,
|
|
into their Nix-ified forms. In other words, I'd rewrite the ones I had, but I'd based them on NixOS, a truly
|
|
declarative enviroment, unlike /ehm/ all the other base images... Well, I didn't realize how *hard* it is to
|
|
"dockerize" databases, databases are inherently programs, which deal almost exclusively with state, as opposed to Nix
|
|
and Docker, which are both declarative systems (one of them is trying and failing really hard).
|
|
|
|
** Configuration
|
|
Configuration is a rather big part of what systems administrator(DevOps engineer for the cool kids) does, one must
|
|
correctly configure a program, most of the time dynamically. And because this is such an important thing, it baffles
|
|
me, why 90% of all containers primarily use and support environment variables. I get that it's a convenient way to do
|
|
it, it's simple widely supported, uniform, all around great, but *really* cumbersome. Say your config file looks like
|
|
this:
|
|
|
|
#+BEGIN_SRC conf
|
|
# stripped down Gitea configuration, I kept the parts that nicely illustrate my point
|
|
[server]
|
|
APP_DATA_PATH = /data/gitea
|
|
ROOT_URL = https://gitea.redalder.org/
|
|
DOMAIN = gitea.redalder.org
|
|
|
|
[database]
|
|
DB_TYPE = postgres
|
|
HOST = database-postgres
|
|
NAME = gitea
|
|
#+END_SRC
|
|
|
|
The config has two sections, =server= and =database=, each of these sections has =n= key-value pairs. This has
|
|
structure, it has multiple layers and configuration files can get much, much more complex than that. Now, for the
|
|
sake of argument, let's imagine that we want to "environment-ize" this config. The natural, and frankly only way to
|
|
do this is to essentially flatten the config file, so we'd get something like this:
|
|
|
|
#+BEGIN_SRC conf
|
|
SERVER_APP_DATA_PATH="/data/gitea"
|
|
SERVER_ROOT_URL="https://gitea.redalder.org/"
|
|
SERVER_DOMAIN="gitea.redalder.org"
|
|
|
|
DATABASE_DB_TYPE="postgres"
|
|
DATABASE_HOST="database-postgres"
|
|
DATABASE_NAME="gitea"
|
|
#+END_SRC
|
|
|
|
Now, you might think that this is completely fine, even reasonable, but let me explain to you why that is horrible.
|
|
First of, there is no imposed structure and structure is always good, nothing is preventing you from mixing the =server= and =database= sections and while a "good" admin should not do that, it's best if they don't even have the
|
|
ability. Next depending on the parser, which parses this "configuration", you might encounter issues when you leave
|
|
out =""=, some are better at this than others, but once again, it's an implicit rule, which is bad. I hope you're
|
|
starting to understand the bigger picture, implicitness is *bad*, period. It introduces unnecessary mistakes that
|
|
could have been avoided if just the computer yelled at you. So why not give it the option to do that?
|
|
|
|
** Configuration and Databases
|
|
Now, it's finally time to combine databases with configuration. What we get is an all out war between immutable,
|
|
declarative environments holding stateful and infinitely changing data. You must make sure that your configuration
|
|
gets applied only at first start, so you must keep state yourself! In a Docker container! Madness! Then comes the
|
|
joy of updating the configuration. Say you give your user the option to specify the default authentication method
|
|
([[https://hub.docker.com/_/postgres][PostgreSQL]]), the user specifies that they want =scram-sha-256=, that's nice and all, so you apply it, but *only* on
|
|
the first boot. Why? Because now that the value is in the config file, if the user changed it, you'd have to figure
|
|
out *if* they changed it and then update the configuration file and that's really hard. The user might have gone into
|
|
where they store the state for PostgreSQL and manually changed the config file, they might have even completely
|
|
deleted your configuration and replaced it with their own? What should you do? Most Docker containers just take the
|
|
easy way out and do as PostgreSQL does and I don't blame them, there is nothing really that you can do.
|
|
|
|
** Nix - The Solution?
|
|
Similar to some good literature, this rant has gone full circle. We're back at the start, back on the topic of
|
|
Nix. How can Nix save us? By removing unnecessary state. The mutable configuration file? Gone, it's immutable
|
|
now. Not knowing whether a setting changed? Poof, gone too, Nix is fully declarative, which means it identifies *everything* by a sha256 hash in addition to its developer configured name. Nix also serves as a single point of
|
|
truth, which means even if the user modifies the config files, they will be overwritten before they are used
|
|
again. This makes mix ups are impossible. Messy and flat "configuration" files? Solved too, the Nix expression
|
|
language can be as flat or as deep as you need it to be, you can create complex APIs with functions and all that
|
|
jazz. Basically Nix is awesome!
|
|
|
|
** Conclusion
|
|
The take-away from this rant, is that the best course of action is to figure out either how to completely replace
|
|
Docker and all the other container runtimes with something based on Nix, but since I'm a realist, I propose another
|
|
possible solution. We must get Nix to nicely work with Docker, so instead of clumsy environment variables, you'd
|
|
write your configuration in the form of Nix expressions and build a new docker image based on one common base. This
|
|
ensure that all configuration would be properly hashed and declarative, while allowing for much more complex config
|
|
files than environment variables or even templates.
|