Bundlewrap: First Impressions
I have spent quite some time with configuration management for my home infra setup, and I have recently come across a new tool that I’m excited to share with you. It’s called Bundlewrap, and it is a flexible, small-scale1 configuration management solution. Bundlewrap is written in Python 32, and is unique in the sense that the infrastructure configuration is also written in Python. I was first introduced to Bundlewrap by @kunsi in December 2020, who gave a brief presentation at a cozy conference.
My current setup uses Ansible and the oldest parts are about two years old, so it’s mostly written for Ansible 2.4 and later. Most comparisons done are against Ansible.
Brief overview
Bundlewrap manages nodes via so-called bundles, which roughly correspond to Ansible roles. A bundle describes a desired state on the target nodes, comprising of one or more items, which would be tasks in Ansible.
Nodes are defined in nodes.py
. Each node has a separate dictionary
called metadata
associated with it, which holds all the node’s
configuration data and can be read and written by the bundles. Machines can be
grouped and group metadata can be applied to all members by specifying it in
groups.py
.
Let’s look at an example: the bundle ssh-server
should install and configure
an SSH server on our node test
. To do so we have to specify the items we
want to use, here pkg_apt
,
file
and
svc_systemd
. An entire
repository could look something like this:
.
├── bundles # All bundles are in the directory bundles/
│ └── ssh-server # Our ssh-server bundle
│ ├── files # File templates for the ssh-server bundle
│ │ └── sshd_config # SSH server configuration template (omitted)
│ ├── items.py # Item definition file, Ansible: tasks/main.yml
│ └── metadata.py # Metadata definition file, Ansible: defaults/main.yml
├── groups.py # Group configuration
└── nodes.py # Node configuration
# nodes.py
nodes = {
'test': {
'hostname': '198.51.100.1',
'bundles': {'ssh-server'},
'metadata': {'nodevar': 'string'},
},
}
# groups.py
groups = {
'group': {
'members': {'test'},
'metadata': {'groupvar': 5},
},
}
# items.py for bundle ssh-server
pkg_apt = {
'openssh-server': {'installed': True},
}
files = {
'/etc/ssh/sshd_config': {
'source': 'sshd_config',
'triggers': {'svc_systemd:sshd:restart'},
},
}
svc_systemd = {
'sshd': {
'enabled': True,
'running': True,
'needs': {'pkg_apt:openssh-server'},
},
}
# metadata.py for bundle ssh-server
defaults = {
'bundlevar': ['abc', 'def'],
}
Configuration is done via Python dictionaries, and since the files are literal Python code, you can embed arbitrary logic in these files. Bundlewrap is well-documented, and I encourage you to read the docs if you want to figure out what an item does.
Now that we have seen the general structure of bundles, let us come to the feature comparison.
The Good
Agentless, push-based, no Python required on guest
Just like Ansible, Bundlewrap is both agentless and push-based. Managed nodes are accessed via SSH, which is my preferred way. Contrary to Ansible though, Bundlewrap does not require Python to be installed on the managed hosts, and instead relies on common tools.
Privilege escalation must work noninteractively. Since I anyway dislike entering
passwords, this not a problem for me. The exact privilege escalation method used
is configurable, so BSDs for example have the option to use doas
instead of
sudo
(the default).
Bundlewrap does not do SSH multiplexing by default, but it is possible to pass arbitrary arguments to the underlying ssh invocation via an environment variable.
Automatic metadata merging, metadata generation
In the example above you might have noticed that we have defined metadata for
both the node and the group it belongs to. Generally in Bundlewrap,
non-collection metadata follows a strict hierarchy: node metadata overrides
group metadata overrides bundle metadata. Collections are merged recursively,
which is one of the best features Bundlewrap has. We can instruct Bundlewrap to
display the metadata associated with test
. The output is color-coded3 according
to where the key comes from (group/node/bundle), which is very helpful.
% bw metadata test
{
"bundlevar": [ # Colored blue = from bundle
"abc",
"def"
],
"groupvar": 5, # Colored yellow = from group
"nodevar": "string" # Colored red = from node
}
This is one of the features I miss most from Ansible. I have a ton of roles
which would like to have their variables merged. One example is my Prometheus
setup: My monitoring server has to know about every exporter that a node has
installed in order to scrape all of them. Ideally I’d just have a list for each
node which has (exporter, port)
pairs and each exporter role appends a pair to
this node, thus allowing the monitioring role to work independently of the
available exporters. However, since Ansible does not allow appending to an
existing variable, I am stuck hardcoding every possible exporter into the main
prometheus role.
Bundlewrap also allows for generating new metadata from existing metadata, using a concept called metadata reactors. These are defined at the bundle level and are extremely powerful. You can, for example, ensure that every virtual host automatically also gets issued a letsencrypt certificate, while still separating the webhost and letsencrypt bundles.
Secret derivation
Ansible has secrets, which allow you to store encrypted data and decrypt it with a static key. Bundlewrap can also do this, but additionally it allows you to generate secrets dynamically, which you can extract on demand. This is especially useful for automatic password generation for user accounts or when connecting a service to a DB user account: In both cases I don’t really care what the secret is, only that 1. it is a secret known only to the correct parties and 2. I can recover it if needed. Additionally, the secrets can easily be rotated by replacing the key used for secret derivation! Of course, now anyone in possession of the Bundlewrap master secret can derive all your passwords, so be sure to secure it well.
Offline testing
This one’s huge: Bundlewrap supports sensible offline testing. Bundlewrap tests
involve assembling all metadata for all nodes, checking that all items are
well-formed, all templates instantiate without errors, and so on. This is a
feature I sorely miss from Ansible. While Ansible has the --check
parameter, it
still simulates each step by connecting to the target node, which is really
slow compared to local execution. Plus, you can run Bundlewrap tests as part of
your CI pipeline (even works for secrets without the decryption/generation
keys!).
% bw test
✓ No reactors violated their declared keys
✓ group has no subgroup loops
✓ test has no metadata conflicts
✓ test ssh-server file:/etc/ssh/sshd_config
✓ test ssh-server pkg_apt:openssh-server
✓ test ssh-server svc_systemd:sshd
✓ test ssh-server svc_systemd:sshd:restart
✓ test ssh-server svc_systemd:sshd:reload
Small core
Bundlewrap has an extremely small “standard library” of items, and prides itself on staying that way. Personally, I value scope-restriction a lot in projects, so this is a good thing. On the other hand it means that, more often than not, you have to write the code for new items yourself, e.g. support for a new package manager. Fortunately, the code is quite accessible, and the methods you need to implement are well-documented.
The Neutral
Python dicts
Python dictionaries look much more like JSON than YAML, however in my opinion this does not impact readability. Writing Python dicts is slightly more pleasant than raw JSON, since it allows the use of single quotes for string identifiers4. Formatting is taken care of by any linter, which is nicer than YAML, where indentation cannot be automatically inferred. Of course, this is true for raw Python code as well.
Statistics and dependency graphs
This is undoubtedly a cool feature: Bundlewrap can output graphs (in graphviz format) visualizing the item/bundle dependencies on a node, or your repositories’ group relationships. And it also keeps track of statistics such as the number of items, nodes, groups, bundles and so on. These features don’t have a downside, however I also haven’t (yet) discovered clear upsides other than “ooh, shiny”.
The Bad
Python
I really, really, really dislike Python. Mainly because it is interpreted
and dynamically typed, which means that most errors will occur at runtime when
it is too late to fix stuff. Working with Bundlewrap snippets is even worse,
since some variables are passed automagically, which confuses my poor language
server (and mypy
as well), so any possibility of static type checking is
chucked right out of the window.
Bundlewraps excellent local testing feature alleviates this issue somewhat.
Turing-complete config language
Having all the flexibility and power of Python also means having more footguns available to shoot yourself with. Bundlewrap relies much more on the user for constraining the bundle complexity. Personally, I think that for small infrastructures (such as what I run at home) this is fine, however I would be wary of this power for bigger deployments.
Conclusion
I’ve spent the last month thinking about and testing configuration management systems, and believe I have found a hidden gem in Bundlewrap. The only other notable mention I tried was cdist, however it has its fair share of oddities, most notably being a 100% sh-based solution. Of course, this doesn’t mean that it’s not good for you! Go check it out if that premise excites you.
I hope you got a brief overview of Bundlewrap and it’s features. Personally, I think it’s a better solution for my usecase than Ansible, and I’m going to slowly port my Ansible roles to Bundlewrap this year. If you’re interested and want to see more configurations/examples check out my and (especially) Franziska’s repositories.
If you have any questions or comments feel free to reach out to me via my public inbox or toot @bfiedler on Mastodon.