kerberos-kdc: role to manage Kerberos KDC servers

This adds a role and related testing to manage our Kerberos KDC
servers, intended to replace the puppet modules currently performing
this task.

This role automates realm creation, initial setup, key material
distribution and replica host configuration.  None of this is intended
to run on the production servers which are already setup with an
active database, and the role should be effectively idempotent in

Note that this does not yet switch the production servers into the new
groups; this can be done in a separate step under controlled
conditions and with related upgrades of the host OS to Focal.

Change-Id: I60b40897486b29beafc76025790c501b5055313d
Ian Wienand 2 years ago
parent 6df7767200
commit c1aff2ed38

@ -15,9 +15,8 @@ At a Glance
* kdc*
* :git_file:`modules/openstack_project/manifests/kdc.pp`
* :git_file:`playbooks/service-kerberos.yaml`
@ -30,50 +29,59 @@ At a Glance
OpenStack Realm
OpenStack runs a Kerberos ``Realm`` called ``OPENSTACK.ORG``.
The realm contains a ``Key Distribution Center`` or KDC which is spread
across a master and a slave, as well as an admin server which only runs on the
master. Most of the configuration is in puppet, but initial setup and
the management of user accounts, known as ``principals``, are manual tasks.
OpenStack runs a Kerberos ``Realm`` called ``OPENSTACK.ORG``. The
realm contains a ``Key Distribution Center`` or KDC which is spread
across a primary and a replica, as well as an admin server which only
runs on the primary.
Most of the configuration is in Ansible, but management of user
accounts, known as ``principals``, is a manual task for
Realm Creation
On the first KDC host, the admin needs to run `krb5_newrealm` by hand. Then
admin principals and host principles need to be set up.
Set up host principals for slave propagation::
# execute kadmin.local then run these commands
addprinc -randkey host/
addprinc -randkey host/
ktadd host/
ktadd host/
Realm creation is exercised by the Ansible roles during testing, but
is not expected to be used in production (because we have an active
Copy the file `/etc/krb5.keytab` to the second kdc host.
The general process is:
The puppet config sets up slave propagation scripts and cron jobs to run them.
* create the new Kerberos database on the primary
* distribute the database ``stash`` file from the primary to
replicas, to allow them to unencrypt the database propogated to
them. This is created from a master key kept as a secret.
* create an admin user (password saved in file on primary server)
* add host principals for the primary and replica servers
* create keytabs on primary and replica servers (via the admin user),
which allows them to authenticate to each other.
* setup database propogation from primary to replicas with ``kprop``
(primary-side push) and ``kpropod`` (replica-side listen).
You will also need to create a stash file after creating a new realm. Run
`krb5_util stash` on the first kdc host. Copy the file `/etc/krb5kdc/stash`
to all other KDC servers for the krb5-kdc daemons to run.
In a disaster recovery situation, we can provision a fresh realm and
recover principals from dump files (XXX: 2020-03-11 ianw -- dump file
backup to come).
.. _addprinc:
Adding A User Principal
First, ensure the user has an entry in puppet so they have a unix
First, ensure the user has an entry in Ansible so they have a Unix
shell account on our hosts. SSH access is not necessary, but keeping
track of usernames and uids with account entries is necessary.
Then, add the user to Kerberos using kadmin (while authenticated as a
kerberos admin) or kadmin.local on the kdc::
If you are already an admin, you should authenicate with ``kinit
<username>/admin``. Otherwise you can use the ``kadmin.local`` tool
(instead of ``kadmin``) on the primary server, which by-passes
authenication and writes to the database directly.
Use ``kadmin`` to add the principal like so:
kadmin: addprinc $USERNAME@OPENSTACK.ORG
Where `$USERNAME` is the lower-case username of their unix account in
puppet. `OPENSTACK.ORG` should be capitalized.
Ansible. `OPENSTACK.ORG` should be capitalized.
If you are adding an admin principal, use
`username/admin@OPENSTACK.ORG`. Admins should additionally have
@ -87,11 +95,11 @@ than a person. There is no difference in their implementation, only
in conventions around how they are created and used. Service
principals are created without passwords and keytab files are used
instead for authentication. The program `k5start` can use keytab
files to automatically obtain kerberos credentials (and AFS if
files to automatically obtain Kerberos credentials (and AFS if
Add the service principal to Kerberos using kadmin (while
authenticated as a kerberos admin) or kadmin.local on the kdc::
authenticated as a Kerberos admin) or kadmin.local on the kdc::
kadmin: addprinc -randkey service/$NAME@OPENSTACK.ORG
@ -105,6 +113,10 @@ Then save the principal's keytab::
.. warning:: Each time ``ktadd`` is run, the key is rotated and
previous keytabs are invalidated.
These keytabs are then usually converted to base-64 and stored as
secret variables, and deployed to hosts via Ansible.
``mirror-update`` is probably a good example.
Resetting A User Principal's Password
@ -117,12 +129,12 @@ twice as prompted. If you need to reset your admin principal, use
No Service Outage Server Maintenance
Should you need perform maintenance on the kerberos server that requires
taking kerberos processes offline you can do this by performing your
Should you need perform maintenance on the Kerberos server that requires
taking Kerberos processes offline you can do this by performing your
updates on a single server at a time.
`` is our primary server and ``
is the hot standby. Perform your maintenance on ``
is the replica. Perform your maintenance on ``
first. Then once that is done we can prepare for taking down the
primary. On `` run::
@ -132,7 +144,7 @@ You should see::
Database propagation to SUCCEEDED
Once this is done the standby server is ready and we can take kdc03
Once this is done the replica is ready and we can take kdc03
offline. When kdc03 is back online rerun `` to ensure
everything is working again.

@ -0,0 +1,9 @@
- 88
- 464
- 749
- 754
- 88
- 464
- 749

@ -0,0 +1,27 @@
Configure a Kerberos KDC server
All KDC servers (primary and replicas) should be in a common
``kerberos-kdc`` group that defines ``kerberos_kdc_realm`` and
The ``kerberos-kdc-primary`` group should have a single primary KDC
host. It will be configured to replicate its database to hosts in
the ``kerberos-kdc-replica`` group.
Hosts in the ``kerberos-kdc-replica`` group will be configured to
receive updates from the ``kerberos-kdc-primary`` host.
The role should be run twice; once limited to the primary group and
then a second time limited to the secondary group.
**Role Variables**
.. zuul:rolevar:: kerberos_kdc_relam
The realm for all KDC servers.
.. zuul:rolevar:: kerberos_kdc_master_key
The master key written into the *stash* file for each KDC, which
allows them to auth.

@ -0,0 +1,6 @@
# This file Is the access control list for krb5 administration.
# When this file is edited run /etc/init.d/krb5-admin-server restart to activate
# One common way to set up Kerberos administration is to allow any principal
# ending in /admin is given full administrative rights.
# To enable this, uncomment the following line:
*/admin *

@ -0,0 +1,14 @@
Description=Kerberos 5 replica KDC update server
ExecReload=/bin/kill -HUP $MAINPID
ExecStart=/usr/sbin/kpropd -D $DAEMON_ARGS
InaccessibleDirectories=/etc/ssh /etc/ssl/private /root
ReadWriteDirectories=/var/tmp /tmp /var/lib/krb5kdc /var/run /run

@ -0,0 +1,32 @@
- name: Install packages
- krb5-kdc
state: present
- name: Ensure directories
path: '{{ item }}'
state: directory
mode: 0755
owner: root
group: root
- /etc/krb5kdc
- /var/krb5kdc
- name: Install KDC config
src: 'kdc.conf.j2'
dest: '/etc/krb5kdc/kdc.conf'
mode: 0644
owner: root
group: root
- name: Copy kadm5.acl
src: kadm5.acl
dest: '/etc/krb5kdc/kadm5.acl'
mode: 0644
owner: root
group: root

@ -0,0 +1,94 @@
- name: Install packages
- krb5-admin-server
state: present
# Note the following is not really for production, where we already
# have a database setup. It is exercsied by testing however.
- name: Look for primary database
path: /var/lib/krb5kdc/principal
register: _db_created
- name: Setup clean primary
when: not _db_created.stat.exists
- name: Setup primary db
shell: |
yes {{ kerberos_kdc_master_key }} | kdb5_util create -r {{ kerberos_kdc_realm }} -s
- name: Generate and save admin principal password
dest: '/etc/krb5kdc/admin.passwd'
content: '{{ lookup("password", "/dev/null chars=ascii_letters,digits length=12") }}'
owner: root
group: root
mode: '0600'
- name: Setup initial admin principal
shell: |
echo "addprinc -pw $(cat /etc/krb5kdc/admin.passwd) admin/admin@{{ kerberos_kdc_realm }}" | kadmin.local
# It is not strictly necessary to have the primary KDC server in
# the Kerberos database, but it can be handy if you want to be
# able to swap the primary KDC with one of the replicas.
- name: Create primary host principal and keytab
cmd: |
echo "addprinc -randkey host/{{ inventory_hostname }}" | kadmin.local
echo "ktadd host/{{ inventory_hostname }}" | kadmin.local
- name: Create replica host principals
cmd: 'echo "addprinc -randkey host/{{ item }}" | kadmin.local'
with_inventory_hostnames: kerberos-kdc-replica
# The stash file is used to decrypt the on-disk database. Without
# this you are prompted for the master password on daemon start. This
# needs to be distributed to the replicas so they can also open the
# database.
- name: Read and save stash file
src: '/etc/krb5kdc/stash'
register: kerberos_kdc_stash_file_contents
# Export this so replica servers can use this variable to authenicate
# and create keytabs for their host principals, if they need to.
- name: Read in admin/admin password
src: "/etc/krb5kdc/admin.passwd"
register: _admin_password
- name: Export admin password
kerberos_kdc_admin_password: '{{ _admin_password.content | b64decode }}'
# kprop is what pushes the db to replicas. Set it up to run via cron
# periodically.
- name: Install kprop script
src: ''
dest: '/usr/local/bin/'
mode: 0755
owner: root
group: root
- name: kprop cron to push db to replicas
name: kprop
minute: 15
job: '/usr/local/bin/ >/dev/null 2>&1'
- name: start krb5-admin-server
state: started
enabled: yes
name: krb5-admin-server
- name: start krb5-kdc
state: started
enabled: yes
name: krb5-kdc

@ -0,0 +1,64 @@
- name: Install packages
- krb5-kdc
- krb5-kpropd
state: present
# This is the key to unencrypt the database pushed by the primary
- name: Install stash file from primary
cmd: 'echo "{{ hostvars[groups["kerberos-kdc-primary"][0]]["kerberos_kdc_stash_file_contents"].content }}" | base64 -d > /etc/krb5kdc/stash'
creates: '/etc/krb5kdc/stash'
- name: Ensure stash file permsissions
path: /etc/krb5kdc/stash
owner: root
group: root
mode: '0600'
# Use the admin user to write out our host keytab
- name: Create host keytab
cmd: |
echo "ktadd host/{{ inventory_hostname }}" | kadmin -p admin/admin -w '{{ hostvars[groups["kerberos-kdc-primary"][0]]["kerberos_kdc_admin_password"] }}'
creates: '/etc/krb5.keytab'
# This specifies servers that are allowed to send us updates;
# i.e. the primary server
- name: Install kpropd ACL
src: 'kpropd.acl.j2'
dest: '/etc/krb5kdc/kpropd.acl'
mode: 0644
owner: root
group: root
- name: Install kpropd service
src: krb5-kpropd.service
dest: /etc/systemd/system/krb5-kpropd.service
mode: 0644
owner: root
group: root
register: _kpropd_service_installed
- name: Reload systemd
daemon_reload: yes
when: _kpropd_service_installed.changed
- name: Ensure kpropd running
state: started
name: krb5-kpropd
enabled: yes
# Note we can't start until replicas are distributed; the main
# service-kerberos.yaml playbook handles this.
- name: Ensure krb5-kdc is enabled
name: krb5-kdc
enabled: yes
masked: no

@ -0,0 +1,16 @@
kdc_ports = 750,88
{{ kerberos_kdc_realm }} = {
database_name = /var/lib/krb5kdc/principal
admin_keytab = FILE:/etc/krb5kdc/kadm5.keytab
acl_file = /etc/krb5kdc/kadm5.acl
key_stash_file = /etc/krb5kdc/stash
kdc_ports = 750,88
max_life = 10h 0m 0s
max_renewable_life = 7d 0h 0m 0s
master_key_type = aes256-cts
supported_enctypes = aes256-cts:normal
default_principal_flags = +preauth

@ -0,0 +1,3 @@
{% for kdc in groups["kerberos-kdc-primary"] %}
host/{{ kdc }}@{{ kerberos_kdc_realm }}
{% endfor %}

@ -0,0 +1,7 @@
kdclist="{% for s in groups['kerberos-kdc-replica'] %}{{ s }} {% endfor %}"
kdb5_util dump /var/krb5kdc/slave_datatrans
for kdc in $kdclist
kprop -f /var/krb5kdc/slave_datatrans $kdc

@ -0,0 +1,47 @@
# Setting up a fresh realm, as done in CI, is a five step process of:
# 1. setup common packages/config
# 2. setup primary; create db, setup kprop pushes, start services.
# 3. configure replica to accept db updates via kpropd
# 4. do a db replication
# 5. start replica daemons now they have a db copy
# In production this is largely a no-op just ensuring things are
# running.
- hosts: "kerberos-kdc:!disabled"
name: "Configure common KDC components"
- kerberos-client
- kerberos-kdc
- hosts: "kerberos-kdc-primary:!disabled"
name: "Configure Kerberos Primary"
- name: Configure primary KDC
name: kerberos-kdc
tasks_from: primary
- hosts: "kerberos-kdc-replica:!disabled"
name: "Configure Kerberos Replicas"
- name: Configure replica KDC
name: kerberos-kdc
tasks_from: replica
- hosts: "kerberos-kdc-primary:!disabled"
name: "Run replication"
- name: Run a DB replication
shell: |
- hosts: "kerberos-kdc-replica:!disabled"
name: "Ensure krb5-kdc running"
- name: Start krb5-kdc
name: krb5-kdc
state: started

@ -0,0 +1,7 @@
- hosts: ""
- name: Run kinit
shell: |
cat /etc/krb5kdc/admin.passwd | kinit admin/admin

@ -58,6 +58,7 @@
- group_vars/registry.yaml
- group_vars/gitea.yaml
- group_vars/gitea-lb.yaml
- group_vars/kerberos-kdc.yaml
- group_vars/letsencrypt.yaml
- group_vars/meetpad.yaml
- group_vars/jvb.yaml

@ -27,3 +27,11 @@ groups:

@ -0,0 +1,10 @@
# global server settings
kerberos_kdc_realm: OPENDEV.CI
kerberos_kdc_master_key: masterkey123
# client settings
kerberos_realm: OPENDEV.CI

@ -593,6 +593,23 @@
- modules/
- manifests/
- job:
name: infra-prod-service-kerberos
parent: infra-prod-service-base
description: Run Kerberos playbook.
playbook_name: service-kerberos.yaml
infra_prod_ansible_forks: 1
- opendev/system-config
- inventory/
- playbooks/service-kerberos.yaml
- inventory/service/group_vars/kerberos-kdc.yaml
- playbooks/roles/kerberos-kdc/
- roles/kerberos-client/
- playbooks/roles/iptables/
- job:
name: infra-prod-remote-puppet-else
parent: infra-prod-service-base

@ -25,6 +25,7 @@
- name: opendev-buildset-registry
- name: system-config-build-image-hound
soft: true
- system-config-run-kerberos
- system-config-run-lists
- system-config-run-nodepool
- system-config-run-meetpad:
@ -131,6 +132,7 @@
- name: opendev-buildset-registry
- name: system-config-upload-image-hound
soft: true
- system-config-run-kerberos
- system-config-run-lists
- system-config-run-nodepool
- system-config-run-meetpad:
@ -253,6 +255,7 @@
soft: true
- infra-prod-service-bridge
- infra-prod-service-gitea-lb
- infra-prod-service-kerberos
- infra-prod-service-nameserver
- infra-prod-service-nodepool
- infra-prod-service-codesearch:
@ -320,6 +323,7 @@
- infra-prod-service-nameserver
- infra-prod-service-etherpad
- infra-prod-service-meetpad
- infra-prod-service-kerberos
- infra-prod-service-mirror-update
- infra-prod-service-mirror
- infra-prod-service-static

@ -919,3 +919,35 @@
- testinfra/
# If we rebuild the image, we want to run this job as well.
- docker/refstack/.*
- job:
name: system-config-run-kerberos
parent: system-config-run
ansible-version: 2.9
description: |
Run the playbook for kerberos servers
timeout: 3600
- name:
label: ubuntu-bionic
- name:
label: ubuntu-focal
- name:
label: ubuntu-focal
'/etc/krb5kdc/': logs
'/var/krb5kdc/': logs
'/etc/krb5kdc/': logs
'/var/krb5kdc/': logs
- playbooks/service-kerberos.yaml
run_test_playbook: playbooks/test-kerberos.yaml
- playbooks/bridge.yaml
- playbooks/roles/kerberos-kdc/