This makes the haproxy role more generic so we can run another (or
potentially even more) haproxy instance(s) to manage other services.
The config file is moved to a variable for the haproxy role. The
gitea specific config is then installed for the gitea-lb service by a
new gitea-lb role.
statsd reporting is made optional with an argument. This
enables/disables the service in the docker compose.
Role documenation is updated.
Cloud images bake in an ubuntu/centos/admin user then prevent root
logins. Early on in our boot process we copy authorized keys to root
then logout and back in again as root and proceed from there. This means
it should be safe to remove these "helpful" user accounts that we don't
use. Clean them up as they can only cause problems.
Having two groups here was confusing. We seem to use the review group
for most ansible stuff so we prefer that one. We move contents of the
gerrit group_vars into the review group_vars and then clean up the use
of the old group vars file.
Previously we had set up the test gerrit instance to use the same
hostname as production: review02.opendev.org. This causes some confusion
as we have to override settings specifically for testing like a reduced
heap size, but then also copy settings from the prod host vars as we
override the host vars entirely. Using a new hostname allows us to use a
different set of host vars with unique values reducing confusion.
Previously we had a test specific group vars file for the review Ansible
group. This provided junk secrets to our test installations of Gerrit
then we relied on the review02.opendev.org production host vars file to
set values that are public.
Unfortunately, this meant we were using the production heapLimit value
which is far too large for our test instances leading to the occasionaly
There is insufficient memory for the Java Runtime Environment to continue.
Native memory allocation (mmap) failed to map 9596567552 bytes for committing reserved memory.
We cannot set the heapLimit in the group var file because the hostvar
file overrides those values. To fix this we need to replace the test
specific group var contents with a test specific host var file instead.
To avoid repeating ourselves we also create a new review.yaml group_vars
file to capture common settings between testing and prod. Note we should
look at combining this new file with the gerrit.yaml group_vars.
On the testing side of things we set the heapLimit to 6GB, we change the
serverid value to prevent any unexpected notedb confusion, and we remove
While under development, the subdomain for the PTG site was
originally written as ptgbot.opendev.org and this is what was
communicated to event organizers. Mass communications subsequently
went out including this for URLs to the service. In order to make
the content from those announcements viable, add the additional name
to our configuration so we can redirect from it to the name we
eventually settled on.
While we're adjusting vhost metadata, make the ServerAdmin
directives between the HTTP and HTTPS vhosts for the service
Set the channel we want ptgbot joining in production with a group
var, like we do for statusbot's channel list. Correct the password
var name to match what's used in the template for production (and
matches the override set in our private hostvars on the bastion).
Clean up the unnecessary auth nicks list which was copied from the
statusbot config but is entirely unused. Also get rid of some
unnecessary empty lines in the defaults as they really don't make
the file any more readable.
We are seeing that replication tasks occasionally sit around forever and
have had to take manual intervention. One theory is that this is related
to networking between the gerrit server and the gitea servers. We don't
set maxRetries which means replication should be retried infinitely
which means if we hit the timeout we should try again. 15 minutes was
sort of arbitrarily chosen as ~twice the time it takes to clone a large
repo like nova.
This switch testing of lists.openstack.org to Focal and we make a CGI
env var update to accomodate newer mailman.
Specifically newer mailman's CGI scripts filter env vars that it will
pass through. We were setting MAILMAN_SITE_DIR to vhost our mailman
installs with apache2, but that doesn't pass the filter and is removed.
HOST is passed through so we update our scripts, apache vhost configs,
exim, and init scripts to use the HOST env var instead.
INAP mtl01 region is now owned by iWeb. This updates the cloud launcher
to use the new name and instructs the mirror in this cloud to provision
ssl certs for the old inap and new iweb names as well as updating
The Open Infrastructure Foundation's developers who maintain the
OpenStackID software are taking over management of the site itself,
and have deployed it on new servers. DNS records have already been
updated to the new IP address, so it's time to clean up our end in
preparation for deleting the old servers we've been running.
OpenStackID is still used by some services we run, like RefStack and
Zanata, and we're still hosting the OpenStackID Git repository and
documentation, so this does not get rid of all references to it.
We now depend on the reverse proxy not only for abuse mitigation but
also for serving .well-known files with specific CORS headers. To
reduce complexity and avoid traps in the future, make it non-optional.
This isn't necessary since it's hard-coded into the file. Let's
not add it where it isn't needed lest we confuse ourselves into
thinking it's necessary.
We merged change I9459e47ecfd19b27b7adcaee9ce91f80d51c124d which
should have opened this port but did not. Add testing for it.
Remove eavesdrop from webservers group
This was overridding the custom iptables ports that were being set
in the eavesdrop group vars file. There appears to be no other use
for the webservers group.
We are now using the mariadb jdbc connector in production and no longer
need to include the mysql legacy connector in our images. We also don't
need support for h2 or mysql as testing and prod are all using the
mariadb connector and local database.
Note this is a separate change to ensure everything is happy with the
mariadb connector before we remove the fallback mysql connector from our
Previously we were only managing root's known_hosts via ansible but even
then this wasn't happening because the gerrit_self_hostkey var wasn't
set anywhere. On top of that we need to manage multiple known_hosts
because gerrit must recognize itself and all of the gitea servers.
Update the code to take a dict of host key values and add each entry to
known_hosts for both the root and gerrit2 user.
We remove keyscans from tests to ensure that this update is actually
With our system-config-run gerrit/review jobs we have much less need
for a dedicated server to stage changes on. Remove in prepartion of
This moves review02 out of the review-staging group and into the main
review group. At this point, review01.openstack.org is inactive so we
can remove all references to openstack.org from the groups. We update
the system-config job to run against a focal production server, and
remove the unneeded rsync setup used to move data.
This additionally enables replication; this should be a no-op when
applied as part of the transition process is to manually apply this,
so that DNS setup can pull zone changes from opendev.org.
It also switches to the mysql connector, as noted inline we found some
issues with mariadb.
Note backups follow in a separate step to avoid doing too much at
once, hence dropping the backup group from the testing list.