Previously we had set up the test gerrit instance to use the same
hostname as production: review02.opendev.org. This causes some confusion
as we have to override settings specifically for testing like a reduced
heap size, but then also copy settings from the prod host vars as we
override the host vars entirely. Using a new hostname allows us to use a
different set of host vars with unique values reducing confusion.
Previously we had a test specific group vars file for the review Ansible
group. This provided junk secrets to our test installations of Gerrit
then we relied on the review02.opendev.org production host vars file to
set values that are public.
Unfortunately, this meant we were using the production heapLimit value
which is far too large for our test instances leading to the occasionaly
There is insufficient memory for the Java Runtime Environment to continue.
Native memory allocation (mmap) failed to map 9596567552 bytes for committing reserved memory.
We cannot set the heapLimit in the group var file because the hostvar
file overrides those values. To fix this we need to replace the test
specific group var contents with a test specific host var file instead.
To avoid repeating ourselves we also create a new review.yaml group_vars
file to capture common settings between testing and prod. Note we should
look at combining this new file with the gerrit.yaml group_vars.
On the testing side of things we set the heapLimit to 6GB, we change the
serverid value to prevent any unexpected notedb confusion, and we remove
While under development, the subdomain for the PTG site was
originally written as ptgbot.opendev.org and this is what was
communicated to event organizers. Mass communications subsequently
went out including this for URLs to the service. In order to make
the content from those announcements viable, add the additional name
to our configuration so we can redirect from it to the name we
eventually settled on.
While we're adjusting vhost metadata, make the ServerAdmin
directives between the HTTP and HTTPS vhosts for the service
Set the channel we want ptgbot joining in production with a group
var, like we do for statusbot's channel list. Correct the password
var name to match what's used in the template for production (and
matches the override set in our private hostvars on the bastion).
Clean up the unnecessary auth nicks list which was copied from the
statusbot config but is entirely unused. Also get rid of some
unnecessary empty lines in the defaults as they really don't make
the file any more readable.
We are seeing that replication tasks occasionally sit around forever and
have had to take manual intervention. One theory is that this is related
to networking between the gerrit server and the gitea servers. We don't
set maxRetries which means replication should be retried infinitely
which means if we hit the timeout we should try again. 15 minutes was
sort of arbitrarily chosen as ~twice the time it takes to clone a large
repo like nova.
This switch testing of lists.openstack.org to Focal and we make a CGI
env var update to accomodate newer mailman.
Specifically newer mailman's CGI scripts filter env vars that it will
pass through. We were setting MAILMAN_SITE_DIR to vhost our mailman
installs with apache2, but that doesn't pass the filter and is removed.
HOST is passed through so we update our scripts, apache vhost configs,
exim, and init scripts to use the HOST env var instead.
INAP mtl01 region is now owned by iWeb. This updates the cloud launcher
to use the new name and instructs the mirror in this cloud to provision
ssl certs for the old inap and new iweb names as well as updating
The Open Infrastructure Foundation's developers who maintain the
OpenStackID software are taking over management of the site itself,
and have deployed it on new servers. DNS records have already been
updated to the new IP address, so it's time to clean up our end in
preparation for deleting the old servers we've been running.
OpenStackID is still used by some services we run, like RefStack and
Zanata, and we're still hosting the OpenStackID Git repository and
documentation, so this does not get rid of all references to it.
We now depend on the reverse proxy not only for abuse mitigation but
also for serving .well-known files with specific CORS headers. To
reduce complexity and avoid traps in the future, make it non-optional.
This isn't necessary since it's hard-coded into the file. Let's
not add it where it isn't needed lest we confuse ourselves into
thinking it's necessary.
We merged change I9459e47ecfd19b27b7adcaee9ce91f80d51c124d which
should have opened this port but did not. Add testing for it.
Remove eavesdrop from webservers group
This was overridding the custom iptables ports that were being set
in the eavesdrop group vars file. There appears to be no other use
for the webservers group.
We are now using the mariadb jdbc connector in production and no longer
need to include the mysql legacy connector in our images. We also don't
need support for h2 or mysql as testing and prod are all using the
mariadb connector and local database.
Note this is a separate change to ensure everything is happy with the
mariadb connector before we remove the fallback mysql connector from our
Previously we were only managing root's known_hosts via ansible but even
then this wasn't happening because the gerrit_self_hostkey var wasn't
set anywhere. On top of that we need to manage multiple known_hosts
because gerrit must recognize itself and all of the gitea servers.
Update the code to take a dict of host key values and add each entry to
known_hosts for both the root and gerrit2 user.
We remove keyscans from tests to ensure that this update is actually
With our system-config-run gerrit/review jobs we have much less need
for a dedicated server to stage changes on. Remove in prepartion of
This moves review02 out of the review-staging group and into the main
review group. At this point, review01.openstack.org is inactive so we
can remove all references to openstack.org from the groups. We update
the system-config job to run against a focal production server, and
remove the unneeded rsync setup used to move data.
This additionally enables replication; this should be a no-op when
applied as part of the transition process is to manually apply this,
so that DNS setup can pull zone changes from opendev.org.
It also switches to the mysql connector, as noted inline we found some
issues with mariadb.
Note backups follow in a separate step to avoid doing too much at
once, hence dropping the backup group from the testing list.
The paste service needs an upgrade; since others have created a
lodgeit container it seems worth us keeping the service going if only
to maintain the historical corpus of pastes.
This adds the ansible to deploy lodgeit and a sibling mariadb
container. I have imported a dump of the old data as a test. The
dump is ~4gb and imported it takes up about double that; certainly
nothing we need to be too concerned over. The server will be more
than capable of running the db container alongside the lodgeit
This should have no effect on production until we decide to switch