This runs tests on Ic47d998089c320e8e4ca371b8fb4b338c5fd903a. We'll
use this tested image to deploy with.
Change-Id: I6c500b26a0340a685573c22b748d37d32cb45e27
ansible-galaxy CLI makes multiple calls to the remote server, with
various API endpoint, and expects JSON containing fully qualified URI
(scheme://host/path), meaning we must inspect the different files and
ensure we're rewriting the content so that it points to the proxy all
the time.
Also, the remote galaxy.ansible.com has some redirects with absolute
paths, breaking for some reason the ProxyPassReverse - this is why we
get yet a new pair of dedicated ports for this proxy (TLS/non-TLS).
Then, there's the protocol issue: since mod_substitute is apparently
unable to take httpd variables such as the REQUEST_SCHEME, we have to
use some If statement in order to ensure we're passing the correct
scheme, being http or https. Note that ansible-galaxy doesn't understand
the "//host/path".
This patch also adds some more tests in order to ensure the API answers
as expected through the proxy.
Change-Id: Icf6f5c83554b51854fabde6e4cc2d646d120c0e9
When bringing up a new server, scan the ssh-keys of the remote IP and
add them automatically to the inventory output.
c.f. I4863425d5b784d0cdf118e1252414ca78fd24179
Change-Id: I2120fd476aa89e207ab76a1fc0faeeb5a0fb55ce
The "non-local" mode was added to this for the old Bionic based bridge
node, whose version of ssh-keyscan didn't have "-D", so we had to
actually log into the remote host to query its keys.
Now this runs on a Jammy node, we can remove this and just use the
remote probe. We don't have to worry about comaptability of this
tool, so I've just removed these bits.
Change-Id: Ie8254a965597db5695ff1613fc4ebf8cc26f3a25
On the old bridge node we had some unmanaged venv's with a very old,
now unmaintained RAX DNS API interaction tool.
Adding the RDNS entries is fairly straight forward, and this small
tool is mostly a copy of some of the bits for our dns api backup tool.
It really just comes down to getting a token and making a post request
with the name/ip addresses.
When the cloud the node is launched as is identified as RAX, this will
automatically add the PTR records for the ip4 & 6 addresses. It also
has an entrypoint to be called manually.
This is added and hacked in, along with a config file for the
appropriate account (I have added these details on bridge).
I've left the update of openstack.org DNS entries as a manual
procedure. Although they could be set automatically with small
updates to the tool (just a different POST) -- details like CNAMES,
etc. and the relatively few servers we start in the RAX mangaed DNS
domains means I think it's easier to just do manually via the web ui.
The output comment is updated.
Change-Id: I8a42afdd00be2595ca73819610757ce5d4435d0a
Due to changes in the internap cloud being renamed to iweb and back
again the history of the internap clouds.yaml profile is one of change.
Unfortunately, we need to talk to iweb specifically but the internap
profile in new openstack sdk talks to internap and things break.
Fix this by removing the use of the profile and setting the values
explicitly in our clouds.yaml files.
While this cloud is going away in about a month making this change is
still worthwile as it will allow us to use new openstacksdk on bridge
and nodepool to talk to iweb in the meantime.
Change-Id: I9f6c414115190ec5d25e0654b4da9cd9b9cbb957
For some reason, this was in our original lists.openstack.org Exim
configuration when we first imported it to Puppet so many years ago.
Somehow it's survived and multiplied its way into other configs as
well. Time to finally let it go.
Change-Id: I23470c10ae0324954cb2afda929c86e7ad34663e
The version of python-cinderclient needs to be constrained to before
the point at which it dropped volume API v2 support (which happened
in 8.0.0). If this is pinned back, latest openstackclient can be
installed and used for Rackspace volume operations without issue.
Make sure we install new enough OpenStackSDK so it doesn't try to
pass unsupported networking options in server create calls to
Rackspace too.
The script itself had a couple of issues once rehomed, the first
being it was looking for Ansible playbooks relative to its former
path in the repository rather than its installed location in the
venv, so make that path configurable but have it default to the
absolute path to those on the bridge now. Also, the script really
wanted to clear the ansible cache, but when that path doesn't exist
(as is currently the case on the bridge), it aborts rather than
continuing, so wrap that call in a try/except.
While we're here, update our default server image from focal to
jammy.
Change-Id: I103c7799ebe319d2d8b3fb626d7804387d3e8a60
We need the infra-prod-bootstrap-bridge job to add SSH host keys
from our Ansible inventory to the /etc/ssh_known_hosts on the
bridge. When adding a new server to the inventory, any added host
keys should be deployed. Make sure this happens.
Change-Id: I422f80fc033cfe8e20d6d30b0fe23f82800c4cea
We've been running against the dev branch of acme.sh since the initial
commit of the letsencrypt work -- at the time I feel like there were
things we needed that weren't in a release. Anyway, there is now an
issue causing ECC certificates to be made and failing to renew [1]
which we can't work-around.
Pin this to the current release. It would probably be good to pin
this to the "latest" release to avoid us forgetting to ever bump this
and ending up with even harder to debug bit-rot.
[1] https://github.com/acmesh-official/acme.sh/issues/4416
Change-Id: I0d07ba1b5ab77e07c67ad990e7bc78a9f90005a4
Currently "openstack" command on bridge doesn't work, because we need
cinder client pinned to an older version for RAX support. The
upstream container uses the latest versions of everything and it fails
to parse the "volume_api_version: 2" pin for RAX in the config file.
In general, the version of openstackclient we can probably most likely
rely on to work is the one from the launch-node virtualenv. It also
means we just have one place to manage a broadly-compatible version,
instead of trying to manage versions in separate containers, etc.
This converts the /usr/local/bin/openstack command from calling into
the container, to calling into the launch venv.
Change-Id: I604d5c17268a8219d51d432ba21feeb2e752a693
Instead of pinning to an exact release, make this just accept anything
from the current version series. I think this is a good trade-off
between us have to bump every single time a point release comes out,
but also not jumping too far and breaking production.
Change-Id: I4789fe99651597b073e35066ec3be312e18659b8
Just a few hours after Ifcb88f57a4e6b721eb87b47148ad133713af1e42 to
update to 6.6.0, Ansible 7 was released :)
This is proposed as a separate change just to facilitate quick
reversal to 6 if required.
Change-Id: Id3d8b660a5442c3033d8177a80921979244adbae
This was pinned to v2 in I6dddf93fb2c7b1a73315629e4a983a2d5a0142cc
some time ago.
I have tested with this removed and openstacksdk appears to figure it
out correctly. Removing this reduces one small thing we need to think
about.
Change-Id: I85c3df2ebf6a424724a8e6beb0611924097be468
This will serve our new Mailman v3 mailing list sites once they're
migrated.
Change-Id: I0c7229eeffcb5896edadf697044cbd026037d903
Depends-On: https://review.opendev.org/865360
For some reason something about running under Ansible 6 trips this up
and fails with "unable to allocate a TTY", whereas the old version
didn't. Tell it not to allocate a tty.
Change-Id: Iceb3686d6c00380f4ffba0be8a7af7abd10f8f8b
A change in Ansible 6, discussed at [1], highlights that Ansible is
using the #! line to determine using it's own heuristics the
interpreter to use to run the file, rather than the more common idea
that this file is being interpreted by the kernel.
The best solution seems to be to have no interpreter line, which is
done here.
To avoid confusion this removes the executable bit; if you want to run
it manually you should run it under the python binary as an argument.
I couldn't find any other instances of this same behaviour in
system-config.
[1] https://github.com/ansible/ansible/issues/78809
[1] 9142be2f6c
[2] 9e22cfdb0f
Change-Id: I1d37e485a877982c72fa19acad907b682858c15b
These list constructions look wrong, in hindsight I'm not really sure
how it works. Ansible 6 seems to barf on it. Make this one evaluated
statement.
Change-Id: I2a5d4926221f758501f95a8689e4304f814f405f
In Ansible 6 this doesn't come out as a list. Refactor this into a
more jinja-y pipeline that should do a better job of it.
Change-Id: I5684291047a3e1000cd38ba33a951bed9fa3081f
This looks wrong, in hindsight I'm not really sure how it works.
Ansible 6 seems to barf on it. Make this one evaluated statement.
Change-Id: I7f73bf723af1086fc4473e76614ce30ca14f3d74
In what looks like a typo, we are overriding the bridge node for this
test to a bionic host. Remove this. This was detected by testing an
upgraded Ansible, which wouldn't install on the lower python on
Bionic.
Change-Id: Ie3e754598c6da1812e74afa914f50d91972012cd
These images have a number of issues we've identified and worked
around. The current iteration of this change is essentially
identical to upstream but with a minor tweak to allow the latest
mailman version, and adjusts the paths for hyperkitty and postorius
URLs to match those in the upstream mailman-web codebase, but
doesn't try to address the other items. However, we should consider
moving our fixes from ansible into the docker images where possible
and upstream those updates.
Unfortunately upstream hasn't been super responsive so far hence this
fork. For tracking purposes here are the issues/PRs we've already filed
upstream:
https://github.com/maxking/docker-mailman/pull/552https://github.com/maxking/docker-mailman/issues/548https://github.com/maxking/docker-mailman/issues/549https://github.com/maxking/docker-mailman/issues/550
Change-Id: I3314037d46c2ef2086a06dea0321d9f8cdd35c73
This turns launch-node into an installable package. This is not meant
for distribution, we just encapsulate the installation in a virtualenv
on the bastion host. Small updates to documentation and simple
testing are added (also remove some spaces to make test_bridge.py
consistent).
Change-Id: Ibcb4774114d73600753ca155ed277d775964bc79
This is related to the work in
I0823c09165c445e9178c75ac5083f1988e8d3055 to deploy the host keys from
inventory to the bastion host.
As noted inline, there's really no reason this host should be
connecting anywhere that isn't in the inventory. So caching values
can only hide that we might have missed something there. Disable user
known_hosts globally.
Change-Id: I6d74df90db856cf7773698e3a06180986a531322
It looks like at some point the RAX bind output changed format
slightly, which messed up our backup script. Rework it to parse the
current output.
This parsing is obviously a little fragile ... it is nice to have the
output sorted and lined up nicely (like our manually maintained
opendev.org bind files...). If the format changes again and this
becomes a problem, maybe we switch to dumping the RAX output directly
and forget about formatting it nicely.
Change-Id: I742dd6ef9ffdb377274b384b847625c98dd5ff16
Grab the make logs from the dkms directory. This is helpful if the
modules are failing to build.
The /var/lib/dkms directory contains all the source and object files,
etc., which seems unnecessary to store in general. Thus we just trim
this to the log directory.
Change-Id: I9b5abc9cf4cd59305470a04dda487dfdfd1b395a