- Manual steps required to setup machines
Workers must be added to the firewall config before they will be able to connect to the Jenkins master.
You must be part of the infra group and have setup the ssh keys and config file beforehand.
To add an entry do the following:
- ssh to the ci master:
ssh ci - save the current config to a temporary file:
iptables-save >foo - edit the temporary file with your favorite editor. Use one of
the existing lines as a template and add a new entry at the end
of the list of hosts just before the second
COMMITline near the end of the file. - restore the config from the temporary file:
iptables-restore foo - remove the temporary file:
rm foo - run
iptables-save > /etc/iptables/rules.v4to ensure the changes persist across reboots
Release machines must be able to upload release artifacts to the nodejs.org web server. The release-builder Ansible role will write the necessary key and ssh config onto the release machine, automating the previously manual steps.
Manual steps
Once setup, they must have ~iojs/.ssh cloned from another machine, so they
have the ssh setup and keys required to upload release artifacts to the
nodejs.org web server. The result will be 2 files, an id_rsa containing
a private key, and a config containing:
Host node-www
HostName direct.nodejs.org
User staging
IdentityFile ~/.ssh/id_rsa
Both the config file and id_rsa should be owned and only readable by the
user: chmod 700 .ssh && chmod 600 .ssh/*.
Its necessary to accept the known_hosts keys interactively on first ssh or
the release builds will fail. After setting up .ssh, do something like this:
ssh node-www date
// ... accept the host keys
In the case of Docker container release hosts, the SSH configuration above works
differently since the ~iojs home directories are elsewhere on the host
machine. The Docker containers are started with /home/iojs inside the
container mounted from /home/iojs/name-of-container/ on the host machine.
Therefore, the above SSH configuration should take place in
/home/iojs/name-of-container/.ssh/, with permissions set appropriately.
known_hosts can be primed and SSH tested from within the running containers:
- Find the running container ID using
docker ps - Enter the container using
docker exec <containerid> -ti bash - Run
ssh node-www date(as above)
Our AIX Ansible bootstrap role attempts to resize filesystems to be large enough to install packages from the AIX Toolbox and to hold workspaces for CI builds. The server instances will need to have enough disk space available to fit the requested disk space. On IBM Cloud, for example, this involves having a second disk added in addition to the default one (e.g. 20Gb standard disk and an additional 100Gb one).
If not enough space is availble the Jenkins worker create playbook will fail
with an allocp error, e.g.
TASK [bootstrap : Set size of /tmp to 1G] ******************************************************************************************************************
task path: /home/rlau/sandbox/github/build/ansible/roles/bootstrap/tasks/partials/aix.yml:25
redirecting (type: modules) ansible.builtin.aix_filesystem to community.general.aix_filesystem
fatal: [test-ibm-aix73-ppc64_be-1]: FAILED! => {"changed": false, "msg": "Failed to run chfs. Error message: 0516-404 allocp: This system cannot fulfill the allocation request.\n\tThere are not enough free partitions or not enough physical volumes \n\tto keep strictness and satisfy allocation requests. The command\n\tshould be retried with different allocation characteristics.\n"}On AIX disks are organised into physical volumes (the disks) and logical volumes. Disks can be added to a volume group (where AIX will then manage the storage for the group). Useful commands are:
lspvthis will list the physical volumes attached to the server, e.g.
hdisk0 00c8d470fdbc3b5e rootvg active
hdisk1 00f6db0a6c7aece5 rootvg activeshows two disks attached to one volume group named rootvg. If this shows one
of the disks as None this indicates that the disk has not been included in
the volume group.
hdisk0 00fa00d6b552f41b rootvg active
hdisk1 none Nonelspv <disk name>
will show further information about disks.
# lspv hdisk0
PHYSICAL VOLUME: hdisk0 VOLUME GROUP: rootvg
PV IDENTIFIER: 00fa00d6b552f41b VG IDENTIFIER 00fa00d600004c000000017d43623707
PV STATE: active
STALE PARTITIONS: 0 ALLOCATABLE: yes
PP SIZE: 32 megabyte(s) LOGICAL VOLUMES: 13
TOTAL PPs: 639 (20448 megabytes) VG DESCRIPTORS: 2
FREE PPs: 4 (128 megabytes) HOT SPARE: no
USED PPs: 635 (20320 megabytes) MAX REQUEST: 512 kilobytes
FREE DISTRIBUTION: 00..00..00..00..04
USED DISTRIBUTION: 128..128..127..128..124
MIRROR POOL: None
# lspv hdisk1
0516-1396 : The physical volume hdisk1, was not found in the
system database.
#To add a disk to a volume group:
extendvg rootvg hdisk1This may fail:
0516-1254 extendvg: Changing the PVID in the ODM.
0516-1162 extendvg: The Physical Partition Size of 32 requires the creation of
3200 partitions for hdisk1. The limitation for volume group rootvg is
1016 physical partitions per physical volume. Use chvg command with -t
option to attempt to change the maximum Physical Partitions per Physical
volume for this volume group.
0516-792 extendvg: Unable to extend volume group.Information about a volume group can be obtained via lsvg,
e.g.
# lsvg rootvg
VOLUME GROUP: rootvg VG IDENTIFIER: 00fa00d600004c000000017d43623707
VG STATE: active PP SIZE: 32 megabyte(s)
VG PERMISSION: read/write TOTAL PPs: 639 (20448 megabytes)
MAX LVs: 256 FREE PPs: 4 (128 megabytes)
LVs: 13 USED PPs: 635 (20320 megabytes)
OPEN LVs: 12 QUORUM: 2 (Enabled)
TOTAL PVs: 1 VG DESCRIPTORS: 2
STALE PVs: 0 STALE PPs: 0
ACTIVE PVs: 1 AUTO ON: yes
MAX PPs per VG: 32512
MAX PPs per PV: 1016 MAX PVs: 32
LTG size (Dynamic): 512 kilobyte(s) AUTO SYNC: no
HOT SPARE: no BB POLICY: relocatable
PV RESTRICTION: none INFINITE RETRY: no
DISK BLOCK SIZE: 512 CRITICAL VG: no
FS SYNC OPTION: no CRITICAL PVs: no
ENCRYPTION: yes
#where the earlier error is referring to MAX PPs per PV. This can be changed,
as indicated by the error message, with chvg -t.
Note that -t takes a scaling factor which is multiplied by 1016. For our
100Gb disks we use a factor of 16 which then allows the
extendvg
command to succeed:
# chvg -t 16 rootvg
0516-1164 chvg: Volume group rootvg changed. With given characteristics rootvg
can include up to 2 physical volumes with 16256 physical partitions each.
# extendvg rootvg hdisk1
#After extending the rootvg volume group you may still run into an error with
the playbook:
TASK [bootstrap : Set size of /home to 50G] ****************************************************************************************************************
ok: [test-ibm-aix73-ppc64_be-1] => {"changed": false, "msg": "0516-787 extendlv: Maximum allocation for logical volume hd1\n\tis 512.\n"}We can use lslv <logical volume name> to view the volume:
# lslv hd1
LOGICAL VOLUME: hd1 VOLUME GROUP: rootvg
LV IDENTIFIER: 00fa00d600004c000000017d43623707.8 PERMISSION: read/write
VG STATE: active/complete LV STATE: opened/syncd
TYPE: jfs2 WRITE VERIFY: off
MAX LPs: 512 PP SIZE: 32 megabyte(s)
COPIES: 1 SCHED POLICY: parallel
LPs: 1 PPs: 1
STALE PPs: 0 BB POLICY: relocatable
INTER-POLICY: minimum RELOCATABLE: yes
INTRA-POLICY: center UPPER BOUND: 32
MOUNT POINT: /home LABEL: /home
MIRROR WRITE CONSISTENCY: on/ACTIVE
EACH LP COPY ON A SEPARATE PV ?: yes
Serialize IO ?: NO
INFINITE RETRY: no PREFERRED READ: 0
ENCRYPTION: no
#and use chlv -x
to increase the maximum logical partitions (MAX LPs). For our 100Gb we use
6000 (to match what we have for our AIX 7.1 IBM Cloud instances):
# chlv -x 6000 hd1
# lslv hd1
LOGICAL VOLUME: hd1 VOLUME GROUP: rootvg
LV IDENTIFIER: 00fa00d600004c000000017d43623707.8 PERMISSION: read/write
VG STATE: active/complete LV STATE: opened/syncd
TYPE: jfs2 WRITE VERIFY: off
MAX LPs: 6000 PP SIZE: 32 megabyte(s)
COPIES: 1 SCHED POLICY: parallel
LPs: 1 PPs: 1
STALE PPs: 0 BB POLICY: relocatable
INTER-POLICY: minimum RELOCATABLE: yes
INTRA-POLICY: center UPPER BOUND: 32
MOUNT POINT: /home LABEL: /home
MIRROR WRITE CONSISTENCY: on/ACTIVE
EACH LP COPY ON A SEPARATE PV ?: yes
Serialize IO ?: NO
INFINITE RETRY: no PREFERRED READ: 0
ENCRYPTION: no
#
On AIX OpenSSL is not available as an rpm via yum/dnf and is instead an installp fileset that must be manually downloaded and installed.
The following instructions are based on https://www.ibm.com/support/pages/downloading-and-installing-or-upgrading-openssl-and-openssh.
Go to https://www.ibm.com/resources/mrs/assets?source=aixbp&S_PKG=openssl
and pick the most recent OpenSSL release (each package should contain
compatibility libraries for older versions). Download/copy the .tar.Z
package (URL will be temporary) on to the machine into a temporary directory
e.g. /tmp/openssl.
curl -sL openssl-3.0.16.1000.tar.Z https://iwm.dhe.ibm.com/.../openssl-3.0.16.1000.tar.ZThen unpack the compressed archive:
zcat openssl-3.0.16.1000.tar.Z | tar -xvf -and install:
installp -qaXFY -d . openssl.base openssl.license openssl.man.en_USTo see a list of installed packages, run:
lslpp -L allSome libuv/Node.js tests currently fail on AIX with a network interface
containing a link local address. This is being tracked in
nodejs/node#46792. In the meantime the en1
interface containing the link local address is removed.
sudo ifconfig en1 down detach
Use
ifconfig -a
to list the available interfaces. To add back the en1 interface, run
sudo autoconf6 -i en1
Java 17 requires XL C/C++ Runtime 16.1 available from https://www.ibm.com/support/pages/fix-list-xl-cc-runtime-aix#161X.
Once downloaded, unpack the files with zcat:
zcat 16.1.0.9-IBM-xlCrte-AIX-FP009.tar.Z | tar -xvf -
and then install with installp:
installp -aXYgd . -e /tmp/install.log all
Use lslpp -l xlC\* to view the curently installed version.
# lslpp -l xlC\*
Fileset Level State Description
----------------------------------------------------------------------------
Path: /usr/lib/objrepos
xlC.aix61.rte 13.1.3.3 COMMITTED IBM XL C++ Runtime for AIX 6.1
and later
xlC.cpp 9.0.0.0 COMMITTED C for AIX Preprocessor
xlC.rte 13.1.3.3 COMMITTED IBM XL C++ Runtime for AIX
xlC.sup.aix50.rte 9.0.0.1 COMMITTED XL C/C++ Runtime for AIX 5.2
#
Most packages should be installed via ansible.
If there are any missing they should be installed via yum
What you do need to install manually is ccache
mkdir -p /opt/gcc-6.3 && cd /opt/gcc-6.3
curl -L https://ci.nodejs.org/downloads/aix/gcc-6.3-aix7.2.ppc.tar.gz | /opt/freeware/bin/tar -xzf -mkdir -p /opt/ccache-3.7.4 && cd /opt/ccache-3.7.4
curl -L https://ci.nodejs.org/downloads/aix/ccache-3.7.4.aix7.2.ppc.tar.gz | /opt/freeware/bin/tar -xzf -For AIX 7 and 6.1, needed for the file watcher unit tests.
Add the following to /etc/filesystems:
/aha:
dev = /aha
vfs = ahafs
mount = true
vol = /aha
and then:
mkdir /aha
mount /aha- Download 16.1.0 packages from: https://testcase.boulder.ibm.com (username: xlcomp4, password: ask @mhdawson)
- scp them to target:/opt/ibm-xlc
- on target:
cd /opt/ibm-xlc
uncompress 16.1.0.3-IBM-xlCcmp-AIX-FP003.tar.Z
tar -xvf 16.1.0.3-IBM-xlCcmp-AIX-FP003.tar
uncompress IBM_XL_C_CPP_V16.1.0.0_AIX.tar.Z
tar -xvf IBM_XL_C_CPP_V16.1.0.0_AIX.tar
installp -aXYgd ./usr/sys/inst.images -e /tmp/install.log all
inutoc
installp -aXgd ./ -e /tmp/install.log all- Find compilers in
/opt/IBM/xl[cC]/16.1.0/bin/
-
download gcc-c++ (with dependencies) from bullfreeware.com
-
scp 15412gcc-c++-6.3.0-1.aix7.2.ppc.rpm-with-deps.zip TARGET:/ramdisk0- Note: / is too small
-
unzip 15412gcc-c++-6.3.0-1.aix7.2.ppc.rpm-with-deps.zip -
contained wrong libstdc++-9.1, so downloaded bundle for libstdc++ 6.3.0-1
-
unpack the RPMs:
$ for f in *gcc* *stdc*; do rpm2cpio $f | /opt/freeware/bin/cpio_64 -idmv; done -
Find absolute symlinks, and make them relative, example:
$ find . -type l | xargs file ./opt/freeware/lib/gcc/powerpc-ibm-aix7.2.0.0/6.3.0/ppc64/libatomic.a: symbolic link to /opt/freeware/lib/gcc/powerpc-ibm-aix7.2.0.0/6.3.0/libatomic.a. ./opt/freeware/lib/gcc/powerpc-ibm-aix7.2.0.0/6.3.0/ppc64/libgcc_s.a: symbolic link to /opt/freeware/lib/gcc/powerpc-ibm-aix7.2.0.0/6.3.0/libgcc_s.a. ./opt/freeware/lib/gcc/powerpc-ibm-aix7.2.0.0/6.3.0/ppc64/libstdc++.a: symbolic link to /opt/freeware/lib/gcc/powerpc-ibm-aix7.2.0.0/6.3.0/libstdc++.a. ./opt/freeware/lib/gcc/powerpc-ibm-aix7.2.0.0/6.3.0/ppc64/libsupc++.a: symbolic link to /opt/freeware/lib/gcc/powerpc-ibm-aix7.2.0.0/6.3.0/libsupc++.a. ./opt/freeware/lib/gcc/powerpc-ibm-aix7.2.0.0/6.3.0/pthread/ppc64/libatomic.a: symbolic link to /opt/freeware/lib/gcc/powerpc-ibm-aix7.2.0.0/6.3.0/pthread/libatomic.a. ./opt/freeware/lib/gcc/powerpc-ibm-aix7.2.0.0/6.3.0/pthread/ppc64/libgcc_s.a: symbolic link to /opt/freeware/lib/gcc/powerpc-ibm-aix7.2.0.0/6.3.0/pthread/libgcc_s.a. ./opt/freeware/lib/gcc/powerpc-ibm-aix7.2.0.0/6.3.0/pthread/ppc64/libstdc++.a: symbolic link to /opt/freeware/lib/gcc/powerpc-ibm-aix7.2.0.0/6.3.0/pthread/libstdc++.a. ./opt/freeware/lib/gcc/powerpc-ibm-aix7.2.0.0/6.3.0/pthread/ppc64/libsupc++.a: symbolic link to /opt/freeware/lib/gcc/powerpc-ibm-aix7.2.0.0/6.3.0/pthread/libsupc++.a.bash-5.0# pwd /ramdisk0/aixtoolbox/opt/freeware/lib/gcc/powerpc-ibm-aix7.2.0.0/6.3.0/ppc64bash-5.0# ln -fs ../libatomic.a ../libgcc_s.a ../libstdc++.a ../libsupc++.a ./bash-5.0# find . -type l | xargs file ./ppc64/libatomic.a: archive (big format) ./ppc64/libgcc_s.a: archive (big format) ./ppc64/libstdc++.a: archive (big format) ./ppc64/libsupc++.a: archive (big format) ./pthread/ppc64/libatomic.a: symbolic link to /opt/freeware/lib/gcc/powerpc-ibm-aix7.2.0.0/6.3.0/pthread/libatomic.a. ./pthread/ppc64/libgcc_s.a: symbolic link to /opt/freeware/lib/gcc/powerpc-ibm-aix7.2.0.0/6.3.0/pthread/libgcc_s.a. ./pthread/ppc64/libstdc++.a: symbolic link to /opt/freeware/lib/gcc/powerpc-ibm-aix7.2.0.0/6.3.0/pthread/libstdc++.a. ./pthread/ppc64/libsupc++.a: symbolic link to /opt/freeware/lib/gcc/powerpc-ibm-aix7.2.0.0/6.3.0/pthread/libsupc++.a.bash-5.0# cd pthread/ppc64/bash-5.0# ln -fs ../libatomic.a ../libgcc_s.a ../libstdc++.a ../libsupc++.a ./bash-5.0# file *.a libatomic.a: archive (big format) libgcc.a: archive (big format) libgcc_eh.a: archive (big format) libgcc_s.a: archive (big format) libgcov.a: archive (big format) libstdc++.a: archive (big format) libsupc++.a: archive (big format) -
Move to target location and create a tarball with no assumptions on leading path prefix:
$ mkdir /opt/gcc-6.3 $ cd /opt/gcc-6.3 $ mv .../opt/freeware/* ./ $ tar -cvf ../gcc-6.3-aix7.2.ppc.tar *
Example above was for 6.3.0, but process for 4.8.5 is identical, other than the version numbers.
Example search for 4.8.5 gcc on bullfreeware:
The clang frontend will be auto installed via ansible playbook from: https://github.com/IBM/llvm-project/releases
The clang backend requires manually installing xl runtime and xl utilities
runtime:
-
Download the current *.tar.Z from https://www.ibm.com/support/pages/fix-list-xl-cc-runtime-aix
-
scp the tar onto the target - it helps to put both tarfiles into a single folder and scp the folder
-
On the target
uncompress IBM_OPEN_XL_CPP_RUNTIME_17.1.4.1_AIX.tar.Z
tar -xf IBM_OPEN_XL_CPP_RUNTIME_17.1.4.1_AIX.tar
installp -aFXYd . ALLutilities:
- Download the current *.tar.Z from https://www.ibm.com/support/pages/ibm-open-xl-cc-utilities-aix-1713
- scp tar onto the target
- On the target
uncompress IBM_OPEN_XL_CPP_UTILITIES_17.1.3.0_AIX.tar.Z
tar -xf IBM_OPEN_XL_CPP_UTILITIES_17.1.3.0_AIX.tar
inutoc .
installp -aFXYd . ALLAfter installing these packages we will need to update dnf:
/usr/sbin/updtvpkgNotes:
-
AIX tar doesn't know about the "z" switch, so use GNU tar.
-
Build tools create 32-bit binaries by default, so explicitly create 64-bit ones.
$ curl -L -O https://github.com/ccache/ccache/releases/download/v3.7.4/ccache-3.7.4.tar.gz % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 607 0 607 0 0 3281 0 --:--:-- --:--:-- --:--:-- 3281 100 490k 100 490k 0 0 586k 0 --:--:-- --:--:-- --:--:-- 60.4M $ /opt/freeware/bin/tar -xzf ccache-3.7.4.tar.gz $ cd ccache-3.7.4 $ ./configure CC="gcc -maix64" && gmake $ mkdir -p /opt/ccache-3.7.4/libexec /opt/ccache-3.7.4/bin $ cp ccache /opt/ccache-3.7.4/bin $ cd /opt/ccache-3.7.4/libexec $ ln -s ../bin/ccache c++ $ ln -s ../bin/ccache cpp $ ln -s ../bin/ccache g++ $ ln -s ../bin/ccache gcc $ ln -s ../bin/ccache gcov $ cd cd /opt/ccache-3.7.4 $ tar -cf /opt/ccache-3.7.4.aix7.2.ppc.tar.gz *
In order to get Windows machines to a state where Ansible can be run against them, some manual steps need to be taken so that Ansible can connect.
Machines should have:
- Remote Desktop (RDP) enabled, the port should be listed with the access credentials if it is not the default (3389).
- PowerShell access enabled, the port should be listed with the access credentials if it is not the default (5986).
Install the pywinrm pip module: pip install pywinrm
The preparation script needs to be run in PowerShell (run as Administrator):
[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
Invoke-WebRequest "https://raw.githubusercontent.com/ansible/ansible-documentation/devel/examples/scripts/ConfigureRemotingForAnsible.ps1" -OutFile "ConfigureRemotingForAnsible.ps1"
.\ConfigureRemotingForAnsible.ps1 -ForceNewSSLCert true -CertValidityDays 3650After creating new machines, the update-windows.yml playbook should be run to:
- Make sure the unencrypted WinRM endpoint is deleted on every machine. Check with:
The HTTP endpoint should not appear. Only the HTTPS endpoint should be present.
ansible -f 50 'test-*-win*' -m win_shell -a 'winrm enumerate winrm/config/listener' - On Rackspace hosts, make sure to change the ports, username, and password as described in the playbook.
On Azure, changing the ports is done in the Load Balancer configuration using the Azure Portal. The username and password are set during the creation of the VM in the Azure Portal.
Test the connection to the target machine with ansible HOST -m win_ping -vvvv. If there is any issue, please refer to the official Ansible documentation in Setting up a Windows Host.
The hosts labelled jenkins-workspace are used to "execute" the coordination of Jenkins jobs. Jenkins uses them to do the initial Git work to figure out what needs to be done before farming off to the actual test machines. These machines are lower powered but have large disks so they can waste space with the numerous Git repositories Jenkins will create in this process. The use of these hosts takes a load off the Jenkins master and prevents the Jenkins master from filling up its disk with Git repositories.
Note that not all jobs can use jenkins-workspace servers for execution, some are tied to other hosts.
The jenkins-workspace hosts are setup as standard Node.js nodes but are only given the jenkins-workspace label.
The benchmark machines are set up so they can run preinstalled tooling against the Node.js codebase and submit the results to Coverity Scan.
The playbook should download and install the Coverity build tool needed for static analysis into /var/. The extracted build tool should end up in a directory similar to /var/cov-analysis-linux64-2023.6.2. This directory must match the PATH setting in the node-daily-coverity job. According to Synopsis the tool is usually updated twice yearly -- if it is updated the directory will change and the following steps should be done:
- Run the playbook on all benchmark machines so that they have the same version of the Coverity build tool installed.
- Update the node-daily-coverity job so that the set
PATHcontains the new directory name.
The hosts that run Docker images for "sharedlibs", Alpine Linux and a few other dedicated systems (hosts identified by grep _docker-x64- inventory.yml) don't have Docker image reload logic built in to Ansible. Changes to Docker images (adding, deleting, modifying) involve some manual preparation.
The general steps are:
- Stop the concerned Jenkins systemd service(s) (
sudo systemctl stop jenkins-test-$INSTANCE) - Disable the concerned Jenkins systemd service(s) (
sudo systemctl disable jenkins-test-$INSTANCE) - Remove the Jenkins systemd service configuration (
rm /lib/systemd/system/jenkins-test-$INSTANCE.service) systemctl daemon-reloadto reload systemd configuration from disksystemctl reset-failedto remove the disabled and removed systemd service(s)- Clean up unnecessary Docker images (
docker system prune -fato clean everything up, or justdocker rmfor the images that are no longer needed and a lighterdocker system pruneafter that to clean non-tagged images).
Steps 3-5 may not be strictly necessary in the case of a simple modification as the existing configurations will be reused or rewritten by Ansible anyway.
To completely clean the Jenkins and Docker setup on a Docker host to start from scratch, either re-image the server or run the follwing commands:
systemctl list-units -t service --plain --all jenkins* | grep jenkins-test | awk '{print $1}' | xargs -l sudo systemctl stop
systemctl list-units -t service --plain --all jenkins* | grep jenkins-test | awk '{print $1}' | xargs -l sudo systemctl disable
sudo rm /lib/systemd/system/jenkins-test-*
sudo systemctl daemon-reload
sudo systemctl reset-failed
sudo docker system prune -faTo do this across multiple hosts, it can be executed with parallel-ssh like so:
parallel-ssh -i -h /tmp/docker-hosts 'systemctl list-units -t service --plain --all jenkins* | grep jenkins-test | awk '\''{print $1}'\'' | xargs -l sudo systemctl stop'
parallel-ssh -i -h /tmp/docker-hosts 'systemctl list-units -t service --plain --all jenkins* | grep jenkins-test | awk '\''{print $1}'\'' | xargs -l sudo systemctl disable'
parallel-ssh -i -h /tmp/docker-hosts 'sudo rm /lib/systemd/system/jenkins-test-*'
parallel-ssh -i -h /tmp/docker-hosts 'sudo systemctl daemon-reload'
parallel-ssh -i -h /tmp/docker-hosts 'sudo systemctl reset-failed'
parallel-ssh -i -h /tmp/docker-hosts 'sudo docker system prune -fa'Note that while this is being done across all Docker hosts, you should disable node-test-commit-linux-containered to avoid a queue and delays of jobs. The Alpine Linux hosts under node-test-commit-linux will also be impacted and may need to be manually cancelled if there is considerable delay. Leaving one or more Docker hosts active while reloading others will alleviate the need to do this.
The SmartOS machines are hosted by MNX.io. They are individual machines with the actual jenkins worker housed inside a VM on the host. To provision a new SmartOS jenkins host for testing we must do the following manual steps to prepare:
- Provision the host at MNX.io. The credentials for the mnx.io account are located in the admin logins file in the secrets repo.
- Configuration of the host environment
- Create the VM environment
- Configure the VM environment
- Create the Jenkins nodes/Open the jenkins firewall
- Ansible the VM
The host environment that houses the virtual machines currently relies on some older system libraries in order for compilation to succeed on smartos VM's. MNX.io has provided a base platform that has these older libs that we can use to make the workers.
- Login to MNX.io.
- Select "Compute" and then "Custom Images" From the left sidebar
- Click on "Create Instance" on the
smartos-retro-20220407T001427Zrow. - Then "click on the "Compute" option and pick c1.xlarge-ojsf (16GB Rab, 4vcpus, 200GB Disk), and then "Next"
- Name the instance test-mnx-smartosXX-x64-Y where XX is the version of smartos you plan on provisioning, and Y is an incremented number of similar instances
- Select MNX-Triton-Public (public) as the Network
- Add a tag of "role" = "test"
- Click Launch.
Instances should launch and be ready relatively quickly (less than 5 minutes)
Once the instance is up you should be able to find its host IP address (choose Instances from the left column).
Then, ssh to get into the host hypervisor: ssh root@<IP ADDRESS> -i ~/.ssh/nodejs_build_test (the nodejs_build_test key should be on the machine)
The older images are still configured to point at a defunct joyent image archve. We have to point to the new mnx one:
mkdir /var/imgadmvi /var/imgadm/imgadm.conf- Put the following json into the contents:
{
"dockerImportSkipUuids": true,
"upgradedToVer": "3.0.0",
"source": "https://images.mnx.io",
"sources": [
{
"type": "imgapi",
"url": "https://images.smartos.org"
}
]
}
- Find the UUID of the image for the version of smartos you want to run on the VM:
imgadm avail name=base-64-lts
UUID NAME VERSION OS TYPE PUB
[... older version omitted ...]
1d05e788-5409-11eb-b12f-037bd7fee4ee base-64-lts 20.4.0 smartos zone-dataset 2021-01-11
c8715b60-7e98-11ec-82d1-03d16599f529 base-64-lts 21.4.0 smartos zone-dataset 2022-01-26
85d0f826-0131-11ed-973d-2bfeef68011c base-64-lts 21.4.1 smartos zone-dataset 2022-07-11
93bdf06a-01ef-11ed-81ff-bf0efad842c7 base-64-lts 20.4.1 smartos zone-dataset 2022-07-12
e44ed3e0-910b-11ed-a5d4-00151714048c base-64-lts 22.4.0 smartos zone-dataset 2023-01-10
8adac45a-aca7-11ee-b53e-00151714048c base-64-lts 23.4.0 smartos zone-dataset 2024-01-06
- Import the image
smartos22:
imgadm import e44ed3e0-910b-11ed-a5d4-00151714048csmartos23:imgadm import 8adac45a-aca7-11ee-b53e-00151714048c - create new image_properties.json to define the VM we're creating:
{
"brand": "joyent",
"resolvers": [
"8.8.8.8",
"8.8.4.4"
],
"ram": 15360,
"alias": "os1",
"customer_metadata": {
"root_authorized_keys": "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDQ+xhL3A6vERZoUUoQspAuydcEghmjiC0m8yETHqghnPr2Y5nfFjnzmNB8EPM5+/jsSwPBF0jUPFpAgWYYXAGQZ62hsovfVMcwqXlMYJyA+L/uGDX7KxtLhn6FcftxJgyHwbggC1kXuzmrtzX/4oHSQi9mth3sOuf4KxS2pE0nNqIy4lIEyfkutuIZa3dhKTYLVCklNCH+UjYBtVgIjvqBBoEKNcBNO4fhLM5MCS6/MpbkhTkTJBN/kvJYBfZ9xAh+2/gQc0ndtK+rJsOJaQ1yFnIHoJBTRZN3qS5O8aE7Tdtxh5G1B+S2BNujdPestpBHPGONsVtFgdMkRFfmp3lVCwoBOpNHL5zzkGHNuq2ViphxRsX5A9SgL4MP/5P3xKgxWbzYYEl/0ef612d8fQQmiMitX7STM5R/9aau1d3PUV9ZJLgNgIWLCjdifGgDJeO32fISGK/mHkCvR5zjhtd7wUBPKspOxbVNCj2C5mHGKhT3PlQNt1drwbLJUYndXCQN0EJuIeM81jcKGO6J51kOHco/NM3vQBGdn24efbj/R1gPtZsGmhc6ho/xUNEoSXY7sg6Ma++iYJ+nJqgUvvVPzpJxurcwABTjwqjFj9NZrUQtVchnONsfrpcEnZhOhUZSxhQquaWwpYfWw/slMe/B7BGRiRLY5payqlZkxWCUJw== [email protected]",
"user-script" : "/usr/sbin/mdata-get root_authorized_keys > ~root/.ssh/authorized_keys ; /usr/sbin/mdata-get root_authorized_keys > ~admin/.ssh/authorized_keys"
},
"nics": [
{
"interface": "net0",
"nic_tag": "vswitch0",
"gateway": "172.16.9.1",
"gateways": [
"172.16.9.1"
],
"netmask": "255.255.255.0",
"ip": "172.16.9.3",
"ips": [
"172.16.9.3/24"
],
"primary": true
}
],
"image_uuid": "8adac45a-aca7-11ee-b53e-00151714048c",
"quota": 160
}
- create the vm with
vmadm create -f image_properties.json - ssh to the internal vm, proxying through the host:
ssh [email protected] -oProxyCommand="ssh [email protected] -i ~/.ssh/nodejs_build_test -W %h:%p" -o StrictHostKeyChecking=no -i ~/.ssh/nodejs_build_testNote that the 192.207.255.126 is the IP address asssigned to the instance at MNX.(see image) - install
htopandpython(usepkgin searchto find the latest version of python to install based on the smartos versionpython311on smartos22,python312on smartos23) pkgin install python311- Smartos22 extra steps:
pkg_alternatives manual python311pkgin install py311-expat-3.11.1nb1pkgin install openjdk17-17.0.9pkg_alternatives manual openjdk17-17.0.9
- install pip:
python -m ensurepip --upgradeandpython -m pip install packaging - Add the machine to the inventory with the proxycommand:
smartos23-x64-4:
ip: 172.16.9.3
ansible_ssh_common_args: '-o ProxyCommand="ssh -i ~/.ssh/nodejs_build_test -W %h:%p [email protected]"'
ansible_user: root
- Add the node to jenkins
- add the jenkins secret for the new node to the secrets repo
- Provision with ansible: `ansible-playbook ansible/playbooks/jenkins/worker/create.yml --limit "<HOSTNAME_TO_PROVISION>" -vv
- Ensure host can connect by modifying iptables on jenkins ci director.
SmartOS machines use libsmartsshd.so for PAM SSH authentication in order to look up SSH keys allowed to access machines. Part of our Ansible setup removes this so we can only rely on traditional SSH authentication. Therefore, it is critcal to put nodejs_test_* public keys into $USER/.ssh/authorized_keys as appropriate or access will be lost and not recoverable after reboot or sshd restart (part of Ansible setup).
There isn't a system start service on IBMi -- the machine should not be
rebooted, and after ansible is run, jenkins needs to be started with
jenkins-start.sh. This will submit the job under the iojs user. If the
job is already running, the jenkins-start.sh script will not start
another job.
See http://ibm.biz/ibmi-rpms (see "Installation" section)
/QOpenSys/usr/bin/system -kpib 'CRTUSRPRF USRPRF(NODEJS) PASSWORD() USRCLS(*SECOFR) SPCAUT(*USRCLS) PWDEXPITV(*NOMAX)'
mkdir -p /home/NODEJS
chown -R nodejs /home/NODEJS
Edit /QOpenSys/etc/profile to contain:
PATH=/QOpenSys/pkgs/bin:$PATH
export PATH
This can be done by running the following commands from a shell:
echo 'PATH=/QOpenSys/pkgs/bin:$PATH' >> /QOpenSys/etc/profile
echo 'export PATH' >> /QOpenSys/etc/profile
After that is completed, copy to the .bashrc file for the nodejs user
cp /QOpenSys/etc/profile /home/NODEJS/.bashrc
/QOpenSys/pkgs/bin/yum install chsh
/QOpenSys/pkgs/bin/chsh -s /QOpenSys/pkgs/bin/bash
/QOpenSys/pkgs/bin/chsh -s /QOpenSys/pkgs/bin/bash nodejs
The system Java installed is too old to be able to verify the SSL certificate
for our Jenkins servers and a more recent version has to be installed manually.
The script used to start the Jenkins agent expects to find the Java SDK in
/u/unix1/java/J8.0_64/.
To install the Java SDK, obtain the latest Java 8 service refresh for z/OS from: https://developer.ibm.com/javasdk/support/zos/
Transfer the pax.Z file to the z/OS system (via sftp, do not use scp as that
will perform an unwanted character conversion). Log into the z/OS system and
extract the SDK via the pax command:
e.g. if the pax.Z file is located in /u/unix1/SDK8_64bit_SR6_FP10.PAX.Z
mkdir -p /u/unix1/java
cd /u/unix1/java
pax -rf /u/unix1/SDK8_64bit_SR6_FP10.PAX.Z -ppx