Github CI Maintenance
Introduction
CI stands for continuous integration. The CI repo contains code that handles a number of tasks related to ensuring new code merged into the Offline and Production repos passes the necessary tests; specifically, it handles all tasks that involve communication with GitHub, such as checking if a new comment contains a test command or posting the test results on the PR. It does not contain the tests themselves; those are in the codetools repo. Code in the CI repo is written in Python, while codetools is in bash.
CI repo code is used in our Jenkins jobs that handle build tests for Offline and Production. Each Jenkins job makes a fresh clone of the CI and codetools repos and downloads any new dependencies or dependencies with updated required versions. This means any changes made to these repo take effect immediately, and will be seen the next time someone runs the Offline or Production build tests.
Some important parts of CI repo
Standalone scripts for interacting with GitHub
These are executable scripts which may be used by code outside the CI repo, but which use code in the CI repo to interact with GitHub. For examples of how to call these scripts from a bash script, see this codetools function, or any of the functions starting with "cmsbot" on that page. These scripts allow the Jenkins jobs mu2e-offline-build-test and mu2e-production-build-test to merge PR's correctly and update the PR as tests complete.
Bash scripts in the codetools repo can call CI scripts by the following process:
- create a Python virtual environment using the
venv
command, and activate it - clone the CI repository and install the dependencies in requirements.txt using
pip install
- call the script
Once the first two steps have been done, you can re-activate the virtual environment and all the scripts and dependencies will be there. This is useful if you need a CI script after running a muse setup
: using a subshell to activate the venv and run the script keeps dependencies separate (see note at the end of Python version issues).
The standalone scripts are at the top level of the CI repo. Two examples are:
comment-github-pullrequest - Posts a comment on a pull request. Takes three arguments: -p NNN
where NNN is the number of the PR, -r Mu2e/ABC
where ABC is the name of the repo, and -R filename
where filename is a file containing the comment to be posted. An example might be comment-github-pullrequest -r Mu2e/Offline -p 555 -R gh-report.md
Comments are made as the bot user FNALbuild.
get-pr-base-sha - Retrieves the last commit sha of the branch a PR is asking to merge into, and writes it to a file. Takes three required arguments and one optional argument. The required arguments are: -p NNN
where NNN is the number of the PR, -r Mu2e/ABC
where ABC is the name of the repo, and -f filename
where filename is the file where the output information will be written. The optional argument is -j anything
where anything means it doesn't matter what the value is, if this argument is included, the script will write the ref name (i.e. branch name) to the file rather than the commit sha. An example might be get-pr-base-sha -r Mu2e/Production -p 333 -f myShaFile.txt
or get-pr-base-sha -r Mu2e/Production -p 333 -f myRefFile.txt -j true
adding a new script
If you need to create a new script to interact with GitHub, and which needs to be executable by non-python bash scripts, use comment-github-pullrequest, get-pr-base-sha, or report-test-status as a model. All of them use argparse to define and process command-line arguments. Once you're satisfied the script will do what you want, set it to be executable with the command git update-index --chmod=+x your-file-name
and then commit. If you don't do this step, the script will not be executable.
Code for controlling whether build tests are run
This is Python code used by the mu2e-github-bot job in our PR tests workflow. It collects information about the pull request and determines which, if any, tests need to be run. In certain cases, it communicates with the PR creator via comments posted on the PR, informing the creator of its decisions.
process-pull-request - an executable script similar to the ones already described. Takes the repo name and the PR number as in process-pull-request repo Mu2e/Offline pr_id 555
and launches the process_pr script in the Mu2eCI folder.
Mu2eCI/process_pr.py - The main script for mu2e-github-bot. Any time Jenkins receives an event from GitHub, it uses this scripts to determine what, if anything, to do. The procedure of this script is as follows:
- If the PR has been merged, collect all the open PRs on this repo and run process_pr.py on them -- this will have the effect of posting a comment on them saying the HEAD has changed.
- Establish the list of "authorized users": people who can initiate tests on this PR
- Check which files were changed by the PR and notify the people who have asked to watch these files -- this information is in config/watchers.yaml
- Determine which tests should be run by default on these changed files -- this information is in Mu2eCI/test_suites.py.
- Collect all the commit statuses attached to the last commit in the PR. Use these to determine if tests are already running and if the base branch HEAD has changed.
- Look for a test command: loop through the comments on the PR, ignoring comments that have already been seen or that predate the last commit. Look for comments by authorized users that contain valid test commands. Mu2eCI/test_suites.py contains the regexes used for this. If a valid command is found, add the appropriate test suite to the list of tests to trigger, and have FNALbuild post a thumbs-up reaction on the comment.
- If the PR is brand new and the PR creator is in the Mu2e organization, make sure the default tests are in the list to trigger.
- Update the PR with a label indicating the state of any tests, and actually trigger the tests by creating a properties file (code in Mu2eCI/common.py) containing the information about the test to be run. The existence of this file is what tells Jenkins to run the actual tests.
- Update the commit statuses to reflect the test suite to be run, if any
- Post a comment on the PR. Depending on the situation (new PR, tests requested, no tests requested but the base branch HEAD changed), different messages will be posted. These can be found in Mu2eCI/messages.py.
Commit status, test status, and labels
Commit statuses are collections of information attached to a specific commit on GitHub. The GitHub commit status API is documented here. We use these to keep track of actions Jenkins has taken regarding a pull request, and the results of these actions.
To see the statuses for the last commit in a PR, scroll down to the bottom of the PR's conversation tab:
One can see the statuses of any commit by querying the API:
curl \ -H "Accept: application/vnd.github+json" \ -H "Authorization: Bearer TOKEN_REDACTED"\ -H "X-GitHub-Api-Version: 2022-11-28" \ https://api.github.com/repos/Mu2e/Repo_Name/commits/commit_sha_here/statuses
The fields used by the CI system are:
- context - the action this status refers to. Options include "jenkins/ghprb" for Jenkins receiving an event from GitHub related to this PR, "mu2e/buildtest/last" for finding the latest commit on the base branch, "mu2e/buildtest" for the build tests, and the names of certain individual tests.
- state - the current state of the action described in "context." May be "success,", "pending," "failure", or "error".
- description - the current result of the action. May simply say "The build is running", or may contain more detailed information about a completed test. In the case of "mu2e/buildtest/last", it will include the commit sha for the last commit in the base branch.
- target_url - url that points to more information. May point to a log file in the case of a completed test, or a Jenkins console in the case of a running one.
Test status in process_pr.py
The process_pr.py script in CI/Mu2eCI keeps track of the status of tests run on a PR, using a dict object called test_statuses
. This is populated partially by looking at the latest commit status with the context "mu2e/buildtest". Note that multiple fields in the commit status contribute to the test status. The possible values of a test status partially overlap with the possible values of the "state" field of a commit status, but can also contain information pulled from the description, such as if a test is "running" or "stalled".
PR labels
Labels can placed on a PR; they are attached to the PR as a whole and not individual commits. We add labels to the PR indicating which tests are needed and what their status is. If the build tests are running, the label will say "build running". Labels such as "build pending" and "build running" are set by the mu2e-github-bot Jenkins job in process_pr.py, using information in the test_statuses
object mentioned above. Labels indicating the build has finished are set by the Jenkins jobs that run the tests themselves (mu2e-offline-build-test and mu2e-production-build-test).
Managing the CI Github repo
automated tools
Three main automated tools work on the CI repo: dependabot, pre-commit-hooks, and tests.
dependabot
Dependabot manages updates to the dependencies listed in CI/requirements.txt. It checks weekly for updates, and makes a PR to CI to change requirements.txt if it finds them. Dependabot actions are controlled by the config file at CI/.github/dependabot.yml. Dependabot runs on GitHub infrastructure.
pre-commit-hooks
pre-commit-hooks ensures uniform style and formatting. When you make a PR to the CI repo, it checks for whitespace errors, python warnings, etc, and then adds its own commit to your PR fixing these issues. These actions are controlled by the config file at CI/.pre-commit-config.yaml. When the pre-commit-hooks related packages have updates, pre-commit-hooks makes a PR with the versions in .pre-commit-config.yaml updated. pre-commit-hooks runs on GitHub infrastructure.
tests
The tests are run when new code is pushed to a branch of the Mu2e/CI repo. These are NOT the build tests for Offline and Production; these are only for changes to the CI repo. Both dependabot and pre-commit-hooks make their own branches for their PRs, so the these tests always run on those. The tests are controlled by the config file at CI/.github/workflows/tests.yml. Right now they only check that all the dependencies install correctly in the python versions we want. The tests run on GitHub infrastructure -- referred to elsewhere on this page as the "GitHub test runner".
Python version issues
The OS default python version on the Jenkins machines is Python 3.6. This is the version Jenkins uses when it creates the virtual environment where the CI repo code runs, so all code in the CI repo needs to work properly with that version. Some dependencies have stopped supporting Python 3.6 in their newest versions. Additionally, the GitHub test runner has dropped Python 3.6 from its latest version. This means we need to avoid problems caused by unwanted updates.
If a dependency in CI/requirements.txt is updated to a version that doesn't support Python 3.6, the next Jenkins job will stop downloading dependencies when it reaches the unsupported one, and will continue with whatever dependencies it happens to already have. If a dependency listed later in requirements.txt also received an update, the updated version will not be used. The way to avoid this problem is to always check that the automatic tests run on GitHub all pass, especially the Python 3.6 test, before merging any updates from depdendabot. If they don't pass, simply close the PR. To ignore updates for a dependency in CI/requirements.txt, add it to the "ignore" section of CI/.github/dependabot.yml. If, on the other hand, the Jenkins version of python is upgraded, it may be necessary to remove items from the "ignore" section so they can be updated. As an example of an "ignore" section:
#requirements.txt updates: - package-ecosystem: "pip" ... other things ... ignore: - dependency-name: "PyGithub" - dependency-name: "requests"
If the GitHub test runner is set to use the latest version, it will fail to set up Python with the error message "Version 3.6 with arch x64 not found". This will result in PRs meant to update the CI repo failing the tests, even if the PR itself would cause no problems. The way to avoid this problem is to ensure that the "runs-on" item in CI/.github/workflows/tests.yml is set to ubuntu-20.04 and not to ubuntu-latest. If Jenkins is later updated to a new version of Python, this can be changed.
Important note: The python version required for the CI repo is not synchronized with the version required for Offline and Production. They use the version of python required by art, which is regularly updated. This gets set up when muse setup
is run. Unfortunately, the build test workflow needs to use the CI repo before the muse setup can happen, so can't just piggyback off Offline to get the same Python version -- hence using the OS default python. This also means that, in order to keep the different dependencies separated, if a Jenkins job calls a script from the CI repo after running muse setup, it should use a subshell to activate the virtual environment and run the script.