GitHubWorkflow: Difference between revisions

From Mu2eWiki
Jump to navigation Jump to search
Line 169: Line 169:
# Within a PR, prefer many commits with a small number of related changes to few commits with many changes each.
# Within a PR, prefer many commits with a small number of related changes to few commits with many changes each.
# Do not make spurious white space changes or formatting changes; if you want to make such changes, do so in a separate PR that includes only those changes.
# Do not make spurious white space changes or formatting changes; if you want to make such changes, do so in a separate PR that includes only those changes.
# Do you use hard tabs in your code; instead program your editor to change tabs into the appropriate number of spaces.
# Do not use hard tabs in your code; instead program your editor to change tabs into the appropriate number of spaces.  Fixme: Add links to .emacs and .vimrc.


==Collaborating on a feature==
==Collaborating on a feature==

Revision as of 21:47, 13 January 2021

Introduction

This page describes two recommended git workflows for use with the Mu2e Offline code in GitHub, one workflow for regular users and one for developers.

The Mu2e computing group manages the Mu2e organization within GitHub. To have full access to Mu2e software, Mu2e collaborators should create a GitHub account, and request to join the Mu2e organization.

  1. To execute either of these workflows make sure you have defined the Mu2e environment by executing "setup mu2e" in your shell. This ensures that you have known version of git.
  2. To execute the developer workflow, make sure you have a GitHub account, and that you are added to the Mu2e GitHub organization (https://github.com/orgs/Mu2e/people)
  3. In order to authenticate with github, you will need to set up your ssh keys on the machine from which you plan to clone/push, following the instructions here.
    • Password based authentication on GitHub is now deprecated and is scheduled to be disabled soon, as described here.
  4. Before you do any developing, go to your GitHub account and create your own fork of the official Mu2e Offline repo using the GitHub web interface (instructions here).

The main Mu2e repository is 'Offline', which contains the algorithms, data structures, art modules and configuration used in simulation production, online filtering, offline reconstruction, and other tasks. The main development branch within Offline is 'master', which should be used for all code development. Offline also contains legacy branches associated with particular data sets, which are maintained for a limited time after the master branch has been developed past those data, as described in the table below.

Mue2/Offline branches
branch name branch purpose support end date
master development end of Mu2e
MDC2018 MDC2018 dataset support end of 2020


Authentication

We recommend that you establish authentication to GitHub using ssh keys and all GitHub examples on the Mu2e wiki assume that you have done so. The page Authentication#Authenticating to github has instructions.


Downloading Offline as a user and NOT a developer

Option 1: you want the default primary version of the code (most people):

  1. clone the repo:
  2. git clone git@github.com:Mu2e/Offline cd Offline
  3. Done!

Option 2: A particular collaborator has a version or branch you want to use:

  1. Find their github user name
  2. Learn the name of the branch they are working on; this may be master but it normally should not be.
  3. Clone their fork:
  4. git clone git@github.com:<their GitHub user name>/Offline cd Offline git checkout origin/<branch name>
  5. Done!

Option 3: You want to use pgit to avoid a long compilation time (EXPERIMENTAL)

  1. Create new directory to put your Offline repo in and move to that directory
  2. mkdir Offline cd Offline
  3. As in Option 2, determine fork and branch name you wish to use
  4. Create a partial checkout clone
  5. pgit2 setup git@github.com:<user name (or mu2e)>/Offline <branch name>
  6. You can now use as normal:
  7. source setup.sh scons -j 4
  8. You might need to source the following before 'source setup.sh':
  9. source /cvmfs/mu2e.opensciencegrid.org/setupmu2e-art.sh

Developer Workflow

This section assumes basic familiarity with git, including:


  1. Create your own fork of the official Mu2e Offline repo using the GitHub web interface (instructions here).
    • This fork will be your personal sandbox on GitHub; you can do anything you want to it and it will have no effect on anyone else!
    • You only need to do this once; you can reuse this fork for all your development projects.
  2. On your development machine, create a local clone of your Offline fork. This is another step that you only need to do once and you can use this local clone for multiple development projects in you want to. When you push code to another repository, the default will be to push to your fork; you will never push code directly to the GitHub Mu2e/Offline repository.
  3. setup mu2e
    git clone git@github.com:<your GitHub username>/Offline
    cd Offline
    git remote add -f mu2e git@github.com:Mu2e/Offline
    
  4. The above steps will create a local clone of your fork of the Offline repository with the following properties:
    • There are two remotes: your fork (origin), and mu2e.
    • All of the branches from your fork and from Ofline will be visible as remote branches
    • The only local branch will be the HEAD branch (usually master) of your local fork.
    • The checked out local branch will be the HEAD of your local fork.
    • You can inspect the status of remotes and branches using git remote -v and git branch -avv.
    • In general the head of the master branch of your local fork will be out of date; you should never work from this branch. Therefore you must do the next step before you can begin development work.
  5. Make a new local branch on which to do your development work.
    • If two or more development efforts are not intrinsically coupled, each should be done on its own branch.
    • This branch is local to your clone, so has no impact on the main Offline; when you push it, it will be to your fork. Therefore it has on effect on the GitHub Mu2e version of Offline.
    • The branch name is used for sharing your development work with other people while it is still in progress, and for making your pull request, but has no special meaning to git. Choose a name that will be meaningful.
     git checkout --no-track -b <development branch name> mu2e/master
    
  6. Do your work and commit it.
  7. git commit -m "brief comment describing the changes you are committing" file1 [ file2 file3 .... ]
  8. When you wish to back up your work, or share your work with others, push your branch to your GitHub fork. If you are working on a disk that his not backed up, such as /mu2e/app, we encourage you to push frequently in order to backup your work:
  9. git push -u origin <development branch name> The -u option tells git that your local branch should track the branch in your fork. If you push the branch again, the -u option is not needed but it won't hurt if it is present. You can use git branch -avv to see that your development branch is now tracking the version of itself in your github fork and your local clone.
  10. When your development is complete and tested, go to the web site of your GitHub fork and, using the GUI, request that your branch be pulled into Mu2e/Offline. Your pull request (PR) will start the code review process (see Code Review), which may take anywhere from a few hours to a few days.
    1. In a web browser, open https://github.com/<your GitHub user name>/Offline
    2. Click on the icon that shows all branches
    3. Click on the 'New pull request' button associated with your development branch
    4. There will be an informational message near the top of the page saying if your branch is "Able to merge" or if conflicts exist.
    5. After conflicts, if any, are resolved, fill requested information and click the "Create Pull Request" button.
    6. More info is available at the GitHub instructions for Pull Requests
  11. After you submit your PR, GitHub will automatically start a Continuous Integration (CI), which includes:
    1. In a scratch area it will merge your PR into master and will build your code (prof build only)
    2. It will run several standard fcl scripts; the test passes if the art executable returns a status of 0. There are no checks on the output.
    3. It will run two tests that check that the geometry description has no illegal constructions: the Geant4 surface check and the root overlap check.
    4. It also run code formatting and static analysis checks; at this time these are informational only and their recommendations are not enforced.
    5. It reports how many time it sees the strings "FIXME" or "TODO" in the code in the PR.

    The results of these tests are posted to the PR Conversation page. These tests must pass before your PR will be merged.

  12. If changes are requested during the code review process, make those on the same development branch as your PR. When the changes are complete, commit them, and push your changes back to your fork. GitHub will automatically update your PR to include your new commits. This is because the target of a PR is a branch, not the commit that happened to be at the head of the branch at the time of your initial PR
  13. <edit code as requested by reviewer> git commit -m "Address review comment X" file1 [file2 file3 ...] git push origin <development branch name>
  14. When the code reviewers are satisfied, one of the software coordinators will merge the PR into Mu2e/Offline. Once your PR is merged your changes (commits) will be part of Mu2e/Offline master, and your development branch can be deleted. If you are uncertain if your branch has been merged or not, select the branch, and push the 'compare' button. If this comes back stating there is 'nothing to compare to', it means all your changes were already merged. If it shows differences, those have NOT been merged, so do NOT delete your branch. To delete your branch in GitHub, just push the trash can icon. You can also delete the branch in your shell, as
  15. git branch -d <my branch name> git push origin --delete <my branch name> (this deletes the branch from your github fork as well)
  16. Every night the head of the master branch is used as input to a series of validation tests; these are similar to the CI tests discussed above; however some of the jobs run many events and the output of these jobs is compared to reference output. On the morning following the merge of your PR, you may be asked if the nightly validation behaved as you expected.
  17. To reuse your working directory for a new development, first refresh to the current head of master, then create a new branch as described above. Do NOT reuse branches for new development, as updating those to the head of Mu2e/Offline master will confuse the git history.
  18. git fetch mu2e master git checkout -b <new development branch name>
  19. We encourage you to commit your work frequently and push to your github fork frequently; this is the best way to backup your work. You do NOT need to wait until you are ready for a PR request to push to your fork.

Tips for Good GitHub Hygiene

  1. Prefer many PRs, each on a self contained topic, instead of a single PR that includes many topics.
    1. Of course, extensive changes are sometimes necessary and will require a single large PR.
  2. Within a PR, prefer many commits with a small number of related changes to few commits with many changes each.
  3. Do not make spurious white space changes or formatting changes; if you want to make such changes, do so in a separate PR that includes only those changes.
  4. Do not use hard tabs in your code; instead program your editor to change tabs into the appropriate number of spaces. Fixme: Add links to .emacs and .vimrc.

Collaborating on a feature

Sometimes you may want to collaborate on a feature branch with other developers. In this case since the main Offline repository no longer has all the development branches we need to do a couple extra steps

  1. First make sure you actually need to work on the same branch. Are you actually working on the same feature? Can the problem be split into smaller features that can be developed asynchronously? Just because features are related doesn't mean they need to be developed on the same branch
  2. Determine if a large number of people will be developing on the same branch for a significant amount of time. In this case it should become an official branch in the mu2e/Offline, like MDC2018
  3. Decide which user's fork will be the primary repo for this feature branch, and which branch on that fork you are going to use. If a new branch is needed, the owner of that fork start the new branch as follows. First make sure that you have done steps 1 and 2 in #Developer_Workflow. Then do the following:
  4. git fetch mu2e master git checkout --no-track -b <branch name> mu2e/master git push -u origin <branch name>
  5. There are then a couple options for moving forward: either add all other developers as collaborators on the primary fork, or use pull requests to the primary fork
  6. To add developers as collaborators:
    1. The owner of the primary fork opens https://github.com/<their user name>/Offline
    2. click settings on the right, then collaborators
    3. In the collaborators box, type the github user name of each other developer and hit "Add collaborator"
    4. The other collaborators can then either create a read/write access clone of the primary fork, or add it as a remote to an existing offline repo
    5. git clone git@github.com:<primary user name>/Offline or git remote add primaryfork git@github.com:<primary user name>/Offline
    6. The other collaborators can now push directly to the primary fork as if it was their own:
    7. git push primaryfork <branch name>
  7. To use pull requests:
    1. The owner of the primary fork can just push to it as normal following the normal developer workflow
    2. Other developers clone their own fork, but add the primary fork as a remote
    3. git remote add primaryfork git@githbub.com:<primary user name>/Offline
    4. Other developers can pull in and merge changes from collaborators by fetching/pulling/merging from this remote
    5. git fetch primaryfork git merge primaryfork/<branch name>
    6. Other developers push to their own fork
    7. git push origin <branch namee>
    8. Like in the normal developer workflow, they open a pull request. But then in the compare window before creating the request, change the "base repository" from Mu2e/Offline to <primary user name>/Offline (see here)
    9. the owner of the primary fork will need to accept and merge it in
    10. everything else goes like the normal workflow

Rebasing

There will be times when want to, or need to, bring your development branch up-to-date with the head of GitHub Mu2e/Offline/master. One such time is when GitHub reports that your PR has conflicts. There are two ways to bring your branch up-to-date. This section will discuss the preferred method, rebasing your development branch onto the head of GitHub Mu2e/Offline/master; you should not use the other method, merging the head of GitHub/Mu2e/Offline/master onto your development branch.

You can learn about rebasing in the GitHub documentation:

  1. git-rebase Documentation
  2. merging vs rebasing.

Until you are comfortable with rebasing we suggest that, before rebasing, you backup your work by making a gzipped tar file of your working area, excluding .so and .os files.

The instructions below presume that your GitHub fork is the remote named "origin" and that the GitHub Mu2e/Offline repo is the remote named "mu2e". The simplest workflow is:

git checkout <your development branch>
git fetch mu2e master
git rebase mu2e/master
# resolve conflicts if needed; see the git-rebase Documentation and #Tips_For_Resolving_Conflicts
git push origin <your development branch>

Note that "fetch" wants whitespace between "mu2e" and "master" but "rebase" needs a slash "/". You can now put in a pull request on your development branch.

A second option is to keep your development branch as a backup, start a new branch and rebase that branch:

git checkout <your development branch>
git checkout -b <a new development branch> 
git fetch mu2e master
git rebase mu2e/master
# resolve conflicts if needed;  see the git-rebase Documentation and #Tips_For_Resolving_Conflicts
git push origin <a new development branch>

When this process is complete, you will have two branches in your clone: <your development branch> and <a new development branch>. If you pushed both branches, they will also be in your GitHub fork of Offline. You can now create a pull request on <a new development branch>, leaving <your development branch> unchanged.

Chose the second option if it is important to retain the original branch, perhaps because you performed detailed validation using that branch and you wish to preserve the validation work and its source code for future reference.


Tips For Resolving Conflicts

When conflicts are identified by a Pull Request it is your responsibility to resolve them before continuing. There is no formula for this step; you will have to look at the 2 versions of the conflicting code blocks and decide how to best merge both functionality. If you have questions about the intent of the previously-merged conflicting code, work together with the author of those changes to figure that out. You can figure out who last changed a line in a file using the 'git blame' command.

 git blame mu2e/master <name of file that has conflicts>

When you think you are done, it's a good idea to grep to code to look for unresolved conflict markers. If you make extensive changes during rebasing, it's a good idea to check that the code builds; normally this is not necessary because the CI tests are there to catch such problems.

Once all conflicts from the merge are resolved, commit the merge and push it back to your fork. After this, GitHub will allow you to request a pull.

 git add <files that were edited as part of resolving conflicts> 
 git remove <any files that need to be removed to resolve conflicts> 
 git commit -m "Resolve conflicts message"  file1 [ file2 file3 ... ]
 git push origin <branch name>

Code Review

An important part of the GitHub workflow is reviewing new code before putting it back into the repository. Reviews are intended to minimize the risk that the requested changes break anything, check that the content of the changes are sensible, and enforce Mu2e coding standards and policies. Some reviews are automated, such as testing that the code builds and can run a few events of some standard apps. Automated code formatting checks will also be deployed soon.

Offline repo managers are responsible for assigning reviewers to each Pull Request (PR), as well as a manager in charge of each particular PR. The PR author may also assign or suggest reviewers. All assigned reviewers must approve the PR before the assigned manager will merge it in.

PRs can cover multiple subject areas. Reviewers should concentrate on reviewing code in areas in which they have personal expertise and/or subject knowledge. Reviewers are not expected to learn about areas outside their experience, as other reviewers will cover those. If you feel you were incorrectly assigned to a review, contact the repo manager assigned to the PR to request clarification or to be removed as reviewer. Reviewers should attempt to complete their reviews within a few days. Large PRs may take longer to review, and PR authors should plan accordingly. If an assigned reviewer is unavailable, they or the PR author should contact the assigned repo manager to request a substitution. The Offline repo managers should be alerted if a review becomes stuck for any reason.

Reviewers should look at the content of the PR commits for code correctness, good design and efficient implementation. Reviewers don’t need to build or run the code, that’s for the automated tests. The github commit differences referenced in the PR are the easiest way to see and review the changes. Review feedback should be inserted as comments at the relevant lines in the github diff where the reviewer has a concern. After reviewing all files and commits, reviewers should complete their review using the github interface. If you feel changes are required submit your review with that box checked. If you simply have questions submit your review checking the 'neutral' box; this neither approves the review or requires changes, it just requires a response on the part of the author. PR requesters should respond to all review comments or questions in the PR thread and/or by making a new commit inside the PR. Once all the reviewer's concerns and questions have been addressed, the reviewer should re-submit their review checking the 'approved' box. The repo manager assigned to the PR should merge the PR after all reviewers have approved it.

One of the reviewers should check for the following:

  • All modules, services and tools included in the PR must be upgraded to use valdidated fhicl.
  • Parameters that affect physics performance must not have default values in the code; the recommended values must be specified in the appropriate .fcl files. Parameters that affect debugging and verbosity may be initialized in code and need not be present in the .fcl files.

Legacy Instructions

The following sections were important during the transition into our currently recommended git workflows. Hopefully they are not relevant for current work. They have been retained for now and will be deleted at a later date.

Adapting your Existing Clones to the New Workflow

Many of you have existing clones of your own github fork that were created using an older recommended github workflow. This section discusses how to bring your clone into compliance with the new recommended workflow. We cannot give an exact script to follow since everyone's local clone is likely to be different; so the following is general guidance:

  1. If you have unpushed commits on your local master, see the instructions in the hints section below.
  2. If you have uncommitted changes in your checked out area, commit them to a local development branch that you intend to keep.
    1. If needed: git stash; create a new development branch starting from the current head; git stash pop.
  3. If the old workflow you will have have a local branch named master and a branch named remotes/origin/master. These come from the master branch of your github fork; if you still have these branches, delete them.
    1. You cannot delete the branch that is currently checked out; therefore, to delete your local master you need first to checkout a different branch.
  4. You must have a branch named remotes/mu2e/master that comes from the Offline repository in the Mu2e github instance; actually the middle "mu2e" in the branch name is just an arbitrary identifier than you can choose but it would be weird to chose anything except mu2e or Mu2e.
  5. If you have existing development branches that started from something other than mu2e/master, just keep them and proceed as described in the recommended workflow; if you are lucky they will merge cleanly. If not, follow the merge/rebase procedure.
  6. Start all new development branches with:
 git fetch mu2e master
 git checkout -b <new development branch name>

Here are a few hints and reminders of things to watch for:

  1. Use git remote -v and git branch -avv to learn what remotes you have and what branches are tracking what other branches.
  2. If your local master branch has commits on it that you need to keep, then start a new development branch from the head of master. Just develop on it normally and push it to your fork when you are ready and make a pull request when you are ready for that. If you are lucky it will merge cleanly; if not follow the merge/rebase procedure. Once you have created the new development branch you can delete the local master branch.


Migrating an existing redmine clone directory to GitHub

You can repoint an existing Offline clone from redmine to GitHub as a way of making an easy transition. To follow these instructions, you must already have a GitHub account that's registered in the Mu2e Organization. The following assumes your redmine clone is in /mu2e/app/Me/MyRedmineClone/Offline, that your GitHub username is MyGitHubName, and that you are working on a branch called MyDevelopmentBranch.

    1. Login to GitHub
    2. Fork Mu2e/Offline in GitHub using the 'Fork' button on the top right of the Mu2e Offline repo page
    3. Add GitHub as a remote to your clone
      > cd /mu2e/app/Me/MyRedmineClone/Offline
      > git remote rename origin Redmine
      > git remote add mu2e git@github.com:Mu2e/Offline
      > git remote add origin git@github.com:MyGitHubName/Offline
      These commands are very fast as no data is actually transferred. You can see your remotes with:
      > git remote -v
      Your clone is now connected to the main Mu2e github repository and your own fork in addition to the Redmine repository. Additionally, we have changed your fork to be the default remote repository (origin)
    4. Push your working branch to your GitHub fork:
      > git push -u origin MyDevelopmentBranch
    After this, you can continue developing in this directory according to the general GitHub workflow described below.


I have a branch on Redmine that's not on GitHub! How do I use it now?

The official Mu2e GitHub repo will usually have only the master branch (+ maybe some tagged release branches). Currently Redmine has >100 branches. If you have a Redmine branch that is not merged in to master, you will move this branch to your fork to work on it instead.

To start, you should read the development workflow and understand remotes and forks, and have a github account and a fork of the Mu2e Offline repo.

  1. Create a local clone of your Offline fork. This will be identical to cloning the official Mu2e repo (if your fork is up to date), except that the default remote (called "origin") where you will push to by default will point to your fork instead of the official Mu2e version.
  2. git clone git@github.com:<your user name>/Offline
  3. Add Redmine as a remote to your local clone:
  4. cd Offline git remote add redmine http://cdcvs.fnal.gov/projects/mu2eofflinesoftwaremu2eoffline/Offline.git
  5. Fetch a list of the commits and branches from redmine, and check out whatever branch you need
  6. git fetch redmine git checkout -b <branch name> redmine/<branch name>
  7. Push this branch to your fork
  8. git push origin <branch name>
  9. You should now be able to see your branch on github.com/<your user name>/Offline. You can now continue working on that branch and pushing to your fork, and when it is ready you can submit a pull request to the main Mu2e repo using the developer workflow