GitPartialCheckout: Difference between revisions
No edit summary |
|||
(9 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
<font size=+2 color=red>'''This procedure is deprecated''' please see the current links from the [[Code]] page</font> | |||
==Introduction== | ==Introduction== | ||
Line 5: | Line 8: | ||
The local area is a valid, complete git repo, so you can pull, push, tag, etc. while you are building a subset of the code. Commands and concepts such as pull, tag, hashes, branches, etc, which normally apply to an entire repo, will still apply to the whole repo, not just the check-ed out part. | The local area is a valid, complete git repo, so you can pull, push, tag, etc. while you are building a subset of the code. Commands and concepts such as pull, tag, hashes, branches, etc, which normally apply to an entire repo, will still apply to the whole repo, not just the check-ed out part. | ||
To implement the partial checkout, we have provided a script called "pgit" which is in your path after | To implement the partial checkout, we have provided a script called "pgit" which is in your path after <code>mu2einit</code>. We have also provided a set of pre-built Offline areas, to serve as the base builds. These can be provided for the head of selected branches. | ||
==Two Warnings== | ==Two Warnings== | ||
Line 25: | Line 28: | ||
==Commands== | ==Commands== | ||
Make sure you're using the Mu2e version of git: | |||
mu2einit | |||
setup git | |||
For instructions see | |||
pgit help | pgit help | ||
See the base builds available: | See the base builds available: | ||
pgit list | |||
2018-12-13 13:50 master/6d77f6b8/SLF6/prof | 2018-12-13 13:50 master/6d77f6b8/SLF6/prof | ||
2018-12-13 13:42 master/6d77f6b8/SLF6/debug | 2018-12-13 13:42 master/6d77f6b8/SLF6/debug | ||
The hex is the first 8 char of a commit hash. The list is presented with the most recent at the top. | The hex is the first 8 char of a commit hash. The list is presented with the most recent at the top. | ||
To start a new partial build, backed by the given base build: | To start a new partial build, backed by the given base build: | ||
pgit init master/6d77f6b8/SLF6/prof | pgit init master/6d77f6b8/SLF6/prof | ||
If you know you want the latest master base release, there are reasonable defaults: | |||
pgit init master | |||
Setup: | Setup: | ||
Line 45: | Line 54: | ||
proff | proff | ||
to checkout a | to checkout a couple of directories: | ||
pgit | pgit add CaloReco ParticleID | ||
You can only checkout the top level directories. | |||
to remove a directory: | to remove a directory: |
Latest revision as of 22:23, 19 July 2024
This procedure is deprecated please see the current links from the Code page
Introduction
Following the standard git checkout patterns results in cloning and checking out locally the entire Offline repository. When the entire repo is built, it can take a an annoyingly long time. Partial checkout is one powerful way to mitigate this build time. In partial checkout, you rely on a base Offline build for most of the Offline libraries, and you only checkout a little bit of Offline locally and build that subset of code as you work. When you build, if a needed header file is found locally, that will be used, but if it is not in the local working area, then it will be found in the base build. After building, the exe's will pick up your local libraries first, then link the rest from the base release. The same pattern of including your local working area in a path before the base build, is followed for all the paths (fcl, python, data, bin, etc).
The local area is a valid, complete git repo, so you can pull, push, tag, etc. while you are building a subset of the code. Commands and concepts such as pull, tag, hashes, branches, etc, which normally apply to an entire repo, will still apply to the whole repo, not just the check-ed out part.
To implement the partial checkout, we have provided a script called "pgit" which is in your path after mu2einit
. We have also provided a set of pre-built Offline areas, to serve as the base builds. These can be provided for the head of selected branches.
Two Warnings
The downside to not building the entire repo locally is that your local partial build may get out of sync with the base build. If code in the base release is compiled with a different header file than the local partial build, then then executables will not run correctly, and will probably result in odd memory errors, such as seg faults. It is also possible that the code will fail silently, so these are very serious issues.
There are probably other ways to cause trouble, but all users must be aware of the following two issues.
First is the issue of intermediate commits. For example, the base build is at a certain commit, call it N, and you create a partial checkout based on this base build. After you are setup, someone else commits N+1 to the branch you are working on. In terms of the git repo, you can handle this like a full repo - you can pull or merge the new commit into your local repo, work on that and commit N+2 to the head of the branch. The problem is that when you do pull the intermediate commit N+1, the header files in your working area may become inconsistent with the header file used for the base build, leading to serious errors. Also, you may have checked out a subset of code X, when the intermediate commit concerned disjoint subset Y, so you will not see the effect of the intermediate commit at all.
If you see an intermediate commit, then you can "git diff" and decide if it is harmless with respect to dependencies. For example, if the commit was only to a cc file, then you can checkout this part locally if you want to get its effect, or ignore it in your local checkout, if you are sure it doesn't matter to your work.
The second major issue concerns your local header files. If you modify a local header file, the only way to get a fully correct build is to recompile every piece of code that includes that header file and what depends on this header may not be at all obvious.
"pgit check" provides some checks for these problems, but there is no substitute for being personally aware of these dependency issues. There are probably other ways to cause trouble, but all users must be aware of the following two issues. When in doubt, you can always "pgit quit" and go back to a normal full checkout, disconnecting from the base build.
Commands
Make sure you're using the Mu2e version of git:
mu2einit setup git
For instructions see
pgit help
See the base builds available:
pgit list 2018-12-13 13:50 master/6d77f6b8/SLF6/prof 2018-12-13 13:42 master/6d77f6b8/SLF6/debug
The hex is the first 8 char of a commit hash. The list is presented with the most recent at the top.
To start a new partial build, backed by the given base build:
pgit init master/6d77f6b8/SLF6/prof
If you know you want the latest master base release, there are reasonable defaults:
pgit init master
Setup:
cd Offline source setup.sh
see your paths:
proff
to checkout a couple of directories:
pgit add CaloReco ParticleID
You can only checkout the top level directories.
to remove a directory:
pgit rm CaloReco
to check dependencies:
pgit status
build normally:
scons -j 4
At this point you should be able to use regular git commands.
To exit partial checkout and go back to a full checkout
pgit quit
How it works
the checkout
The partial check uses the official git partial checkout methods. The functionality is turned on by
git config core.sparsecheckout true
Once this is set, only files and directories that you specify are copied from the index to the working directory. The checkout list is kept here:
echo "/SConstruct" >> .git/info/sparse-checkout
The slash says it has has to be in the top level directory. If there is no slash, it is used like a search term so "X/SConstruct" would also be checked out. This file contents can also be set to "*" for everything.
The remote of the local partial checkout is set to the remote of the .git in the base build, which will typically be the main repo.
The last initialized base build is saved here:
git config mu2e.baserelease $base_repo
The pgit command resides:
/cvmfs/mu2e.opensciencegrid.org/bin/pgit
and is put in the path in
/cvmfs/mu2e.opensciencegrid.org/setupmu2e-art.sh
The base releases
The base releases are kept up to date by a jenkins project called "mu2e_ci_branch". The script is kept in
setup codetools ls $CODETOOLS_DIR/bin/jenkinsCIBranch.sh
This project is triggered by commits to the repo. The project is configured with a set of branches to monitor. If the commit doesn't change the head of one of those branches, it doesn't build.
Since it is not easy for jenkins to push a result, there is a cron process on mu2epro@mu2egpvm01 which polls the jenkins project and pulls the new tarball when they appear. This script is
~mu2epro/cron/git/moveCIBranch.sh
Then, since only a special user can write to cvmfs, this script runs another:
cvmfsmu2edev@oasiscfs.fnal.gov:pullCIBranch.sh