GitIntro: Difference between revisions
No edit summary |
No edit summary |
||
Line 157: | Line 157: | ||
==Changing and Committing Files== | ==Changing and Committing Files== | ||
< | <div style=color:red>Please note that this tutorial describes the conceptual process of committing changes. The mu2e [[Git:gitCommits|committing workflow]] must be followed</div> | ||
The Mu2e software team recommends that, when you wish to change the repository, you first checkout a local, temporary, working branch | The Mu2e software team recommends that, when you wish to change the repository, you first checkout a local, temporary, working branch |
Revision as of 20:44, 15 December 2017
Introduction
git is a Source Code Management System that is more powerful than earlier systems such as cvs or svn. As a new user of git in the Mu2e environment you won't use any of the advanced features and git will feel like verbose version of cvs or svn. As you gain more experience, the power and utility of the advanced features will become clear. Source code management systems are also called Version Control Systems or Revision Control Systems.
The Mu2e Offline git repository is hosted on the Fermilab redmine server.
Getting Help
Mu2e maintains a page that has links to code management systems; this includes both general git information and Mu2e specific information. The most complete resource is on git itself is the git website: http://git-scm.com/doc
We strongly recommend that you read chapters 1 through 3 as soon as is practical
The built-in git documentation can be accessed using either of the following commands (using the git clone command as an example ):
git help clone
or
man git-clone
The command:
git status
gives useful reminders and suggestions. For example it reminds you to commit uncommitted changes. It also tells you how to back out of some operations, such as git add, git rm or git rename.
Getting git into your environment
If you are logged into the interactive machines, you should begin every Mu2e login session with the command:
setup mu2e
This t adds git to your PATH. For those of you who are familiar with UPS, git is found as a UPS product in: If you are doing Mu2e work at a non-Fermilab site, consult an expert at that site to learn the command that does the equivalent of setup mu2e. If you need to install git yourself, see http://git-scm.com/downloads . To see which version of git you are using, type the command
git --version
When you are reading git documentation, pay attention to version numbers mentioned in the documentation. git has many safety features, some of which are on by default and some of which must be invoked explicitly. In general, the higher the version of git, the more safety features are available and the more that are enabled by default.
Configuring git
The first time that you use git, run the following commands:
git config --global user.name "Your Name" git config --global user.email you@example.com git config --global push.default current
The values will be stored in $HOME/.gitconfig
file, so configuration needs to be done only once per home directory.
The first line tells git who you are so that your changes to the Mu2e code can be properly labeled; git uses the second line to send you email if you have signed up for notices; the last line changes one of git's defaults to a more intuitive, and safer, behaviour.
Getting a Copy of the Mu2e Offline
You will need 10's of MB to checkout the Offline code and GB to build it. You woudl typically work on the app disk
/mu2e/app/users/$USER
To get a copy of the Mu2e Offline code, cd to a working directory and type the command:
git clone ssh://p-mu2eofflinesoftwaremu2eoffline@cdcvs.fnal.gov/cvs/projects/mu2eofflinesoftwaremu2eoffline/Offline.git
If git tells you that you do not have permission, you can get a read-only copy of the code using the command:
git clone http://cdcvs.fnal.gov/projects/mu2eofflinesoftwaremu2eoffline/Offline.git
The ssh url allows you to both read from the repository and write to it; the http url allows read-only access. If you do not have permission to use the ssh url, you need to check two things:
- [ComputingLogin|Check] that you have a valid kerberos ticket and that it is forwardable.
- Make sure that you have been added as a member of the mu2e group in redmine. This shoudl happen when you get your orginal login account
First Look at the Layout
The git clone command created a single subdirectory, named Offline, in your current working directory. cd to that directory and look at its contents, including the files that begin with a dot:
cd Offline ls -a
Most of the files that you see are directories that contain Mu2e code or are files that are part of the Mu2e build system. The two exceptions are
- The subdirectory .git
- The file .gitignore
The subdirectory .git contains a complete copy (a clone) of the information stored in the Mu2e redmine repository; that is, it contains a complete history of the Mu2e code from the beginning of Mu2e up to the time that you issued the git clone command. If someone later modifies the redmine repository, the clone in your .git file will not be modified until you ask for it to be updated. The directory tree rooted at .git is large, 10's of MB.
The .git subdirectory is sometimes called your local repository while the redmine repository is sometimes called the remote repository. Actually, the redmine repository is just a remote repository, not the remote repository. Git is able to deal with multiple remote repositories but the introductory Mu2e git documentation will not talk about those features; for the foreseeable future, Mu2e will use our redmine repository as the unqiue, authortative central repository.
The file .gitignore tells git to ignore files that match certain patterns; see docs for more details. If you wish to add additional patterns please make a proposal to the Mu2e Offline software team. You may also define a personal .gititnore
file that is not seen by other people and which applies to all git projects in which you participate; by default, this files lives in ~/.config/git/ignore
; see gitignore for details.
The other files and directories that you see in your Offline directory, and recursively down the through the directories are called your working tree.
About Branches
The next section of this page presumes a minimal knowledge about git branches; this, in turn, requires a minimal knowledge of git commits. This section will endeavor to give you the minimal information that you need to make sense of the remaining material. We strongly encourage you, at the first practical opportunity, to read Chapters 1 to 3 from git docs ; this will give you a good understanding of git commits and git branches. The more you understand from these 3 chapters, the easier the rest of the material will be.
The fundamental unit of git management is called the commit. Suppose that you clone an existing git repository, edit two files and tell git to commit the change. The action of doing a commit stores a copy of the two modified files somewhere under the .git directory (read the git documentation if you want details). git also creates a new internal git object, called a commit, that contains:
- A "snapshot" of all files that you would have to checkout to recover this commit; not just the two files you committed with this commit, but all files that are part of working tree and are already committed, either as part of this commit or as part of an earlier commit. This snapshot is not a copy of the files but a compact representation of where to find the files; the files actually are found somewhere under the .git directory.
- A 40 hex-digit hash code of the content of all files that participate in 1); this hash code is the name of the commit.
- The hash code of the parent commit; the parent commit describes the state of the local repository just before you started to make changes.
- Other bookkeeping information that you can look up in the git manual.
A git repository is nothing more than a graph of commit objects where the word graph used in the sense of topology. The following figure shows a cartoon of a very simple git repository:
In this figure, the red outline boxes represent commits and each commit has been given a mnemonic name, c1 through c13; in reality git commits are named with 40 hex-digit hash codes. In this figure the arrows that connect the commit boxes denote parentage. The repository begins with commit c1, which has no parent. Commit c1 is the parent of the commit c2. The commit c3 is the parent of both commits c4 and c6. The commits c6, c7, c8 form a branch. This branch is merged back into the main branch at commit c5; note that commit c5 has two parents, c4 and c8. The commit c5 is known as a merge commit. Finally commits c10 and c11 form another branch; this branch is not merged back into the main line; maybe it will merge in at a later date or maybe it won't; both are legal.
The items in the figure that are shown as solid red boxes denote the names of git branches. A git branch is just a move-able, lightweight pointer to a commit; you can think of the present value of a branch as an alias for the 40 hex-digit hash code that is the name of a commit. In the figure the git branch names point to their matching commit. Suppose that we were to start from commit c11 and make a change to some files; if we committed those changes, git would make a new commit object, whose parent is c11 and it would move the branch "feature2" to point at the new commit object. Whenever you do a commit, git automatically advances the appropriate branch.
If you take git out of the box and do not modify any of its defaults, it will contain a branch named master. Many git users, including, Mu2e use git in such a way that the head of the main line of development is always at the head of the master branch.
The last idea illustrated in a figure is a git tag, which is shown as the solid blue box. As for a branch, a tag is just a pointer to a commit. The difference is that once it is seated, a tag stays put; it is not advanced by future commits.
You may have noticed that this section overloads the word branch; it has two different meanings. One meaning is the "move-able, lightweight pointer to a commit". The other use is "a set of commits connected by parent-child relationships", such as (c6,c7,c8) or (c10,c11). This is common usage and you will need to learn to distinguish the two meanings from their context.
Branches in the Mu2e Offline Repository
Now issue the git command that lists all branches in the .git subdirectory:
git branch -a
For a clone done in late 2017, this produced the output
* master remotes/origin/HEAD -> origin/master remotes/origin/CaloDigi0816 remotes/origin/CaloGeom remotes/origin/GenVector
The branches of interest to us are master and remotes/origin/HEAD. The rest represent temporary development work. All of these branches, including their full history, are cloned in the .git subdirectory. The line
remotes/origin/HEAD -> origin/master
tells git that, when someone clones the repository, git should:
- Create a new local branch named master and initialize it to point at the same commit as remotes/origin/master
- Create a working tree that contains the files that are part of the local master branch
The line:
* master
says that the local repository contains a local branch named master; we can tell it is a local branch because it's name does not begin with "remotes". The asterisk beside master tells us that our working tree is a checkout of the local branch named master. If you have git colorization enabled the branch that is checked out will be in a unique color; if you do not have colorization enabled, but would like to, see config.
You can list only the local branches by removing the -a option from git branch command:
git branch
which produces the output
* master
It was mentioned earlier that git allows your local repository to be aware of multiple remote repositories and to copy branches from them to your local repository. Any branch that begins with "remotes" is a remote branch. The next field in the name identifies which remote repository it came from. The name "origin" is a shorthand for "the place from which I first made the clone"; you can see its definition in the file .git/config . The last field in the name of a remote branch is the branch name, proper.
One of the most important git commands is,
git status
If you issue it right after a clone it will produce the output
# On branch master nothing to commit, working directory clean
The first line tells us the same information as did the asterisk beside master in the previous output listing, that the working tree is a checkout of master. The next line tells us that we have not made any changes to the working tree; if the working tree exactly matches its corresponding local branch then it is said to be clean.
Among other things, git status will tell you if you have uncommitted files in your working tree. It is important to watch for these since your work can be lost if you issue a git checkout command without committing your changes. If you are working on the local master branch, git status will also let you know if the it is ahead or behind the corresponding local tracking branch. Your local branch is ahead of the local tracking if the local branch contains commits that have not yet been added to the local tracking branch. Your local branch is behind your local tracking branch if your local tracking branch contains commits that have not yet been added to the local branch.
Use git status regularly to verify that the state of your work is indeed what you thing it is.
For most (all?) git commands, the branch named remotes/origin/master can be called simply origin/master; the rest of this documentation will use the shorter name.
All of the branches whose names begin with "remotes/origin" are called local tracking branches; they track the state of a branch in a remote repository. In particular the branch origin/master is also known as the local master tracking branch.
Changing and Committing Files
The Mu2e software team recommends that, when you wish to change the repository, you first checkout a local, temporary, working branch and commit your changes to that branch. This is described on the wiki page Git Workflow for Mu2e. The reason for this recommendation is that will create a commit history that is much easier to navigate.
Editing a file
Edit the file using your favorite editor. If you give the git command:
git status
the output will show that the file you just edited has uncommitted changes.
Committing a file
To commit the change to your current branch, use the following git command:
git commit -m "Your commit comment goes here." filename
If you do not specify the -m option git will open an editor session in which you can type your comments. You can control the editor that git chooses for you; git inspects the following resources, in the specified order, and the first one that is defined wins:
GIT_EDITOR environment variable core.editor variable in $HOME/.gitconfig VISUAL environment variable EDITOR environment variable vi
Fixme: reference to a vi survival guide
If you change many files, you may choose to commit each file individually; you may choose to commit all files at once; or you may choose to commit the files in several groups. The Mu2e software team recommends that you commit files in related groups. For example, if one logical change touches 3 files, commit all three as a single commit; in this case the commit comment should focus a big picture statement of what the change does, not on the individual changes to each file - we can use git diff for that.
To commit several files at once:
git commit -m "Comment" file1 file2 file3
To commit all uncommitted changes in the working tree:
git commit -m "Comment" -a
All of the above comments apply to commits for any reason: edited file, new file, deleted file, renamed file. But they will not be repeated in those sections. Edit Creating a new file
Use your editor to create the file and its initial content. If you give the command
git status
you should see your file is on git's list of untracked files. The next two steps are:
git add filename git commit -m "Comment" filename
If you do a git status following the git add but before the git commit, you will see your file on the list of files that have been added but not committed. If you decided after the add but before the commit that you want to undo the add operation, you can find the command to do that in the output of git status! The answer is:
git reset HEAD filename
Edit Deleting a file
Delete the file from your working tree using the Unix rm command. If you give the command
git status
you should see that the file is present on the list of deleted files. To tell git that the file should be deleted:
git rm filename git commit -m "Comment" filename
If you have removed the file, but not yet issued the git rm command, you can recover the file by:
git checkout -- file
The git status command has a reminder about this last command.
If you later need to access to the deleted file, you can checkout an earlier commit in which the file exists. When you do so, the file will appear in your working tree.
There is no option for a "one step" version of:
rm filename git rm filename
Edit Renaming files
To rename a file, git does all of the work:
git mv sourceFile destinationFile git commit -m "Comment" sourceFile destinationFile
Note that you must commit both the source and the destination files. This action preserves the full history; that is you can start with destinationFile and follow the commit history back to the creation of sourceFile.
Edit Adding directories
With git you cannot commit an empty directory; you must put at least one file in it before you can commit it.
To begin, create a directory and add one file to it:
mkdir dir1 emacs dir1/file1
Then issue the git commands:
git add dir1 git commit -m "new directory" dir1
Note that you did not need to add or commit dir1/file1 - this is automatically taken care of because dir1/file1 already existed at the time that command "git add dir1" was executed. If you add dir1/file2 after the add and before the commit, you will have to explicitly "git add" that file.
The first few times you do this, you should use git status after each git command to familiarize yourself with its output in these circumstances - its self explanatory. Edit Untracked Files
When you give the command
git status
it will sometimes tell you have you have untracked files in your working tree. Look at this list carefully, it may contain files that you intended to add but forgot to.
You may notice that editor backup files that end in the tilde character (~) are not present in the list of untracked files. Nor are object files (ending in .o or .os) or dynamic libraries (ending in .so) listed. These are not listed because they match one of the patterns listed in Mu2e .gitignore file. The Mu2e software team has supplied a .gitignore file that should result in a short list of untracked files. If you would like to add a new pattern to the .gitignore file, please suggest it to the Mu2e software team; we ask you to run this past the Mu2e software team because the pattern you wish to exclude may be important for other users. Edit Some General Comments
Whenever you are editing files in a git environment you will always be working on a git branch, perhaps the local master branch but usually a temporary local working branch. You should be aware of the possibility of 5 copies of any file that you are working on:
The copy in your working tree The copy in your local working branch The copy in your local master branch The copy in the local master tracking branch The copy in the master branch of the redmine repository
git checkout" copies files from a branch to your working tree git *commit copies files from your working tree to your local working branch git merge copies files from one local branch to another git push copies files from your local master branch to both the local master tracking branch and to the redmine repository. git fetch copies files from the redmine repository to the local master tracking branch git pull does a fetch and then merges that branch into the local master branch
There is one more critical idea, git rebase, which will be described as part of the recommended Git Workflow for Mu2e.
The recommended Git Workflow for Mu2e will give a recommended way of using these 6 commands to ensure that the commit history of the repository is simple and is easy to understand. This workflow is constructed so that all conflict resolution is done during rebase operations, never in any of the others.
One of the design features of git is that git commit can never generate a conflict. Commit regularly.
Conflicts
Any source code management system has to deal with the following situation:
- You have cloned a repository, edited some files, added some files and deleted others (and committed ).
- You would like to return your modified repository to the redmine repository
- Before you have a chance to send your work to the redmine repository, someone else has modified the redmine repository
This situation is called a conflict. There are several flavors of conflict:
- The files that you have modified are disjoint from the files that the other person has modified.
- Both you and the other person have modified some of the same files but in each file the changes are at widely separated places.
- Both you and the other person have modified the same lines in the same file.
Git's default behavior is this:
- In cases 1 and 2 git merges the two sets of changes
- In case 3 git gives up and asks you to fix the file before telling git to continue.
When git gives up it writes both versions of the conflicting text to the file, delimted by conflict markers:
<<<<<<< HEAD version of the code from your file ======= version of the code from the other file >>>>>>> b600c43a1af8fb632679c221a71c689c132e25fd
Your job is to ensure that the correct code is in place and to remove the conflict markers; this may involve consultation with the other author. A later section will describe how to tell git to incorporate your newly revised version of the code.
Browsers of Repository History
There are many tools for getting a graphical view of the history of a repository. Two of them are:
gitk SourceTree
If you know of other tools, please add them to this list and describe them briefly below.
If you are working at Fermilab, gitk is distributed as part of the git UPS product and is available after you "setup mu2e". To use gitk, cd to your Offline directory and:
gitk --all&
If you are at a non-Fermilab site consult with whoever supports your git installation.
SourceTree is only available for Windows and Mac OSX, not for Linux variants. You can download it from their website: http://www.sourcetreeapp.com