Last week we learned how to log in to the DCC to take advantage of the shared computing resources available at Duke. While access to large amounts to processing power and memory will be advantageous in your case studies moving forward, development on the cluster can be awkward and uncomfortable for those unfamiliar with a Linux/Unix command line. In part of the lab, we will introduce the command line interface of “git” as a straighforward way to link development on the cluster with local work on your personal computer.
git pull
: Update your local branch of a repository with any commits that have been pushed to the remote branch of your repository
When you have made important changes to your local branch of your repository and want to update a remote “master” branch of your repository, you will need to stage only the files with important changes, save a snapshot of those staged changes, and then update the remote branch to include your latest snapshot. Those three steps are accomplished with git add
, git commit
, and git push
respectively
git add <filename1> <filename2> ...
git commit -m "<description of commit>"
git push
git init
: Create a git repository in your current directory on your current computergit clone <url to a git repository>
: Make a local branch of some repository such that the repository you are cloning from will be the master branch.git status
shows color coded list of files in your local directory that are tracked in your repository or untracked.
git log
git rm <filename>
works like git add
but stages a file to be removed from your repositoryThere is an important difference! Git is a version control system, not a website. It is a piece of software that helps you better track changes in your code. Think of it like a scrapbooking system for a CS or data science project. Git has nothing inherently to do with the internet.
GitHub is one (of many) websites that will host a copy of your scrapbook remotely. This is useful because you and your friends, who all have internet access, can work together on the scrapbook collaboratively. The scrapbook (git repository) typically becomes especially important for group CS or DS projects.
Let’s go through the process together of creating a git repository and seeing how it can be used to simplify development for the DCC.
git
as your version control system and provide the link to your remote repository.git init
. Then make a commit, allocate space for a repository on GitHub (first and second steps above) and follow the instructions to “push an existing repository from the command line”module load R
.git clone <github url>