This document demonstrates some of the workflows we saw in class here. All of these workflows have some pros and cons. It’s up to you and your team as to which one you adopt. Also note that your preferences might change, depending on the composition of your team and the task at hand.
A centralized workflow uses a single repo for all changes to the project. All team members pull from and push to the same repo.
Imagine Juan sits down to work on his new testing project that Mary and Aisha have been working on as well. The first thing Juan does is pull from the remote repository.
Everything is up to date, so Juan works on the R script for doing some simple math operations. He adds 2 lines to the script to demonstrate a subtraction operation, and then saves his changes.
After basking in the glow of his wonderful new code, Juan decides he should take a break for awhile, so he commits his changes.
The last step before Juan takes a break is to push his local commit up to the shared repo on GitHub.
Oops–that did not go as expected!
What went wrong? Git is telling us that there are some changes on the remote repo that Juan does not have in his local copy. How did that happen? It turns out that Mary was also working on the project at the same time as Juan. In between the time that Juan sat down and pulled the changes from the remote repo (there were none at the time), Mary made some changes to the script, committed them, and pushed them to the remote repo. In looking at the error message from Git, Juan notices that it suggests he should pull the changes and integrate them into his own script.
Juan goes ahead and pulls Mary’s recent changes to his local repo,
which issues another warning about a so-called merge conflict
in R/00_testing.R
that Juan will have to fix.
When Juan inspects his version of 00_testing.R
he sees
that Git has added some information to the script, which brackets the
new changes that both Juan and Mary made:
<<<<<<< HEAD
## subtraction
num - den
=======
## addition
num + demo
>>>>>>> d1ab55e3cf1979efd0a61954305f742cfbb38e11
Tip: You can think of these new lines as “conflict dividers”.
The =======
line is the center of the conflict.
Everything between <<<<<<< HEAD
and
=======
is content that exists in the local repo
to which the HEAD
ref is pointing.
Everything between =======
and
>>>>>>> d1ab55e3cf1979efd0a61954305f742cfbb38e11
is content that is present in the remote repo.
Tip: At this point, it would be a good idea for Juan to contact Mary and find out what her plans are, so he can avoid this problem in the future.
After confirming that Mary is done making changes for awhile, Juan can return to resolving the current merge conflict. To do so, Juan inspects the new changes that Mary made and decides if he wants to merge them into his version of the file, or whether he wants to discard them. In this case, Juan is happy with the new addition operation that Mary made, so he accepts them by simply deleting the conflict dividers.
Now Juan can commit his resolved changes and push them to the remote repo.
Now Juan is all set on his side, but Mary should make sure that she pulls Juan’s changes to her local repo before she starts working on the script again.
Note: This scenario does not necessarily require two or more collaborators.
For example, Mark knows from personal experience that someone can do some work on a computer in their office, commit those changes, and then forget to push them before heading home. Upon arriving home, they could pull their changes to their laptop, make additional changes, commit them, and then push them to the remote repo. Upon returning back to the office, they can encounter problems with committing and pushing those changes they forgot about the day before.
The feature branch workflow is designed around multiple branches within a repo. These branches can contain copies of files on the main branch, and are typically used for the development of specific features.
Let’s look at two different ways to create a new branch.
Task: Review these instructions for creating a new branch from within GitHub, but don’t follow through with them. We’ll instead use the second approach shown below.
The first is from GitHub itself. To do so, navigate to the remote repo and click on the button marked main in the upper left. This will open a dialogue box with a line for you to find or create a branch. Type the name of the new branch (e.g., “develop”) and then hit “return”.
The second option for creating a new branch, which we’ll follow here, is to do so from within RStudio.
Task: Click on the purple button with 2 rectangles and a diamond in the Git pane, which will bring up a dialogue window where you can enter the name of your new branch (we’ll call it “develop”). Press the Create button when you’re done.
git checkout -B develop
develop
branch on the remote GitHub repo.
You should also note that Git is suggesting you create a so-called pull request to move changes from the develop branch into the main branch by navigating to your remote repo on GitHub.
Tip: If you navigate back to the testing repo on GitHub and refresh your browser, you will see that there are now 2 branches there: main and develop.
Let’s go ahead and make a small change to our R test script (note that the following assumes you have not already made these changes in the above section on Centralized workflow). After doing so, we’ll commit and push the changes on the develop branch up to our remote repo on GitHub.
Task: Add the following 2 lines of code to the
00_testing.R
script. Save the changes when you are
finished.
Task: Add the file to staging and commit it. Include a short but informative commit message and click the Commit button when ready.
Task: Click on the green uparrow next to “Push” to push your changes to GitHub.
Note: RStudio will pop open a window echoing your push command and the destination.
Now let’s take the advice we received earlier about creating a “pull request” (or “PR” for short). You can think of a pull request as an offer of support. In effect, someone who submits a pull request is kindly requesting that the repo owner pull new changes into their repo.
Task: Navigate to your testing repo on GitHub.
Note: Upon arriving there, we see a message that “develop had recent pushes less than a minute ago”. To the right of that is a green button marked Compare & pull request.
Task: Click on the green Compare & pull request button, which will take you to a page to open a pull request.
Tip: At the top you will see some information about the branches from which and to where the changes will be merged, along with a note about the branches being okay to merge. If you scroll down the page, you’ll see the changes that were committed, when they were made, and who did it.
Tip: If your pull request involves multiple files, features, etc, it’s a good idea to include some additional information in the comment box (this can include Markdown flavored text).
Task: Click on the green Create pull request button, which will take you to another page where you can review any comments (here there are none) and merge the pull request. In the middle of the page, you’ll see a note about This branch has no conflicts with the base branch and a green button labeled Merge pull request. Go ahead and push that button.
At this point, the window will switch over to asking you to confirm the merge.
Task: Go ahead and click on the green Confirm merge button.
After doing so, GitHub will now report Pull request successfully merged and closed (you’ll also notice that the green Open oval below the name of the pull request has changed to a purple Merged oval). GitHub also tells you that “You’re all set–the develop branch can be safely deleted.”
Tip: If your branch was created solely for the purpose of working on a specific feature, then you can go ahead and delete it. If, however, you want to maintain a development (or other) branch, then you can just ignore this message and keeping working on it as well.
Recall that a forking workflow is commonly used when a developer is not a member of the project/organization on GitHub. This workflow allows the developer to work with a copy of the owner’s repo and potentially submit a pull request for the owner to consider adopting the changes.
To fork a repo, navigate to its location on GitHub and look for the Fork button in the upper right. Press that button, which will bring up a dialogue box asking you where you would like to fork the repo. Click on the user or organization where you’d like to fork the repo and GitHub will do the rest.
Note: When you click on the Fork button, GitHub will open a page that looks similar to when you create a new repo.
Task: Click the green Create fork button to proceed.
You will now see a new forked version of the original repository with some information in the upper left about its location and from where it came. You’ll also see some information above the repo contents about the current branch being up to date with the “upstream” source.
Note: In the not so distant past, you had to use Git via the command line to specify where the “upstream” repo of a fork was and then “fetch” any changes. GitHub has now streamlined that process by providing a “sync fork” button to do so (see below).
Task: Clone the remote GitHub repo to a local repo by creating a new project in RStudio based upon your newly forked repo. Click here if you need a reminder as to how to do that.
At this point, your local copy of the remote repo can only pull changes from your remote at
https://github.com/USERNAME/ex-fork
where USERNAME
is your GitHub username. Unfortunately
there is no way to update your remote repo directly from the owner’s
repo at
https://github.com/FISH549/ex-fork
.
Instead, we have to set up our local repo to pull changes from the
owner’s remote repo and then we can push those up to our remote copy. By
convention, we will refer to the owner’s repo as the
upstream
repo, but you could give it any name.
At this point you have 2 options for adding the upstream
repo.
Task: Click on the Terminal tab in your RStudio console window and type (or copy/paste) the following Git command:
git remote add upstream https://github.com/FISH549/ex-fork
Task: Open up your project in RStudio and click on the “New branch” button in the Git pane (i.e., the button with the 2 purple rectangles and a diamond).
Task: Click on the button labeled “Add remote…”.
Task: In the “Remote Name:” box, type
upstream
. In the “Remote URL:” box, enter the name of the
owner’s repo you forked in Step 1:
https://github.com/FISH549/ex-fork
(you can copy/paste if
you’d like). When you are finished, click the Add
button.
Task: Click on the Cancel button (don’t worry, everything is still OK).
Task: Now you need to verify that the remote
upstream
is set up properly. To do so, click on the
Terminal tab in your Console pane and type this:
$ git remote -v
which should return the following, where USERNAME
is
your user name.
origin https://github.com/USERNAME/ex-fork (fetch)
origin https://github.com/USERNAME/ex-fork (push)
upstream https://github.com/FISH549/ex-fork (fetch)
upstream https://github.com/FISH549/ex-fork (push)
Now you are ready to make any changes you’d like to the files or folders in your local repo.
Tip: Before doing so, make sure to
pull any changes that might have been made to the
owner’s upstream
remote.
Task: To do so, again navigate to the Terminal tab in RStudio and type:
$ git pull upstream main --ff-only
If nothing has changed, you’ll see a message like this:
From https://github.com/FISH549/ex-fork
* branch main -> FETCH_HEAD
* [new branch] main -> upstream/main
Already up to date.
If you examine your forked copy of the repo on GitHub, you’ll notice that it contains 2 branches: main and develop. However, if you inspect the local branches in your current project in RStudio, you’ll see that only main exists there.
Tip: To set the local branch to develop (where we’d like to do our work), click on develop under (REMOTE: ORIGIN), which will bring up a message box from Git informing you that you’ve created and switched over to a new branch called develop.
Task: Close the window and you will see that the branch name in the Git pane of RStudio is set to develop.
https://github.com/USERNAME/ex-fork
)
You can also track the main branch on the remote
upstream (located at
https://github.com/FISH549/ex-fork
)
Let’s create a new file in our local repo so we have something to
commit and push to our own remote repo, and then ultimately submit it as
a pull request to the original repo’s owner (here
FISH549
).
Task: Go ahead and make a simple R script and commit
it (I’m called mine simple_addition.R
).
Task: When you’re happy with your script, go ahead and commit it.
Now it’s time to push your simple script to your own remote located at
https://github.com/USERNAME/ex-fork
Task: Click the green uparrow to push your new script to GitHub.
Task: Navigate to your forked repo on GitHub where you’ll see a message about this branch having a recent push and a new green button labeled Compare & pull request.
Task: Open a pull request by clicking on the green Compare & pull request button.
https://github.com/FISH549/ex-fork
)
Tip: It’s good practice to give your pull request an informative title and include some additional information about the substance of the pull request, so that the owner of the original repo has some understanding of what you’ve done.
Task: Give your pull request a title and description. When ready, click on the green Create pull request button, which will take you to a page showing the details of your pull request.
Success: No you just have to wait for the repo’s owner to review the pull request and either accept it as is, or come back to you with additional comments or requests for changes.