11 The basics of Git and GitHub
11.1 Learning objectives
The learning objectives of this session are to:
- List and describe Git’s core functionality and purpose, and how GitHub expands on that.
- Explain the difference between Git and GitHub.
- Explain how GitHub differs from services like OneDrive or Dropbox.
11.2 💬 Discussion activity: Recall what you read during the pre-workshop tasks
Time: ~5 Minutes
Before we start the more practical part of the workshop, we’ll take some time to refresh your memory on what you read about Git and GitHub in the pre-workshop tasks. So:
- For 1 minute, recall what you understood about Git and GitHub from the pre-workshop tasks. Think about how you’d explain it to someone else.
- For 4 minutes, pair up with your neighbour and take turns explaining to them what you remember, 2 minutes each.
11.3 📖 Reading task: What is version control and Git?
After they’ve read it, take some time to repeat some key points from the text, such as:
- Emphasising how people usually version files.
- Highlighting that Git can track any file type, but that Git has more features for text-based files.
- Reinforcing what “plain text” files are.
Time: ~5 minutes.
The text below is the same text you read for the pre-workshop tasks.
So, why are we asking you to discuss it and then read it again?
Because Git is hard to learn. It requires changing how you think about working with files, which often takes time to adjust to. By revisiting the material through reading, discussion, and rereading we want to help you build familiarity with these concepts before moving on to the hands-on parts of the workshop.
One of the world’s most popular version control systems is called Git. Git is used by millions of people around the world, including thousands of organisations. It is also used increasingly by researchers.
With Git you can create snapshots of file changes, known as commits. Each commit captures:
- What specific changes were made to the file or files.
- Who made the changes to the files.
- When they made the changes to the files.
Each commit also has a short message attached to it that can describe why the changes were made.
Git stores these commits in a history log. The history log allows you to quickly go back and explore each change made to the files, along with the individual commit messages. This is extremely useful when you revisit your own work after a long time and when you work in groups or with collaborators.
Git only tracks changes to files within a specific folder (and its sub-folders). In Git terminology, this folder is called a repository (or a repo for short). The best way to use a repository is to store all files related to a specific project, like a research project or administration files for your lab or group, in the repository (the “folder”). This way, you can track all changes made to all files in the project. It keeps things more organised and self-contained, since everything related to a project is in one place.
Any type of file can be stored in a repository, including both text and other non-text based files like Word or images. However, Git can only show specific changes made to a file if it is text-based, like a .txt, .csv, or code. Since these text-based files are literally only text characters, it is easier for the computer to show the exact changes to the exact lines of text. Unlike files like images or Word documents (that actually aren’t just text), where there are no “lines” to track changes on.
11.4 What is GitHub?
Verbally explain the differences between Git and GitHub, briefly go over the diagram but reinforce that we won’t cover that in this workshop. Then, highlight some simple differences between tools like OneDrive and GitHub.
There are several ways to use Git. In this workshop, we will use GitHub, which is a website that hosts Git repositories and builds on Git’s core features. What this means is that your Git repositories can be stored on GitHub, and you can manage your files and projects using Git through GitHub’s web interface.
Everything we do in this workshop (including storing and managing files and folders) will happen through the GitHub website. Behind the scenes, GitHub will use Git to track the changes we make.
In the simplest terms, Git is a software, while GitHub is a company and website that makes it easier to use Git and share Git repositories. For beginners, GitHub’s web interface has some advantages: you commit changes immediately after editing a file, and it’s easier to view changes and file history compared to using Git alone on your computer.
While we will only be interacting with Git via GitHub during this workshop, when you feel more comfortable with the concepts, you can eventually start using Git on your computer (instead of via the GitHub website). Using Git on your computer has the benefit of being faster (you do work locally, so don’t need to wait for the internet) and more flexible (you can do more things with Git on your computer than on GitHub). Then you can use GitHub as a place to keep backups of your repository, collaborate with others, track tasks, and make use of the other features GitHub has. How you would use Git locally with GitHub looks something like the figure below.
Using GitHub on its own is a great way to get started with Git; it allows you to learn the concepts of version control and Git without needing to install anything on your computer and without needing to learn some of the more technical details of Git. Since GitHub is a website, it also makes it easier to share your work with others and to collaborate with others. This is one of the main reasons why GitHub is so popular.
You may notice that GitHub sounds a bit like file synching tools such as OneDrive or Dropbox. So how is GitHub different? Unlike OneDrive or Dropbox, GitHub (via Git) tracks line-level changes to files, not just file-level changes, if you work with text-based files. This means you can see the specific changes made in a file, not just that it was changed. The messages you attach to commits also help you keep track of why the changes were made.
OneDrive and Dropbox use a simple way of handling conflicts (i.e., different changes to the same file) when synching between the cloud and your computer by either creating a new file with some details appending to it or by overwriting which ever is newer. Git and GitHub, on the other hand, use a more complex way of handling conflicts by showing you the changes and allowing you to resolve them as you want to. This means that with Git and GitHub, you have complete control over how conflicts are resolved.
File synching tools are really good for easily sharing files within a team or group, but they aren’t as good for collaboratively working together on files. That’s where GitHub shines. It’s built for working on files together, not just sharing them.
Now that you know that you use Git and GitHub to work with files, this is the perfect time to go over what file paths are! 🎉
11.5 📖 Reading task: What is a file path?
Reinforce that:
- Paths are pointers to files on your computer
- They are for us humans to effectively organise and work with files
- Every file has a parent folder, and every folder may also have a parent folder
- Files and folders are separated by
/or\and that the last item in the path is either a file or a folder.
Time: ~3 minutes.
Operating systems like Windows and MacOS try really hard to make the filesystem, and ultimately file paths, hidden or obscured from the user. This has some benefits, but also some downsides. Computers and their programs depend on file paths, so by hiding them from the user, they don’t learn what they are and how to use them effectively. So as soon as a user needs to do even a bit deeper computer work, they encounter file paths and need to know how they work. This is especially true for Git and GitHub.
So to make sure we’re all on the same page, we’ll briefly describe what file paths are, and why they’re important to know about.
In simple terms, a path is the location of a file or folder in a filesystem. The end of a path is either a folder or a file and indicated by either a / or an extension like .txt or .docx. All items in the path before the last item are folders. For example:
/Users/username/Documents/is a path to theDocumentsfolder, within theusernamefolder, which is then within theUsersfolder./Users/username/Documents/notes.txtis a path to thenotes.txtfile, within theDocumentsfolder, which is within theusernamefolder, and that finally is in theUsersfolder.
When you make files for work, it’s best to organise files and folders based on the project you are working on, so that things are easy to find and kept together. This is especially important when using tools like Git and GitHub. That’s because tools like Git and GitHub work within a specific folder and treat that specific folder as a Git repository. Then, all files within that repository (folder) are relative to one another. This “relativeness” is also shown by two “special characters”:
..: Two dots mean the folder up one, also called the “parent folder”. In the file path/Users/username/, the../is the/Users/folder, since it is one folder up fromusername/..: One dot means the current folder. If you’re in the folder/Users/username/and see./Documents/, it means theDocuments/folder within theusername/folder, like so:/Users/username/Documents/.
We’ll be working with and navigating the file path on GitHub throughout this workshop, so you will get more exposure to it as we go along.
11.6 💬 Discussion activity: Explain the basics of Git and GitHub
Time: ~4.5 minutes.
Learning is about recalling and explaining something in your own words. And since Git is such a fundamentally different way of working with and thinking about files, this discussion activity aims to help solidify what we’ve covered so far about Git and GitHub. So:
- Take ~30 seconds to silently explain to yourself what you understood the basics of Git and GitHub are.
- Pair up with your neighbour and for the next 4 minutes, take turns (2 minutes each) explaining to each other what you understand about the basics of Git and GitHub, and how file paths relate to them. Try to come to a shared understanding of what it is, how to work with it, and how it’s different from other ways of working with files.
11.7 Summary
- Git is a version control software that tracks changes to files in a repository. It allows you to see what changes were made, who made them, when they were made, and why they were made.
- A Git repository is a folder that contains all the files and sub-folders for a project.
- GitHub is a company and website that hosts Git repositories and adds tools to help you work with files in a repository.
- File paths are the location of a file or folder in a filesystem.
- Each change to a file in a repository creates a new commit in the Git history log, each with its own commit message (when working on GitHub).
- Commits are connected to each other creating the history of changes made within a repository.