Why should you use Git/GitHub?
1 Collaboration and Version Control
Git allows you to keep track of all the versions of your analysis and track changes even when all your collaborators are working on different computers. The utility of Git-based version control might be apparent for massive collaborative projects, like the dplyr
package with over 250 contributors.
But even for smaller projects (or if you work alone), there are benefits to avoiding the “file renaming” method, for example:
You can collaborate on the most up-to-date versions of the analysis asynchronously
You can see what you changed, who made the change, and when between versions by looking at the “diff,”even years later
If you make a mistake and break your code or want to try something new, you can always revert to the last working version
It allows you to combine organizing and sharing your work into your daily workflow, and
Make revisiting your code easier for future you.
2 Reproducibility and Sharing
2.1 Reproducibility
We make any small decisions during data analysis that can significantly impact the outcome of an analysis. Often, these choices are left out of the methods sections of publications. Hosting your work on GitHub in a public repository provides a line-for-line method for your analysis which can be reviewed by others interested in your work or looking to use similar methods.
3 Recover from mistakes
See the section on recovering from mistakes.