11 min read
Git can support your project not just with version control, but also with collaboration and release management. Understanding how Git workflow patterns can help or hinder a project will give you the knowledge to evaluate and adapt your project’s Git processes effectively.
Throughout this guide I will isolate software development process patterns found in common Git workflows. Knowledge of these will help you find a direction when joining, creating or growing a development team. The pros and cons for certain types of projects or teams will be highlighted within the workflow examples we explore, so that you can pick and choose what might work well for your scenario.
This is not an introduction to using Git. There are fabulous guides and documentation for this out there already. You will benefit from this Git workflow guide if you already have experience within an application development team and have faced workflow snags, integration implosions or git-tastrophes - these patterns may shed some light on how to avoid those situations in the future.
In terms of Git process, collaboration is often about branching workflows. Thinking ahead on how you will intertwine commit trees will help you minimize integration bugs and support your release management strategy.
Use an integration branch with software development teams who work towards deploying a collection of contributions into production as a single entity. This is opposed to teams that focus on deploying features individually. Often teams may want to be doing the latter but practical limitations impose a process that groups their efforts, and the team ends up doing the former, so be sure to review your actual Git usage to see if you would benefit from using this type of collaboration pattern.
This workflow pattern is a useful staging point for when the risk of integrating multiple branches is high enough to warrant testing the combined contributions as a whole.
An integration branch usually consists of a major feature and several smaller contributions to be deployed together. Put an integration branch through your development team’s process (Q&A and acceptance testing, for example). Push minor commits onto it to bring it close to production ready, and then use an environment branch or release branch (discussed below) to prepare it for deployment.
Be aware that the contributions on the integration branch need to be merged into the next release stage before another major feature can be merged into the integration branch - otherwise you are mixing features at different stages of completion. This will inhibit your ability to release what is ready.
Teams will want to use topic branches if it is important to keep their commit trees in a state that can be easily read or have individual features reverted. Topic branches signify that the commits may be overwritten (using a force push) to clean up their structure and be shrunk down to a feature commit.
Topic branches are often owned by an individual contributor but can also be a designated space for a team to develop a feature upon. Other contributors know that this type of branch could have its commit tree re-written at any moment, and should not try to keep their local branches synchronized with it.
Without utilizing topic branches in your Git workflow you are restricted to sticking by the commits you push to a remote branch. Force pushing a new commit tree to a remote branch could anger other contributors who rely on the maintained integrity of the branch that they synchronize with.
Chances are that you use this workflow pattern already without realizing it, but it’s worth having a shared set of definitions amongst teams to reinforce the practices behind them. For example, you may find the convention of prefixing the branch name with the initials of the branch creator helps to signal which are topic branches. Either way, it’s up to your team to decide on internal conventions.
DO NOT use topic branches on public repositories, you cause a myriad of conflicts for anyone who has synchronized their local branches with a topic branch that has had it’s commit tree re-written.
Open source projects thrive using this Github-originated feature. The fork empowers the repository maintainers with an enforced gateway over pushing directly to an origin repository branch, but more importantly it facilitates collaboration. Wahoo!
You may find yourself in the scenario where creating a fork of a private repository suits your needs too. Setting the origin repository to read-only for the contributors of the fork repository and rolling with pull requests gives you the same benefits that the open source community experience. Teams from different organizations can work effectively using a fork which can be the platform for communication and project policy adherence.
The fork workflow pattern gives teams their own space to work in whatever way they are used to with a single integration point between the two repositories - a pull request. Over communicating is imperative within the pull request description. The teams have had separate communication streams before a pull request has been issued, and highlighting the decisions that have already made will speed up the review process.
Of course one benefit of the fork workflow is that you can direct comments to contributors of the origin repository, as the permissions cascade downwards. From the point of view of the origin repository, you have the control to delete forks when they are no longer needed.
Make sure you are using a tool that facilitates forking and pull requests to take advantage of this pattern. These tools are not limited to Github: other popular choices are Bitbucket and GitlLab. But there are quite a few other Git workflow hosting services that will have these features (or similar). Pick which service works best for you.
DO NOT use a fork of a private repository for each member of a team. The numerous forked repositories can make it difficult for multiple members to collaborate on the same feature branch, and keeping all of these repositories in sync can become error prone due to the sheer number of the moving parts. Open source projects have core team members with push access to the origin repository that lessen this overhead.
A common outsourcing strategy is to have contribution “seats” on a project that can be filled by multiple software developers. It’s up to the outsourcing company to manage their resource pipeline to deliver contracted hours, the issues they face are how to on-board, train and maintain a pool of their developers for each client’s projects.
Using a clone of the project’s repository lays out an isolated training and communication ground for the outsourced team to manage their contributions, enforce policies and take advantage of knowledge sharing - all out from under the watchful eye of the client’s development team. Once a contribution is deemed up to standard and ready for the main repository it can be pushed to one of the origin repositories remote branches and integrated as usual.
Some projects have high expectations for following their coding conventions and defined Git workflow standards to contribute to their repository. It can be daunting working in this environment until you have learnt the ropes, so work together as a team to optimize both parties’ time.
DO NOT create a hosted copy of the client’s repository without their permission, you could be breaking a contractual agreement, verify up front that this practice will benefit the project with the client.
The steps between going from collaboration to release are going to start at different points within the development process for each team. Generally, you would not want to use more than one release management Git pattern. You want to have the simplest possible workflow that will enable your team to deliver effectively.
Your software development process may be supported by several environments to help with quality assurance before being deployed into production. Environment branches mimic the stages of this process: each stage corresponds to a branch, and contributions flow through these in a pipeline.
Teams running with these processes often have application environments set up for each stage in the pipeline, for example “QA”, “Staging” and “Production”. In these cases the infrastructure is in place to support personnel who are responsible for signing off a feature or contribution for their slice of what it means to be production ready (e.g. exploratory testing, QA, acceptance testing), before moving it onto the next person’s stage. This gives them their own place to deploy, test, and evaluate against their requirements, with a Git workflow to record its journey through the sign-off tunnel.
Having a branch for each stage of the process is OK for small teams that can work towards a release as a unit. Unfortunately, a pipeline like this can too easily bottleneck or bunch up and leave gaps. It couples your Git process to your infrastructure which can cause issues when feature demands ramp-up and both processes need to scale.
DO NOT use this pattern without considering the long term benefits of other patterns first.
A team that pushes a collection of contributions out to their production application as a unit in successive sprints can find release branches a favorable fit.
A collection of near “production ready” commits are given minor bug fixes on a release branch. Use an integration branch to combine and test the features before moving its commit tree onto a release branch. Limit the responsibility of a release branch to being a final check before deployment to the production application.
Release branches differ from environment branches in that they have a short lifespan. Release branches are created only when needed and destroyed after its commit tree has been deployed into production.
Try to prevent coupling release branches to your software development road map. Restricting yourself to following a pre-determined plan delays deploying a release until all of the planned features are production ready. Not assigning a version number to the roadmap before creating a release branch can alleviate these types of delays, by allowing the features that are production ready to be put onto a release branch and deployed.
Do use a version number naming convention for the release branch name to make obvious what version of the repository has been deployed into production.
Deploy the master branch and not the release branch. To encourage making minor fixes on release branches prior to merging with the master branch, use a Git hook on the master branch to trigger after a merge has happened to automatically deploy the updated commit tree into production.
Allowing only one release branch to exist at given moment in time ensures you will avoid the overhead of keeping multiple release branches in sync with each other.
DO NOT use release branches with multiple teams working on the same repository. Even though release branches are short lived, if the final checking of it takes too long then it holds up the other team from releasing. A team piggy backing on another team’s release branch is likely to introduce bugs and cause delays for both teams. Look at the timestamped release pattern below, which works better for a larger number and groups of contributors.
Applications with infrastructure restrictions commonly schedule their deployments during low traffic periods. If your project is faced with regular queues of features ready to be deployed then you may benefit from using timestamped releases.
A timestamped release relies on the deployment process to automatically add a timestamp tag to the last commit on the master branch that was deployed into production. Topic branches are used to put a feature through the development process before being merged into the master branch to await deployment.
The timestamp tag should include an actual timestamp and a label to indicate that it represents a deployment, for example:
Including deployment meta-data, in the form of the timestamp tag within the commit tree of the master branch, will assist you in debugging regressions released into the production application. The person charged with hunting down the cause of the issue is unlikely to know a great deal about each and every line that is deployed into the production application. Running a
git diff command on the last two tags can quickly give a snapshot of what commits were last deployed and who are the commit authors who could help resolve the issue.
Timestamped branches are more than they appear on the surface. A simple mechanism for recording a deployment of queued features requires a surprising amount of good process to drive it. The process is one that can scale and works well with a small team of contributors too.
For this Git workflow pattern to be truly effective it needs the master branch to always be deployable. That could mean different things for your team, but essentially all commits must have gone through your projects development process before ending up on the master branch.
New commits landing on the master branch are going to happen multiple times a day. This is an issue for topic branches that have been through the development process and have not been synchronized with the master branch during this time. Unfortunately such a scenario can introduce regressions into the master branch when merge conflicts are incorrectly dealt with.
If merge conflicts do arise between a topic branch and the master branch, then the risk of introducing a new bug should be discussed with your team before updating the remote master branch. If there is any doubt that a regression could occur then the topic branch can be put back through the quality assurance process with the merge conflicts resolved.
To reduce integration bugs, developers who are working on related parts of the repository can collaborate on when best to merge and synchronize their topic branches with the master branch. Integration branches work well to resolve conflicts from related topic branches too - these should be put through the testing process before being merged into the queue on the master branch pending deployment.
Software developement projects with many contributors have to deal with collaboration and release management processes with practical and efficient approaches. The additional meta-data on the commit tree we gain from using timestamped releases is a pointer to the foresight of the teams who are preparing to respond to production issues.
If you have a repository that you not only run in production but others use for their own hosted applications, then using version branches can give your team the platform to support users who do not, or cannot, stay on the bleeding edge of your application’s developments.
A repository using version branches will have one branch per minor version of the application. Major, minor and patch versions are explained within the Semantic Versioning documentation. Version branches typically follow a naming convention to include the word “stable” and drop the patch number from the application version: e.g.
2-3-stable to make their purpose and reliability obvious to end users.
Git tags may be applied down to the patch version number of the application, but version branches are not that fine grained. A version branch will always point to the most stable commit for a supported minor version.
When security patches or the need to backport functionality come along, put together the commits necessary to work for older application versions that you support, and push them to your version branches respectively.
DO NOT use version branches unless you support more than one version of your repository.
When your team changes size, or your project develops its processes through continuous evaluation, don’t leave out evaluating your Git process too. Use the patterns in this tutorial as a starting point to help direct you down the path of Git workflow righteousness.
The pattern in this guide can help to arm you with some foresight in adapting your distributed version control system to work for you. If you would like to read up on Git workflows be sure to check out Gitflow, Github Flow, and most importantly the amazing git-scm documentation!