If you’ve been a developer for a while, then you hopefully know it is wise to keep secret information such as passwords and encryption keys outside of source control. If you didn’t know that, then surprise! Now you know.
Sometimes slip-ups do happen and a password ends up in a default config file or a new config file was not added to “.gitignore” and that same someone ran “git add .” and didn’t even notice it got committed. There should be protections in place no matter how diligent your programmers are since nobody is infallible, and the peace of mind is well worth it.
How can software know something is a secret?
It can’t know for sure, but it can make an educated guess. Secrets typically fit certain known patterns, or have higher entropy than other strings in your code and configuration files. A good scanner should check for strings that fit these patterns throughout your entire repository’s history, and raise anything suspicious to you.
Checking for Secrets in CI and CD
When it comes to automatically checking for secrets in your code, you have quite an array of options. To keep this article brief, I am just going to cover a few tools, and which tools you use may depend on your repository host.
If you’re using GitHub for your project and your repository is either public or you use GitHub Enterprise Cloud, then GitHub will automatically scan the code you upload for secrets. GitHub’s solution is special because they have partnered with several different companies to allow for automatic revocation of secrets pushed to the repo. See the following excerpt from GitHub’s secret scanner documentation:
When you make a repository public, or push changes to a public repository, GitHub always scans the code for secrets that match partner patterns. If secret scanning detects a potential secret, we notify the service provider who issued the secret. The service provider validates the string and then decides whether they should revoke the secret, issue a new secret, or contact you directly. Their action will depend on the associated risks to you or them.
Well, that’s nice now isn’t it? If you’re curious if the services you use are partnered with GitHub so that their secrets can be scanned for, you can view the full list here. Just keep in mind that this functionality is only available for public repositories and private repositories using GitHub Enterprise Cloud with a “GitHub Advanced Security” license.
GitLab, like GitHub, has secret detection as well. GitLab uses Gitleaks for their secret detection. This is a well documented tool whose source code is freely available. The capabilities of secret detection in GitLab does vary based on your tier, though.
You will have to use GitLab Ultimate to view detected secrets in the pipeline, and merge request sections for example. You can still use the scanner in free and premium versions, but it isn’t nearly as integrated as it is in the ultimate version.
We mentioned that GitLab uses Gitleaks, but you aren’t just limited to using it with GitLab! Since Gitleaks is open source, that means you can use it with other providers such as GitHub, and even run it locally on your own system. It is also very easy to set up either as a CI job, or if you need to run it locally.
Scanning for Secrets using a CI Job
For GitHub you can simply use this action made by the author of Gitleaks. In this case Gitleaks is helpful if you’re using private repositories on GitHub without GitHub Enterprise Cloud. It is fully configurable with the action as it allows you to specify a custom .gitleaks.toml file. This is optional of course, and the default might work fine for you.
Checking for Secrets in a Pre-Commit Hook
There are a couple of ways to accomplish setting up the hook. A pre-commit script is available on the Gitleaks GitHub that will run Gitleaks on your staged files before you commit. Your commit will be stopped if any secrets are detected. This script can simply be copied into your .git/hooks/ directory. It does require that Gitleaks is installed and in your $PATH, however.
The other method involves using the pre-commit utility. It will assist with installing Gitleaks automatically for any developers that clone the repository and it can also assist with installing the hooks for the first time as well. Using the pre-commit tool might make more sense if you want to ensure other linters and checkers run, and you don’t want to have developers juggle installing everything themselves.
A Good Code Review Process Goes a Long Way
Although automated tooling for identifying secrets in code works well, it’s still good to keep an eye out for them when reviewing code. Automated scanning tools, as I mentioned earlier in the article, work great and you should definitely use them. However, they aren’t perfect. These tools look for sets of patterns and strings with high entropy, but not all secrets fit these criteria.
Knowing that even with the best scanning tools it’s still possible secrets could sneak through, it’s easier to understand why it is important to also have a good code review process to catch these issues. Also remember that committing secrets isn’t the only thing you should be worried about. You should have others review your work to help mitigate the chances that your changes could introduce new security vulnerabilities in the code as well!
Oh no, there’s already a secret in my Git history!
If you already have secrets in your repository, and they’re pushed to a main branch, not all hope is lost. Before we get into methods of removing the secrets, I have a massive disclaimer I should get out of the way. The only way to truly remove secrets from your repository is to rewrite your git history. This is a destructive operation, and will require developers to re-pull branches and cherry-pick changes from their local branches if applicable.
Did you read the disclaimer? Good, we can discuss methods then. What method we do depends on how and when the secret made it into the repository.
The Secret is in a Single Branch
If a secret made it into a feature branch by mistake, then you could simply initiate an interactive rebase to remove it and force push that branch to the remote. It should be noted that this is only effective if no other branches are based off of your branch, and if it isn’t tagged.
Let’s say, for example, you push your code to the remote and a CI job identifies a secret after your push. At this point there would be nobody else using your branch and it shouldn’t be tagged, so this is the perfect opportunity to just rewrite the commit that triggered the CI failure. If your commit hash is let’s say
09fac8dbfd27bd9b4d23a00eb648aa751789536d, then these are the first command you would have to execute to begin cleaning up your branch’s history:
$ git rebase --interactive 09fac8dbfd27bd9b4d23a00eb648aa751789536d^
Note the caret at the end of the SHA1 commit hash. This is vitally important to include as we need to rebase to the commit prior to the commit introducing the secret. The gist is that we’re going to return back to that point in time, and prevent the secret from ever being added in the first place.
Git will now open your default command-line text editor and ask you how you want to execute your rebase. Find the line referencing the problematic commit and replace
edit on that line. If you save the file and quit you then should now find yourself at the commit where the secret was introduced. From here you can remove the secret, stage the affected files, and then execute the following commands:
$ git commit --all --amend --no-edit
And if that’s successful:
$ git rebase --continue
Then after that the history should be successfully rewritten to not include the secret locally. If you pushed your branch already, then you’ll need to get these changes pushed to the remote as well. You can do this with a force push using the
-f flag like so:
$ git push origin <your_branch_name> -f
Now you should be set! At this point, you can get to fixing up your code to pull the secret some other way that doesn’t include config files or strings hard-coded inside of your codebase.
The Secrets are in Main Already…
If your secrets are present in many branches, like tagged versions or your main branch, then things get a little more complicated. There’s more than one way to handle this situation, but I am going to cover only one. Just revoke the secret.
In this scenario, you should revoke the secret and issue a new one. If you do this, then it doesn’t matter that the old secret is still in the repository history because it will be entirely useless! How this is done of course depends on what service the secret was issued from.
There is also the added benefit in that anyone that has cloned your repo with the secret will also be unable to use it. Simply rewriting history doesn’t matter if an adversary has already downloaded it before you deleted it.
It’s important to note that this isn’t always great if, for example, the secret is used in multiple projects. You will need to ensure your revoked secret is replaced everywhere before you actually revoke it or else you may experience downtime.
With scanning tools becoming more accessible, there are fewer and fewer reasons to not use them. Secret scanning is especially important for public repositories, but it is also useful for private repositories where a compromised developer account can access secrets and wreak havoc.