Skip to content

A Guide to Keeping Secrets out of Git Repositories

This article was written over 18 months ago and may contain information that is out of date. Some content may be relevant but please refer to the relevant official documentation or available resources for the latest information.

If you’ve been a developer for a while, then you hopefully know it is wise to keep secret information such as passwords and encryption keys outside of source control. If you didn’t know that, then surprise! Now you know.

Sometimes slip-ups do happen and a password ends up in a default config file or a new config file was not added to “.gitignore” and that same someone ran “git add .” and didn’t even notice it got committed. There should be protections in place no matter how diligent your programmers are since nobody is infallible, and the peace of mind is well worth it.

How can software know something is a secret?

It can’t know for sure, but it can make an educated guess. Secrets typically fit certain known patterns, or have higher entropy than other strings in your code and configuration files. A good scanner should check for strings that fit these patterns throughout your entire repository’s history, and raise anything suspicious to you.

Checking for Secrets in CI and CD

When it comes to automatically checking for secrets in your code, you have quite an array of options. To keep this article brief, I am just going to cover a few tools, and which tools you use may depend on your repository host.

GitHub

If you’re using GitHub for your project and your repository is either public or you use GitHub Enterprise Cloud, then GitHub will automatically scan the code you upload for secrets. GitHub’s solution is special because they have partnered with several different companies to allow for automatic revocation of secrets pushed to the repo. See the following excerpt from GitHub’s secret scanner documentation:

When you make a repository public, or push changes to a public repository, GitHub always scans the code for secrets that match partner patterns. If secret scanning detects a potential secret, we notify the service provider who issued the secret. The service provider validates the string and then decides whether they should revoke the secret, issue a new secret, or contact you directly. Their action will depend on the associated risks to you or them.

Well, that’s nice now isn’t it? If you’re curious if the services you use are partnered with GitHub so that their secrets can be scanned for, you can view the full list here. Just keep in mind that this functionality is only available for public repositories and private repositories using GitHub Enterprise Cloud with a “GitHub Advanced Security” license.

GitLab

GitLab, like GitHub, has secret detection as well. GitLab uses Gitleaks for their secret detection. This is a well documented tool whose source code is freely available. The capabilities of secret detection in GitLab does vary based on your tier, though.

You will have to use GitLab Ultimate to view detected secrets in the pipeline, and merge request sections for example. You can still use the scanner in free and premium versions, but it isn’t nearly as integrated as it is in the ultimate version.

Gitleaks

We mentioned that GitLab uses Gitleaks, but you aren’t just limited to using it with GitLab! Since Gitleaks is open source, that means you can use it with other providers such as GitHub, and even run it locally on your own system. It is also very easy to set up either as a CI job, or if you need to run it locally.

Scanning for Secrets using a CI Job

For GitHub you can simply use this action made by the author of Gitleaks. In this case Gitleaks is helpful if you’re using private repositories on GitHub without GitHub Enterprise Cloud. It is fully configurable with the action as it allows you to specify a custom .gitleaks.toml file. This is optional of course, and the default might work fine for you.

Checking for Secrets in a Pre-Commit Hook

There are a couple of ways to accomplish setting up the hook. A pre-commit script is available on the Gitleaks GitHub that will run Gitleaks on your staged files before you commit. Your commit will be stopped if any secrets are detected. This script can simply be copied into your .git/hooks/ directory. It does require that Gitleaks is installed and in your $PATH, however.

The other method involves using the pre-commit utility. It will assist with installing Gitleaks automatically for any developers that clone the repository and it can also assist with installing the hooks for the first time as well. Using the pre-commit tool might make more sense if you want to ensure other linters and checkers run, and you don’t want to have developers juggle installing everything themselves.

A Good Code Review Process Goes a Long Way

Although automated tooling for identifying secrets in code works well, it’s still good to keep an eye out for them when reviewing code. Automated scanning tools, as I mentioned earlier in the article, work great and you should definitely use them. However, they aren’t perfect. These tools look for sets of patterns and strings with high entropy, but not all secrets fit these criteria.

Knowing that even with the best scanning tools it’s still possible secrets could sneak through, it’s easier to understand why it is important to also have a good code review process to catch these issues. Also remember that committing secrets isn’t the only thing you should be worried about. You should have others review your work to help mitigate the chances that your changes could introduce new security vulnerabilities in the code as well!

Oh no, there’s already a secret in my Git history!

If you already have secrets in your repository, and they’re pushed to a main branch, not all hope is lost. Before we get into methods of removing the secrets, I have a massive disclaimer I should get out of the way. The only way to truly remove secrets from your repository is to rewrite your git history. This is a destructive operation, and will require developers to re-pull branches and cherry-pick changes from their local branches if applicable.

Did you read the disclaimer? Good, we can discuss methods then. What method we do depends on how and when the secret made it into the repository.

The Secret is in a Single Branch

If a secret made it into a feature branch by mistake, then you could simply initiate an interactive rebase to remove it and force push that branch to the remote. It should be noted that this is only effective if no other branches are based off of your branch, and if it isn’t tagged.

Let’s say, for example, you push your code to the remote and a CI job identifies a secret after your push. At this point there would be nobody else using your branch and it shouldn’t be tagged, so this is the perfect opportunity to just rewrite the commit that triggered the CI failure. If your commit hash is let’s say 09fac8dbfd27bd9b4d23a00eb648aa751789536d, then these are the first command you would have to execute to begin cleaning up your branch’s history:

$ git rebase --interactive 09fac8dbfd27bd9b4d23a00eb648aa751789536d^

Note the caret at the end of the SHA1 commit hash. This is vitally important to include as we need to rebase to the commit prior to the commit introducing the secret. The gist is that we’re going to return back to that point in time, and prevent the secret from ever being added in the first place.

Git will now open your default command-line text editor and ask you how you want to execute your rebase. Find the line referencing the problematic commit and replace pick with edit on that line. If you save the file and quit you then should now find yourself at the commit where the secret was introduced. From here you can remove the secret, stage the affected files, and then execute the following commands:

$ git commit --all --amend --no-edit

And if that’s successful:

$ git rebase --continue

Then after that the history should be successfully rewritten to not include the secret locally. If you pushed your branch already, then you’ll need to get these changes pushed to the remote as well. You can do this with a force push using the -f flag like so:

$ git push origin <your_branch_name> -f

Now you should be set! At this point, you can get to fixing up your code to pull the secret some other way that doesn’t include config files or strings hard-coded inside of your codebase.

The Secrets are in Main Already…

If your secrets are present in many branches, like tagged versions or your main branch, then things get a little more complicated. There’s more than one way to handle this situation, but I am going to cover only one. Just revoke the secret.

In this scenario, you should revoke the secret and issue a new one. If you do this, then it doesn’t matter that the old secret is still in the repository history because it will be entirely useless! How this is done of course depends on what service the secret was issued from.

There is also the added benefit in that anyone that has cloned your repo with the secret will also be unable to use it. Simply rewriting history doesn’t matter if an adversary has already downloaded it before you deleted it.

It’s important to note that this isn’t always great if, for example, the secret is used in multiple projects. You will need to ensure your revoked secret is replaced everywhere before you actually revoke it or else you may experience downtime.

Conclusion

With scanning tools becoming more accessible, there are fewer and fewer reasons to not use them. Secret scanning is especially important for public repositories, but it is also useful for private repositories where a compromised developer account can access secrets and wreak havoc.

This Dot is a consultancy dedicated to guiding companies through their modernization and digital transformation journeys. Specializing in replatforming, modernizing, and launching new initiatives, we stand out by taking true ownership of your engineering projects.

We love helping teams with projects that have missed their deadlines or helping keep your strategic digital initiatives on course. Check out our case studies and our clients that trust us with their engineering.

You might also like

Git Reflog: A Guide to Recovering Lost Commits cover image

Git Reflog: A Guide to Recovering Lost Commits

Losing data can be very frustrating. Sometimes data is lost because of hardware dying, but other times it’s done by mistake. Thankfully, Git has tools that can assist with the latter case at least. In this article, I will demonstrate how one can use the git-reflog tool to recover lost code and commits. What is Reflog? Whenever you add data to your local Git repository or perform destructive operations, Git keeps track of all these using reference logs, also known as reflogs. These log entries contain a SHA-1 hash of the commit associated with it and any references, or refs for short. Refs themselves are branch names, tags, and symbolic refs like HEAD, which is always pointing to the ref or commit id that’s currently checked out. These reflogs can prove very useful in assisting with data recovery against a Git repository if some code is lost in a destructive operation. Reflog records contain data such as the SHA-1 hash that HEAD was pointing to when an operation was performed, and a description of the operation that was performed as well. Here is an example of what a reflog might look like: ` The first part 956eb2f is the commit hash of the currently checked out commit when this entry was added to the reflog. If a ref currently exists in the repo that points to the commit id, such as the branch-prefix/v2-1-4 branch in this case, then those refs will be printed alongside the commit id in the reflog entry. It should be noted that the actual refs themselves are not always stored in the entry, but are instead inferred by Git from the commit id in the entry when dumping the reflog. This means that if we were to remove the branch named branch-prefix/v2-1-4, it would no longer appear in the reflog entry here. There’s also a HEAD part as well. This just tells us that HEAD is currently pointing to the commit id in the entry. If we were to navigate to a different branch such as main, then the HEAD -> section would disappear from that specific entry. The HEAD@{n} section is just an index that specifies where HEAD was n moves ago. In this example, it is zero, which means that is where HEAD currently is. Finally what follows is a text description of the operation that was performed. In this case, it was just a commit. Descriptions for supported operations include but are not limited to commit, pull, checkout, reset, rebase, and squash. Basic Usage Running git reflog with no other arguments or git reflog show will give you a list of records that show when the tips of branches and other references in the repository have been updated. It will also be in the order that the operations were done. The output for a fresh repository with an initial commit will look something like this. ` Now let’s create a new branch called feature with git switch -c feature and then commit some changes. Doing this will add a couple of entries to the reflog. One for the checkout of the branch, and one for committing some changes. ` This log will continue to grow as we perform more operations that write data to git. A Rebase Gone Wrong Let’s do something slightly more complex. We’re going to make some changes to main and then rebase our feature branch on top of it. This is the current history once a few more commits are added. ` And this is what main looks like: ` After doing a git rebase main while checked into the feature branch, let’s say some merge conflicts got resolved incorrectly and some code was accidentally lost. A Git log after doing such a rebase might look something like this. ` Fun fact: if the contents of a commit are not used after a rebase between the tip of the branch and the merge base, Git will discard those commits from the active branch after the rebase is concluded. In this example, I entirely discarded the contents of two commits “by mistake”, and this resulted in Git discarding them from the current branch. Alright. So we lost some code from some commits, and in this case, even the commits themselves. So how do we get them back as they’re in neither the main branch nor the feature branch? Reflog to the Rescue Although our commits are inaccessible on all of our branches, Git did not actually delete them. If we look at the output of git reflog, we will see the following entries detailing all of the changes we’ve made to the repository up till this point: ` This can look like a bit much. But we can see that the latest commit on our feature branch before the rebase reads 138afbf HEAD@{6}: commit: here's some more. The SHA1 associated with this entry is still being stored in Git and we can get back to it by using git-reset. In this case, we can run git reset --hard 138afbf. However, git reset --hard ORIG_HEAD also works. The ORIG_HEAD in the latter command is a special variable that indicates the last place of the HEAD since the last drastic operation, which includes but is not limited to: merging and rebasing. So if we run either of those commands, we’ll get output saying HEAD is now at 138afbf here's some more and our git log for the feature branch should look like the following. ` Any code that was accidentally removed should now be accessible once again! Now the rebase can be attempted again. Reflog Pruning and Garbage Collection One thing to keep in mind is that the reflog is not permanent. It is subject to garbage collection by Git on occasion. In reality, this isn’t a big deal since most uses of reflog will be against records that were created recently. By default, reflog records are set to expire after 90 days. The duration of this can be controlled via the gc.reflogExpire key in your git config. Once reflog records are expired, they then become eligible for removal by git-gc. git gc can be invoked manually, but it usually isn’t. git pull, git merge, git rebase and git commit are all examples of commands that will trigger git gc to run behind the scenes. I will abstain from going into detail about git gc as that would be deserving of its own article, but it’s important to know about in the context of git reflog as it does have an effect on it. Conclusion git reflog is a very helpful tool that allows one to recover lost code and commits when used in conjunction with git reset. We learned how to use git reflog to view changes made to a repository since we’ve cloned it, and to undo a bad rebase to recover some lost commits....

Git Strategies For Teams, Another Take cover image

Git Strategies For Teams, Another Take

Recently we published an article about a pragmatic approach to using Git in teams. It outlines a strategy which is easy to implement while safeguarding against common issues when using git. It is, however, in my opinion, a compromise. Allowing some issues with edge cases as a trade-off for ease of use. Status Quo Before jumping into another strategy, let us establish the pros and cons of what we’re up against. This will give us a reference for determining whether we did better, or not. Pro: Squash Merging One of the stronger arguments is the squash merge. It offers freedom to all developers within the team to develop as they see fit. One might prefer to develop all at the same time, and push a big commit in the end. Others might like to play it safe and simply commit every 5 minutes, allowing them to roll back or backup changes. The only rule is that at the end of the work, all changes get squashed into a single commit that has to adhere to the team's standards. Con: You Get A Single Commit Each piece of work gets a single commit. To circumvent this, one could create multiple PRs and break-up the work into bite-size chunks. This is, undeniably, a good practice. But it can be cumbersome and distributive when trying to keep pace. In addition, you put the team at risk of getting git conflicts. Say you’re in the middle of building your feature and uncover a bug which requires fixing. As a good scout, you implement the fix and create a separate PR to deliver the fix to the team. At the same time, you keep the fix in your branch as you need it for your feature. At the point of squashing, your newly squashed commit conflicts with the stand-alone fix you’ve delivered to your team. This is nothing that git can’t fix, but it is a nuisance nonetheless. Con: Review Potential A good PR shouldn’t have too many changes, making it easy to review. In the real world however, things get messy. Commits can give us some insight into how the complete changeset came to be. This requires the team to write well-curated commits though. This conflicts with the strength we’re getting from allowing freedom to commit as one sees fit. The History Rewriting Controversy It is good to know that what I’m about to suggest is considered blasphemy by many. Rewriting history is not without its dangers. Changes can go missing, and others who have based their work on now-changed history need to deal with conflicts. However, when applied prudently, rewriting history can yield benefits as well. In this context, some advanced git knowledge is required. The Alternative There was a soft hint towards using conventional commits in Dustin’s article. Let's go ahead and fully endorse adopting it. The convention is simple enough, and the documentation is exhaustive. Now I hear you think, “but we just concluded that allowing us to commit as we like was a good thing”. And you are not wrong. This is where history rewriting comes into play. As you’re working, commit as you like. Then, when it’s time to put your changes up for review, start editing your branch to ensure each change is nicely wrapped and documented in a proper commit. Finally, after getting a thumbs-up on the PR, rebase your changes on top of the branch you’re merging into, and finally do a normal merge. Most git hosting services offer this workflow for you. While I endorse rewriting history in your own branch, restrain from altering shared branches like “main” or “develop”. By sticking to this small rule, you’ve already negated most disadvantages of rewriting history. Shared Changes If we look back at our scenario where both the main and our feature branch include a fix, we get to the same point where we want to merge. However in this case, given that you’ve made the same commit in both branches, git is clever enough to fix the flow of history and remove the fix from your new changes. The following flow... ... will look like this after merging: Fixes On Your Own Features Although this is part of the conventional commit strategy, I feel it deserves some special attention. If you have introduced a new feature in your branch, and committed the changes. It can happen that you introduced a bug. Your first intuition might be to create a “fix” commit. Instead, consider going back to your feature commit and amending the fix to it. This has two advantages. First of all, the history will be less cluttered. Looking back at what changed, it's easy to see which features got introduced and what bugs we found along the way. On top of that, it will prevent confusion for your reviewers. Now, the code presented to others is fixed code. At no time in its history does it ever contain the bug. Your co-workers are not going to have any comments on it. How To Rewrite Your Branch Now we know why to clean-up, let's look at strategies to actually do the orchestration. The most obvious route is to keep the changeset you want to present in mind. Doing so, one prevents having to go back and rewrite everything from scratch. As an added bonus, I’ve found that it helps me better separate concerns. Complete Wipe If you like making periodic commits (or some other strategy that results in you creating arbitrary commits) chances are you are going to completely wipe all commits (not the changes) in your branch. The simplest way to accomplish this is by doing a soft reset to where you forked from the main branch. This can be achieved by rebasing and resetting to main (given main is where you want to merge into). This is a good approach as you also prepare your branch for being merged back. ` This can also be accomplished by counting the amount of commits and making that amount of steps back from HEAD. For example, if you have made 4 commits in your branch. ` And lastly, you can do this by knowing where you started off. One can find this by looking at the logs: ` Using either of these methods will leave you with no commits in your branch, and all your changes in your workspace. From this point, you can start cherry-picking your changes, and making well-curated commits. Interactive Rebase If you already have somewhat of a structure, interactive rebasing might be a better solution for you. This will allow you to go over each commit, and decide on how to alter them. The most interesting options being: s, squash - this will add the changes from this commit to its parent, followed by allowing you to change the commit message, and thus appending the message with the squashed changes. e, edit - using this option, the rebase will stop right before the commit gets added to the branch as if you went back in time and just did the development work. From this point, you can add files, split the commit in multiple different commits, change the commit message, or do whatever you’d like to do. d, drop - in the rare occasion you simply don’t want this commit anymore. r, reword - like edit, but you’re only offered the option to change the commit message. To start an interactive rebase, simply run ` Conclusion By embracing history rewriting and dropping squash merging. A team could produce an even cleaner git history. This option might not be for everyone, as it requires a little work and git knowledge. But if done well, it will circumvent some of the drawbacks of our pragmatic approach....

Making Seamless Page Transitions with the View Transitions API cover image

Making Seamless Page Transitions with the View Transitions API

Make Seamless Page Transitions using the View Transitions API Traditionally web applications have always had a less polished experience, both functionally and visually speaking, compared to native applications on mobile and other platforms. This has been improving over the years with the introduction of new web APIs and standards. One such new API that bridges that gap is the View Transitions API. What is the View Transitions API? The View Transitions API is a new browser feature that allows developers to more easily create animated transitions between pages and views, similar to how mobile apps have transitions when navigating between pages. Adding view transitions to your application can reduce the cognitive load on your users and make the experience feel less inconsistent. One great thing about view transitions is that they can be used by both SPAs (single-page applications) and MPAs (multi-page applications) alike! Using the View Transitions API for SPAs Let’s see how we can use view transitions ourselves. We will use examples adapted from demos created by the wonderful Jake Archibald and Bramus. There are some alterations to a few of the examples to make them work with the newest version of the View Transitions API, but otherwise, they’re mostly the same. The process of getting view transitions working for SPAs is as simple as calling document.startViewTransition right before starting navigation and replacing your page’s content immediately after that. You need to make this call in your router or in an event listener that triggers your page navigation. The following is an example of how you can hook into page navigation in a vanilla JavaScript application. This example uses the Navigation API if it’s available. As of the time of writing this article only Chromium-based browsers support the Navigation API, but there is a fallback to using a click listener on link elements so that this works with Firefox and Safari. ` If you’re using a framework, then there’s probably a better way to do this specific to that framework than by using this contrived example. For example, if you are implementing usage of this API with a Next application, you may opt for something like next-view-transitions. For this article, we’re going to focus on the way this is done in vanilla JavaScript so that the fundamentals of using the API are understood. Knowing how it’s implemented in vanilla JavaScript should give you enough understanding to implement this in the framework of your choice if there isn’t already a library that does it for you. Now we have a way to know when a navigation is being requested. Using the above utility function we can start the animation and replace our content as follows: ` getMyPageContentSomehow can be any function that returns HTML. How that is implemented will depend on your application and framework of choice. If you want to run these examples on your machine, the demo repository has instructions in its README. In essence, it’s as simple as hosting an HTTP server at the root of the project and visiting localhost. No fancy build system is required! By default, a fade is done that only lasts for a moment. The animation that happens can be configured using CSS with the ::view-transition-group pseudo-element on the element you want to animate, which in our case is the entire page, and we pass in root. ` Given that the animation properties are used, you have much control over what kind of animation to do. You can change the duration and the interpolation, add keyframes and gradients, and even add image masks. You can learn more about the animation properties in CSS on MDN. Using the View Transitions API for MPAs As mentioned, view transitions also work for MPAs with no client-side routing. Setting up view transitions for MPAs is much easier than for SPAs. All you have to do is include the CSS rule on all pages where you want the animations. ` Of course, like how it works with SPAs, you can customize the animations to your heart’s content using the view transition pseudo-elements. Conclusion I hope this article helped you understand the fundamentals of using the View Transitions API. Now that most browsers support it and it’s not too difficult to add it to existing applications, it is a great time to learn how to use it. You can view a full list of examples here (made by Jake Archibald and Bramus) if you want to see how advanced the animations can get. I like the Star Wars ones especially. The README explains how to get all the examples up and running (a total of 43!)....

The simplicity of deploying an MCP server on Vercel cover image

The simplicity of deploying an MCP server on Vercel

The current Model Context Protocol (MCP) spec is shifting developers toward lightweight, stateless servers that serve as tool providers for LLM agents. These MCP servers communicate over HTTP, with OAuth handled clientside. Vercel’s infrastructure makes it easy to iterate quickly and ship agentic AI tools without overhead. Example of Lightweight MCP Server Design At This Dot Labs, we built an MCP server that leverages the DocuSign Navigator API. The tools, like `get_agreements`, make a request to the DocuSign API to fetch data and then respond in an LLM-friendly way. ` Before the MCP can request anything, it needs to guide the client on how to kick off OAuth. This involves providing some MCP spec metadata API endpoints that include necessary information about where to obtain authorization tokens and what resources it can access. By understanding these details, the client can seamlessly initiate the OAuth process, ensuring secure and efficient data access. The Oauth flow begins when the user's LLM client makes a request without a valid auth token. In this case they’ll get a 401 response from our server with a WWW-Authenticate header, and then the client will leverage the metadata we exposed to discover the authorization server. Next, the OAuth flow kicks off directly with Docusign as directed by the metadata. Once the client has the token, it passes it in the Authorization header for tool requests to the API. ` This minimal set of API routes enables me to fetch Docusign Navigator data using natural language in my agent chat interface. Deployment Options I deployed this MCP server two different ways: as a Fastify backend and then by Vercel functions. Seeing how simple my Fastify MCP server was, and not really having a plan for deployment yet, I was eager to rewrite it for Vercel. The case for Vercel: * My own familiarity with Next.js API deployment * Fit for architecture * The extremely simple deployment process * Deploy previews (the eternal Vercel customer conversion feature, IMO) Previews of unfamiliar territory Did you know that the MCP spec doesn’t “just work” for use as ChatGPT tooling? Neither did I, and I had to experiment to prove out requirements that I was unfamiliar with. Part of moving fast for me was just deploying Vercel previews right out of the CLI so I could test my API as a Connector in ChatGPT. This was a great workflow for me, and invaluable for the team in code review. Stuff I’m Not Worried About Vercel’s mcp-handler package made setup effortless by abstracting away some of the complexity of implementing the MCP server. It gives you a drop-in way to define tools, setup https-streaming, and handle Oauth. By building on Vercel’s ecosystem, I can focus entirely on shipping my product without worrying about deployment, scaling, or server management. Everything just works. ` A Brief Case for MCP on Next.js Building an API without Next.js on Vercel is straightforward. Though, I’d be happy deploying this as a Next.js app, with the frontend features serving as the documentation, or the tools being a part of your website's agentic capabilities. Overall, this lowers the barrier to building any MCP you want for yourself, and I think that’s cool. Conclusion I'll avoid quoting Vercel documentation in this post. AI tooling is a critical component of this natural language UI, and we just want to ship. I declare Vercel is excellent for stateless MCP servers served over http....

Let's innovate together!

We're ready to be your trusted technical partners in your digital innovation journey.

Whether it's modernization or custom software solutions, our team of experts can guide you through best practices and how to build scalable, performant software that lasts.

Prefer email? hi@thisdot.co