My name is Philipp C. Heckel and I write about nerdy things.
This site moved here recently from blog.philippheckel.com!

Altering old SVN revisions: removing confidental data from a Subversion repository


Administration, Linux, Programming, Security

Altering old SVN revisions: removing confidental data from a Subversion repository


Version control systems like CVS or Subversion are designed for keeping track of the changes of a project and for having the possibility to revert to old revisions if something goes wrong. In contrast to regular relational databases, these systems are made only for adding new content to a repository, and not for removing data from it. In fact, deleting old content is not a built-in functionality in SVN, and mostly requires removing entire revisions from the repository or even creating a new one.

But what happens if you accidentally commit a password or other sensitive information to a repository? This post explains how to remove this confidential data permanently from the repository by simply overwriting it in old revisions, i.e. without having to remove revisions or create a new repository.


Contents


1. Introduction

1.1. Disclaimer

The following actions might lead to data loss. I am not responsible for anything that goes wrong because of my description.

1.2. Requirements

It is absolutely necessary to have root access to the SVN respository. That is not only through the svnadmin command, but full command line access to the files, particularly to the “repos” directory.

If you do not have root access to the repository, you cannot remove any data from the repository! In that case, contact your SVN administrator.

1.3. Example Scenario

For this example, let’s assume you accidentally committed the file config.cfg with a plain text password 123secret a while ago (in revision 12). The repository is currently at revision 25 and you just realized that the password was in there all the time:

2. Local machine: Identify the affected revisions in the working copy

2.1. Fix and commit the affected file

The following commands are performed on your local machine within the working copy of the project, i.e. on the client machine.

Before we start tinkering and forging the SVN history and its repository, first fix the affected file and commit a new revision to the repository. In most cases, people are not going to look in old revisions of a config file, so the faster you commit a new version, the less likely it is that someone sees it!

2.2. Identify the affected file versions locally

In most cases you will probably realize right away that you just committed something confidential to the SVN repository. In this case, you only have to fix one single version of that file and is pretty clear which revision is affected.

In other cases, however, the affected file might be in the repository for many revisions before you realize it. If this is the case, there might be multiple revisions of the file in the repository and each of these versions needs to be fixed. To identify the possibly affected versions of the file, you can peak into the logs:

In this case, the file has been altered in the two revisions 12 and 22. Both might include the password and are stored in the repository, i.e. both potentially need to be corrected.

2.3. Get MD5 checksums of the affected versions

SVN ensures the integrity of its repository by saving MD5 checksums of all the files and its versions. Since it is now clear which revisions might be affected, you need to get the current checksums of these file versions and calculate checksums for the new corrected (“forged”) versions. In short, you need to do the following for each affected version:

  • Retrieve the version and calculate its MD5 checksum
  • Make a copy of file, replace the confidential information with “x”s and calculate the MD5 checksum of the new file.
  • Remember or copy all the checksums and versions into a file.

In this example, we’ll have to get the checksums for revisions 12 and 22 of the config.cfg-file. The code below only shows what to do for revision 22; revision 12 is analogue:

First, get the current checksum of revision 22:

Find the checksum using the md5sum utility:

Copy the ‘wrong’ config file and correct the new file using vi:

Then get the new checksum:

Then repeat this for revision 12.

3. SVN repository: Correct the affected files

Warning: This step can damage your repository, so make sure you backup as described in 3.1. before you change anything.

In this step, we finally start altering the repository. All the actions are performed on the server machine as root user inside the actual SVN repository directory, so be sure not to confuse it with you local machine.

3.1. Make a repository backup

Creating some sort of backup is crucial, since we are about to change the binary revision files of the Subversion repository. The easiest way to do this is to backup the whole repository folder of your project, e.g. /path/to/svn/repos/yourproject. However, if its total size is too big you can also choose to only backup the files identified in 3.2.

3.2. Verify affected versions

After the backup, we need to verify that we really need to change all the versions we identified earlier. To do that, navigate to the “revs” folder inside the repository and grep for the password:

The matching files are the revisions that contain the password, and hence also the files that need to be “corrected”. Note that sometimes not all the versions identified through the “svn log” command appear in this list. That is because when the file is simply moved and not changed or other parts of it were altered, its contents will not be stored in the SVN revision file.

3.3. Replace the password and checksums

Since the SVN revision files are binary, we need a hex editor to edit them. Hence install hexedit, and then simply replace the password and checksums like identified before:

Hexedit is not the easiest editor to use. So here is a step-by-step of what you need to do:

  • Hit TAB, then CTRL-S to search
  • Enter the password 123secret and hit return
  • Overwrite the password with xxxxxxxxx (same length!)
  • Hit CTRL-S, then “Y” to save
  • Repeat 1-4 for each occurance of the password.
  • Do the same for the old checksum “0e28c6c8342649c290400567130f657b”, and replace it with the new one “f85abfd8b63fa7ab68abc9364f2d339e”
  • Hit CTRL-X to quit
  • Repeat this for all affected revisions

That’s the complete magic. If checked out, the revisions 12 and 22 (and of course also their succeeding versions) will show xxxxxxxxx instead of the initially committed password.

4. Test locally

Now test locally if you can switch between revisions and every works without error messages:

If you did everything as the tutorial says, you shouldn’t get any errors. If you forgot to replace checksums or you changed something that you weren’t supposed to change in the SVN revision file, you might get an error like below. However, if that happens, you can always go back to your backup and try it again…

5. Bash history cleanup

In step 3.2. we typed the plain text password in the bash. As you might know, this leaves traces in the ~/.bash_history file. Delete them by opening the files and then by simply removing the according lines. Make sure that you do not use the search function of VIM, since that has a history on its own. If you do, delete the history of VIM in ~/.viminfo.

3 Comments

  1. Kandis Kawelo

    Hey there, this is Gianluca from the Wuala Team. Thanks for your interest in Wuala. Rest assured that there is no backdoor (unless the NSA managed to put one into AES or other cryptographic building blocks, but in that case, your tipp of separate encryption does not help much either). For 99.9% of the users, the thing to worry is malware with keyloggers and weak passwords.


  2. miwa

    Hint: If you don’t want a command to be stored in the .bash_history file simplly insert a leading space.


  3. Rajesh

    -rw-r–r– 1 www-data www-data 11741494623 Mar 26 10:06 /home/repos/db/revs/121129

    it is 11gb. No file or directory in this revs file are required. how do i get rid of this.

    Can i truncate like this..will this effects other revision create after this revision.
    >/home/repos/db/revs/121129

    my current running revs is 121406

    Please let me know how to get rid of this revision as it effecting my backup and maintenence.

    thanks in advance.


Leave a comment

I'd very much like to hear what you think of this post. Feel free to leave a comment. I usually respond within a day or two, sometimes even faster. I will not share or publish your e-mail address anywhere.