Making a False Revlog in Mercurial

One of the projects I'm working on uses Mercurial internally for, well, for version control. I want to make a realistic demonstration of the project's version control features, which requires a series of changesets that span several months (if not years) and that were made by several different contributors. Because Mercurial is written in Python, it's quite easy to "hack" the internals so they do what you want. I was pleasantly surprised when I realized that making a false changelog is almost trivial. This article talks through my whole process in creating the repository and making it accessible online.

First, I determined the goals for this repository. I wanted to highlight certain characteristics of our version control features, specifically that we expect a high level of collaboration, and that we expect it from an international contributor base. The repository content is also designed to highlight certain other features, but that's not relevant to this article.

Next, I designed the content I wanted to end up with (that is, what exists after the most recent changeset). Our project involves musical scores, so the content was a simple, nondescript piece of music with three sections.

Then I started to design the changesets required to reach our target content. I thought to use the musical score's three sections to show a collaborative situation where three people who live in different countries divide a larger project into sections. But wouldn't it be great if there were also "drive-by" contributions from various other people? And if the software were clearly being used in a professional manner, so someone else were auditing the work? Thus I arrived at the following contributors to the repository:

  • 宋靓乐
    • primary contributor for the "A" section
    • only contributes in Mandarin
  • Gérard Bourassa
    • primary contributor for the "B" section
    • only contributes in French
  • Danceathon Smith
    • primary contributor for the "C" section
    • only contributes in English
    • (this is my usual "Foo Bar" style name)
  • Fortitude Johnson
    • audits work by 靓乐,Gérard, and Danceathon
    • contributes in Mandarin, French, and English
    • (this is my usual "Qux Quux" style name)
  • Shereen Lafleur
    • drive-by contributor in English
  • 黄明星
    • drive-by contributor in Cantonese and English

Obviously, these fake contributors represent the parts of the world to which I have linguistic (and therefore cultural) access, however incomplete. It would have been great to invent people from a wider range of places, but then I run the risk of making a (bigger) fool of myself!

Having some contributors, now we need to plan out a series of changesets that reflect their descriptions. This seemed like it would be a huge pain, so I devised a set of rules:

  • each of the sections (A, B, and C) would be made of five subsections
  • each subsection would be committed by the section's primary contributor
  • the auditor (Fortitude Johnson) would make one commit to each section, which changes existing work
  • the auditor would also be responsible for the metadata files (which link the three sections as part of "the same" score)
  • 黄明星 would make a small modification to section A because his name starts with that letter
  • Shereen Lafleur would make a small modification to Bourassa's work because why not?
  • all changesets would be made on the same branch

I derived the following revlogs:

  • Section A:
    • A1: 宋靓乐
    • A2: 宋靓乐
    • A3: 宋靓乐
    • Fortitude Johnson revises something from A3
    • A4: 宋靓乐
    • 黄明星 revises something from A1
    • A5: 宋靓乐
  • Section B:
    • B1: Gérard Bourassa
    • B2: Gérard Bourassa
    • B3: Gérard Bourassa
    • Shereen Lafleur revises something from B2
    • Fortitude Johnson revises something different from B2
    • B4: Gérard Bourassa
    • B5: Gérard Bourassa
  • Section C:
    • C1: Danceathon Smith
    • Fortitude Johnson revises something from C1
    • C2: Danceathon Smith
    • C3: Danceathon Smith
    • C4: Danceathon Smith
    • C5: Danceathon Smith
  • Metafiles:
    • after A1, add A to metafile (Fortitude Johnson)
    • after B1, add B to metafile (Fortitude Johnson)
    • after C1, add C to metafile (Fortitude Johnson)

This gives me three series of changesets and their order, but they still need to be attached to specific dates and interleaved. So I made up some more rules:

  • work starts on Friday 13 February 2015 (the day our project's leaders started organizing)
  • sections begin work in approximately weekly offsets, in alphabetical order
  • Fortitude adds a section to the metafile four days after the section's creation
  • new subsections are added approximately triweekly
  • revision changesets should either "fill in" the days between nearby changesets, or if changesets are already close together they should happen on the same day as another changeset
  • the changeset time should always be 15:00 -0500 UTC, to make my life easier

Applying these rules, we get the following revlog:

Fri Feb 13:A1
Tue Feb 17:A1 in metafile
Thu Feb 19:B1
Mon Feb 23:B1 in metafile
Wed Feb 25:C1
Sun Mar 1:C1 in metafile
Tue Mar 3:Fortitude revises C1
Thu Mar 5:A2
Fri Mar 13:B2
Thu Mar 19:C2
Wed Mar 25:A3
Thu Apr 2:B3
Sun Apr 5:Shereen revises B2
Tue Apr 7:Fortitude revises A3
Thu Apr 9:Fortitude revises B2
Fri Apr 10:C3
Thu Apr 16:A4
Sun Apr 19:黄明星 revises A1
Wed Apr 22:B4
Thu Apr 30:C4
Fri May 8:A5
Thu May 14:B5
Wed May 20:C5

The changeset messages don't have to be particularly imaginative, so from here it was relatively easy to fill out the details, and determine exactly what goes in which commit. There's only one thing left: to actually do it! I opened two terminal windows in an empty directory, then ran hg init in the directory. In one terminal, I started up Python 2 (I'm running Mercurial 3.1.2 on Debian 8, so it's still Python 2) and ran the following code to set up everything we need to make new changesets with arbitrary dates, users, and messages.

from mercurial import ui, hg, commands as cmds
myui = ui.ui()
repo = hg.repository(myui, '.')

Then it was a simple matter of copy-and-pasting what belongs in a changeset, then running something like this to commit it:

cmds.commit(myui, repo, message='Add Section C4', user='Danceathon Smith <danceathon@example.com>', date='Thu Apr 30 15:00:00 2015 -0500')

And there you go! You can take a look at the result here: https://bitbucket.org/crantila/hgdemo.