Thursday, May 29, 2008

This Week in Bazaar

This is the fourth in a series of posts about current topics in the development world of the Bazaar distributed version control system. The series is co-authored by John Arbash Meinel, one of the primary developers on Bazaar, and Elliot Murphy, imaginary boy and part-time impostor.


Stacked branches

Some projects are very big with lots of files and lots of history. Many projects want to maintain the policy that development is done on independent branches, which are then merged back when the development is complete. However, the overhead of downloading, branching, and uploading the full history is prohibitive. There are a couple of different ways to solve this problem.

Dealing with a large branch can be split into two problems: downloading and uploading.

Bazaar has had a storage optimization called shared repositories for quite a while. This serves to dramatically reduce the amount of data downloaded for the second, third, etc branches of a project. A shared repository is a big pool of revisions which multiple branches point to. When you grab a new branch into a shared repository, bzr figures out how much of the history it already has, and only downloads the new revisions. So the first branch of a large project transfers most of the data, and grabbing additional branches is very cheap. In extreme cases, like working on a multi-gigabyte project from a 56k dial-up connection, you could even do things like distribute the initial data on a DVD to prime the shared repository, and then the user only needs to download incremental changes.

This technique can also be used for solving the uploading problem. If the upload location uses a shared repository, then uploading a new branch can just copy the new data. The problem with this, is once you start introducing multiple users, who decide that they may not want to give access to other people to push data into their repository.

Another approach to minimizing the data uploaded is called server side forking, and you can see a nice implementation of this on github.com. The user places a request with the code host to do the copy for them, and when it finishes, they have their own location already primed with the current branch.

The Bazaar project is approaching it in a different way. If some data is already public, then you can just reference the other public location when you start uploading your new branch. The first steps in this direction are being termed "Stacked Branches". Basically, instead of requiring all branches to contain the full history, you are allowed to "Stack" a branch on top of another. Because the uploader does not have write access to the lower levels of the stack, this addresses the security risks of shared repositories.

Stacking also opens up possibilites for the "download" side of the equation. For many users, they don't need a very deep copy of history to get their work done. If there is a location that can be trusted to be available when they need it, they can copy just the tip revisions. Which would allow them to do most of their work (commit, diff, etc) without consulting the remote host. And when they need more information (such as during a complex merge), the bzr client is able to fall back to the original source to get any needed information.

The goal of all this is to make it very easy to start working with a large project, while still making all the history available in a meaningful way. The bulk of this work has been completed, and it is likely that it will land in bzr 1.6 (to be released in a couple of weeks.)

Thursday, May 22, 2008

This Week in Bazaar

This is the third in an amazingly regular weekly series of posts about current topics in the development world of the Bazaar distributed version control system. The series is co-authored by John Arbash Meinel, one of the primary developers on Bazaar, and Elliot Murphy, Launchpad developer and relentless agitator. This week we have a special guest, Jelmer Vernooij, Samba developer, and author of the Bazaar Subversion plugin.

In last week's episode, our fearless explorers braved the new world of plugins. Today we will focus on a specific plugin, and talk about how you can use Bazaar with Subversion. Earlier this week there was a very nice blog post about using Git with the Subversion servers on Google Code Hosting, and plenty of interesting discussion afterwards.


Rationale

If you have Bazaar installed, why would you want to work with Subversion? Well, it's nice not to have to force the whole world to change at once. Bazaar-Subversion integration allows you to use Bazaar without any changes required from the project administrators to the central Subversion server.

There are three general cases, where you would want to use bzr-svn:
  1. Upstream uses Subversion, and you don't yet have commit access. With bzr-svn, you are able to still make your improvements with all the benefits of a great VCS.
  2. Project has chosen to use Subversion, you want something better, but still want to play nice with your fellow developers. You can commit to your local Bazaar branch, and push those changes back into Subversion. You can even do "bzr commit" in your Subversion checkout and have it commit those changes to the Subversion server.
  3. Migration from Subversion to Bazaar. Often when migrating from once VCS to another, there is a period of time where people are adjusting to the new system. bzr-svn allows you to continue allowing people to commit to Subversion, it's just another branch with changes to be merged.

Overview
Currently the bzr-svn dependencies can be a bit tricky to install on some platforms, but that should be much easier once Subversion 1.5 is released. Once you get things installed, it's pretty amazing what you can do. On most debian based systems, it is a simple "apt-get install bzr-svn" away.

Once you have bzr-svn installed, you can start using Subversion branches as though they were regular Bazaar branches.


General usage

Now that you have bzr-svn installed, how do you get a local copy of your Subversion project? Generally, it is just a "bzr checkout URL" away.

$ bzr checkout svn+https://your-project.googlecode.com/svn/trunk

This will create a local checkout of your project that contains a local copy of the history present remotely.

You should now be able to use this branch like any regular Bazaar branch. Since this is a bound branch, any commits you make will also be show up in the Subversion repository.

It is possible to create new local branches from this branch, for example for feature branches::

$ bzr branch trunk feature-1

And to merge the branch back into Subversion once it is finished, you can use merge like you would with any ordinary Bazaar branch

$ bzr merge ../feature-1
$ bzr ci -m "Merge feature-1"

In addition to the code changes, bzr-svn will write metadata about the history of the new commit into Subversion. This means that your merge history is available, so when someone else comes along and grabs a copy of the branch using Bazaar, they can see what happened. To a normal Subversion client this is transparent, the custom properties are simply ignored.

It is also possible to push directly from the feature branch into Subversion::

$ bzr push http://subversion/project

This will preserve all of the history from the branch you are pushing - there is no need to rebase your local branch after pushing.

Since bzr-svn allows access to Subversion protocols and file formats using the standard Bazaar API, it is possible to use most standard Bazaar commands directly on Subversion formats and URLs. Commands like "bzr missing", "bzr log", or even "bzr viz" work out of the box.

Miscellaneous

Some bits and pieces to pique your interest in bzr-svn.
  • Subversion 1.5 introduces custom revision properties - this should allow bzr-svn to hide the properties used to store merge information. (At the moment, the file properties used show up in commit emails.)
  • Bazaar will soon be introducing shallow (stacked) branches. This will allow you to have a fully functioning local branch (including offline commits, etc), without needing to download the complete history to your local machine.
  • Bzr for GNOME developers is a quick guide for people who want to use Bazaar for developing with the Subversion Gnome repository.
  • Bazaar branches of Python are available. They are currently using bzr-svn to mirror the Subversion branches, allowing their developers to see what life is like developing with Bazaar.
For more information, check out the bzr-svn home page, FAQ, bug tracker, or join us on the Bazaar mailing list.


Next week: how to print money with Bazaar.

Thursday, May 15, 2008

This Week in Bazaar

This is the second in a mostly-every-week series of posts about whats been happening in the development world of the Bazaar distributed version control system. The series is co-authored by John Arbash Meinel, one of the primary developers on Bazaar, and Elliot Murphy, Launchpad developer and compulsive conflict avoider.

Plugins

One of the nice things about Bazaar is the API, which enables new features to be added with plugins. Once a feature is polished and proves widely useful, it can move from a plugin into core bazaar. Most of the plugins are hosted/mirrored on Launchpad, and are a simple "bzr branch lp:bzr-plugin ~/.bazaar/plugins/plugin" away. For the rest, they are indexed at http://bazaar-vcs.org/BzrPlugins. Here's a quick summary of some of the plugins we are using on our laptops right now:

bookmarks: This allows me to store an alias for a branch location, so it is easier to branch/push to a common location. So I can type 'bzr get bm:work/foo' instead of 'bzr get bzr+ssh://server.example.com/dev/stuff/foo'

bzrtools: a collection of commands which provide extended functionality. Such as 'bzr cdiff' to display colored diffs and 'bzr shelve' to temporarily revert sections of changes.

difftools and extmerge: These plugins let me view differences in meld or kdiff3 (or anything that you want to configure, really), and do merges via meld.

email: Keep people informed of what you are working on by sending an email after every commit.

fastimport: This plugin allows me to import code from my friends mercurial repository and push it to launchpad.

git: this gives me read access to a local git repository

gtk: This is the Bazaar Gtk GUI, which has some nice tools like visualize and gcommit.

htmllog: Useful for generating html formatted logs for publishing on the web.

loom: Allows me to manage several "layers" of development in a single branch, and colloborate on those layers with other people.

notification: Gives a GUI popup when a pull or push completes

pqm: This formats a merge request to PQM. PQM then takes my branch, merges to main, runs tests, and commits the merge if all was well. This ensures that we always have passing tests in the main tree!

push_and_update: This updates the working tree when I push my branch to a remote server. Very useful for doing website updates.

removable: I try to keep all branches very small for easier review, so I have a lot of branches at one time. This tells me which branches have already been merged to the main tree (and thus can be removed). It can also let me know why something is not ready to be removed.

stats: Provides 'bzr stats' which gives a simple view of how many people have committed to your project and how many commits each has done.

update_mirrors: 'bzr update-mirrors' recursively scans for Bazaar branches and updates them to their latest upstream.

vimdiff: Adds the commands 'bzr vimdiff' and 'bzr gvimdiff'. Which opens vim in side-by-side mode, showing you your changes.

qbzr: Another great GUI for bzr, this one is written using Qt.


1.5rc1, 1.5 this Friday

Continuing our pattern of having time-based releases, bzr 1.5rc1 was released last Friday, and 1.5 final should be released tomorrow. Ever wonder how we churn out releases so regularly? The biggest factor enabling us to make consistent releases is our use of a Patch Queue Manager. It ensures that all of our 11,724 unit tests pass before allowing any merge into mainline. Even when lots of changes are landing, the trunk can be considered release quality. Most of the developers use the tip of mainline for their day-to-day work, which means that any changes get immediate use, rather than waiting for a release candidate.

By releasing every month, we have reduced the tendency to rush patches, trying to sneak them in before the next release. We know that there will be another release just around the corner, so we can land complex patches right after a release. For each release cycle, we have 3 weeks of "open" development, where any approved (peer reviewed) patch can be merged. Then we have a feature freeze week, where only bug fixes are supposed to be merged. At the end of the freeze week, we release RC1 and reopen mainline for development. If no regressions are found in RC1, it is tagged as final and released after one week.

The bzr-1.5 release is mostly focused on fixing small ui bugs, a couple of performance improvements, and some documentation updates.

(edit: 2008-05-16, the merged plugin changed and is now called bzr-removable)

Wednesday, May 14, 2008

Creating a new Launchpad Project (redux)

A while back I posted about how to set up a new launchpad project. At the time it took quite a few steps to set everything up that you wanted. I'm happy to report that a lot of those steps have been streamlined, so I posting a new step-by-step instruction for setting up your project in Launchpad.

  1. Make sure the project isn't already registered. A lot of upstream projects have already been registered in Launchpad, as it is used to track issues in Ubuntu. So it is always good to start on the main page and use the search box "Is your project registered yet?".
  2. If you don't find your project, there will be a link to Register a new project
  3. The form for filling out your project details has been updated a bit, but you should know the answers. (I still use 'bazaar' as the "part of" super-project, and bzr-plugin-name for my plugins)
  4. This is where things start to get easier. After you have registered the project you can follow the Change Details link. This is generally https://launchpad.net/PROJECT/+edit. It was the same before, but now more information is on a single page, so you can set up more at once. Here I always set the bug tracker to be Launchpad, I click the boxes to opt-in for extra launchpad features.
  5. Optionally you can assign the project to a shared group. Follow the "Change Maintainer" link (https://launchpad.net/PROJECT/+reassign). I generally assign them to the bzr group, because I don't want to be the only one who can update information.
  6. At this point you should be able to push up a branch to be used as the mainline using:
    bzr push lp:///~GROUP/PROJECT/BRANCH
    in my example, this is lp:///~bzr/PROJECT/trunk. (You may need to run 'bzr launchpad-login' so that bzr knows who to connect as, rather than using anonymous http:// urls)
  7. You now want to associate your mainline branch with the project, so that people can use the nice lp:///PROJECT urls. You can follow the link on your project page for the "trunk" release series (usually this is https://launchpad.net/PROJECT/trunk) On that page is a "Edit Source" link, or https://launchpad.net/PROJECT/trunk/+source.
    Set the official release series branch to your new ~GROUP/PROJECT/BRANCH.
See, now it is only 7 steps instead of 11. (Though only really one or two steps has actually changed.)

Thursday, May 8, 2008

This Week In Bazaar First Edition

This is the first in a mostly-every-week series of posts about whats been happening in the development world of the Bazaar distributed version control system. The series is co-authored by John Arbash Meinel, one of the primary developers on Bazaar, and Elliot Murphy, Launchpad developer and wanted criminal.

We get to talk about anything we want. This week:
  • What's been happening for a better GUI on Windows
  • What's new in the 1.4 release
  • Importing from other VCS's with bzr fast-import
... details ...

GUI on Windows

We found this guy named Mark Hammond who claims to know how to make python stuff work well on windows. There is an existing GUI tool for Bazaar on Windows called TortoiseBZR now, modeled after TortoiseSVN. If you haven't used a Tortoise before, they are extensions that integrate into Windows Explorer; allowing you to see and control the versioning of your files without needing to change to a separate tool.

Mark has taken a look and proposed a series of enhancements to make the tool work even better. Bazaar already works very well from the Windows command prompt, but we want to provide excellent GUI tools as well. Take a look at the TortoiseBZR web page for screenshots of it in action.

What's new in the 1.4 release

The Bazaar team releases a new version of Bazaar just about every month, with both bugfixes and new features. The bzr-1.4 release came out last Thursday, May 1st.

The major changes for 1.4 include improvements in performance of 'log' and 'status', and a new Branch hook called post-change-branch-tip, which will trigger any time a Branch is modified (push, commit, etc). This should enable server generated emails whenever somebody publishes their changes. Write something cool with it and tell us what you did!

The full list of changes for 1.4 can be found at: https://launchpad.net/bzr/1.4/1.4
The list of all changes is at http://doc.bazaar-vcs.org/bzr.dev/en/release-notes/NEWS.html

bzr fast-import

Bazaar fast-import is a plugin for bazaar that allows you to import from many different version control systems. The fast-import stuff is intended to support any system that can use the fast-export format. This format was originated by git developers, and quickly adopted elsewhere. So if a source format can generate a "fast-import" stream, you should be able
to import it into Bazaar.

  • CVS
    To convert from cvs, you currently use the cvs2svn converter. Which has a flag to generate a "fast-import" stream.
  • Mercurial
    There is a script called hg-fast-export.py bundled with the plugin (in the exporters/ directory).
  • SVN
    The svn-fast-export script is also bundled with the bzr-fastimport plugin.
  • git
    Bundled with the standard git distribution is the git-fast-export command.
  • Your own exotic system here.
Give fast-import a try. It's mostly designed for 1-time conversions, rather than mirroring, but there are already some rudimentary mirroring capabilities.


That's all for the first installment of "This Week in Bazaar".

(edited for formatting)