monotone

Issue 166: netsync performance is very poor when new branches are synced with restricted sync patterns

Reported by Ethan Blanton, May 15, 2011

When syncing a new branch to/from a server for the first time, 
netsync performance is extremely poor if both the client and server 
share history for an ancestor of the new branch, but that ancestor 
is not captured by the sync pattern.

This is only a serious problem when revision history is very large.

You can reproduce this by cloning the Pidgin revision history and 
setting up a server to serve im.pidgin*.  Create a new branch (say, 
im.pidgin.slowsync) by committing a new revision against the head of 
im.pigin.pidgin.  Sync (or push) *only* im.pidgin.slowsync to the 
server.

Specific reproduction steps:
1. Fetch the Pidgin mtn bootstrap database from pidgin.im:
   wget http://developer.pidgin.im/static/pidgin.mtn.bz2
   bunzip2 pidgin.mtn
2. Migrate the database if necessary
3. Copy the database:
   cp pidgin.mtn pidgin2.mtn
4. Check out a working copy from pidgin2.mtn:
   mtn -d pidgin2.mtn co -b im.pidgin.pidgin pidgin-test
5. Create a new revision:
   cd pidgin-test
   echo 'make mtn do something stupid' >> README
   mtn ci -b im.pidgin.slowsync -m 'README update'
6. Start a sync:
   mtn sync file://../pidgin.mtn/?im.pidgin.slowsync
7. Go get a cup of coffee
8. Go ahead and have lunch
9. Take a nap
10. Kill the sync because the bug has been demonstrated and there 
are still 25,000 revisions to go

Despite the fact that the server and the client share the entire 
history of the branch in question except for the most recent commit, 
monotone will exchange 30k+ revisions, because the shared ancestry 
is on a branch (im.pidgin.pidgin) which is not included in the 
current sync pattern.

This problem can be solved by including a shared ancestor branch in 
the sync pattern.  However, this approach makes sync patterns much 
more complicated when it is not desirable to sync some local 
branches.

I am aware of the (arguably good) reasons for this performance issue 
(it has been around ~forever and discussed on multiple occasions), 
but I believe it should be addressed nonetheless, if even as a 
heuristic hack.

mtn version --full

monotone 0.99.1 (base revision: 
8973482283db7c36780dce2b54721ccc0f5b7388)
Running on          : Darwin 10.7.0 Darwin Kernel Version 10.7.0: 
Sat Jan 29 15:17:16 PST 2011; root:xnu-1504.9.37~1/RELEASE_I386 i386
C++ compiler        : GNU C++ version 4.2.1 (Apple Inc. build 5666) 
(dot 3)
C++ standard library: GNU libstdc++ version 20070719
Boost version       : 1_45
SQLite version      : 3.7.5 (compiled against 3.7.5)
Lua version         : Lua 5.1
PCRE version        : 8.12 2011-01-15 (compiled against 8.12)
Botan version       : 1.8.10 (compiled against 1.8.10)
Changes since base revision:
format_version "1"

new_manifest [c1270158b7fa91abf8235ad129b0476943bde1ed]

old_revision [8973482283db7c36780dce2b54721ccc0f5b7388]

  Generated from data cached in the distribution;
  further changes may have been made.

Created: 12 years 11 months ago by Ethan Blanton

Status: New

Labels:
Type:Defect
Priority:Medium

Quick Links:     www.monotone.ca    -     Downloads    -     Documentation    -     Wiki    -     Code Forge    -     Build Status