monotone

Issue 120: list changed much slower than status

Reported by joe 23, Dec 26, 2010

It seems like 'list changed' is the same as 'status' but with a 
different output format. However, 'list changed' takes much longer, 
when a small portion of the total workspace is specified. With a 
decent-sized project, the delay of 'list changed' is mildly 
annoying.

My guess is that 'status' only checks the portion of the workspace 
specified, but 'list changed' checks the whole workspace and then 
filters the result?

Of course if there's some good reason why it must be slower, please 
close the ticket.

Thanks!

Steps to reproduce the problem:
-------------------------------

1. time mtn list changed path/to/some/dir
2. time mtn status path/to/some/dir
3. repeat 1) and 2) now that the OS has cached things

Expected result:
----------------
Both take about the same CPU and elapsed time

Actual results:
---------------
list changed takes about 4x as long as status, and is noticeably 
slower to complete.


Output of `mtn version --full`:
-------------------------------
0.99.1

Comment 1 by Richard Hopkins, Jan 25, 2011

Labels: Milestone:1.0
Status: Accepted

Comment 2 by Richard Hopkins, Jan 25, 2011

This has passed preliminary testing, and is now ready for extensive 
testing.

See fc13d22f21a87c3a0f37d87a7fb8bbf667b65639 for more details.
Status: Fixed

Comment 3 by Thomas Keller, Feb 5, 2011

I had a look at the changes in this branch and you're heading the 
right direction. Please go the extra mile and externalize the code 
that iterates the edge_map, but instead of using booleans, use some 
local, namespaced enum.

Secondly, your branch is missing a short note in NEWS that this bug 
is fixed by the change. Its a good idea to do that in the branch 
already, so its not forgotten later on and can be reviewed as well. 

Finally, I'm not sure I haven't asked that before, but could you 
find out the reason (or reproduce at least) why the old 
implementation was so slow?
Status: Started

Comment 4 by Richard Hopkins, Feb 6, 2011

Thanks for reviewing - I'm currently on Windows and cannot build it 
at the moment but I'll look at it again soon.

"externalize the code that iterates the edge_map" - do you 
mean to put the for loop in a new function so it can be used by 
"mtn status" and "mtn ls changed" - because they 
are now practically identical ?

I'm also not sure what you mean by namespaced enum ? I guess you 
mean 

enum output_type { added, attr, dropped, patched, renamed };

Does that mean the method would then need a set<output_type> 
and only if the set contains "added" should the added 
files be output ?

Regarding the reason for the slowness, I couldn't actually reproduce 
it myself - but the original issue mentioned a 750MB and I don't 
have one that big.

However, I did compare both versions and the only difference was 
inside the edge_map for loop. The start of both functions performed 
the same things but just in a different order.

The inside of the edge_map seemed to be doing basically the same as 
well - the original called select_nodes_modified_by_cset - which 
loops around each type just like the new implementation does.

The only difference I can see is that select_nodes_modified_by_cset 
does extra loops at the end to copy what it has found into the 
output. select_nodes_modified_by_cset also has more memory overhead 
as it's keeping more track of more things in memory.

Comment 5 by Thomas Keller, Feb 18, 2011

@Richard H.: Any news on this one?

Comment 6 by Richard Hopkins, Feb 24, 2011

1ce52abe1d12b891c5c5783921287f41b5915b30 passes all the tests and 
now needs reviewing.
Status: Fixed

Comment 7 by joe 23, Mar 17, 2011

I haven't tested the fix, but I found proof that in 0.99.1 it was 
scanning the whole tree. This may also be helpful for a test case.

Assume two directories x/ and y/ in a repo, both with checked-in 
files.

1. rm x/foo (remove - NOT drop - some file in x)
2. mtn list changed y (FAILS on missing file in x/)
3. mtn status y (Works)

If you have a fix, step 2 should work.

Comment 8 by Richard Hopkins, Mar 21, 2011

I have now added a new test to the func test suite which mimics Joes 
new scenario (thanks).

There is also 2 heads now in the nvm.issue-120 branch. The first is 
the original refactoring of "mtn status", the second is a 
1 line change to apply the PATH mask.

Both heads contain the test, and both pass.

d0883e2ac70cb5ca8a3cf845fbb5fee2f59e390e is the 1 line change and 
will be pushed later.

Comment 9 by Richard Hopkins, Mar 25, 2011

This is now merged into net.venge.monotone and has revision id 
b2729ea31820bf3bca56abb985e92f7009e4d4d2.

Comment 10 by Richard Levitte, Mar 25, 2011

According to the build slaves, this went through with no trouble.

Created: 13 years 11 months ago by joe 23

Updated: 13 years 8 months ago

Status: Fixed

Followed by: 3 persons

Labels:
Type:Defect
Priority:Medium
Milestone:1.0

Quick Links:     www.monotone.ca    -     Downloads    -     Documentation    -     Wiki    -     Code Forge    -     Build Status