monotone

Issue 199: Some repositories broken (SQL logic error)

Reported by Dirk Heinrichs, Nov 20, 2011

As per ML discussion, here's the bug report for it.

Some of my repositories seem to be broken. A checkout attempt and 
mtn
db check both give the following error:

mtn: error: sqlite error: SQL logic error or missing database

I also tried to dump one of the databases, which gives this:

% mtn db dump --db=~/monotone/vcontrol.mtn 2>&1|head
mtn: fatal: std::terminate() - exception thrown while handling 
another
exception
mtn: This is almost certainly a bug in monotone.
mtn: Please report this error message, the output of 'mtn version 
--full',
mtn: and a description of what you were doing to
'https://code.monotone.ca/p/monotone/issues/'.
mtn: wrote debugging log to /afs/altum.de/home/heini/.monotone/dump
mtn: if reporting a bug, please include this file

Here's the content of the dump file:

% cat /afs/altum.de/home/heini/.monotone/dump
Encountered an error while musing upon the following:
src/database.cc:804: detected internal error, 'I(stepresult ==
SQLITE_DONE || stepresult == SQLITE_ROW)' violated
Encountered an error while musing upon the following:
src/migrate_schema.cc:105: detected system error, 'E(false)' 
violated
Current work set: 4 items
----- begin 'system_flavour' (in virtual void 
sanity::initialize(int,
char**, const char*), at src/sanity.cc:119)
Linux 3.0.0-13-generic #22-Ubuntu SMP Wed Nov 2 13:27:26 UTC 2011 
x86_64
-----   end 'system_flavour' (in virtual void 
sanity::initialize(int,
char**, const char*), at src/sanity.cc:119)
----- begin 'cmdline_string' (in virtual void 
sanity::initialize(int,
char**, const char*), at src/sanity.cc:133)
'mtn', 'db', 'dump', '--db=~/monotone/vcontrol.mtn'
-----   end 'cmdline_string' (in virtual void 
sanity::initialize(int,
char**, const char*), at src/sanity.cc:133)
----- begin 'string(lc_all)' (in virtual void 
sanity::initialize(int,
char**, const char*), at src/sanity.cc:138)
C
-----   end 'string(lc_all)' (in virtual void 
sanity::initialize(int,
char**, const char*), at src/sanity.cc:138)
----- begin 'full_version_string' (in virtual void
mtn_sanity::initialize(int, char**, const char*), at 
src/mtn-sanity.cc:32)
monotone 1.0 (base revision: 
a7c3a1d9de1ba7a62c9dd9efee17252234bb502c)
Running on          : Linux 3.0.0-13-generic #22-Ubuntu SMP Wed Nov 
2
13:27:26 UTC 2011 x86_64
C++ compiler        : GNU C++ version 4.6.1
C++ standard library: GNU libstdc++ version 20110903
Boost version       : 1_46_1
SQLite version      : 3.7.7 (compiled against 3.7.7)
Lua version         : Lua 5.1
PCRE version        : 8.12 2011-01-15 (compiled against 8.12)
Botan version       : 1.8.13 (compiled against 1.8.13)
Changes since base revision:
format_version "1"

new_manifest [b252820fde344fd3f5d023fd91de86522baa671d]

old_revision [a7c3a1d9de1ba7a62c9dd9efee17252234bb502c]

  Generated from data cached in the distribution;
  further changes may have been made.
-----   end 'full_version_string' (in virtual void
mtn_sanity::initialize(int, char**, const char*), at 
src/mtn-sanity.cc:32)

One of my databases is attached for investigation.

Comment 1 by Thomas Keller, Nov 28, 2011

I had a look at the database and it seems to be very badly broken, 
unfortunately. This is a case of brokeness that I have seen only 
very, very rarely for the past 5 years I worked on this project.

The underlying problem sqlite reports is SQLITE_CORRUPT or "The 
database disk image is malformed", i.e. one or more tables 
crashed (revisions, revision_certs, roster_deltas, file_deltas and 
possibly a few others as well).

While its not completly impossible that this kind of data corruption 
could have been introduced through monotone or even sqlite (they 
have a rather good consistency record actually), its more likely 
that some faulty hardware or software system caused this problem. In 
the past databases that were hosted on network drives were blamed in 
similar cases, but this is hard to guess in your specific case.

Can you give us more information about your setup? Is this a problem 
that you could reproduce somehow or do you remember the last steps 
you did before the database went corrupt?

Thanks,
Thomas.

Comment 2 by Dirk Heinrichs, Dec 1, 2011

Yes, it indeed resides on a network filesystem (OpenAFS). However, 
there's never been any concurrent access by different users.

And no, unfortunately I cannot remember what I did last with these 
database, maybe I ran db migrate after upgrade to monotone 1.0...

Bye...

    Dirk

Comment 3 by Thomas Keller, Dec 2, 2011

Network filesystems usually cause this by nature even if no 
concurrent access happens to the database file. We usually urge 
everybody to not host monotone databases on network drives or at 
least have regular backups handy if its not possible to store the 
database on a non-network drive.
Status: WontFix

Created: 4 years 7 months ago by Dirk Heinrichs

Updated: 4 years 6 months ago

Status: WontFix

Followed by: 1 person

Labels:
Type:Defect
Priority:Medium

Quick Links:     www.monotone.ca    -     Downloads    -     Documentation    -     Wiki    -     Code Forge    -     Build Status