monotone

Issue 42: invariant 'I(utf8_validate(path))' violated

Reported by Unknown User, Apr 13, 2006

(This entry was imported from the savannah tracker, original 
location: https://savannah.nongnu.org/bugs/index.php?16337)

I get the following error:

mtn: fatal: std::logic_error: paths.cc:194: invariant 
'I(utf8_validate(path))' violated

when adding a file with the following filename:
"Mike´s Report.aqt"

Its the "´" UTF-8(0xb4) that's causing it, because the 
problem goes away as soon as I change it.

I found what's looks like related bug#15748 but its still unresolved.

Thanks,

Alex

monotone version:
-----------------
monotone 0.26 (base revision: unknown)
Running on          : Windows NT/2000/XP (5.0, build 2195) on ia32 
(level 15, rev 258)
C++ compiler        : GNU C++ version 3.4.5 (mingw special)
C++ standard library: GNU libstdc++ version 20051201
Boost version       : 1_33_1
Changes since base revision:
unknown

Comment 1 by Unknown User, Jun 5, 2006

´, U+00b4?  That's an acute accent, you should be using ’, U+2019!

But, umm, anyway, as to the actual bug report.  It's entirely 
plausible that we're miffing some character encoding conversion 
somewhere, in fact I know there are bugs in that code that we 
haven't gotten around to fixing yet.  What would be helpful here is:
  -- information on what command you ran exactly?  E.g., it makes a 
difference whether you ran "mtn add myfolder/Mike´s 
Report.aqt" or "mtn add myfolder/"
  -- the dump file generated whenever monotone gives that kind of 
fatal error -- probably was left in _MTN/dump, or you could use 
--dump=<somefile> and re-run the command?
  -- if you know what encoding your filesystem uses, that would be 
_really_ helpful, but I'm not sure how you'd find that out, exactly. 
 My guess is though that somehow that character is being written in 
UTF-16, and we're not converting from that to UTF-8, or something 
along those lines.

Comment 2 by Unknown User, Dec 18, 2006

I have a similar problem. I made a fresh new empty repository. Then 
I created a file with the name
this_filename_contains_the_characters_å,_ä,_and_ö.txt
Those are the native Swedish characters "a with a ring", 
"a with two dots", and "o with two dots". Then I 
ran:

mtn list unknown

I got the following error:

mtn: fatal: std::logic_error: paths.cc:255: invariant 
'I(utf8_validate(path))' violated
mtn: this is almost certainly a bug in monotone.
mtn: please send this error message, the output of 'mtn 
--full-version',
mtn: and a description of what you were doing to 
monotone-devel@nongnu.org.
mtn: wrote debugging log to /home/sven/test/_MTN/debug
mtn: if reporting a bug, please include this file

Then I patched paths.cc as follows, to see what the failing string 
was:

--- paths.cc.original   2006-11-11 20:08:11.000000000 +0100
+++ paths.cc    2006-12-18 18:33:14.996221792 +0100
@@ -252,6 +252,9 @@
 file_path::file_path(file_path::source_type type, string const 
& path)
 {
   MM(path);
+  if (!utf8_validate(path))
+    for (int i=0; i<path.length(); i++)
+      printf("%x ", path[i]);
   I(utf8_validate(path));
   switch (type)
     {

When I ran "mtn list unknown" again, it printed the 
following sequence of numbers, in addition to the error message:
74 68 69 73 5f 66 69 6c 65 6e 61 6d 65 5f 63 6f 6e 74 61 69 6e 73 5f 
74 68 65 5f 63 68 61 72 61 63 74 65 72 73 5f ffffffe5 2c 5f ffffffe4 
2c 5f 61 6e 64 5f fffffff6 2e 74 78 74

So it seems that my Swedish characters became negative.

I attach the _MTN/debug file.


(file #11552)

Created: 18 years 8 months ago by Unknown User

Updated: 18 years 2 days ago

Status: New

Labels:
Type:Other
Component:Charset Handling
Priority:Medium

Quick Links:     www.monotone.ca    -     Downloads    -     Documentation    -     Wiki    -     Code Forge    -     Build Status