monotone

monotone Mtn Source Tree

Root/paths.hh

1#ifndef __PATHS_H__
2#define __PATHS_H__
3
4// copyright (C) 2005 nathaniel smith <njs@pobox.com>
5// all rights reserved.
6// licensed to the public under the terms of the GNU GPL (>= 2)
7// see the file COPYING for details
8
9// safe, portable, fast, simple path handling -- in that order.
10// but they all count.
11//
12// this file defines the vocabulary we speak in when dealing with the
13// filesystem. this is an extremely complex problem by the time one worries
14// about normalization, security issues, character sets, and so on;
15// furthermore, path manipulation has historically been a performance
16// bottleneck in monotone. so the goal here is the efficient implementation
17// of a design that makes it hard or impossible to introduce as many classes
18// of bugs as possible.
19//
20// Our approach is to have three different types of paths:
21// -- system_path
22// this is a path to anywhere in the fs. it is in native format. it is
23// always absolute. when constructed from a string, it interprets the
24// string as being relative to the directory that monotone was run in.
25// (note that this may be different from monotone's current directory, as
26// when run in working copy monotone chdir's to the project root.)
27//
28// one can also construct a system_path from one of the below two types
29// of paths. this is intelligent, in that it knows that these sorts of
30// paths are considered to be relative to the project root. thus
31// system_path(file_path_internal("foo"))
32// is not, in general, the same as
33// system_path("foo")
34//
35// -- file_path
36// this is a path representing a versioned file. it is always
37// a fully normalized relative path, that does not escape the project
38// root. it is always relative to the project root.
39// you cannot construct a file_path directly from a string; you must pick
40// a constructor:
41// file_path_internal: use this for strings that come from
42// "monotone-internal" places, e.g. parsing revisions. this turns on
43// stricter checking -- the string must already be normalized -- and
44// is extremely fast. such strings are interpreted as being relative
45// to the project root.
46// file_path_external: use this for strings that come from the user.
47// these strings are normalized before being checked, and if there is
48// a problem trigger N() invariants rather than I() invariants. such
49// strings are interpreted as being _relative to the user's original
50// directory_. this function can only be called from within a
51// working copy.
52// file_path_internal_from_user: use this for strings that come from
53// the user, _but_ are not referring to paths in the working copy,
54// but rather in some database object directly. for instance, 'cat
55// file REV PATH' uses this function. this function is exactly like
56// file_path_internal, except that it raises N() errors rather than
57// I() errors.
58// file_path's also provide optimized splitting and joining
59// functionality.
60//
61// -- bookkeeping_path
62// this is a path representing something in the MT/ directory of a
63// working copy. it has the same format restrictions as a file_path,
64// except instead of being forbidden to point into the MT directory, it
65// is _required_ to point into the MT directory. the one constructor is
66// strict, and analogous to file_path_internal. however, the normal way
67// to construct bookkeeping_path's is to use the global constant
68// 'bookkeeping_root', which points to the MT directory. Thus to
69// construct a path pointing to MT/options, use:
70// bookkeeping_root / "options"
71//
72// All path types should always be constructed from utf8-encoded strings.
73//
74// All path types provide an "operator /" which allows one to construct new
75// paths pointing to things underneath a given path. E.g.,
76// file_path_internal("foo") / "bar" == file_path_internal("foo/bar")
77//
78// All path types subclass 'any_path', which provides:
79// -- emptyness checking with .empty()
80// -- a method .as_internal(), which returns the utf8-encoded string
81// representing this path for internal use. for instance, this is the
82// string that should be embedded into the text of revisions.
83// -- a method .as_external(), which returns a std::string suitable for
84// passing to filesystem interface functions. in practice, this means
85// that it is recoded into an appropriate character set, etc.
86// -- a operator<< for ostreams. this should always be used when writing
87// out paths for display to the user. at the moment it just calls one
88// of the above functions, but this is _not_ correct. there are
89// actually 3 different logical character sets -- internal (utf8),
90// user (locale-specific), and filesystem (locale-specific, except
91// when it's not, i.e., on OS X). so we need three distinct operations,
92// and you should use the correct one.
93//
94// all this means that when you want to print out a path, you usually
95// want to just say:
96// F("my path is %s") % my_path
97// i.e., nothing fancy necessary, for purposes of F() just treat it like
98// it were a string
99//
100//
101// There is also one "not really a path" type, 'split_path'. This is a vector
102// of path_component's, and semantically equivalent to a file_path --
103// file_path's can be split into split_path's, and split_path's can be joined
104// into file_path's.
105
106
107#include <iosfwd>
108#include <string>
109#include <vector>
110
111#include "numeric_vocab.hh"
112#include "vocab.hh"
113
114typedef u32 path_component;
115
116typedef std::vector<path_component> split_path;
117
118const path_component the_null_component = 0;
119
120inline bool
121null_name(path_component pc)
122{
123 return pc == the_null_component;
124}
125
126// It's possible this will become a proper virtual interface in the future,
127// but since the implementation is exactly the same in all cases, there isn't
128// much point ATM...
129class any_path
130{
131public:
132 // converts to native charset and path syntax
133 // this is a path that you can pass to the operating system
134 std::string as_external() const;
135 // leaves as utf8
136 std::string const & as_internal() const
137 { return data(); }
138 bool empty() const
139 { return data().empty(); }
140protected:
141 utf8 data;
142 any_path() {}
143 any_path(any_path const & other)
144 : data(other.data) {}
145 any_path & operator=(any_path const & other)
146 { data = other.data; return *this; }
147};
148
149std::ostream & operator<<(std::ostream & o, any_path const & a);
150
151class file_path : public any_path
152{
153public:
154 file_path() {}
155 // join a file_path out of pieces
156 file_path(split_path const & sp);
157
158 // this currently doesn't do any normalization or anything.
159 file_path operator /(std::string const & to_append) const;
160
161 void split(split_path & sp) const;
162
163 bool operator ==(const file_path & other) const
164 { return data == other.data; }
165
166 bool operator <(const file_path & other) const
167 { return data < other.data; }
168
169private:
170 typedef enum { internal, external, internal_from_user } source_type;
171 // input is always in utf8, because everything in our world is always in
172 // utf8 (except interface code itself).
173 // external paths:
174 // -- are converted to internal syntax (/ rather than \, etc.)
175 // -- normalized
176 // -- assumed to be relative to the user's cwd, and munged
177 // to become relative to root of the working copy instead
178 // both types of paths:
179 // -- are confirmed to be normalized and relative
180 // -- not to be in MT/
181 file_path(source_type type, std::string const & path);
182 friend file_path file_path_internal(std::string const & path);
183 friend file_path file_path_external(utf8 const & path);
184 friend file_path file_path_internal_from_user(utf8 const & path);
185};
186
187// these are the public file_path constructors
188inline file_path file_path_internal(std::string const & path)
189{
190 return file_path(file_path::internal, path);
191}
192inline file_path file_path_external(utf8 const & path)
193{
194 return file_path(file_path::external, path());
195}
196// this is rarely used; it is for when the user provides not a path relative
197// to their position in the working directory, but instead a project-root
198// relative path (e.g., in 'cat REV PATH'). It is exactly like
199// file_path_internal, but counts invalid paths as naughtiness rather than
200// bugs.
201inline file_path file_path_internal_from_user(utf8 const & path)
202{
203 return file_path(file_path::internal_from_user, path());
204}
205
206
207class bookkeeping_path : public any_path
208{
209public:
210 bookkeeping_path() {};
211 // path _should_ contain the leading MT/
212 // and _should_ look like an internal path
213 // usually you should just use the / operator as a constructor!
214 bookkeeping_path(std::string const & path);
215 bookkeeping_path operator /(std::string const & to_append) const;
216 // exposed for the use of walk_tree
217 static bool is_bookkeeping_path(std::string const & path);
218};
219
220extern bookkeeping_path const bookkeeping_root;
221
222// this will always be an absolute path
223class system_path : public any_path
224{
225public:
226 system_path() {};
227 system_path(system_path const & other) : any_path(other) {};
228 // the optional argument takes some explanation. this constructor takes a
229 // path relative to the working copy root. the question is how to interpret
230 // that path -- since it's possible to have multiple working copies over the
231 // course of a the program's execution (e.g., if someone runs 'checkout'
232 // while already in a working copy). if 'true' is passed (the default),
233 // then monotone will trigger an invariant if the working copy changes after
234 // we have already interpreted the path relative to some other working
235 // copy. if 'false' is passed, then the path is taken to be relative to
236 // whatever the current working copy is, and will continue to reference it
237 // even if the working copy later changes.
238 explicit system_path(any_path const & other,
239 bool in_true_working_copy = true);
240 // this path can contain anything, and it will be absolutified and
241 // tilde-expanded. it will considered to be relative to the directory
242 // monotone started in. it should be in utf8.
243 system_path(std::string const & path);
244 system_path(utf8 const & path);
245 system_path operator /(std::string const & to_append) const;
246};
247
248
249void
250save_initial_path();
251
252// returns true if working copy found, in which case cwd has been changed
253// returns false if working copy not found
254bool
255find_and_go_to_working_copy(system_path const & search_root);
256
257// this is like change_current_working_dir, but also initializes the various
258// root paths that are needed to interpret paths
259void
260go_to_working_copy(system_path const & new_working_copy);
261
262#endif

Archive Download this file

Branches

Tags

Quick Links:     www.monotone.ca    -     Downloads    -     Documentation    -     Wiki    -     Code Forge    -     Build Status