monotone

monotone Mtn Source Tree

Root/paths.hh

1#ifndef __PATHS_HH__
2#define __PATHS_HH__
3
4// Copyright (C) 2005 Nathaniel Smith <njs@pobox.com>
5//
6// This program is made available under the GNU GPL version 2.0 or
7// greater. See the accompanying file COPYING for details.
8//
9// This program is distributed WITHOUT ANY WARRANTY; without even the
10// implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
11// PURPOSE.
12
13// safe, portable, fast, simple path handling -- in that order.
14// but they all count.
15//
16// this file defines the vocabulary we speak in when dealing with the
17// filesystem. this is an extremely complex problem by the time one worries
18// about normalization, security issues, character sets, and so on;
19// furthermore, path manipulation has historically been a performance
20// bottleneck in monotone. so the goal here is the efficient implementation
21// of a design that makes it hard or impossible to introduce as many classes
22// of bugs as possible.
23//
24// Our approach is to have three different types of paths:
25// -- system_path
26// this is a path to anywhere in the fs. it is in native format. it is
27// always absolute. when constructed from a string, it interprets the
28// string as being relative to the directory that monotone was run in.
29// (note that this may be different from monotone's current directory, as
30// when run in workspace monotone chdir's to the project root.)
31//
32// one can also construct a system_path from one of the below two types
33// of paths. this is intelligent, in that it knows that these sorts of
34// paths are considered to be relative to the project root. thus
35// system_path(file_path_internal("foo"))
36// is not, in general, the same as
37// system_path("foo")
38//
39// -- file_path
40// this is a path representing a versioned file. it is always
41// a fully normalized relative path, that does not escape the project
42// root. it is always relative to the project root.
43// you cannot construct a file_path directly from a string; you must pick
44// a constructor:
45// file_path_internal: use this for strings that come from
46// "monotone-internal" places, e.g. parsing revisions. this turns on
47// stricter checking -- the string must already be normalized -- and
48// is extremely fast. such strings are interpreted as being relative
49// to the project root.
50// file_path_external: use this for strings that come from the user.
51// these strings are normalized before being checked, and if there is
52// a problem trigger N() invariants rather than I() invariants. if in
53// a workspace, such strings are interpreted as being
54// _relative to the user's original directory_.
55// if not in a workspace, strings are treated as referring to some
56// database object directly.
57// file_path's also provide optimized splitting and joining
58// functionality.
59//
60// -- bookkeeping_path
61// this is a path representing something in the _MTN/ directory of a
62// workspace. it has the same format restrictions as a file_path,
63// except instead of being forbidden to point into the _MTN directory, it
64// is _required_ to point into the _MTN directory. the one constructor is
65// strict, and analogous to file_path_internal. however, the normal way
66// to construct bookkeeping_path's is to use the global constant
67// 'bookkeeping_root', which points to the _MTN directory. Thus to
68// construct a path pointing to _MTN/options, use:
69// bookkeeping_root / "options"
70//
71// All path types should always be constructed from utf8-encoded strings.
72//
73// All path types provide an "operator /" which allows one to construct new
74// paths pointing to things underneath a given path. E.g.,
75// file_path_internal("foo") / "bar" == file_path_internal("foo/bar")
76//
77// All path types subclass 'any_path', which provides:
78// -- emptyness checking with .empty()
79// -- a method .as_internal(), which returns the utf8-encoded string
80// representing this path for internal use. for instance, this is the
81// string that should be embedded into the text of revisions.
82// -- a method .as_external(), which returns a std::string suitable for
83// passing to filesystem interface functions. in practice, this means
84// that it is recoded into an appropriate character set, etc.
85// -- a operator<< for ostreams. this should always be used when writing
86// out paths for display to the user. at the moment it just calls one
87// of the above functions, but this is _not_ correct. there are
88// actually 3 different logical character sets -- internal (utf8),
89// user (locale-specific), and filesystem (locale-specific, except
90// when it's not, i.e., on OS X). so we need three distinct operations,
91// and you should use the correct one.
92//
93// all this means that when you want to print out a path, you usually
94// want to just say:
95// F("my path is %s") % my_path
96// i.e., nothing fancy necessary, for purposes of F() just treat it like
97// it were a string
98//
99//
100// There is also one "not really a path" type, 'split_path'. This is a vector
101// of path_component's, and semantically equivalent to a file_path --
102// file_path's can be split into split_path's, and split_path's can be joined
103// into file_path's.
104
105
106#include <iosfwd>
107#include <string>
108#include <vector>
109#include <set>
110
111#include "vocab.hh"
112
113#include <boost/filesystem/path.hpp>
114
115namespace fs = boost::filesystem;
116
117typedef std::vector<path_component> split_path;
118
119const path_component the_null_component;
120
121inline bool
122null_name(path_component pc)
123{
124 return pc == the_null_component;
125}
126
127bool
128workspace_root(split_path const & sp);
129
130template <> void dump(split_path const & sp, std::string & out);
131
132// It's possible this will become a proper virtual interface in the future,
133// but since the implementation is exactly the same in all cases, there isn't
134// much point ATM...
135class any_path
136{
137public:
138 // converts to native charset and path syntax
139 // this is a path that you can pass to the operating system
140 std::string as_external() const;
141 // leaves as utf8
142 std::string const & as_internal() const
143 { return data(); }
144 bool empty() const
145 { return data().empty(); }
146protected:
147 utf8 data;
148 any_path() {}
149 any_path(any_path const & other)
150 : data(other.data) {}
151 any_path & operator=(any_path const & other)
152 { data = other.data; return *this; }
153};
154
155std::ostream & operator<<(std::ostream & o, any_path const & a);
156std::ostream & operator<<(std::ostream & o, split_path const & s);
157
158class file_path : public any_path
159{
160public:
161 file_path() {}
162 // join a file_path out of pieces
163 file_path(split_path const & sp);
164
165 // this currently doesn't do any normalization or anything.
166 file_path operator /(std::string const & to_append) const;
167
168 void split(split_path & sp) const;
169
170 bool operator ==(const file_path & other) const
171 { return data == other.data; }
172
173 bool operator <(const file_path & other) const
174 { return data < other.data; }
175
176private:
177 typedef enum { internal, external } source_type;
178 // input is always in utf8, because everything in our world is always in
179 // utf8 (except interface code itself).
180 // external paths:
181 // -- are converted to internal syntax (/ rather than \, etc.)
182 // -- normalized
183 // -- assumed to be relative to the user's cwd, and munged
184 // to become relative to root of the workspace instead
185 // both types of paths:
186 // -- are confirmed to be normalized and relative
187 // -- not to be in _MTN/
188 file_path(source_type type, std::string const & path);
189 friend file_path file_path_internal(std::string const & path);
190 friend file_path file_path_external(utf8 const & path);
191};
192
193// these are the public file_path constructors
194inline file_path file_path_internal(std::string const & path)
195{
196 return file_path(file_path::internal, path);
197}
198inline file_path file_path_external(utf8 const & path)
199{
200 return file_path(file_path::external, path());
201}
202
203class bookkeeping_path : public any_path
204{
205public:
206 bookkeeping_path() {};
207 // path _should_ contain the leading _MTN/
208 // and _should_ look like an internal path
209 // usually you should just use the / operator as a constructor!
210 bookkeeping_path(std::string const & path);
211 bookkeeping_path operator /(std::string const & to_append) const;
212 // exposed for the use of walk_tree and friends
213 static bool internal_string_is_bookkeeping_path(utf8 const & path);
214 static bool external_string_is_bookkeeping_path(utf8 const & path);
215 bool operator ==(const bookkeeping_path & other) const
216 { return data == other.data; }
217
218 bool operator <(const bookkeeping_path & other) const
219 { return data < other.data; }
220};
221
222extern bookkeeping_path const bookkeeping_root;
223extern path_component const bookkeeping_root_component;
224// for migration
225extern file_path const old_bookkeeping_root;
226
227// this will always be an absolute path
228class system_path : public any_path
229{
230public:
231 system_path() {};
232 system_path(system_path const & other) : any_path(other) {};
233 // the optional argument takes some explanation. this constructor takes a
234 // path relative to the workspace root. the question is how to interpret
235 // that path -- since it's possible to have multiple workspaces over the
236 // course of a the program's execution (e.g., if someone runs 'checkout'
237 // while already in a workspace). if 'true' is passed (the default),
238 // then monotone will trigger an invariant if the workspace changes after
239 // we have already interpreted the path relative to some other working
240 // copy. if 'false' is passed, then the path is taken to be relative to
241 // whatever the current workspace is, and will continue to reference it
242 // even if the workspace later changes.
243 explicit system_path(any_path const & other,
244 bool in_true_workspace = true);
245 // this path can contain anything, and it will be absolutified and
246 // tilde-expanded. it will considered to be relative to the directory
247 // monotone started in. it should be in utf8.
248 system_path(std::string const & path);
249 system_path(utf8 const & path);
250 system_path operator /(std::string const & to_append) const;
251};
252
253void
254dirname_basename(split_path const & sp,
255 split_path & dirname, path_component & basename);
256
257void
258save_initial_path();
259
260system_path
261current_root_path();
262
263// returns true if workspace found, in which case cwd has been changed
264// returns false if workspace not found
265bool
266find_and_go_to_workspace(system_path const & search_root);
267
268// this is like change_current_working_dir, but also initializes the various
269// root paths that are needed to interpret paths
270void
271go_to_workspace(system_path const & new_workspace);
272
273void mark_std_paths_used(void);
274
275typedef std::set<split_path> path_set;
276
277void
278split_paths(std::vector<file_path> const & file_paths, path_set & split_paths);
279
280// equivalent to file_path_internal(path).split(sp) but more efficient.
281void
282internal_string_to_split_path(std::string const & path, split_path & sp);
283
284// Local Variables:
285// mode: C++
286// fill-column: 76
287// c-file-style: "gnu"
288// indent-tabs-mode: nil
289// End:
290// vim: et:sw=2:sts=2:ts=2:cino=>2s,{s,\:s,+s,t0,g0,^-2,e-2,n-2,p2s,(0,=s:
291
292#endif

Archive Download this file

Branches

Tags

Quick Links:     www.monotone.ca    -     Downloads    -     Documentation    -     Wiki    -     Code Forge    -     Build Status