|
dir_it
iterator to get all files in a directory
Abstract
The Standard C++ Library does not have any way to access
the directory structure of a computer. This is due to the
missing notion of directories at all on some C++ target
platforms. However, many important platforms do have a notion of
a directory but the system interface is very different between
these platforms. This class provides a standard interface which
is extensible to suit specific needs on the platform (when it
comes to the need to access file attributes).
Synopsis
|
#include <boost/directory.h>
std::string dirname(...);
boost::filesystem::dir_it begin(dirname);
boost::filesystem::dir_it end;
boost::filesystem::dir_it it(begin);
it = begin
*it
++it
*it++
it == end
it != end
prop::value_type v = boost::filesystem::get<prop>(it)
boost::filesystem::set<prop>(it, value)
|
Description
The class boost::filesystem::dir_it (dir_it for short)
is an input iterator which iterates over the entries in a directory.
A begin iterator is constructed from a valid directory name using the
platform specific notation, an end
iterator is constructed using the default constructor of the class.
The two function boost::filesystem::get() and
boost::filesystem::set() are used to access specific properties
of a file. The exact list of available properties depends on the
system. Below is a list of common properties and lists of properties
supported on specific systems.
Since the file properties differ between systems, an extensible
interface was choosen to allow different sets of properties to be
accessed. It is even possible for the user to add special properties.
To define a new file property, a struct is defines which
gives the name and the type to the property. Of course, it is also
necessary to define the get() and/or set() functions.
Details for this are given below.
Basic Functionality
The main functionality of the class dir_it is to iterate
over the entries in a directory. Here is an example how the class
can be used to print the files in a directory:
|
#include <iterator>
#include <iostream>
#include <algorithm>
#include <boost/directory.h>
int main(int ac, char *av[])
{
if (ac == 2)
{
typedef boost::filesystem::dir_it InIt;
typedef std::ostream_iterator<std::string> OutIt;
std::copy(InIt(av[1]), InIt(), OutIt(std::cout, "\\n"));
}
return 0;
}
|
Of course, it is also possible to do this loop manually: The
class dir_it is just an input iterator. Note, that
the post increment operator only returns a proxy object which
can be used for dereferencing (using operator*()) as
required by the input iterator specification. However, the
proxy object cannot be used to access other file attributes
than the name.
dir_it Members
Lifecycle
-
Default Constructor
-
The default constructor is used to create the "past the
end" iterator. This construction never fails and the
resulting iterator cannot be deferenced.
-
Constructor taking a std::string
-
A std::string naming a directory can be used to
construct a "begin" iterator. If the argument does
not name an accessible directory, the resulting iterator
compares equal to the past the end iterator constructed
with the default constructor. On most system it is no
problem how this failure is indicated because even an
empty directory has entries, e.g. on POSIX systems the
directories "." (the directory itself) and ".."
(the parent directory).
-
Copy Constructor
-
The copy constructor creates a new instance which is
always positioned on the same current entry as the
original dir_it instance. This means, that
advancing either the original or the newly created
iterator will advance both iterators. It is not
possible to copy a dir_it to iterate over the
same directory entries twice. To do this, two objects
of type dir_it have to be constructed from the
directory name.
-
Destructor
-
The destructor releases the resources associated with
the dir_it. However, if the dir_it was
copied, associated system resources are released when
the last copy is destroyed. This is because the various
copies share the same system resources.
-
Assignment
-
The assigned dir_it is always position on the
same entry as the original iterator. Thus, the same
restriction on the assigned iterator apply as
those for iterators created with the copy constructor.
Operations
-
Dereference (operator*())
-
Dereferencing a dir_it returns the name of the
current directory entry as std::string. It is
only possible to derference a dir_it if it does
not compare equal to the past the end iterator.
-
Pre Increment (operator++())
-
The major means to advance a dir_it is the pre
increment operator. This operation moves the object to
the next directory entry, if there is another entry.
Otherwise, the dir_it object compares equal to
the past the end iterator after the pre increment. The
pre increment operator returns the object itself.
-
Post Increment (operator++(int))
-
The post increment advances the dir_it to the
next entry and returns a proxy object which can be
dereferenced as if it were an object of type
dir_it. However, nothing else can be done with
this object. This method of advancing the iterator is
normally less efficient such that the pre increment
operator should be used if possible.
-
Equals Operator (operator==())
-
The equals operator determines whether two objects of
type dir_it are either both indicating a
current directory entry, or both objects are past
the end iterators. Because every directory turns into
a past the end iterator once all entries in the
directory have been seen, this can be used to test
whether there are any more entries. However, it is
not possible to determine whether a dir_it is
positioned on a specific directory entry (but this
can be done by comparing the results of the
dereference operator).
-
Not Equal Operator (operator!=())
-
The not equal operator returns the exact negation of the
equals operator. Thus, this operator returns true
if one of the two iterators indicates a current
directory entry while the other iterator is a past the
end iterator.
File Properties
Using the functions get() and set() it is
possible to access file properties. Here is an example
which prints the file sizes in addition to the name:
|
#include <iostream>
#include <boost/directory.h>
int main(int ac, char *av[])
{
if (ac == 2)
{
using namespace boost::filesystem;
for (dir_it it(av[1]); it != dir_it(); ++it)
std::cout << std::setw(10) << get<size>(it)
<< " " << *it << "\\n";
}
return 0;
}
|
Each property constists of two major components
-
A struct which gives the name to the
property and which defines the type accessed
using the property. The type of the property is
defined using a typedef defining the type
value_type in the corresponding
struct. For the standard properties,
the corresponding structs are defined
in the namespace boost::filesystem.
-
Access functions which are just specializations
of the functions boost::filesystem::get()
and boost::filesystem::set(). Of course,
if the property can only be read or only be
written, only the corresponding access function is
defined.
Example Property
The size property used in the
above example might be defined as follows:
|
namespace boost {
namespace filesystem {
struct size
{
typedef size_t value_type;
};
template <>
size::value_type get<size>(dir_it const &it)
{
return ... /* environment specific code */
}
}
}
|
The properties which are already provided by the
implementation normally access some data structure
internal to the dir_it objects to avoid
multiple system calls.
Details
-
Property Selection
-
The file property to be accessed is selected
using a template argument to the get() or
set() function. The template argument is
a type which defines the type value_type
as a subtype. The get() and set()
functions are specialized for the properties
provided by the system. By specializing addtional
versions of these functions, the user may extend
the set of accessible properties.
-
Property Type
-
The type of a file property is determined from
a typedef called value_type in
the type selecting the property.
-
Reading a Property
-
To read a file property, a dir_it is
passed as argument to the template function
boost::filesystem::get(). The
template argument prop selecting the file
property to be accessed is explicitly specified.
The return type returned from the get()
function is prop::value_type.
-
Setting a Property
-
To set a file property, a dir_it and the
new value of the property are passed to the
template function boost::filesystem::set().
The template argument prop selecting the
file property to be accessed is explicitly specified.
The type of the second argument to the set()
function is prop::value_type const &.
Standard Properties
The organization of files differ heavily between
different system. As a result, the sets of file
properties defined on different systems vary. The
property interface is choosen such that it is
obvious how specific properties are accessed except
that the names and the exact types are still open. To
enhance portability, some common file properties are
always defined:
-
is_directory
-
A boolean read only property which can be used to
determine whether a directory entry is itself a
directory.
-
is_hidden
-
A boolean property indicating whether the file is
"hidden". By default, hidden files are not shown
to the user. However, with appropriate options,
these files may be shown anyway. On some systems,
there is a special flag for the files which indicates
that the file is hidden. On such systems this flag
is a read/write property. On other systems, e.g. on
POSIX systems, files starting with a dot (".") are
considered to be hidden. On such systems this flag
is a read only property.
-
size
-
A read only property of type size_t returning
the size in bytes of a file. Note that the size
returned is not necessarily identical to the number
of characters retrieved from an ifstream
created for this file: In text mode, some character
sequences are replaced by single characters during
reading. However, the number of characters in binary
mode should normally match the size of the file.
-
mtime
-
A read only property of type time_t returning
the last modification time of the file. On some
systems, e.g. POSIX, it is possible to write this
property to set the value to an arbitrary value.
POSIX Properties
WinNT Properties
Future Directions
In computer systems there are other structures than the
system's directory which can also be viewed as directories.
An obvious example are archive files which store copies
of directory hierarchies, like ZIP or tar files. It might
be useful to extend the class dir_it to consider
such structures also to be directories and somehow add
support to iterate of these.
A potential approach might be the definition of a CORBA
interface which is used internally by the class
dir_it to determine directory entries and to
figure out, whether an entry itself a directory. This
way it would be possible to even extend what is considered
to be a directory and have the same class iterate over
very different structures.
Whether this approach is reasonable whill have to be
evaluated in the future. Personally, I think this is
an interesting direction and I hope that I will find
time to test this in the near future.
See Also
POSIX: opendir(3), readdir(3), closedir(3), stat(2)
Standard Template Library: Input Iterator Requirements
Dietmar Kühl <dietmar.kuehl@claas-solutions.de>
|