A comparison of several host/file integrity checkers (scanners)
By Rainer Wichmann
[]
(last update: Oct 13, 2004)
Caveat: The author of
this study is also the author of one of these file integrity scanners
(samhain). I.e. this study may be biased, because:
(a) the tests in this study are based on user feedback for samhain and
the authors personal opinion on what basic functionality a file
integrity scanner should provide, and
(b) a bug in samhain found during
this study was fixed (see the remark on samhain under
Remarks on individual programs).
If you think some of the results presented here are incorrect or outdated,
you are welcome to point out corrections.
The lack of a trademark sign does not imply the non-existence
of a trademark.
Content
What is the focus of this study ?
Explanation of table rows
Table of results
Remarks on individual programs
Relative speed
Logging options
Centralized management: osiris and samhain
This study compares seven freely available (mostly open-source)
file/host integrity scanners (file integrity check
programs) with respect to the implementation of the core functionality,
i.e. questions like:
- Can the program check all files that you may want to check ?
- Can the program handle corner cases of the filesystem
(that may e.g. result from an intrusion, or from simple errors of users) ?
- Does the program warn about an incorrect configuration (which may
cause it to check not in the way you intended) ?
The results presented here are based on test runs, and sometimes
also on investigation of the source code.
All test were performed on a Debian 3.0 Linux system.
In general, tests were performed only with console logging (stdout/stderr).
Thus, while some "features" of these programs
are mentioned that may be of interest for useability, the
focus of the study was on testing the scanner's functionality,
not on listing and/or comparing their features.
- Version
-
The version number of the file integrity scanner.
- Date
-
The release date of the file integrity scanner. For PGP
signed source code, this is the date of the PGP signature,
otherwise the date listed on the web site, or in the source.
- PGP signature
-
Is the distributed source code PGP-signed ? If there
is no signature, it may be possible to put a trojan into
the source code (this has
happened in the past with several
high-profile security-related programs)!
- Language
-
The programming language of the file integrity scanner.
- Required
-
Requirements (other than compiler or interpreter).
- Log Options
-
What channels are supported for logging ?
- DB sign/crypt
-
Does the scanner support signed or encrypted baseline databases ?
- Conf sign/crypt
-
Does the scanner support signed or encrypted configuration files ?
- Name Expansion
-
Does the scanner support expansion of file names (shell-style
globbing or regular expressions) in the configuration file ?
- Duplicate Path
-
Does the scanner check the configuration file for duplicate
entries of files/directories (possibly with a different
checking policy for the duplicate) ? Strict checking of the
configuration file can help to avoid user errors.
- PATH_MAX
-
Can the scanner handle a file whose path has the maximum
allowed length (4095 on Linux) ?
- Root Inode
-
Can the scanner handle the "/" directory inode ?
This is the file with the shortest possible path, and also the
only one with a "/" in its filename, so
it may expose programming bugs (and you do want to check that
inode).
- Non-printable
-
Can the scanner handle filenames with weird or non-printable
characters ? And if it can handle them internally, can it
report results in a useful way ?
Checked filenames were:
bash$ ls -l --quoting-style=c /
drwxr-xr-x 2 root root 4096 Feb 11 20:16 "\002\002\002\002"
As "\002" is non-printable, incorrect reporting
will result in a report about removal of the root
directory ("/"), if this file is removed ...
bash$ ls -l --quoting-style=c /opt
drwxr-xr-x 2 root root 1024 Feb 11 19:51 "this is_not_a_love_song\b\b\b\b\b\b\b\b\bwrong_filename"
As "\b" is backspace, incorrect reporting
will result
in a report for the non-existing
file "this is_not_a_wrong_filename"
- No User
-
Can the scanner handle files owned by a non-existing user
(UID with no entry in /etc/passwd) ?
- No Group
-
Can the scanner handle files owned by a non-existing group
(GID with no entry in /etc/group) ?
- Lock
-
Can the scanner handle files if another process has aquired a
mandatory (kernel-enforced) lock on it (yes, Linux has
that kind of locks) ?
It is possible to open() such a file for reading, but the read()
itself will block, so the scanner will hang indefinitely,
unless precautions are taken.
On Linux, mandatory locking requires a special mount option, thus
cannot usually be enforced by unprivileged users.
- Race
-
File integrity scanners first lstat() a file to determine whether
it is a regular file, then open() it to read it for checksumming.
In between these two calls, a user with write access to the
directory may replace the file with a
named pipe. As a result, the open() call will block and the
scanner may hang indefinitely,
unless precautions are taken.
- /proc
-
Is the scanner able to scan the /proc directory ?
On Linux, at least some files in /proc are
writeable and can be used to configure the kernel
at runtime, so you may want to check these files. However,
files in /proc may be listed with zero
filesize, even if you can read plenty of data from them.
Almost all scanners "optimize"
by not checksumming zero-length files, which
is incorrect in the
Linux /proc filesystem. Additionally, some files
may block on an attempt to read from them.
- /dev
-
Has the scanner problems with the /dev directory ?
- Crea/Del
-
Can the scanner report on missing (deleted) or newly created
files ?
|
AIDE |
FCheck |
Integrit |
Nabou |
Osiris |
Samhain |
Tripwire |
| Version |
0.10 |
2.07.59 |
3.02 |
2.4 |
4.0.5 |
1.8.4 |
2.3.1-2 |
| Date |
Nov 30, 2003 |
May 03, 2001 |
Sep 08, 2002 |
Aug 30, 2004 |
Sep 27, 2004 |
Mar 17, 2004 |
Mar 04, 2001 |
| PGP Signature |
YES |
NO |
YES |
YES |
YES |
YES |
NO |
| Language |
C |
Perl |
C |
Perl |
C |
C |
C++ |
| Required |
libmhash |
md5sum (or md5) |
|
PARI/GP library + about 11 Perl modules |
OpenSSL 0.9.6j or newer |
GnuPG (only if signed config/database used) |
|
| Log Options |
stdout, stderr, file, file descriptor |
stdout, syslog |
stdout |
stdout, email |
central log server (email+file on server side) |
stderr, email, file, pipe, syslog, RDBMS, central log server, prelude, external script, IPC message queue |
stdout, file, email, syslog |
| DB sign/crypt |
NO |
NO |
NO |
sign |
NO |
sign |
sign+crypt |
| Conf sign/crypt |
NO |
NO |
NO |
NO |
NO |
sign |
sign+crypt |
| Name Expansion |
regex |
NO |
NO |
see remarks |
regex |
glob (shell-style) |
NO |
| Duplicate Path |
NO |
NO |
NO |
NO |
NO |
Warns |
Exits |
| PATH_MAX |
OK |
OK |
NO |
OK |
NO |
OK |
OK |
| Root Inode |
see remarks |
NO |
OK |
NO |
OK |
OK |
OK |
| Non-printable |
NO |
NO |
NO |
NO |
NO |
OK |
OK |
| No User |
OK |
OK |
OK |
see remarks |
OK |
OK |
OK |
| No Group |
OK |
OK |
OK |
see remarks |
OK |
OK |
OK |
| Lock |
OK |
Hangs |
Hangs |
Hangs |
Hangs |
Times out |
Hangs |
| Race |
Hangs |
Hangs |
Hangs |
Hangs |
Hangs |
OK |
Hangs |
| /proc |
NO |
NO |
Hangs |
Hangs |
NO |
OK |
NO |
| /dev |
OK |
OK |
OK |
OK |
OK |
OK |
OK |
| Crea/Del |
OK |
OK |
OK |
OK |
OK |
OK |
OK |
AIDE
-
Segfaults on syntax error in config file (directory without
policy).
-
When specifying the root directory, apparently '/.* R' does
not match '/'; '/$ R' matches, but
only if there is no other rule, so it's useless (i.e. can't
check the root directory inode). Bug, or my misunderstanding of
the regex syntax in the configuration file.
-
There is no tool to list the database (however, it is
human-readable, not binary).
-
AIDE is the only scanner in this study that uses mmap() rather than
read() to read a file. This is responsible for passing
the 'Lock' test (the kernel denies mmapping a mandatorily locked file).
-
Judging from comments in the source code, AIDE
tries to fix the 'Race' problem, but the solution does not work.
-
Omits checksum if file size is zero, which is incorrect for
Linux /proc files.
-
For deleted / added files, only the path is printed.
FCheck
-
Not possible to define different policies (e.g. ignore size
change for logfiles).
-
Omits checksum if file size is zero, which is incorrect for
Linux /proc files.
-
Filenames in baseline database are not properly escaped, thus
it is not possible to check files with non-printable characters.
Some of them may even corrupt the baseline database (e.g.
filenames with newlines).
-
No check on config file syntax is done. Duplicate entries
are scanned twice, mis-typed directives (e.g. 'Directoy ='
instead of 'Directory =' are silently ignored).
-
No tool to dump/read the baseline database, which is barely
human-readable.
-
If a directory is scanned recursively, the top level directory
inode itself is never included. Thus it is impossible to check
the root inode.
Integrit
-
You can have only one root directory in the config file, which
makes it complicated to scan (only) some directories scattered
over the file system. You need to run one integrit instance
per root with different (per-root) configuration files.
-
Internally, all path names start with a double '/'.
This may cause the observed ENAMETOOLONG error
on valid long paths (?).
-
Usage is simple and straightforward. According to the
documentation, the lack of
features is intentional to simplify usage (which certainly is
a valid argument, as long as one does not need advanced features).
-
Judging from comments in the source code, Integrit
tries to fix the 'Race' problem, but the solution does not work.
- Comment by Ed L Cashin (integrit developer):
While there are integrit users who agree with you, I maintain that
running integrit three times using three configuration files is
cleaner and easier than running integrit once with one more cluttered
configuration file. You can even take advantage of parallel I/O if
the roots are on different devices.
Nabou
-
The script does not check whether a file is a socket, so
it tries to checksum sockets (and hangs).
-
The config file syntax is somewhat apache-like, and
easy to understand. Liked the config file syntax
best of all tested programs.
-
Filename expansion (globbing, i.e. shell-style) is (only)
supported for excluded files.
-
Nabou only prints user/group names. If there is no user (group) for
a UID (GID), it will print whitespace
rather than the numeric UID (GID).
-
With 'use_ls', nabou prints ls -l like line about matching files,
but the file type is incorrect (e.g. devices are listed as
regular files).
-
Dumping the database is possible (comma-separated format).
Osiris
-
Files with filename length of NAME_MAX are completely
ignored (no database or log entries, except for the eventually
modified
timestamp of the parent directory).
-
Exclusion of subdirectories (option NoEntry) apparently
does not work for the root directory (of course that could also
be a user error on my side).
-
Omits checksum if file size is zero, which is incorrect for
Linux /proc files.
-
For deleted / added files, only the path is logged.
-
The management command-line interface (CLI) has no support for
non-printable chars (although 'space' is accepted).
This not only hides
the true path for records, but makes the record details eventually
unavailable (e.g. if a real path gets duplicated by
'path + non-printable chars'), because the
baseline database is binary (Berkeley DB) and only
readable via the CLI (but see below).
-
There is no documented way to dump the baseline database to
a human-readable format (format is Berkeley DB). Because details
for new/missing files are not in the log, one has to lookup
each record individually with the CLI, which is cumbersome.
More precisely, there is a tool printdb in the
src/tools/, which is not compiled by default
(cd src/tools/ && make), and does not work as-is
(edit printdb.c, remove code referencing "db2" in main(),
recompile,
and use 'printdb -a <database>' - not a big thing if you
know C ...).
Samhain
-
Suffers a bit from feature bloat, which causes probably a steeper
learning curve than for other programs in this study.
-
In the 'Lock' test, samhain will timeout.
-
In the interest of full disclosure:
version 1.8.3 had a bug with formatting of long reports that
caused samhain to fail on tests with long paths. Version 1.8.4
was fixed as a result of these tests.
Tripwire
-
The makefile cannot recognize that 'make' is GNU make, it insists
on 'gmake'. Made a symlink gmake->make to fix the problem.
-
Tripwire provides no details
about modified/added/removed files, only path names,
unless one uses twprint --report-level 4,
which is pretty verbose.
-
Omits checksum if file size is zero, which is incorrect for
Linux /proc files.
-
Apparently the open-source version of Tripwire has failed to attract
any developer community. While it appears
to be more solid than most other open-source integrity scanners
in this study, I found the source code poorly commented and not
particularly lucid.
Tests were performed under Debian 3.0 by checking two datasets, one
with 206 Mb, the other with 1.1 Gb. Absolute times depend on the
hardware - your mileage may vary.
All integrity checkers showed non-linear behaviour: the larger dataset
was checked with less speed (Mb / minute) than the smaller one.
Integrity checkers written in C
(AIDE, Integrit, Osiris, and Samhain) were I/O-limited (i.e. speed
was limited by the disk I/O), and
all were about equally fast (about 2 minutes for the small dataset,
18 minutes for the large one), with no significant
difference between database initialization and checking.
Tripwire (written in C++) was slower
(4 minutes / 26 minutes)
than AIDE, Integrit, and Samhain, again with no significant
difference between database initialization and checking.
For the smaller dataset, the two Perl scripts (FCheck and Nabou)
were faster than Tripwire, but slower than the C programs.
For the larger dataset, the performance of the Perl scripts
was much worse:
Nabou took 85 minutes to initialize the database,
and 41 minutes for checking.
FCheck (also Perl) needed 40 minutes for initializing, and 54 minutes
for checking.
This is an overview over the logging options provided by
different scanners. This information is mostly taken from the
documentation, and usually not verified.
AIDE: reports can be printed to stdout, stderr,
plaintext file, or to an open file descriptor.
Any combination of these can be used, but the verbosity level cannot
be set individually.
Fcheck: reports are printed to stdout, and optionally
logged to syslog (via the logger standard utility).
Integrit: reports are printed to stdout.
Nabou: nabou prints reports to stdout, or sends
them via email.
Osiris: osiris clients only send scan results to the
central server, which
in turn logs reports to plaintext files and can send
emails.
Samhain: samhain can log to stdout, plaintext file,
and syslog. Also supported are: sending reports by email,
sending reports to a central server, inserting reports
into an RDBMS (MySQL, PostgreSQL, Oracle, or unixODBC), sending reports to
a Prelude IDS system, writing reports to a
named pipe, calling a user-defined external application to
process reports (e.g. to send an SMS to a mobile phone), and providing
reports via an IPC message queue.
For each supported logging facility, the level of logging can be
configured individually. Any combination of facilities can be used in
parallel.
Tripwire: tripwire prints reports on stdout, and
stores them in binary files. Optionally it can send reports
by email.
It can also log to
syslog (in a very terse way, where only the number of violations
are logged).
Osiris and samhain are special insofar as they are the only
host integrity scanners in this study that
provide built-in support for centralized logging and management.
Both systems are able to collect reports/data from clients on a central
server, and to store baseline databases and client configurations on
the central server. Configuration changes and updates of the baseline
database can be performed centrally rather than on individual hosts
monitored by the system.
General design differences: push vs. pull
Centralized logging and management requires a client/server system
where at least one side has to listen on the network for connections,
and thus is potentially vulnerable to remote attacks.
For osiris, scan requests are pushed from the central server to the
individual scanner clients. Thus the client, which needs root
privileges to open and checksum privileged files, also listens
on the network.
On Unix/Linux, this problem is mitigated by using
privilege separation (similar to OpenSSH): there is a
privileged process that only handles actions that require root
privileges, and an unprivileged (sub-)process that does most
of the work (including network connections).
However, on MS Windows, privilege separation is not supported.
Samhain works the other way: clients pull the baseline database from
the server, and return reports. I.e. here the server has an open
port, and the server does not need root privileges. Actually the
samhain server (called 'yule') will only run as an unprivileged
user (it drops root privileges if started with), and can be
chrooted.
Both samhain and osiris use encrypted client/server connections.
With osiris, (only) the server must authenticate to the client.
However, similar to samhain, osiris clients negotiate a shared
secret with the server that is kept in memory after startup, thus
attempts to replace the client can be detected once it has started.
Samhain
uses mutual authentication (where the client's credentials are located
within the client executable). Upon successful authentication, a
shared secret is negotiated that is kept in memory.
With osiris, clients send back snapshots of the file system, which are
compared to the baseline on the server side, and stored in the same
location as the baseline database. Thus the server (which is potentially
vulnerable to malicious clients) needs write access to the directory
where baseline data is stored.
Samhain clients only send back reports on filesystem modifications.
These reports can be used to update the baseline database on the server
via the central management console. The server only needs read access
to the baseline data.
Additional features
In addition to file integrity checking, samhain can optionally
check for kernel rootkits, search the filesystem for SUID binaries,
check mount points (and their mount options), and
watch login/logout events.
Osiris can report if (and which) users/groups have been added to
/etc/passwd and/or /etc/groups. Also, it can
report on new kernel modules loaded (in a limited way, you can do that
with samhain by monitoring the checksum of /proc/modules).
Samhain offers a large choice of different logging facilities
(both on the client as well as on the server side) that
can optionally be used simultaneously. Osiris clients only report
to the central server, which in turn logs reports to files and
optionally can send email notifications.
Both samhain and osiris support a central management interface.
In the case of osiris, this is a command-line interface (CLI) that
is part of the osiris package. For samhain, the management interface
is a PHP web-based interface that is available as a separate
package (beltane).
Osiris supports MS Windows natively, while samhain requires a POSIX
emulation (like e.g. Cygwin).