Fast Web log analyzer using probabilistic data structures.
Home · GitHub · RepologyLogswan is a fast Web log analyzer using probabilistic data structures. It is targeted at very large log files, typically APIs logs. It has constant memory usage regardless of the log file size, and takes approximatively 4MB of RAM.
Unique visitors counting is performed using two HyperLogLog counters (one for IPv4, and another one for IPv6), providing a relative accuracy of 0.10%. String representations of IP addresses are used and preferred as they offer better precision.
Project design goals include: speed, memory-usage efficiency, and keeping the code as simple as possible.
Logswan is opinionated software:
Logswan is written with security in mind and is running sandboxed on OpenBSD (using pledge). Experimental seccomp support is available for selected architectures and can be enabled by setting the ENABLE_SECCOMP variable to 1 when invoking CMake. It has also been extensively fuzzed using AFL and Honggfuzz.
Currently implemented features:
Logswan uses the CMake build system and requires Jansson and libmaxminddb libraries and header files.
mkdir build
cd build
cmake ..
make
Logswan has been successfully built and tested on OpenBSD, NetBSD, FreeBSD, macOS, and Linux with both Clang and GCC.
Logswan packages are available for:
Logswan looks for GeoIP2 databases in ${CMAKE_INSTALL_PREFIX}/share/dbip by default, which points to /usr/local/share/dbip.
A custom directory can be set using the GEOIP2DIR variable when invoking CMake:
cmake -DGEOIP2DIR=/var/db/dbip .
The free Creative Commons licensed DB-IP IP to Country Lite database can be downloaded here.
Alternatively, GeoLite2 Country database from MaxMind can be downloaded free of charge here, but require accepting an EULA and is not freely licensed.
logswan [-ghv] [-d db] logfile
If file is a single dash (`-'), logswan reads from the standard input.
The options are as follows:
-d db Specify path to a GeoIP database.
-g Enable GeoIP lookups.
-h Display usage.
-v Display version.
Logswan outputs JSON data to stdout.
Logswan is released under the BSD 2-Clause license.
Copyright (c) 2015-2023, Frederic Cambus
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS
BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.