Access Log Analytics
Tempesta FW uses an mmap()ed memory area to deliver client access logs to
the user-space in a zero-copy fashion. tfw_logger is a user-space daemon
spawning a worker thread per a CPU core and gathers access log records from
the mmap()ed area with no locks and no copying. The received from the kernel
access log records are grouped and sent to the ClikHouse
database. Thanks to the batches, ClickHouse can absorb millions of records
per second on a modest hardware node.
ClickHouse is a powerful analytical (column-oriented DBMS) for online analytical processing (OLAP). We have chosen ClickHouse because
- it can eat enormous number of batched records per second
- it provides powerful SQL-based query language
- it’s open source
The ClickHouse advanced analytic queries over the stored data facilitate web performance analytics and security incidents (e.g. application DDoS attacks) response.
Architecture๐
HTTP Request โ Tempesta FW โ mmap buffer โ tfw_logger โ ClickHouse
โ
/dev/tempesta_mmap_log
โ
[Worker Thread 1] [Worker Thread 2] ... [Worker Thread N]
โ
ClickHouse Database
Always keep your ClickHouse database separate from the edge servers that may come under a DDoS attack – this ensures you can analyze the incident in near real time.
Configuration๐
Tempesta FW Configuration๐
For the current development version 0.9 add to your tempesta_fw.conf:
access_log mmap logger_config=/path/to/tfw_logger.json;
For the stable version 0.8 use:
access_log mmap mmap_host=localhost mmap_log=/var/log/tempesta_access.log;
Do not use dmesg for access_log! This can lead to kernel hung under heavy load and should
used only for debug purposes!
When using tfw_logger, you can configure the size of the per-CPU memory-mapped buffers
used for storing access log events before they are transmitted to external storage:
mmap_log_buffer_size <SIZE>;
SIZE specifies the buffer size in bytes for each CPU. The value must be a power of 2 and
a multiple of 4KB (page size). The allowed range is from 4KB to 128MB.
Defaults: 1M.
Examples:
mmap_log_buffer_size 4M;
mmap_log_buffer_size 512K;
mmap_log_buffer_size 16M;
Larger buffer sizes can improve performance under high load by reducing the frequency of buffer flushes to external storage, but will consume more memory. The optimal size depends on your traffic patterns and available system memory.
tfw_logger Configuration๐
0.9 (current)๐
Create a separate JSON configuration file (e.g., /etc/tempesta/tfw_logger.json):
{
"log_path": "/var/log/tempesta/tfw_logger.log",
"access_log": {
"plugin_path": "/opt/tempesta/access_log.so",
"host": "localhost",
"port": 9000,
"user": "tempesta_user",
"password": "secure_password",
"db_name": "default",
"table_name": "access_log",
"max_events": 10000,
}
}
0.8 (stable)๐
In 0.8 tfw_logger is configured by attributes for the access_log configuration
option in the main Tempesta FW configuration tempesta_fw.conf:
access_log mmap mmap_host=localhost mmap_usser=tempesta_user mmap_password=secure_password mmap_log=/var/log/tempesta_access.log;
Configuration Options๐
| Option | Description | Default | Example |
|---|---|---|---|
log_path |
Path to tfw_logger log file | /var/log/tempesta/tfw_logger.log |
|
plugin_path |
Path to access logger plugin | – | /opt/tempesta/access_log.so |
access_log.host |
ClickHouse server hostname | localhost |
clickhouse.example.com |
access_log.port |
ClickHouse native protocol port | 9000 |
9000 |
access_log.db_name |
ClickHouse database name | default |
custom_default |
access_log.table_name |
ClickHouse table name | access_log |
custom_access_log |
access_log.user |
ClickHouse username (optional) | default |
tempesta_user |
access_log.password |
ClickHouse password (optional) | – | secure_password |
access_log.max_events |
Batch size for inserts | 1000 |
500 |
Buffer Size Guidelines๐
- Small deployments: 4MB (
4194304) - Medium traffic: 16MB (
16777216) - High traffic: 64MB+ (
67108864) - Enterprise: 256MB+ (
268435456)
Buffer size must be multiple of page size and โฅ4096 bytes.
Access Log Schema๐
tfw_logger creates the following ClickHouse table structure:
CREATE TABLE IF NOT EXISTS access_log.access_log (
timestamp DateTime64(3, 'UTC'),
address IPv6,
method UInt8,
version UInt8,
status UInt16,
response_content_length UInt64,
response_time UInt32,
vhost String,
uri String,
referer String,
user_agent String,
tft UInt64,
tfh UInt64,
dropped_events UInt64
) ENGINE = MergeTree()
ORDER BY timestamp;
Field Descriptions๐
| Field | Type | Description |
|---|---|---|
timestamp |
DateTime64(3) | Request timestamp with millisecond precision |
addr |
IPv6 | Client IP address (IPv4 mapped to IPv6) |
method |
UInt8 | HTTP method (GET=1, POST=2, etc.) |
version |
UInt8 | HTTP method (GET=1, POST=2, etc.) |
status |
UInt16 | HTTP response status code |
response_content_length |
UInt64 | Response content length in bytes |
response_time |
UInt32 | Response time in milliseconds |
vhost |
String | Host header value |
uri |
String | Request URI path and query |
referer |
String | Referer header |
user_agent |
String | User-Agent header |
tft |
UInt64 | TF TLS hash |
tfh |
UInt64 | TF HTTP hash |
dropped_events |
UInt64 | Number of dropped events (monitoring) |
Field method is a numerical value (see tfw_http_meth_t in
http.h):
1: COPY
2: DELETE
3: GET
4: HEAD
5: LOCK
6: MKCOL
7: MOVE
8: OPTIONS
9: PATCH
10: POST
11: PROPFIND
12: PROPPATCH
13: PUT
14: TRACE
15: UNLOCK
16: PURGE
17: UNKNOWN
Field version is also a numerical value:
0: INVALID
1: HTTP 0.9
2: HTTP 1.0
3: HTTP 1.1
4: HTTP 2
Installation and Usage๐
1. Build tfw_logger๐
cd tempesta/logger
make build
2. Create Configuration (0.9 only)๐
# Generate default configuration
./tfw_logger --generate --config /etc/tempesta/tfw_logger.json
# Edit configuration as needed
sudo nano /etc/tempesta/tfw_logger.json
3. Setup ClickHouse๐
-- Create database and user
CREATE DATABASE access_log;
CREATE USER tempesta_user IDENTIFIED BY 'secure_password';
GRANT ALL ON access_log.* TO tempesta_user;
4. Configure Tempesta FW๐
Add to tempesta_fw.conf FOR 0.9:
access_log mmap logger_config=/etc/tempesta/tfw_logger.json;
or for 0.8:
access_log mmap mmap_host=localhost mmap_log=/var/log/tempesta_access.log;
5. Start Services๐
# Start Tempesta FW (creates mmap device)
sudo ./scripts/tempesta.sh --start
# tfw_ogger will be started automatically by tempesta.sh
# Or start manually:
sudo ./logger/tfw_logger --config /etc/tempesta/tfw_logger.json
Command Line Interface (for 0.9 only)๐
# Show help
./tfw_logger --help
# Start with specific configuration
./tfw_logger --config /etc/tempesta/tfw_logger.json
# Override configuration options
./tfw_logger --config config.json --host clickhouse.example.com --port 9001 --table custom_access_log
# Test configuration
./tfw_logger --config config.json --help
CLI Options๐
| Option | Description |
|---|---|
--help, -h |
Show help message |
--generate, -g |
Generate default configuration file |
--config, -c PATH |
Path to JSON configuration file |
--host, -H HOST |
ClickHouse hostname (override) |
--port, -P PORT |
ClickHouse port (override) |
--database, -d DATABASE |
ClickHouse database name (override) |
--table, -t TABLE |
ClickHouse table name (override) |
--user, -u USER |
ClickHouse username (override) |
--password, -p PASS |
ClickHouse password (override) |
--log-path, -l PATH |
Log file path (override) |
Performance Tuning๐
CPU Affinity๐
tfw_logger automatically sets CPU affinity for worker threads:
- Each worker thread is bound to a specific CPU core
- Number of threads automatically matches available CPU cores (respects affinity/cgroups)
- Improves cache locality and reduces context switching
Buffer Sizing๐
Larger buffers reduce syscall overhead but increase memory usage:
| Traffic Level | Buffer Size | Memory Usage |
|---|---|---|
| Low (< 1K RPS) | 4MB | ~4MB per worker |
| Medium (1K-10K RPS) | 16MB | ~16MB per worker |
| High (10K-100K RPS) | 64MB | ~64MB per worker |
| Enterprise (100K+ RPS) | 256MB+ | ~256MB+ per worker |
ClickHouse Optimization๐
-- Optimize table for high-frequency inserts
ALTER TABLE access_log.access_log
MODIFY SETTING merge_with_ttl_timeout = 3600;
-- Create materialized views for common queries
CREATE MATERIALIZED VIEW access_log.hourly_stats
ENGINE = SummingMergeTree()
ORDER BY (toStartOfHour(timestamp), status)
AS SELECT
toStartOfHour(timestamp) as hour,
status,
count() as requests,
avg(response_time) as avg_response_time
FROM access_log.access_log
GROUP BY hour, status;
Monitoring๐
Health Checks๐
# Check if tfw_logger is running
ps aux | grep tfw_logger
# Check log output
tail -f /var/log/tempesta/tfw_logger.log
# Verify ClickHouse connectivity
clickhouse-client --query "SELECT count() FROM access_log.access_log"
Metrics๐
Monitor these key metrics:
-- Request rate
SELECT
toStartOfMinute(timestamp) as minute,
count() as requests_per_minute
FROM access_log.access_log
WHERE timestamp >= now() - INTERVAL 1 HOUR
GROUP BY minute
ORDER BY minute;
-- Error rates
SELECT
status,
count() as count,
count() * 100.0 / sum(count()) OVER () as percentage
FROM access_log.access_log
WHERE timestamp >= now() - INTERVAL 1 HOUR
GROUP BY status
ORDER BY count DESC;
-- Dropped events (buffer overruns)
SELECT max(dropped_events) as max_dropped
FROM access_log.access_log
WHERE timestamp >= now() - INTERVAL 1 HOUR;
Troubleshooting๐
Common Issues๐
tfw_logger won’t start
# Check if Tempesta FW is running
sudo ./scripts/tempesta.sh --status
# Verify mmap device exists
ls -la /dev/tempesta_mmap_log
# Check permissions
sudo chmod 666 /dev/tempesta_mmap_log # Temporary fix
Permission denied on mmap device
# Run tfw_logger with appropriate permissions
sudo ./tfw_logger --config config.json
# Or fix device permissions permanently
sudo chown tempesta:tempesta /dev/tempesta_mmap_log
ClickHouse connection failed
# Test ClickHouse connectivity
clickhouse-client --host localhost --port 9000 --query "SELECT 1"
# Check user permissions
clickhouse-client --query "SHOW GRANTS FOR tempesta_user"
High memory usage
# Reduce buffer size in configuration
{
"clickhouse": {
"max_events": 500 # Reduce batch size
}
}
Log Analysis๐
Common log patterns:
# Successful startup
grep "Starting Tempesta FW Logger" /var/log/tempesta/tfw_logger.log
# Worker thread info
grep "worker threads started" /var/log/tempesta/tfw_logger.log
# ClickHouse connectivity
grep "ClickHouse" /var/log/tempesta/tfw_logger.log
# Error patterns
grep -i error /var/log/tempesta/tfw_logger.log
Integration Examples๐
Basic Analytics Dashboard๐
-- Top pages by requests
SELECT
uri,
count() as requests,
avg(response_time) as avg_response_time_us
FROM access_log.access_log
WHERE timestamp >= now() - INTERVAL 1 DAY
GROUP BY uri
ORDER BY requests DESC
LIMIT 10;
-- Status code distribution
SELECT
status,
count() as requests,
count() * 100.0 / sum(count()) OVER () as percentage
FROM access_log.access_log
WHERE timestamp >= now() - INTERVAL 1 DAY
GROUP BY status;
-- Traffic by hour
SELECT
toStartOfHour(timestamp) as hour,
count() as requests,
uniq(addr) as unique_visitors
FROM access_log.access_log
WHERE timestamp >= now() - INTERVAL 1 DAY
GROUP BY hour
ORDER BY hour;
Alerting Queries๐
-- High error rate alert
SELECT count() as error_count
FROM access_log.access_log
WHERE timestamp >= now() - INTERVAL 5 MINUTE
AND status >= 500;
-- Slow response time alert
SELECT count() as slow_requests
FROM access_log.access_log
WHERE timestamp >= now() - INTERVAL 5 MINUTE
AND response_time > 1000000; -- > 1 second
Quick Test๐
# Build and test
cd tempesta/logger
make build test