Tempesta Technologies
  • Home
  • Tempesta FW
    • Features
      • Web acceleration
      • Load balancing
      • Application performance monitoring
    • Performance
    • How it works
    • Deployment
    • Support
    • Knowledge base
  • Services
    • Software development
      • High performance
      • Networking
      • Databases
      • Linux kernel
      • Machine learning
      • How we work
      • Case studies
    • Performance analysis
    • Network security
      • DDoS protection
      • Application security
      • Cryptography
      • Security assessment
      • How we work
      • Case Studies
  • Solutions
    • DDoS Protection
    • Web Acceleration
  • Blog
  • Company
    • Research
    • Careers
    • Contact
Tempesta Technologies

Access Log Analytics

Tempesta FW uses an mmap()ed memory area to deliver client access logs to the user-space in a zero-copy fashion. tfw_logger is a user-space daemon spawning a worker thread per a CPU core and gathers access log records from the mmap()ed area with no locks and no copying. The received from the kernel access log records are grouped and sent to the ClikHouse database. Thanks to the batches, ClickHouse can absorb millions of records per second on a modest hardware node.

ClickHouse is a powerful analytical (column-oriented DBMS) for online analytical processing (OLAP). We have chosen ClickHouse because

  • it can eat enormous number of batched records per second
  • it provides powerful SQL-based query language
  • it’s open source

The ClickHouse advanced analytic queries over the stored data facilitate web performance analytics and security incidents (e.g. application DDoS attacks) response.

Architecture๐Ÿ”—

HTTP Request โ†’ Tempesta FW โ†’ mmap buffer โ†’ tfw_logger โ†’ ClickHouse
                    โ†“
                /dev/tempesta_mmap_log
                    โ†“
            [Worker Thread 1] [Worker Thread 2] ... [Worker Thread N]
                    โ†“
                ClickHouse Database

Always keep your ClickHouse database separate from the edge servers that may come under a DDoS attack – this ensures you can analyze the incident in near real time.

Configuration๐Ÿ”—

Tempesta FW Configuration๐Ÿ”—

For the current development version 0.9 add to your tempesta_fw.conf:

access_log mmap logger_config=/path/to/tfw_logger.json;

For the stable version 0.8 use:

access_log mmap mmap_host=localhost mmap_log=/var/log/tempesta_access.log;

Do not use dmesg for access_log! This can lead to kernel hung under heavy load and should used only for debug purposes!

When using tfw_logger, you can configure the size of the per-CPU memory-mapped buffers used for storing access log events before they are transmitted to external storage:

mmap_log_buffer_size <SIZE>;

SIZE specifies the buffer size in bytes for each CPU. The value must be a power of 2 and a multiple of 4KB (page size). The allowed range is from 4KB to 128MB. Defaults: 1M.

Examples:

mmap_log_buffer_size 4M;
mmap_log_buffer_size 512K;
mmap_log_buffer_size 16M;

Larger buffer sizes can improve performance under high load by reducing the frequency of buffer flushes to external storage, but will consume more memory. The optimal size depends on your traffic patterns and available system memory.

tfw_logger Configuration๐Ÿ”—

0.9 (current)๐Ÿ”—

Create a separate JSON configuration file (e.g., /etc/tempesta/tfw_logger.json):

{
    "log_path": "/var/log/tempesta/tfw_logger.log",
    "access_log": {
        "plugin_path": "/opt/tempesta/access_log.so",
        "host": "localhost",
        "port": 9000,
        "user": "tempesta_user",
        "password": "secure_password",
        "db_name": "default",
        "table_name": "access_log",
        "max_events": 10000,
    }
}

0.8 (stable)๐Ÿ”—

In 0.8 tfw_logger is configured by attributes for the access_log configuration option in the main Tempesta FW configuration tempesta_fw.conf:

access_log mmap mmap_host=localhost mmap_usser=tempesta_user mmap_password=secure_password mmap_log=/var/log/tempesta_access.log;

Configuration Options๐Ÿ”—

Option Description Default Example
log_path Path to tfw_logger log file /var/log/tempesta/tfw_logger.log
plugin_path Path to access logger plugin – /opt/tempesta/access_log.so
access_log.host ClickHouse server hostname localhost clickhouse.example.com
access_log.port ClickHouse native protocol port 9000 9000
access_log.db_name ClickHouse database name default custom_default
access_log.table_name ClickHouse table name access_log custom_access_log
access_log.user ClickHouse username (optional) default tempesta_user
access_log.password ClickHouse password (optional) – secure_password
access_log.max_events Batch size for inserts 1000 500

Buffer Size Guidelines๐Ÿ”—

  • Small deployments: 4MB (4194304)
  • Medium traffic: 16MB (16777216)
  • High traffic: 64MB+ (67108864)
  • Enterprise: 256MB+ (268435456)

Buffer size must be multiple of page size and โ‰ฅ4096 bytes.

Access Log Schema๐Ÿ”—

tfw_logger creates the following ClickHouse table structure:

CREATE TABLE IF NOT EXISTS access_log.access_log (
    timestamp DateTime64(3, 'UTC'),
    address IPv6,
    method UInt8,
    version UInt8,
    status UInt16, 
    response_content_length UInt64,
    response_time UInt32,
    vhost String,
    uri String,
    referer String,
    user_agent String,
    tft UInt64,
    tfh UInt64,
    dropped_events UInt64
) ENGINE = MergeTree()
ORDER BY timestamp;

Field Descriptions๐Ÿ”—

Field Type Description
timestamp DateTime64(3) Request timestamp with millisecond precision
addr IPv6 Client IP address (IPv4 mapped to IPv6)
method UInt8 HTTP method (GET=1, POST=2, etc.)
version UInt8 HTTP method (GET=1, POST=2, etc.)
status UInt16 HTTP response status code
response_content_length UInt64 Response content length in bytes
response_time UInt32 Response time in milliseconds
vhost String Host header value
uri String Request URI path and query
referer String Referer header
user_agent String User-Agent header
tft UInt64 TF TLS hash
tfh UInt64 TF HTTP hash
dropped_events UInt64 Number of dropped events (monitoring)

Field method is a numerical value (see tfw_http_meth_t in http.h):

1: COPY
2: DELETE
3: GET
4: HEAD
5: LOCK
6: MKCOL
7: MOVE
8: OPTIONS
9: PATCH
10: POST
11: PROPFIND
12: PROPPATCH
13: PUT
14: TRACE
15: UNLOCK
16: PURGE
17: UNKNOWN

Field version is also a numerical value:

0: INVALID
1: HTTP 0.9
2: HTTP 1.0
3: HTTP 1.1
4: HTTP 2

Installation and Usage๐Ÿ”—

1. Build tfw_logger๐Ÿ”—

cd tempesta/logger
make build

2. Create Configuration (0.9 only)๐Ÿ”—

# Generate default configuration
./tfw_logger --generate --config /etc/tempesta/tfw_logger.json

# Edit configuration as needed
sudo nano /etc/tempesta/tfw_logger.json

3. Setup ClickHouse๐Ÿ”—

-- Create database and user
CREATE DATABASE access_log;
CREATE USER tempesta_user IDENTIFIED BY 'secure_password';
GRANT ALL ON access_log.* TO tempesta_user;

4. Configure Tempesta FW๐Ÿ”—

Add to tempesta_fw.conf FOR 0.9:

access_log mmap logger_config=/etc/tempesta/tfw_logger.json;

or for 0.8:

access_log mmap mmap_host=localhost mmap_log=/var/log/tempesta_access.log;

5. Start Services๐Ÿ”—

# Start Tempesta FW (creates mmap device)
sudo ./scripts/tempesta.sh --start

# tfw_ogger will be started automatically by tempesta.sh
# Or start manually:
sudo ./logger/tfw_logger --config /etc/tempesta/tfw_logger.json

Command Line Interface (for 0.9 only)๐Ÿ”—

# Show help
./tfw_logger --help

# Start with specific configuration 
./tfw_logger --config /etc/tempesta/tfw_logger.json

# Override configuration options
./tfw_logger --config config.json --host clickhouse.example.com --port 9001 --table custom_access_log

# Test configuration
./tfw_logger --config config.json --help

CLI Options๐Ÿ”—

Option Description
--help, -h Show help message
--generate, -g Generate default configuration file
--config, -c PATH Path to JSON configuration file
--host, -H HOST ClickHouse hostname (override)
--port, -P PORT ClickHouse port (override)
--database, -d DATABASE ClickHouse database name (override)
--table, -t TABLE ClickHouse table name (override)
--user, -u USER ClickHouse username (override)
--password, -p PASS ClickHouse password (override)
--log-path, -l PATH Log file path (override)

Performance Tuning๐Ÿ”—

CPU Affinity๐Ÿ”—

tfw_logger automatically sets CPU affinity for worker threads:

  • Each worker thread is bound to a specific CPU core
  • Number of threads automatically matches available CPU cores (respects affinity/cgroups)
  • Improves cache locality and reduces context switching

Buffer Sizing๐Ÿ”—

Larger buffers reduce syscall overhead but increase memory usage:

Traffic Level Buffer Size Memory Usage
Low (< 1K RPS) 4MB ~4MB per worker
Medium (1K-10K RPS) 16MB ~16MB per worker
High (10K-100K RPS) 64MB ~64MB per worker
Enterprise (100K+ RPS) 256MB+ ~256MB+ per worker

ClickHouse Optimization๐Ÿ”—

-- Optimize table for high-frequency inserts
ALTER TABLE access_log.access_log 
MODIFY SETTING merge_with_ttl_timeout = 3600;

-- Create materialized views for common queries
CREATE MATERIALIZED VIEW access_log.hourly_stats
ENGINE = SummingMergeTree()
ORDER BY (toStartOfHour(timestamp), status)
AS SELECT
    toStartOfHour(timestamp) as hour,
    status,
    count() as requests,
    avg(response_time) as avg_response_time
FROM access_log.access_log
GROUP BY hour, status;

Monitoring๐Ÿ”—

Health Checks๐Ÿ”—

# Check if tfw_logger is running
ps aux | grep tfw_logger

# Check log output
tail -f /var/log/tempesta/tfw_logger.log

# Verify ClickHouse connectivity
clickhouse-client --query "SELECT count() FROM access_log.access_log"

Metrics๐Ÿ”—

Monitor these key metrics:

-- Request rate
SELECT 
    toStartOfMinute(timestamp) as minute,
    count() as requests_per_minute
FROM access_log.access_log 
WHERE timestamp >= now() - INTERVAL 1 HOUR
GROUP BY minute 
ORDER BY minute;

-- Error rates  
SELECT 
    status,
    count() as count,
    count() * 100.0 / sum(count()) OVER () as percentage
FROM access_log.access_log
WHERE timestamp >= now() - INTERVAL 1 HOUR
GROUP BY status
ORDER BY count DESC;

-- Dropped events (buffer overruns)
SELECT max(dropped_events) as max_dropped 
FROM access_log.access_log
WHERE timestamp >= now() - INTERVAL 1 HOUR;

Troubleshooting๐Ÿ”—

Common Issues๐Ÿ”—

tfw_logger won’t start

# Check if Tempesta FW is running
sudo ./scripts/tempesta.sh --status

# Verify mmap device exists
ls -la /dev/tempesta_mmap_log

# Check permissions
sudo chmod 666 /dev/tempesta_mmap_log  # Temporary fix

Permission denied on mmap device

# Run tfw_logger with appropriate permissions
sudo ./tfw_logger --config config.json

# Or fix device permissions permanently
sudo chown tempesta:tempesta /dev/tempesta_mmap_log

ClickHouse connection failed

# Test ClickHouse connectivity
clickhouse-client --host localhost --port 9000 --query "SELECT 1"

# Check user permissions
clickhouse-client --query "SHOW GRANTS FOR tempesta_user"

High memory usage

# Reduce buffer size in configuration
{
    "clickhouse": {
        "max_events": 500     # Reduce batch size
    }
}

Log Analysis๐Ÿ”—

Common log patterns:

# Successful startup
grep "Starting Tempesta FW Logger" /var/log/tempesta/tfw_logger.log

# Worker thread info
grep "worker threads started" /var/log/tempesta/tfw_logger.log  

# ClickHouse connectivity
grep "ClickHouse" /var/log/tempesta/tfw_logger.log

# Error patterns
grep -i error /var/log/tempesta/tfw_logger.log

Integration Examples๐Ÿ”—

Basic Analytics Dashboard๐Ÿ”—

-- Top pages by requests
SELECT 
    uri,
    count() as requests,
    avg(response_time) as avg_response_time_us
FROM access_log.access_log
WHERE timestamp >= now() - INTERVAL 1 DAY
GROUP BY uri
ORDER BY requests DESC
LIMIT 10;

-- Status code distribution
SELECT 
    status,
    count() as requests,
    count() * 100.0 / sum(count()) OVER () as percentage
FROM access_log.access_log  
WHERE timestamp >= now() - INTERVAL 1 DAY
GROUP BY status;

-- Traffic by hour
SELECT 
    toStartOfHour(timestamp) as hour,
    count() as requests,
    uniq(addr) as unique_visitors
FROM access_log.access_log
WHERE timestamp >= now() - INTERVAL 1 DAY  
GROUP BY hour
ORDER BY hour;

Alerting Queries๐Ÿ”—

-- High error rate alert
SELECT count() as error_count
FROM access_log.access_log
WHERE timestamp >= now() - INTERVAL 5 MINUTE
  AND status >= 500;

-- Slow response time alert  
SELECT count() as slow_requests
FROM access_log.access_log
WHERE timestamp >= now() - INTERVAL 5 MINUTE
  AND response_time > 1000000; -- > 1 second

Quick Test๐Ÿ”—

# Build and test
cd tempesta/logger
make build test


Share this article
  • Home
  • Requirements
  • Installation
    • Install from packages
    • Install from Sources
  • Configuration
    • Migration from Nginx
    • On the fly Reconfiguration
    • Handling clients
    • Backend servers
    • Scheduling and Load Balancing
    • Caching Responses
    • Non Idempotent Requests
    • Modify HTTP Messages
    • Virtual hosts and locations
    • Sticky Cookie
    • HTTP tables
    • HTTP security
    • Header Via
    • Health monitor
    • Tempesta TLS
    • Vhost Confusion
    • Traffic Filtering by Fingerprints
    • Access Log Analytics
  • Run and stop
  • Application Performance Monitoring
    • Performance statistics
    • Servers statistics
  • Use cases
    • Clouds
    • High availability
    • DDoS mitigation
    • Web security
    • WAF acceleration
    • Best practices
    • WordPress tips and tricks
  • Performance
    • Hardware virtualization performance
    • HTTP cache performance
    • HTTP transactions performance
    • HTTPS performance
    • HTTP2 streams prioritization
  • Bot Protection
    • Tempesta Webshield
    • Setup and Run The Webshield
    • Webshield Configuration
    • Webshield Detectors
    • Webshield Observability
    • Webshield Use Cases
  • Contributing
    • Report issues and send patches
    • Development guidelines
    • Memory safety guideline
    • Debugging and troubleshooting
    • Prepare a new release
    • Testing
    • QTCreator project

Powered by Tempesta FW

Stay up to date with our latest developments

Useful Links

Home
Blog

Tempestaยฎ FW

Features
Performance
Deployment
Support
Knowledge Base

Services

Software Development
Performance analysis
Network Security

Solutions

DDoS Protection

Web Acceleration

Company

Research
Careers
Contact