Troubleshooting Server

Purpose of the Troubleshooting Server🔗

The goal of the server is to collect dmesg logs sent by Netconsole from each instance, analyze them, and send reports to the TFW Support Team.

Network Behavior🔗

The server listens on both UDP and TCP sockets:

UDP is used for log messages sent by Netconsole (fast but limited in message size).
TCP is used to receive larger messages, such as system information, since UDP has payload size limitations.

Log Handling and Incidents🔗

Logs received from each instance are buffered individually.

If an incident is detected, the server sends the entire log buffer for that instance to the Support Server.

We define an Incident message such logs:

Kernel Crash
BUG
WARNING
ERROR
Oops

These messages are typically generated by the TFW kernel modules or the kernel itself.

Data Sent to the Support Server🔗

The data passed to the Support Server includes:

The full log buffer for the instance that encountered an issue
Additional system information, if it was submitted by the instance over TCP

Server Environment Variables🔗

Name	Example	Description
–host	192.168.0.1	IP address of the server. Must match the one configured in Netconsole.
–port	5555	Port the server listens on. Must match the one configured in Netconsole.
–support-host	support.tfw.dev	IP address or hostname of the Tempesta Support Server.
–support-port	443	Port used to connect to the Tempesta Support Server.
–support-client-id	abc123	Client ID for authentication. Obtain it from the Tempesta Support Team.
–support-api-key	xxx-yyy-zzz	API key for authentication. Obtain it from the Tempesta Support Team.
–config-file	/etc/tfw/server.conf	Path to the configuration file.
–skip-ssl-verification	flag	Use this flag for self-signed SSL certificates.

Requirements🔗

To start the server, all you need is Python 3.10+ installed on the system.

Starting the Server (Basic Usage)🔗

Run the server manually using CLI arguments:

./app.py \
  --host=192.168.0.1 \
  --port=5556 \
  --support-host=175.x.x.x \
  --support-port=443 \
  --support-client-id=my-app-id \
  --support-api-key=my_secret

Using a JSON Config File🔗

You can also use a configuration file instead of command-line arguments:

./app.py --config-file=/etc/tempesta-troubleshooting-server/env.json

Example env.json:

{ “host”: “192.168.0.1”, “port”: 5555, “support_host”: “173.0.0.0”, “support_port”: 5556, “support_client_id”: “my-app-id”, “support_api_key”: “my-api-key”, “skip_ssl_verification”: false, “log_level”: “info” }

In case the Tempesta Support Server uses a self-signed SSL certificate, set "skip_ssl_verification": true to bypass SSL verification.

Debugging and Manual Testing🔗

To test if messages are being captured correctly:

Send a test message via dmesg:

echo "test-message" > tee /dev/kmsg

Simulate an incident (e.g. WARNING, ERROR, etc.):

echo "WARNING test message" > /dev/kmsg

Set log level to "debug" in env.json to see incoming messages printed to the console.

Autostart via systemd🔗

To run the server as a background service:

Create a unit file at /etc/systemd/system/tempesta-troubleshooting.service:

[Unit] Description=TroubleShooting Service After=network-online.target Wants=network-online.target

[Service] Type=simple ExecStart=/usr/bin/python3 /path/to/app.py –config-file=/etc/tempesta-troubleshooting-server/env.json RemainAfterExit=true

[Install] WantedBy=multi-user.target Then enable and start the service:

sudo systemctl daemon-reload
sudo systemctl enable tempesta-troubleshooting.service
sudo systemctl start tempesta-troubleshooting.service

Running with Docker🔗

You can also run the server in Docker:

docker build -t tempesta-troubleshooting-server .
docker run -d --network=host \
  -v /etc/tempesta-troubleshooting-server:/etc/tempesta-troubleshooting-server:ro \
  --name tempesta-troubleshooting-server tempesta-troubleshooting-server

Note: Use --network=host to preserve original source IPs. If you use -p 192.168.0.1:5556:5556, you’ll receive the Docker NAT IP instead of the real IP from Netconsole clients.