Tempesta Technologies
  • Home
  • Tempesta FW
    • Features
      • Web acceleration
      • Load balancing
      • Application performance monitoring
    • Performance
    • How it works
    • Deployment
    • Support
    • Knowledge base
  • Services
    • Software development
      • High performance
      • Networking
      • Databases
      • Linux kernel
      • Machine learning
      • How we work
      • Case studies
    • Performance analysis
    • Network security
      • DDoS protection
      • Application security
      • Cryptography
      • Security assessment
      • How we work
      • Case Studies
  • Solutions
    • DDoS Protection
    • Web Acceleration
  • Blog
  • Company
    • Research
    • Careers
    • Contact
Tempesta Technologies

Memory safety and network security

By Alexander Krizhanovsky | Posted on January 22, 2025

US government has issued documents promoting memory safe languages (MSLs) like Rust, C#, Java, Go and others, while advising to shift away from languages such as C, C++ and Assembly. Notable documents include

  • Software Memory Safety by NSA, 2022
  • Product Security Bad Practices by CISA, 2024

These documents emphasize not only software safety but also security concerns. The C++ community has been actively addressing these issues and clarifying the distinction between safety and security, e.g. reference these two talks and the recent CppCon discussion panel:

  • Delivering Safe C++ – Bjarne Stroustrup – CppCon 2023
  • Herb Sutter: Safety, Security, Safety and C / C++ – C++ Evolution, ACCU 2024
  • C++ Safety And Security Panel 2024 – Hosted by Michael Wong – CppCon 2024

The safety and security issues are crucial for us because we develop both secure and security-focused Internet server software. By “security-focused”, we refer to software that actively blocks malicious traffic as a core feature – for example, our open source hybrid of web accelerator and application firewall (WAF) Tempesta FW, or the proprietary volumetric DDoS mitigation proprietary module, Tempesta xFW. By “secure”, we mean general-purpose server software, such as an S3 server, which may not implement extensive security logic but must be inherently robust to withstand hacker attacks. These types of software are deployed at the network edge, where reliability is crucial. A failure at this level can render a company’s services completely inaccessible to its clients.

On the other hand, we use C, C++ and Assembly languages for the development of this software. The reason is that edge software has extremely demanding performance requirements – the servers must deliver high throughput and low latency, even under high-percentile load conditions. These stringent requirements significantly limit the choice of MSLs; in particular, only Rust could be a viable alternative. However, as we demonstrate below, many tasks – particularly the most complex ones – cannot be implemented using safe Rust code.

Product Security Bad Practices by CISA, 2024 advices:

Software manufacturers should build products in a manner that systematically prevents the introduction of memory safety vulnerabilities, such as by using a memory safe language…

and makes a statement

For existing products that are written in memory-unsafe languages, not having a published memory safety roadmap by January 1, 2026 is dangerous and significantly elevates risk to national security, national economic security, and national public health and safety.

With this article, we

  1. introduce our Memory safety guideline – our coding guideline specifically tailored for network edge server software development. While the guideline is not yet complete, it will continue to evolve over time.
  2. discuss the challenges of balancing safety and performance, and explain why the solution isn’t as simple as “just switch to Rust.”

Safety and reliability

The C++ community produces numerous insightful guides, talks, and articles on memory safety. Both C and C++ compilers offer a wide range of options to enhance and enforce program safety. Tempesta FW is a Linux kernel module and the kernel, being developed mostly in C, provides many safety configuration options.

We are still in the process of gathering and documenting all the C++ guidelines, compiler and the Linux kernel options that contribute to safer code. These topics are addressed in our C++ memory safety wiki page, which serves as a guideline we follow during product development and custom software projects.

Software reliability isn’t only about preventing raw memory operations. Even seemingly simple practices can lead to software crashes if mishandled. For example, consider assert() in C and C++, assert!() in Rust or BUG_ON() in the Linux kernel. These statements are abundant in open-source projects, and each one has the potential to cause a server crash. Assertions are meant to check conditions under which the program cannot continue normal operation. However, assertions are sometimes misused, leading to crashes when a program encounters erroneous input that should have been handled correctly. 

One of the key takeaways from Herb Sutter’s talk (a topic was also discussed in the CppCon’24 panel) is that Rust enforces memory safety, while C++ provides safety features. While this fundamental distinction cannot be changed, we aim to improve safety by adhering to our guidelines and enforcing mandatory code reviews.

C++ is getting better

There are strong critics of C++, e.g. MemorySafety.org writes:

Using C and C++ is bad for society, bad for your reputation, and it’s bad for your customers.

The article references Modern C++ Won’t Save Us, providing several examples that highlight flaws in the C++ language. While it’s possible to find many other examples, such as those discussed in C++ safety, in context or Compile-Time Validation in C++ Programming – Alon Wolf – CppCon 2024, these examples are representative enough. Let’s examine these cases and explore how to avoid such code flaws by leveraging the modern C++ standard, compiler options, and best practices.

The first example program is

#include <iostream>
#include <string>
#include <string_view>

int main() {
    	std::string s = "Hellooooooooooooooo ";
    	std::string_view sv = s + "World\n";
    	std::cout << sv;
}

Which is originally taken from the C++ Core Guideline issue report by 2017. If we compile the program with Clang++ 18, then the compiler reports a warning about user after free, even with no extra options:

$ clang++ string_view_uaf.cc
string_view_uaf.cc:15:24: warning: object backing the pointer will be
destroyed at the end of the full-expression [-Wdangling-gsl]
   15 |     	std::string_view sv = s + "World\n";
      |                           	^~~~~~~~~~~~~

Just don’t ignore compiler warnings to avoid such bugs in your code.

The second example demonstrates unsafety of C++ smart pointers, shared_ptr in particular:

#include <memory>
#include <iostream>
#include <functional>

std::function<int(void)> f(std::shared_ptr &x) {
    	return [&]() { return *x; };
}

int main() {
    	std::function<int(void)> y(nullptr);
    	{
            	std::shared_ptr x(std::make_shared(4));
            	y = f(x);
    	}  	 
    	std::cout << y() << std::endl;
}

Shared pointers transfer or share ownership of an object. However, when you pass a shared pointer by reference, you bypass its copying mechanism, which is responsible for tracking references. The f() function, in this program, essentially unwraps the controlled object and returns it to the caller. This code clearly violates the C++ Core Guidelines, which states that for general use, take T* or T& arguments rather than smart pointers.

Unfortunately, both clang and g++ compile this code without any warnings. Even Clang static analysis fails to identify the problem in the code:

$ clang++ --analyze -Xanalyzer -analyzer-output=text -Wall -std=c++23 \
  -Wextra lambda_dangling_ptr.cc
$ ./a.out
4

However, cppcheck reveals and explains the problem:

$ cppcheck lambda_dangling_ptr.cc
Checking lambda_dangling_ptr.cc ...
lambda_dangling_ptr.cc:17:15: error: Using object that points to local
variable 'x' that is out of scope. [invalidLifetime]
 std::cout << y() << std::endl;
          	^
lambda_dangling_ptr.cc:8:9: note: Return lambda.
 return [&]() { return *x; };
    	^
lambda_dangling_ptr.cc:7:51: note: Passed to reference.
std::function<int(void)> f(std::shared_ptr<int>& x) {
                                              	^
lambda_dangling_ptr.cc:8:25: note: Lambda captures variable by reference here.
 return [&]() { return *x; };
                    	^
lambda_dangling_ptr.cc:15:9: note: Passed to 'f'.
  y = f(x);
    	^
lambda_dangling_ptr.cc:14:24: note: Variable created here.
  std::shared_ptr<int> x(std::make_shared<int>(4));
                   	^
lambda_dangling_ptr.cc:17:15: note: Using object that points to local
variable 'x' that is out of scope.
 std::cout << y() << std::endl;
          	^

Clang AddressSanitizer also doesn’t reveal the problem. But valgrind does catch the problem:

$ valgrind ./a.out
...
==825108== Invalid read of size 4
==825108== at 0x109568: f(std::shared_ptr<int>&)::$_0::operator()()
                        const (lambda_dangling_ptr.cc:8)
==825108== by 0x109544: int std::__invoke_impl<int,
                        f(std::shared_ptr<int>&)::$_0&>
                        (std::__invoke_other, f(std::shared_ptr<int>&)::$_0&)
                        (invoke.h:61)
==825108== by 0x1094F4: std::enable_if<is_invocable_r_v<int,
                        f(std::shared_ptr<int>&)::$_0&>, int>::type
                        std::__invoke_r<int, f(std::shared_ptr<int>&)::$_0&>
                        (f(std::shared_ptr<int>&)::$_0&) (invoke.h:114)
==825108== by 0x10940C: std::_Function_handler<int (),
                        f(std::shared_ptr<int>&)::$_0>::_M_invoke
                        (std::_Any_data const&) (std_function.h:290)
==825108== by 0x109844: std::function<int ()>::operator()() const
                        (std_function.h:591)
==825108== by 0x10932A: main (lambda_dangling_ptr.cc:17)
...

The third example is about std::optional and is quite similar to the previous one. It demonstrates the misuse of a wrapping object – in this case, std::optional – by breaking the logic when accessing the wrapped object:

int f() {
    std::optional<int> x(std::nullopt);
    return *x;
}

int main() {
     std::cout << std::hex << f() << std::endl;
     return 0;
}

The problem with the code is that it uses (prints) an uninitialized value:

$ clang++ optional.cc
$ ./a.out
a61846e8
$ ./a.out
b6d892b8
$ ./a.out 8035018

C++26 comes with default initialization, but for now we can use a compilation option to initialize variables by default:

$ clang++ -ftrivial-auto-var-init=pattern optional.cc
$ ./a.out
aaaaaaaa
$ ./a.out
aaaaaaaa
$ ./a.out
aaaaaaaa

In addition to the issues described in the blog post, there are also C++ projects that interact with C code or are written in a mixed C/C++ style. These projects often use raw, unsafe pointer arithmetic, which introduces additional risks. For example:


int get_last_element(int *ptr, size_t sz) {
    return ptr[sz - 1];
}

This example is taken from the C++ safe buffers page, which describes the Clang++ hardening extension. To use this extension we need a modern C++ standard and the -Wunsafe-buffer-usage command-line option:

$ clang++ -std=c++23 -Wunsafe-buffer-usage -c safe_buffers.cc
safe_buffers.cc:9:9: warning: unsafe buffer access [-Wunsafe-buffer-usage]
 9 |     return ptr[sz - 1];
   |            ^~~
safe_buffers.cc:9:9: note: pass -fsafe-buffer-usage-suggestions to receive
code hardening suggestions
1 warning generated.

This problem can be solved with redesigning the function as


int get_last_element(std::span<int> sp, size_t sz) {
    return sp[sz - 1];
}

For sure, it’s possible to write bad, unsafe, code in any programming language, including Rust. For example, the joke project cve-rs demonstrates several memory vulnerabilities in Rust programs, even when operating in safe mode!

C++ and C safety tomorrow

The C++ Core Guidelines describe three safety profiles: types, bound, and lifetime. Although the profiles are only sparingly documented, the lifetime safety profile appears capable of addressing most of the problems above. There was an attempt to implement a lifetime safety profile in Clang, but it was not upstreamed. And there are valid reasons why implementing these profiles directly in a compiler is challenging.

In his CppCon 2024 talk, Compile-Time Validation in C++ Programming, Alon Wolf introduces a compile-time validation library Mitzi, which seems capable of preventing many safety issues, including those mentioned earlier. However, this approach comes with a syntax overhead: not only pointers are accessed though a checker API, but code nesting is also managed with the M_() macro.


M_({)
    auto outer = M::ptr(vec.ref()[0]);
    M_({)
        auto vec2 = M::borrow(std::vector{34});
        auto inner = M::ptr(vec2.ref()[0]);
        inner.assign(outer);
        //vec.mut().clear();
        //vec2.mut().clear();
        //outer.assign(inner);
    M_(})
M_(})

The syntax overhead of Mitzi is significant, but the approach is good to be implemented as a compiler extension.

Thomas Neumann proposes integrating memory safety as a C++ standard extension, performing all safety checks directly within the compiler. Another notable proposal is Safe C++, which already has a working compiler, Circle. However, both proposals are still far from being included in the C++ standard.

For C, Clang is planning to implement bounds checking with the -fbounds-safety compiler option. Additionally, the SafeStack protection (-fsanitize=safe-stack) in Clang provides safety for both C and C++.

The Linux kernel is evolving as well

So far, we discussed C++ and its modern standards. However, there are many C-based projects, with the Linux kernel being a prominent example. Let’s take a look at recent safety improvements in the kernel.

  • Bounds checking: patches have been introduced to improve bounds checking for the sa_data field in the sockaddr structure.
  • Variable-length arrays (VLAs): VLAs are being systematically removed from the kernel to eliminate associated risks.
  • __counted_by() macro: This macro helps sanitizers track out-of-bound access in flexible arrays.
  • Address space layout randomization (ASLR): ASLR for the stack and heap increases the difficulty for exploits to succeed.
  • Explicit fallthrough: The use of explicit fallthrough annotations in switch statements improves code clarity and correctness.

Although it’s beyond the C and C++ programming languages, modern CPUs provide endbr64 instruction, which mitigates return/jump/call-oriented programming attacks. For more details on this instruction and its implementation in Clang and GCC compilers, refer to our recent talk, “What C and C++ Can Do and When Do You Need Assembly”, at the P99 conference.

These are just a few of the security features introduced into the Linux kernel in recent years. Additionally, the Linux Kernel Self-Protection Project (KSPP) has been continuously pushing new security features into the upstream kernel.

Lastly, there are guidelines for configuring a secure Linux kernel – tools and recommendations that even seasoned kernel developers may find insightful:

  • Recommended Settings by KSPP
  • Kernel Hardening Checker by Alexander Popov

Security

If we define a secure system as one with no vulnerabilities, then the development of such systems is supported by the coding guidelines and hardening compilation options discussed above. Now, let’s shift our focus to security systems, which are designed to protect other systems. These include web application firewalls (WAFs), DDoS mitigation systems, and cryptography libraries.

Interestingly, the security features of these applications are closely intertwined with performance.

Cryptography is about Assembly 

If you had to sacrifice, say, 10%, 30%, or even 50% of performance to gain more security, that might seem like a reasonable trade-off, right? But what if the performance cost were x10 or even x40? What if your web server took 40 times longer to respond to a client? At that point, the server would likely be considered completely unusable.

If you think these numbers are exaggerated, they’re not. Consider mbed TLS, a pretty secure TLS and cryptography library. Since 2015, only 12 vulnerabilities have been discovered in mbed TLS – significantly fewer than the 62 vulnerabilities found in OpenSSL over a similar period. This difference might be partly due to mbed TLS’s smaller adoption compared to OpenSSL.

However, the key takeaway is this: when we forked Tempesta TLS from mbed TLS, the original library relied almost entirely on straightforward C programming, with minimal assembly usage. This approach prioritized simplicity, making the code easier to review and less prone to bugs, ultimately aiming for enhanced security.

During our FOSDEM talk we demonstrated a live demo where Tempesta TLS establishes 40-80% more TLS connections per second than OpenSSL/Nginx. By heavily optimizing our fork of mbed TLS, we measured approximately x40 performance improvements. While mbed TLS is designed for embedded applications, it is entirely unsuitable for web servers due to its lack of performance at scale.

However, this isn’t just about the number of client handshakes per second. It is also about resilience against DDoS attacks. For example, a TLS-enabled server can be overwhelmed by a TLS benchmarking tool like tls-perf, which simulates heavy client loads. While Tempesta TLS can limit the rate of new TLS sessions, choosing the right rate limit is difficult, and such limits are ineffective against distributed attacks employing hundreds of thousands of low-rate bots. To handle such attacks effectively, we must provide enough performance to withstand massive traffic surges, even when the attack cannot be mitigated efficiently through other means.

Does Rust offer a fast cryptography library? Rustls is a notable implementation introduced as an alternative to “unsafe” OpenSSL:

OpenSSL is a ubiquitous TLS library, used in a large percentage of all devices connected to the Internet. Unfortunately, it’s written in C and has a long history of memory safety vulnerabilities.

…

Fortunately, there is an excellent alternative to OpenSSL for many use cases. Rustls is a high-quality TLS implementation written in Rust, a memory safe language.

Rustls outperforms OpenSSL in a benchmark. However, instead of relying on RustCrypto, which appears to be the only cryptography backend fully implemented in Rust, the benchmark used aws-lc for all cryptographic computations. aws-lc is a fork of BoringSSL (itself a fork of OpenSSL), written in C++.

RustCrypto, in contrast, is a purely Rust implementation with no assembly optimizations. In this sense, it resembles mbed TLS (which still incorporates some assembly). The likely reason the benchmark used the aws-lc backend instead of RustCrypto is its lower performance, as purely high-level implementations typically lag behind in speed.

All high-performance cryptography libraries – such as OpenSSL, WolfSSL, and Tempesta TLS – leverage hand-optimized assembly for core mathematical operations. In our p99 talk, “What C and C++ Can Do and When Do You Need Assembly”, we showcased an example of a simple function that adds one 128-bit integer to another, where both integers are represented as arrays of two unsigned long elements:


void s_mp_add(unsigned long *restrict a, unsigned long *restrict b) {
    unsigned long carry;
    a[0] += b[0];
    carry = (a[0] < b[0]);
    a[1] += b[1] + carry;
}

The modern GCC 14.2 and rustc 1.84 can emit the ADDC instruction (x86-64) for this code (see equivalent code for Rust in Compiler explorer), which handles the carry bit. However, the earlier GCC 13.3 was unable to do so (refer to Compiler Explorer for the code example). The right assembly code for the function should look like:

// Pointer to a is in %RDI, pointer to b is in %RSI
movq (%rdi), %r8
movq 8(%rdi), %r9

addq (%rsi), %r8  // add with carry
addc 8(%rsi), %r9 // use the carry in the next addition

movq (%r8), (%rdi)
movq (%r9), 8(%rdi)

Rustc 1.84 also generates the optimized code for an equivalent function


pub fn s_mp_add(a: &mut [u64; 2], b: &[u64; 2]) {
    let mut carry = 0;
    a[0] = a[0].wrapping_add(b[0]);
    if a[0] < b[0] {
        carry = 1;
    }
    a[1] = a[1].wrapping_add(b[1]).wrapping_add(carry);
}

However, both Clang and GCC fail to generate more sophisticated multiplication code involving MULX instruction with two separate carry chains using ADCX and ADOX instructions. For an optimized assembly implementation of this approach, you can refer to the Tempesta TLS code.

Application layer DDoS and Web Application Firewalls

When people ask

What’s the difference between Tempesta FW and other web accelerators?

Our answer is simple:

Security and performance

To protect web applications, Tempesta FW performs deep analysis of incoming HTTP requests. This requires our HTTP parsing state machine to execute over 10 times more states compared to a server like Nginx. In our SCALE 18x talk (see also slides) we demonstrated that even Nginx’s small HTTP parser can become a bottleneck during an HTTP request flood attack. For sure we needed quite advanced technologies to make our HTTP parser fast enough to efficiently handle large volumes of ingress traffic, such as application layer DDoS attacks. In the same talk we described our SIMD strings parsing algorithms and the applications of multiple GCC extensions to generate a fast state machine. While it might be possible to implement these techniques in Rust, it is not feasible to achieve this level of performance and complexity in safe Rust.

Tempesta FW was designed as a platform for web application firewalls (WAFs) or a standalone WAF accelerator. Regular expressions still remain the core logic of any WAF product. A typical WAF runs hundreds or thousands of regular expressions for each HTTP request. There is no surprise that regular expressions are a typical bottleneck for WAFs. The Hyperscan regular expression engine is the most widely used solution for WAFs to process numerous rules in parallel (we’re also integrating Hyperscan into Tempesta FW). Hyperscan itself relies heavily on SIMD logic to achieve its performance. However, such complex SIMD implementations cannot currently be written in a memory-safe manner. 

A smart enough DDoS attacker targets the slowest resource. If your WAF, designed to protect against web and DDoS attacks, cannot handle high traffic volume, it risks becoming a DDoS attack vector itself.

Performance

System programming for high-performance server software involves more than just using a low-level programming language like C, C++, or Rust. Achieving outstanding performance often requires implementing custom memory allocators, specialized data structures, efficient interactions with OS system calls, and more.

Rust crates use plenty of unsafe code, particularly for interfacing with other libraries, including OS APIs, and for optimizing performance. For more details, refer to the article Unsafe Rust in the Wild. Even basic data structures, such as a linked list in Rust, typically require the use of unsafe code:

Alright, so that’s implementing a 100% safe doubly-linked list in Rust. It was a nightmare to implement, leaks implementation details, and doesn’t support several fundamental operations.

Blart, an adaptive radix tree Rust crate, heavily relies on unsafe code blocks. Similarly, in Tempesta FW, we utilize numerous custom data structures, including lock-free HTrie and ring-buffer, hash tables with LRU lists, memory pools, system page allocators with advanced reference counting, and many other low-level techniques.

Implementing such techniques in Rust, even with unsafe code, would be extremely complex. In contrast, the simpler code in C is easier to review and debug, resulting in fewer bugs and making it inherently safer.

For a more detailed discussion of the performance capabilities of programming languages, refer to our post Fast programming languages: C, C++, Rust, and Assembly.

Conclusion

Let’s conclude this article with a few thoughts on how to develop software that is performant, reliable, safe and secure. Achieving all these properties simultaneously already seems like an almost impossible task, especially when working under strict deadlines – which is almost always the case nowadays. But that’s not the whole story. The modern, highly complex tech landscape evolves at an incredible pace: HTTP/1.1 is now almost exclusively used over TLS, HTTP/2 has largely replaced HTTP/1, HTTP/3 is replacing HTTP/2, and TLS 1.3 has succeeded TLS 1.2. And these are just the most notable changes. Many additional updates, such as changes to RFCs, significantly impact protocols – for example, the modification of stream scheduling algorithms in HTTP/2. In addition to the four core requirements mentioned above, two more challenges arise: the need for quick releases and the reality of a constantly growing codebase.

It’s no surprise that hyperscallers like Google are affected by these challenges on a great scale.Their products are expected to be highly performant, reliable, and introduced at an incredible pace. In this context, “hyperscale” refers to “hundreds of millions of lines of C++ code,” as noted in Google’s paper, Secure by Design: Google’s Perspective on Memory Safety. In the paper, Google discusses a potential transition toward memory-safe languages. While the focus is on all C++ code, the paper highlights that approximately 70% of vulnerabilities in Chrome and Android are attributed to memory unsafety. By contrast, memory unsafety accounts for only 16–29% of vulnerabilities in server-side software, much of which is written in memory-safe languages like Java, Kotlin, and Go.

Projects like Tempesta FW, Nginx, or Envoy typically consist of 100-300 thousand lines of C or C++ code (with some assembly in Tempesta’s case). In comparison, Chromium exceeds 10 million lines of code.

The key difference lies in scale:

  • Cryptography, which involves complex mathematical operations, comprises only a few thousand lines of code but requires assembly language for high performance.
  • Web or database server engines, with hundreds of thousands of lines of code, rely on C or C++ to implement advanced memory allocators, data structures, and seamless OS integration.
  • Application software like web browsers, with millions of lines of code and still performance-sensitive, benefits from safer languages like C++ or Rust.

Like Rust, Go was also born from dissatisfaction with C++. Now Go has a strong niche in server software, particularly for monitoring systems. While it has taken some market share from C and C++, it hasn’t replaced them entirely. Similarly, Rust will likely capture a share of the market – possibly from Go, as well as from C and C++.

However, C and C++ will remain dominant in the high-performance niche. At Tempesta FW, we will continue to refine our memory safety guideline to ensure that we produce not only the fastest but also the safest code.


Share this article
  • Previous PostBuilding your own WordPress staging with Tempesta FW
  • Next PostThe ‘Made You Reset’ HTTP/2 DDoS Attack: Analysis and Mitigation

Leave a Reply (Cancel reply)

Your email address will not be published. Required fields are marked *

*
*

Powered by Tempesta FW

Stay up to date with our latest developments

Useful Links

Home
Blog

Tempesta® FW

Features
Performance
Deployment
Support
Knowledge Base

Services

Software Development
Performance analysis
Network Security

Solutions

DDoS Protection

Web Acceleration

Company

Research
Careers
Contact