WireGuard kernel module vs. user space: speed and performance
Dragan Čečavac
September 26, 2023
Table of contents
The WireGuard kernel module has been enabled in AOSP (Android Open Source Project) since 2020 and should in theory be preferred over user space implementations. The Nord Security team explored this hypothesis by running several tests which compared key metrics for kernel module and user space. This article provides a brief overview of the collected results.
Differences between user space and kernel
What is the difference between these two approaches? In terms of sending and receiving traffic, there should be no real differences. The same network and cryptographic operations are being performed in both cases, but they are occurring in different parts of the operating system.
User space is a mode intended for applications and similar software, while kernel space code runs in a privileged mode with direct access to rudimentary parts of the system. Direct access usually translates to a lower amount of CPU instructions being executed, and this can provide noticeable improvements in speed and power-saving. These improvements are why device drivers, for example, usually come in the form of a kernel module.
Setup
The entire test setup was located within one physical LAN, where a laptop running Ubuntu 22.04 was used as a WireGuard and iperf3 server. Client devices were connected using the official Android WireGuard application and had to be rooted in order to utilize the kernel module.
Client device #1 was a Raspberry Pi 4 - Model B (RPi 4) running Android 13 — KonstaKANG 20230412. It had an USB-C voltmeter attached to its power connector which was used to capture power consumption data.
Client device #2 was a NUC11PHKi7C (NUC11) running Android 12 — Bliss 15.8.5 — and was booted directly from a USB drive. Its main purpose was to provide supplementary data in cases where client device #1 did not suffice.
Tests
We were primarily interested to see the differences in the following categories:
Latency - Measured by pinging 8.8.8.8
Throughput - Measured by iperf3
Power draw - Measured directly using the voltmeter
Each of these tests was performed in its relevant test modes:
Idle - No network traffic initiated
Base - Network without a VPN tunnel
Kernel - Network with VPN tunnel which uses kernel implementation
Userspace - Network with VPN tunnel which uses user space implementation
All of the tests listed, including the test mode variations, were primarily performed on client device #1 — RPi 4 using Wi-Fi connection. Throughput test was an exception to this, where additional data was collected using Ethernet connection and client device #2 — NUC11.
Additionally, the WireGuard application was being displayed on the screen during all test cases. It was not moved to the background and the device screen remained turned on.
Latency
Ping latency proved to be slightly in favor of user space implementation. The latency was measured during 18 minute intervals for each of the listed test modes.
RPi 4 - Ping
RPi 4 - Ping - Increase relative to base
Throughput
Initial iperf3 throughput the comparison did not show significant differences, although based on our experience with other platforms, we expected to notice some differences. Results suggested that we had hit a limitation unrelated to WireGuard and the VPN, and that in this particular case, the CPU was able to offset most of the overhead, which made the implementation type less relevant in the throughput context.
RPi 4 - Iperf3 - WiFi
RPi 4 - WiFi throughput - decrease relative to base
For the next step we performed the same test using Ethernet interface, which yielded results in favor of WireGuard kernel implementation. The base throughput increased significantly.
RPi 4 - Iperf3 - Ethernet
RPi 4 - Ethernet throughput - relative to base
When it comes to the Android platform, Ethernet is not used as commonly as Wi-Fi and we wanted to confirm that Wi-Fi can also benefit in this regard. Since other online resources including MagPi suggested that this was due to Raspberry Pi 4 hardware limitations, the Wi-Fi throughput test was repeated using client device #2 — NUC11. This time the overall variation increased, but with significantly more processing power available, the differences between kernel and user space implementations were not as stark as they were with RPi 4 Ethernet throughput test.
NUC11 - Iperf3 - WiFi
NUC11 - WiFi throughput - decrease relative to base
Power draw
Power consumption was measured during 1 hour intervals for each of the listed modes. The Wi-Fi download was ongoing at maximum speed, and the previously-discussed RPi 4 Wi-Fi limitation played in our favor for this particular test.
RPi 4 - Power consumption [mWh]
RPi 4 - Power consumption overhead - relative to idle
Since the power is measurable even without any download taking place, it provides the means to calculate the plain network power consumption by subtracting the idle value from the base value. This allows us to take a closer look and highlight the actual VPN power consumption overhead relative to the power consumed on downloading in the base case.
RPi 4 - Power consumption overhead - relative to WiFi consumption
Conclusion
Overall, WireGuard kernel implementation has shown staggering improvements in terms of throughput and power consumption, with a minor latency increase.
Utilizing the WireGuard kernel module could provide better download speeds to all Android users and increase their battery life while still leaving them protected within a VPN tunnel.
Server compatibility would remain preserved and the implementation switch could be introduced without adding complexity to the user experience.
Results
Having the WireGuard kernel module available in Android kernel is just the first step of this journey. Additional effort will be required on both system and application sides to make it available for everyone without needing to use a rooted device. We look forward to seeing more progress in the future.