When VPP handles IPsec dataplane traffic, it maintains its own per-SA byte and packet counters internally. The Linux kernel’s XFRM subsystem, however, never sees those packets and has no way to update its own accounting. The result: ip -s xfrm state reports zero or stale counters, and any Linux-side tooling that depends on XFRM accounting — monitoring dashboards, external rekey decisions based on traffic volume, soft/hard lifetime tracking — sees no activity at all.
In this post I describe how the XFRM plugin for VPP now periodically pushes VPP’s SA counters into the Linux kernel using XFRM_MSG_NEWAE netlink messages, closing this visibility gap.
Background: XFRM Accounting Events
The Linux XFRM subsystem provides a dedicated netlink message type for updating SA accounting data: XFRM_MSG_NEWAE. Originally designed for HA migration of SA state between machines, NEWAE allows user space to inject lifetime counters directly into kernel XFRM states.
The message structure follows the pattern:
nlmsghdr : xfrm_aevent_id : optional TLVs
The xfrm_aevent_id header identifies the target SA by the combination of SPI, protocol, address family, destination/source addresses, and reqid. The TLV we care about is XFRMA_LTIME_VAL, which carries a xfrm_lifetime_cur structure with byte and packet counters.
One important kernel semantic: XFRM_MSG_NEWAE with NLM_F_REPLACE performs assignment, not addition. The kernel’s xfrm_update_ae_params() does x->curlft.bytes = ltime->bytes — a straight overwrite. This means we must always send cumulative totals, not deltas.
The Problem With VPP Counter Resets
The XFRM plugin already had a check_for_expiry() function that monitors VPP’s per-SA counters to detect when soft or hard byte/packet lifetime limits are reached. When a soft limit is hit, the plugin sends an XFRM_MSG_EXPIRE to the kernel (triggering strongSwan to initiate rekeying) and then zeroes the VPP counter so the same threshold can fire again for the next lifetime window.
This counter zeroing creates a challenge for NEWAE sync: if we simply read VPP’s current counter and push it to the kernel, we’d lose all the traffic that was counted before the last reset. We need a mechanism to preserve the running total across these resets.
Design Overview
The solution consists of four components:
- Per-SA sync state — a set of fields embedded in the existing
sa_life_limits_tstructure that tracks cumulative counters and last-synced values - Counter accumulation — a hook in
check_for_expiry()that snapshots the current counter into a running total before zeroing - NEWAE message builder — constructs and sends the raw netlink message with the correct SA identification and lifetime TLV
- Periodic sync loop — a time-gated call in the existing VPP process node that pushes updated counters at a configurable interval
Here is how these components interact:
VPP Dataplane
|
SA byte/packet counters
|
+------------+-------------+
| |
check_for_expiry() lcp_xfrm_sync_counters()
(every 2s wake) (every N seconds)
| |
On soft/hard expiry: For each SA:
1. accumulate into 1. total = cumulative + live counter
cumulative 2. skip if unchanged
2. zero VPP counter 3. send XFRM_MSG_NEWAE
3. send EXPIRE with total
4. record last_synced
Both functions execute in the same VPP process node on the main thread, so there are no race conditions between accumulation and sync.
Per-SA Sync State
Each SA already has a sa_life_limits_t structure that stores soft/hard byte and packet thresholds. The counter sync state is embedded directly into this structure:
typedef struct sa_counter_sync
{
u64 cumulative_bytes; /* running total, survives VPP counter zeroing */
u64 cumulative_packets;
u64 last_synced_bytes; /* value at last NEWAE send */
u64 last_synced_packets;
} sa_counter_sync_t;
typedef struct sa_life_limits
{
u64 soft_byte_limit;
u64 hard_byte_limit;
u64 soft_packet_limit;
u64 hard_packet_limit;
u32 sa_id;
u32 reqid;
int tun_sw_if_idx;
u8 sa_in_tunnel;
sa_counter_sync_t sync; /* XFRM counter sync state */
} sa_life_limits_t;
The cumulative_* fields hold the running total of all traffic counted before each VPP counter reset. The last_synced_* fields record what was last pushed via NEWAE, allowing us to suppress no-op updates for idle SAs.
Counter Accumulation
The key integration point is in check_for_expiry(). Before the existing vlib_zero_combined_counter() call that resets the VPP counter after sending an expire message, we fold the current counter snapshot into the cumulative totals:
if (rv)
{
lcp_xfrm_counter_sync_accumulate (&life->sync, &count);
vlib_zero_combined_counter (&ipsec_sa_counters, sa->stat_index);
}
Where the accumulate function simply adds the current snapshot:
static inline void
lcp_xfrm_counter_sync_accumulate (sa_counter_sync_t *sync,
vlib_counter_t *count)
{
sync->cumulative_bytes += count->bytes;
sync->cumulative_packets += count->packets;
}
This ensures no traffic is lost across counter resets. To verify correctness, consider this sequence:
- Traffic flows, VPP counter reaches
{1000 bytes, 50 packets} - Soft expiry fires: accumulate saves
{1000, 50}into cumulative, then VPP counter is zeroed - More traffic arrives, VPP counter reaches
{200, 10} - Sync computes: total = cumulative
{1000, 50}+ live{200, 10}={1200, 60}— correct
Building the NEWAE Message
The lcp_xfrm_build_newae_msg() function constructs the raw netlink message. Here is the wire format:
Offset Size Field
------ ---- -----
0 16 nlmsghdr (type=XFRM_MSG_NEWAE, flags=NLM_F_REQUEST|NLM_F_REPLACE)
16 48 xfrm_aevent_id (sa_id: daddr+spi+family+proto, saddr, reqid,
flags=XFRM_AE_LVAL)
64 4 nlattr (nla_len=36, nla_type=XFRMA_LTIME_VAL)
68 32 xfrm_lifetime_cur (bytes, packets, add_time=0, use_time=0)
------
Total: 100 bytes
The SA is identified by the same tuple the kernel uses: SPI, protocol (ESP/AH), address family, destination address, source address, and reqid. The XFRM_AE_LVAL flag in ae_id->flags tells the kernel this message carries lifetime values. The NLM_F_REPLACE flag in the netlink header instructs the kernel to overwrite existing counter values rather than treating this as a new SA event.
The message is sent unicast to the kernel (nl_groups=0) through the same XFRM netlink socket the plugin already uses for expire messages.
One implementation detail worth noting: the message is built as a raw byte buffer rather than using struct composition. This avoids compiler alignment padding between struct nlattr (4 bytes) and struct xfrm_lifetime_cur (which contains u64 fields) that would break the NLA wire format expected by the kernel.
The Sync Loop
The sync function iterates all SAs in VPP’s SA pool, computes the total counter (cumulative + live), and sends NEWAE only for SAs whose counters have changed since the last push:
static void
lcp_xfrm_sync_counters (void)
{
sa_life_limits_t *life;
vlib_counter_t count;
ipsec_sa_t *sa;
ipsec_main_t *im = &ipsec_main;
pool_foreach (sa, im->sa_pool)
{
/* look up our per-SA lifetime/sync tracking */
life = ...;
vlib_get_combined_counter (&ipsec_sa_counters, sa->stat_index, &count);
u64 total_bytes = life->sync.cumulative_bytes + count.bytes;
u64 total_packets = life->sync.cumulative_packets + count.packets;
/* skip idle SAs */
if (total_bytes == life->sync.last_synced_bytes &&
total_packets == life->sync.last_synced_packets)
continue;
if (lcp_xfrm_build_newae_msg (sa, life, total_bytes, total_packets))
{
life->sync.last_synced_bytes = total_bytes;
life->sync.last_synced_packets = total_packets;
}
}
}
The idle-SA check is important: without it, every sync cycle would send NEWAE messages for all SAs, including those with no traffic. With hundreds of SAs, this would generate unnecessary netlink traffic.
Scheduling
Rather than creating a new VPP process node, the counter sync piggybacks on the existing ipsec_xfrm_expire_process that already wakes every 2 seconds to check for SA lifetime expiry. A separate timer gates the NEWAE sends at a configurable interval:
uword
ipsec_xfrm_expire_process (vlib_main_t *vm, ...)
{
f64 last_sync_time = 0;
while (1)
{
vlib_process_wait_for_event_or_clock (vm, 2);
vlib_process_get_events (vm, NULL);
check_for_expiry ();
if (nm->counter_sync_interval_s > 0)
{
f64 now = vlib_time_now (vm);
if ((now - last_sync_time) >= (f64) nm->counter_sync_interval_s)
{
lcp_xfrm_sync_counters ();
last_sync_time = now;
}
}
}
}
The default sync interval is 10 seconds. Since the process node wakes every 2 seconds anyway, the actual sync granularity is within 2 seconds of the configured interval. This is a control-plane-only operation running in the VPP main thread — it has zero impact on dataplane packet processing performance.
Configuration
The sync interval is configurable via VPP’s startup configuration:
linux-cp {
...
counter-sync-interval 10
}
Setting the value to 0 disables counter synchronization entirely. The default is 10 seconds, which provides a reasonable balance between counter freshness and netlink message overhead.
Bonus Fix: Memory Leak in send_nl_msg()
While implementing NEWAE support, a pre-existing memory leak was discovered in send_nl_msg(): the nlmsg object allocated by nlmsg_alloc_simple() was never freed after sending. Since send_nl_msg() is called for every expire message and now for every NEWAE message, this leak would grow over time proportional to SA activity. The fix is a single nlmsg_free(nlmsg) call after nl_sendmsg().
Verification
After deploying the change, counter sync can be verified in several ways:
Check XFRM state counters:
watch -n1 ip -s xfrm state
The byte and packet counters should now update every sync interval instead of showing zeros.
Monitor NEWAE events in real time:
ip xfrm monitor
You should see accounting events appearing at the configured interval for SAs with active traffic.
Verify idle SA suppression: SAs with no traffic since the last sync should not generate NEWAE messages.
Test with counter-sync-interval 0: Disabling the feature should stop all NEWAE messages while leaving expire functionality unaffected.
Summary
| Aspect | Detail |
|---|---|
| Message type | XFRM_MSG_NEWAE with NLM_F_REPLACE |
| Sync interval | Configurable, default 10 seconds |
| Counter semantics | Cumulative totals (kernel does assignment, not addition) |
| Counter reset handling | Accumulation before each VPP counter zero |
| Idle SA optimization | Skip NEWAE when counters unchanged |
| Performance impact | Zero dataplane impact (control-plane process node only) |
| Files modified | 3 files, ~150 lines added |
The counter sync feature closes an important observability gap in VPP-based IPsec deployments. Linux tools and monitoring systems that rely on XFRM accounting data now see accurate, up-to-date counters, and external lifetime management decisions based on traffic volume work correctly.




















