Add Final Displacement Error (FDE) and minFDE metrics#101
Open
lonexreb wants to merge 1 commit intoNVlabs:mainfrom
Open
Add Final Displacement Error (FDE) and minFDE metrics#101lonexreb wants to merge 1 commit intoNVlabs:mainfrom
lonexreb wants to merge 1 commit intoNVlabs:mainfrom
Conversation
ADE / minADE measure average L2 error along the trajectory, but the
companion metric for trajectory prediction -- Final Displacement Error
-- was missing. FDE is the L2 distance between predicted and
ground-truth positions at the *last* timestep, and minFDE is its K-min.
ADE without FDE under-reports models that drift early but recover, and
over-reports models that nail the early steps but veer off at the end.
This PR adds two functions to src/alpamayo_r1/metrics/distance_metrics.py
that mirror the existing compute_ade / compute_minade APIs exactly:
- compute_fde(pred_xyz, gt_xyz, timestep_horizon=None, only_xy=True)
-> [B, N, K] tensor of FDE per sample. timestep_horizon=H reports
FDE at the H-th timestep (1-indexed) instead of the final one,
enabling FDE@3s and similar single-horizon reports.
- compute_minfde(pred_xyz, gt_xyz, disable_summary=False,
timestep_horizons=[5, 10, 30, 50], only_xy=True, time_step=0.1)
-> dict[str, Tensor] with `min_fde` plus `min_fde/by_t={H:.1f}` per
valid horizon, plus `_std` variants from summarize_metric when N > 1.
Mirrors compute_minade key naming exactly.
Both functions raise ValueError on out-of-range horizons (compute_fde)
or silently skip them (compute_minfde), matching the existing ADE
behavior so callers can swap with no surprises.
Also adds src/alpamayo_r1/metrics/test_fde.py with 11 pytest cases
covering: shape, non-negativity, manual L2 equivalence, only_xy=False
3D path, horizon selection at index t-1, ValueError on horizon > T,
independence from intermediate timesteps, key emission and _std gating
in compute_minfde, and that minFDE actually takes the K-min.
Verified locally against torch (without GPU/HF) by exec'ing the new
functions in isolation -- all 6 functional invariants pass.
Signed-off-by: lonexreb <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
alpamayo_r1.metrics.distance_metricsalready implements ADE / minADE / corner_distance, but the standard companion metric for trajectory prediction — Final Displacement Error — is missing.ADE measures average L2 error along the trajectory; FDE measures L2 error at the last timestep. Reporting one without the other distorts the picture in two opposite ways:
Every major trajectory-prediction benchmark (Waymo Open Motion, nuScenes, Argoverse) reports both. Adding FDE here closes a real evaluation gap and aligns Alpamayo's metric outputs with the literature.
What
Two new functions in
src/alpamayo_r1/metrics/distance_metrics.pythat mirrorcompute_ade/compute_minadeexactly — same shapes, same kwargs, same error semantics.compute_fde(pred_xyz, gt_xyz, timestep_horizon=None, only_xy=True) -> torch.TensorReturns
[B, N, K].timestep_horizon=Hevaluates FDE at theH-th timestep (1-indexed) instead of the final one — useful for reporting FDE@3s. RaisesValueErrorontimestep_horizon > T(mirroringcompute_ade).compute_minfde(pred_xyz, gt_xyz, disable_summary=False, timestep_horizons=[5, 10, 30, 50], only_xy=True, time_step=0.1) -> dict[str, Tensor]Returns per-batch tensors of shape
[B]:min_fde: K-min FDE at the final timestep, averaged over the N groups.min_fde/by_t={H:.1f}: K-min FDE at the timestepint(H / time_step), for each validt in timestep_horizons. Out-of-range horizons are silently dropped (mirrorscompute_minade).<key>_std: stdev across the N groups, added bysummarize_metricwhenN > 1anddisable_summary=False.Key naming is parallel to
compute_minadeso dashboards / loggers that already group onmin_ade*automatically pick upmin_fde*.Tests
src/alpamayo_r1/metrics/test_fde.py— 11 pytest cases, no GPU / no HF auth:compute_fdeshape, non-negativity, manual final-step L2 equivalence.only_xy=Falseuses the 3D vector and returns a value>=the XY-projection FDE.timestep_horizon=Hselects indexH-1.timestep_horizon > TraisesValueError.compute_minfdeemits all documented keys +_stdvariants whenN > 1.compute_minfdeare silently skipped.disable_summary=Truedrops_std.min_fdeequalscompute_fde(...).min(K).mean(N).Local verification
Validated against
torch(no GPU/HF needed) by exec'ing the new functions in isolation — all 6 core invariants pass. Output:Migration
None. This is purely additive — existing call sites that import
compute_ade/compute_minadeare untouched.