Training / Generation / Efficiency
AGVBench: A Reliability-Oriented Benchmark of Data Augmentation for Vein Recognition
** Haiyang Li, Yuming Fu, Qun Song, Hongchao Liao, Jing Chen, Mounim A. El-Yacoubi, Xin Jin
AGVBench: A Reliability-Oriented Benchmark of Data Augmentation for Vein Recognition
Authors: Haiyang Li, Yuming Fu, Qun Song, Hongchao Liao, Jing Chen, Mounim A. El-Yacoubi, Xin Jin
arXiv ID: 2607.02271
Problem: Data augmentations inherited from natural image tasks can disrupt the fine-grained vascular topology and textures critical for identity discrimination in vein recognition.
Key Methodology:
- Evaluated 30 augmentation strategies across 5 public palm/finger-vein datasets (VERA220, TJU600, SCUT1100, FV-USM, SDUMLA-HMT) with 7 backbone architectures (ResNet18, MobileNetV2, ViT-S, Swin-T, plus vein-specific FVRASNet, AMPVNet, StarLKNet-S)
- Multi-dimensional assessment across recognition performance, calibration (ECE), corruption robustness (19 corruptions at 3 severity levels), adversarial robustness (FGSM & PGD white-box attacks at ϵ=0.2/255), occlusion robustness (0–50% masking), and computational efficiency
- Introduces a Pareto-based APEX rank to jointly evaluate accuracy vs. training time, memory, and FLOPs trade-offs
Key Results:
- Multi-image mixing methods dominate clean accuracy: PuzzleMix 95.55% (VERA220, R18), MixUp 95.27% (VERA220, R18), StarMixup 96.27% (VERA220, APN), and PuzzleMix 96.02% (TJU600, SLK-S)
- These same top-accuracy mixup methods show poor calibration (high ECE) and severe adversarial vulnerability to FGSM/PGD attacks - a clear accuracy–security decoupling
- Severe geometric transforms (Flip, Rotate, Translation) consistently degrade recognition across datasets
- Vein models suffer catastrophic performance collapse even at lowest corruption severity (level C1/C2), motivating custom low-intensity corruption tiers
- Label enhancement methods (LabelSmoothing, DirichletLabelSmooth) offer strong calibration benefits (e.g., LabelSmoothing 94.88% on TJU600 R18 with competitive EER)
Applied Context: Builders should not optimize for clean accuracy alone when selecting augmentations for vein recognition; multi-image mixing yields top accuracy but introduces calibration and adversarial risk, so production systems need a multi-objective evaluation including calibration, corruption, and attack robustness before deployment.