22 Bytes Poison ML Malware Detectors via Label Spoofing

A team of researchers from EURECOM, Universite Paris Cite, and INRIA has published a paper in IEEE Transactions on Information Forensics and Security that should concern anyone building or relying on ML-based malware detection. Their attack is absurdly simple: inject a few dozen bytes of known malware signatures into otherwise clean Android apps, and antivirus engines will mislabel them as malicious. Those false labels then flow into the public datasets that train the next generation of ML malware detectors.

TL;DR

Injecting just 22 to 55 bytes of known malware signatures into benign Android APKs causes antivirus engines to flag them as malicious
Those corrupted labels flow into public datasets like AndroZoo (24 million APKs, used by 880+ research teams) and poison ML training pipelines
With 1% poisoned samples, two widely used ML detectors (DREBIN and MaMaDroid) lose roughly 15% accuracy
At just 0.015% poisoning, attackers can trigger targeted false positives against specific apps
The injected bytes don't change app behavior at all - they sit in dead code or unused resources

The Attack in 55 Bytes

The paper, titled "Trust Under Siege: Label Spoofing Attacks Against Machine Learning for Android Malware Detection," targets a blind spot that the field has largely ignored: the labeling pipeline itself.

ML malware detectors don't label their own training data. Instead, researchers download apps from repositories like AndroZoo - which hosts roughly 24 million APKs - and use VirusTotal's aggregation of 70+ antivirus engines to decide what's malware and what isn't. If enough engines flag an app, it gets the malware label. If few or none do, it's benign.

The EURECOM team, led by Tianwei Lan and Simone Aonzo, realized that antivirus engines don't just scan behavior. Many rely on simple byte-pattern signatures - fixed sequences that identify known malware families. An attacker can take a perfectly clean app, inject one of these signature sequences into a location that never executes (dead code, an unused resource file, padding bytes), and the app's functionality stays identical. But the AV engines see the signature and raise the alarm.

The result: a benign app gets labeled as malware in VirusTotal, that label spreads to AndroZoo and every dataset built on top of it, and any ML model trained on that data now has a poisoned sample.

Lines of code on a screen ML malware detectors inherit their training labels from antivirus engines - if the labels are wrong, everything downstream breaks. Source: unsplash.com

DREBIN and MaMaDroid Both Fall

The researchers tested their attack against two of the most cited ML malware detectors in the Android security literature.

DREBIN (Arp et al., NDSS 2014) uses static analysis to extract features from Android manifests and disassembled code, then classifies with a support vector machine. It's the standard baseline - nearly every following Android ML malware paper benchmarks against it. The DREBIN dataset itself is publicly available and has been used by hundreds of research groups.

MaMaDroid (Mariconti et al., NDSS 2017) takes a different approach, building Markov chains from sequences of abstracted API calls and classifying with random forests. It was specifically designed for temporal robustness - the ability to maintain accuracy as malware evolves over time.

Both crumbled under the poisoning attack. At a 1% poisoning rate - meaning just 1 in 100 training samples were corrupted - DREBIN's detection accuracy dropped by roughly 15 percentage points. MaMaDroid showed similar degradation. The models weren't learning to detect malware; they were learning to associate benign app features with the malware label, because those features now appeared in "malware" samples.

The more surgical attack is even more concerning. At a 0.015% poisoning rate - a fraction of a fraction of the training set - the researchers could trigger targeted false positives. That means an attacker could cause a specific legitimate app to be classified as malware by a trained detector, without materially affecting the detector's overall performance. The global metrics look fine. The targeted app gets flagged.

Why This Matters Beyond Academic Papers

The vulnerability chain here isn't theoretical. It runs through infrastructure that real security teams depend on.

AndroZoo downloads average 500,000 APKs per day by researchers worldwide. The original dataset paper (Li et al., MSR 2016) has accumulated over 880 citations on Google Scholar. VirusTotal is the de facto source of ground truth labels for virtually all Android malware research published in the last decade. If those labels can be corrupted cheaply and at scale, the entire training pipeline for ML-based Android malware detection is compromised.

Android phone on a desk Android's open ecosystem makes it a prime target for both malware and the adversarial attacks that exploit malware detectors. Source: unsplash.com

The attack cost is negligible. The injected payloads are 22 to 55 bytes. No exploit development, no reverse engineering of the target detector, no access to the model's architecture or weights. The attacker only needs the ability to upload modified APKs to a repository that VirusTotal scans - or even just to submit them directly to VirusTotal.

This connects to a broader pattern in AI security research. Most adversarial ML work on malware detection has focused on evasion - changing malware so detectors miss it. Luca Demetrio, one of the paper's co-authors and an assistant professor at the University of Genova's sAIfer Lab, previously demonstrated that Windows malware detectors could be fooled by changing just tens of bytes in file headers (published in ACM TOPS, 2021). That work targeted the inference side. This new paper targets the training side, which is arguably harder to defend because the corruption happens upstream, before the model ever sees the data.

The Supply Chain Angle

This is, at its core, a supply chain attack on ML training data. The analogy to software supply chain compromises is direct: you don't attack the model, you attack the data it's built on. Just as a compromised npm package poisons every application that depends on it, a corrupted training label poisons every model trained on that dataset.

The paper's authors note that defenses exist in theory - label sanitization, outlier detection during training, ensemble-based label verification - but none are standard practice in the Android malware detection pipeline. Most researchers simply trust the VirusTotal consensus and move on.

For practitioners building or launching ML-based malware detectors, the implications are practical. You can't treat AV labels as ground truth without verification. Label auditing needs to be part of the training pipeline, not an afterthought. And any benchmark score achieved on a dataset with potentially corrupted labels should be taken with appropriate skepticism.

Who's Behind the Research

The team spans three French institutions. Simone Aonzo (EURECOM, Sophia Antipolis) is an assistant professor specializing in malware analysis and reverse engineering who previously worked as a penetration tester in banking. Yufei Han (INRIA, Rennes) brings eight years of work bridging ML and security, including a stint at NortonLifeLock. Tianwei Lan and Farid Nait-Abdesselam are at Universite Paris Cite. Luca Demetrio, now at the University of Genova, contributes deep expertise in adversarial attacks against security classifiers.

The paper's appearance in IEEE TIFS (Transactions on Information Forensics and Security) rather than a workshop or preprint gives it additional weight. This is a Tier 1 venue for security research, and the peer review process for TIFS is notoriously thorough.

The core question the paper leaves open: if 22 bytes can break the labeling pipeline for Android malware detection, what other ML training pipelines have similar blind spots in their data provenance chains?

Sources:

Trust Under Siege: Label Spoofing Attacks Against Machine Learning for Android Malware Detection - Lan et al., IEEE TIFS 2026, DOI: 10.1109/TIFS.2026.3671128
DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket - Arp et al., NDSS 2014
MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Models - Mariconti et al., ACM TOPS 2019
AndroZoo: Collecting Millions of Android Apps for the Research Community - Li et al., MSR 2016
Adversarial EXEmples: A Survey and Experimental Evaluation of Practical Attacks on Machine Learning for Windows Malware Detection - Demetrio et al., ACM TOPS 2021

The Attack in 55 Bytes

DREBIN and MaMaDroid Both Fall

Why This Matters Beyond Academic Papers

The Supply Chain Angle

Who's Behind the Research

Google Analytics