Microsoft Copilot Spent Two Weeks Reading Confidential Emails It Was Supposed to Ignore
A Microsoft 365 Copilot bug (CW1226324) let the AI summarize emails with sensitivity labels in Sent Items and Drafts, bypassing DLP policies for two weeks. The NHS was affected. It's the second time in eight months.

For roughly two weeks starting January 21, Microsoft 365 Copilot quietly ignored the sensitivity labels that enterprises use to keep confidential emails out of AI systems. Emails in Sent Items and Drafts marked as confidential were summarized in Copilot Chat as if the labels did not exist.
Microsoft disclosed the issue on February 3 as service health advisory CW1226324. A fix began rolling out February 10, with broader remediation deployed by February 19. No one knows how many organizations were affected because Microsoft has not said.
This is the second time in eight months that Copilot has bypassed its own sensitivity label enforcement.
TL;DR
| Detail | Value |
|---|---|
| Advisory | CW1226324 |
| Active window | January 21 - February 10, 2026 (~2 weeks of exposure) |
| Affected folders | Sent Items and Drafts in Outlook |
| What happened | Copilot Chat summarized emails with sensitivity labels, bypassing DLP |
| Who confirmed it | Microsoft via M365 admin center advisory |
| NHS tracking | INC46740412 |
| Prior incident | EchoLeak (CVE-2025-32711), patched June 2025 |
| Fix | Server-side configuration update, rolling out worldwide |
What broke
Microsoft Purview sensitivity labels are the enterprise mechanism for classifying documents and emails as Confidential, Highly Confidential, Internal Only, and similar tiers. When properly enforced, these labels tell Microsoft 365 services - including Copilot - not to process or surface the content in AI-generated summaries.
A code defect in Copilot Chat's "Work" tab broke that enforcement for two specific Outlook folders: Sent Items and Drafts. The bug did not affect the Inbox directly, but because Sent Items and Drafts routinely contain entire conversation threads (including received messages quoted in replies), the effective exposure was broader than just outgoing mail.
Microsoft's root cause description:
"A code issue is allowing items in the sent items and draft folders to be picked up by Copilot even though confidential labels are set in place."
The result: when enterprise users asked Copilot Chat a question, the retrieval pipeline pulled in confidential emails alongside unrestricted content and generated summaries without distinguishing between them. Users received AI-synthesized answers that included information their organization had explicitly marked as off-limits for AI processing.
What Microsoft says
Microsoft's official statement, provided to BleepingComputer and TechCrunch:
"We identified and addressed an issue where Microsoft 365 Copilot Chat could return content from emails labeled confidential authored by a user and stored within their Draft and Sent Items in Outlook desktop. While our access controls and data protection policies remained intact, this behavior did not meet our intended Copilot experience."
The company added: "This did not provide anyone access to information they weren't already authorized to see."
That framing is technically true - Copilot operates within a user's existing Microsoft Graph permissions. But it misses the point. Sensitivity labels exist specifically to prevent AI systems from ingesting certain content, even when the human user has access. A user authorized to view a confidential email is not the same as that email being fed into an AI summarization engine that blends it with other context and presents it as part of a synthesized response.
The NHS connection
The UK National Health Service logged the incident as INC46740412, confirming that the bug reached healthcare environments. The NHS clarified that patient data was not exposed, noting that its policy already prohibits using Copilot for clinical activity or processing patient or staff data.
But the fact that a healthcare system had to verify whether an AI tool had accessed patient records because of a DLP bypass illustrates the stakes. Organizations adopted sensitivity labels precisely because they needed hard boundaries, not probabilistic ones, around what AI could see.
EchoLeak: the first time
This is not Copilot's first encounter with label bypass. In June 2025, Microsoft patched CVE-2025-32711, a critical (CVSS 9.3) zero-click vulnerability discovered by Aim Security researchers and published as "EchoLeak".
EchoLeak worked differently - it was an adversarial prompt injection attack rather than a product bug. An attacker could send an innocuous-looking email containing hidden malicious prompts to a target's Outlook inbox. When any employee in the organization asked Copilot a business question, the RAG pipeline would ingest the malicious email alongside legitimate data. The injected prompt then instructed Copilot to silently exfiltrate sensitive data through crafted Teams and SharePoint URLs.
The attack bypassed four distinct defense layers: Microsoft's cross-prompt injection classifier, external link redaction, Content-Security-Policy controls, and reference mention safeguards.
Two failures, one pattern
| Incident | Type | When | Sensitivity labels bypassed |
|---|---|---|---|
| EchoLeak (CVE-2025-32711) | Adversarial prompt injection (zero-click) | Patched June 2025 | Yes |
| CW1226324 | Product bug in retrieval pipeline | Jan-Feb 2026 | Yes |
As VentureBeat reported, the pattern is the same: Copilot's retrieval pipeline violated its own trust boundary, and neither standard DLP monitoring nor any third-party security stack caught either failure. In both cases, the bypass was discovered through external reports, not Microsoft's own detection systems.
Why DLP cannot catch AI trust failures
The CW1226324 incident exposes a structural problem with how enterprises think about data protection in AI-augmented environments.
Traditional DLP monitors data in transit - emails being sent, files being uploaded, clipboard operations crossing trust boundaries. It watches the pipes. But Copilot's retrieval pipeline operates inside the trust boundary. The data never leaves Microsoft's environment. It moves from Exchange to the Microsoft Graph index to the Copilot inference engine, all within the same tenant. From DLP's perspective, nothing happened.
This is the gap that sensitivity labels were supposed to fill: a metadata layer that tells AI services "this content exists in the tenant, the user can see it, but you should not process it." When that enforcement breaks, whether through an adversarial attack or a code bug, there is no backup system watching.
Neither EchoLeak nor CW1226324 triggered security alerts. No DLP stack caught either one.
For the 72% of S&P 500 companies that now cite AI as a material risk in regulatory filings, this is the concrete scenario behind the disclosure language. The AI tool they deployed to improve productivity silently exceeded its access scope, and the security infrastructure they built to prevent exactly this had no visibility into it.
The fix
Microsoft deployed a server-side configuration update that re-asserts Purview DLP policy enforcement for Sent Items and Drafts in the Copilot Chat pipeline. The fix updates the Microsoft Graph connectors that feed data into the Copilot index to strictly respect sensitivity labels for those folders.
Deployment began February 10, with Microsoft confirming worldwide rollout by February 19. A scheduled follow-up status update is expected February 24. The company has been contacting subsets of affected tenants to validate the remediation.
What Microsoft has not disclosed:
- How many organizations or users were affected
- Whether any confidential data was surfaced in Copilot outputs during the two-week window
- Whether audit logs captured which labeled emails were processed
- A timeline for any compensatory measures
Two sensitivity label bypasses in eight months - one adversarial, one accidental - point to a retrieval pipeline where data classification is enforced at a layer that can fail silently. For enterprises that bet their compliance posture on Microsoft Purview, the question is not whether the labels work most of the time. It's what happens when they don't, and whether anyone notices.
Sources:
- Microsoft says bug causes Copilot to summarize confidential emails - BleepingComputer
- Microsoft says Office bug exposed customers' confidential emails to Copilot AI - TechCrunch
- Copilot Chat bug bypasses DLP on 'Confidential' email - The Register
- Microsoft Copilot ignored sensitivity labels twice in eight months - VentureBeat
- DLP Policy for Copilot Bug Exposes Confidential Email - Office365ItPros
- Copilot Privacy Flaw CW1226324 - WindowsForum
- CVE-2025-32711 EchoLeak - SOC Prime
- EchoLeak: Zero-Click Prompt Injection in Production LLM - arXiv
- NHS M365 Copilot Support FAQ
