MITRE doesn’t crown winners. It doesn’t score products. And it doesn’t endorse vendors. That hasn’t stopped a familiar ritual from playing out each year when the MITRE ATT&CK® Enterprise Evaluation results are released: every vendor with a recognizable name finds a way to claim victory.
Some highlight specific phases of the attack chain, others trumpet detection volume, speed, or coverage. The point is, with enough caveats and a willing marketing team, just about everyone walks away with a headline that reads “we won.” As I wrote a few years ago, it’s a little like hearing a baseball stat that says a player leads the league in batting average against left-handed pitchers on Tuesdays in August. It’s technically true — but not exactly definitive.
That said, some vendors objectively perform better than others when you look at the raw data. This year, CrowdStrike announced it achieved 100% detection, 100% protection and zero false positives. On its face, that’s impressive. But like everything in cybersecurity, context matters.
A More Demanding Evaluation
The 2025 MITRE ATT&CK® Evaluations were notably more complex than in years past. This round introduced cross-domain scenarios that required visibility and control across endpoint, identity and cloud layers. In short: it wasn’t enough to just be good at one thing.
“The testing went beyond just traditional endpoint techniques,” Michael Sentonas, president of CrowdStrike, explained to me. “You had cloud adversary emulation. So you needed to have a cloud security product to be able to do well, and attacks that had an identity component to the tradecraft, touched the endpoint, touched the cloud”.
MITRE emulated tactics used by two sophisticated threat actors: Mustang Panda, a Chinese state-sponsored espionage group, and Scattered Spider, an eCrime group known for targeting cloud environments. This mix added real-world pressure to the test scenarios, mimicking the layered attacks organizations increasingly face.
CrowdStrike’s Results — And Their Implications
In these scenarios, CrowdStrike reported 100% detection and protection, with zero false positives. These numbers suggest that Falcon, the company’s unified platform, was able to identify and block every tested behavior while minimizing unnecessary alerts. Notably, Sentonas emphasized that the company also aimed for — and believes it achieved — one of the lowest alert volumes in the test.
Why does that matter? Because in the real world, signal fatigue is a real threat to analyst effectiveness. “If you have just an incredible number of alerts because you’re trying to block and detect everything humanly possible to game the test, that doesn’t necessarily yield a great outcome,” said Sentonas. “In a real-world deployment, you’d never be able to use that product”.
He’s not wrong. Alert overload can paralyze SOC teams, even if underlying detection efficacy is high. The ability to reduce noise while maintaining fidelity is critical for platforms intended for enterprise-wide use.
MITRE Isn’t a Magic Bullet
Let’s be clear: achieving high marks in MITRE testing is meaningful. But it isn’t comprehensive. The evaluations are transparent and well-structured, but they occur in a controlled environment. They don’t replicate the chaos of a production network, with encrypted traffic, unpredictable user behavior and overlapping signals.
That’s why buyers need to interpret results through the lens of operational context. A platform that detects everything in the lab but floods analysts with non-actionable alerts may not be usable at scale. Conversely, a vendor that didn’t score top marks in a specific area might still integrate better into an organization’s broader stack or security workflows.
The goal of MITRE testing is to expose behavioral telemetry — not to hand out trophies.
“MITRE isn’t a scoreboard—it’s a microscope,” explained Den Jones, founder and CEO at 909Cyber. “Security leaders shouldn’t just ask who scored highest, but how those results translate to day-to-day resilience in their own environment.”
What Security Leaders Should Take Away
CISOs and security teams evaluating this year’s results would do well to look beyond the headlines. Ask questions like:
- How were detections surfaced?
- Were alerts context-rich and actionable?
- Could this platform scale without tuning or overwhelming analysts?
The shift toward cross-domain simulation in this year’s evaluations mirrors how modern threats behave. It also puts pressure on vendors to unify capabilities across layers — endpoint, identity and cloud — rather than excel in isolation.
“It gives a transparent view into how your architecture works. It gives a transparent view as to how you block things,” Sentonas said. “When it becomes a little bit more of a sophisticated test, I think there’s a lot of meaningful things that you can take away”.
A Useful Lens, Not the Full Picture
CrowdStrike’s performance this year is undeniably strong under MITRE’s framework. But the real story isn’t just about one vendor’s results — it’s about how the evaluation itself is evolving. This year’s emphasis on emulating real-world, cross-domain threats adds much-needed relevance to a process that has, at times, encouraged “teaching to the test.”
As security leaders confront increasingly complex environments and increasingly capable adversaries, the ability to understand how a platform behaves across domains, under pressure, with minimal noise, is key. MITRE won’t make that decision for you — but it can help you ask better questions.
Read the full article here









