M4E1: From Inbox Overload to Machine Triage
Module 4, Episode 1: From Inbox Overload to Machine Triage
The Filter Is the Analysis
Automated triage is the first and most consequential act of judgment in any collection pipeline, performed at machine speed, largely invisible to the analyst who inherits its outputs. Every subsequent step — every link diagram, every finished assessment, every warning — sits downstream of decisions the machine already made about what mattered. The analyst believes she is working with information. She is working with a curated residue.
That point deserves to be stated flatly: the moment you deploy an AI model to triage your collection, you have delegated the definition of "important" to that model's training history. Whatever that history did not represent, did not weight, or did not encounter with sufficient frequency becomes invisible — not redacted, not unavailable, but silently absent from the analyst's universe. The problem is not that AI triage is unreliable. The problem is that it is reliable in patterned, systematic ways that reproduce the coverage assumptions of whoever built and trained the system, and those assumptions compound over time.
This matters more now than it did eighteen months ago because the tooling has matured to the point of operational integration. Palantir's AIP (Artificial Intelligence Platform) combines large language models with operational data; for Security Operations Centers, this produces concrete use cases including automated threat correlation — correlating millions of log entries, network flows, and endpoint data in real time, identifying in seconds what a team of analysts would take hours to process. AI now handles triage and correlation of alerts — tasks that currently make up roughly 70 percent of SOC (Security Operations Center) work — transforming the human analyst into a supervisor and strategist who evaluates complex incidents requiring context and experience. That is the pitch. The reality is that a 70-percent reduction in manual review is also a 70-percent reduction in the analyst's unmediated contact with raw collection. She is no longer sampling the stream. She is reading the stream's editorial selections.
The Machinery of Ingest
To understand what triage does, you need to understand what ingest looks like without it. The volume problem is not theoretical. Analysts at RH-ISAC (Retail and Hospitality Information Sharing and Analysis Center) found themselves spending ten hours a week just collecting threat intelligence, manually tracking blogs, RSS feeds, and social media channels — too long to separate useful signals from irrelevant chatter. After adopting AI-assisted aggregation, the team reduced that time by more than 70 percent, dropping from ten hours to two to three hours per week. Multiply that across a team of twenty analysts, and you are reclaiming the equivalent of a full-time analyst's annual capacity from the shredder of manual collection. The efficiency case is legitimate and not in dispute.
What the efficiency case obscures is the mechanism. AI-driven platforms like Feedly Threat Intelligence work by deploying networks of models against open-source streams. Feedly continuously scans more than 10,000 open web sources — government advisories, vendor blogs, vulnerability databases, news sites, social media, and the dark web — with 1,000 AI models extracting unstructured data such as TTPs (Tactics, Techniques, and Procedures), CVEs (Common Vulnerabilities and Exposures, the standard catalog of known security flaws), and indicators of compromise, then adding them to a real-time Threat Graph that analysts can query with a CTI-trained (Cyber Threat Intelligence-trained) large language model to generate reports with citations back to original sources.
The operational logic is elegant: entity extraction, clustering, and deduplication reduce a thousand near-identical stories about the same CVE to a single enriched entry, and the analyst reads the enriched entry rather than the thousand stories. Story clustering groups related documents into coherent stories, significantly reducing redundancy in the analyst's workflow — major events like new software vulnerabilities or critical cloud outages are reported by many sources in varying styles and languages, and clustering collapses that redundancy. At the infrastructure level, an AI Management Summary ingests, filters, and reviews OSINT (open-source intelligence) threat intelligence before it reaches the analyst.
Alongside entity extraction, collection pipelines also perform event extraction: pulling structured descriptions of discrete actions from unstructured prose. Where entity extraction identifies that "APT-41" and "financial sector" are relevant terms, event extraction goes further — assembling the full predicate, for example "APT-X conducted spear-phishing targeting financial sector organizations on 2026-02-15" from a threat blog that never stated that sentence in so many words. The distinction matters because events have temporal and relational structure that bare entity lists do not; missing event extraction means knowing who appeared in a report without knowing what they did or when they did it.
The NewsCatcher API, one of the underlying data infrastructure layers commonly feeding these pipelines, illustrates the architectural choices that sit beneath the analyst's interface. The clustering behavior changed on January 1, 2026 — for articles published from that date onward, the system uses a new embedding model and clustering algorithm. That sentence, buried in an API changelog, represents a unilateral architectural decision that changed the shape of what every downstream user of that infrastructure would see in their feeds. No analyst was notified. No coverage impact assessment was circulated. The world of surfaced events quietly shifted.
The underlying clustering mechanism combines embedding models and graph-based community detection — specifically the Leiden algorithm, a method for detecting tightly-connected communities within large networks — with adjustable similarity thresholds. At a cosine similarity threshold of 0.6, you get broad clusters: a lot of articles grouped together as covering the "same story." At 0.8, you get tighter clusters and more distinct stories surfaced. The threshold choice determines whether a slightly divergent account — a regional outlet with a different framing, a non-English source with a substantively different angle — ends up in the same cluster as the Western wire service version, and therefore invisible, or surfaces as a separate story worthy of separate attention. Every number in that configuration is an opinion about what counts as different.
De-duplication, the related process of collapsing near-identical reports into a single canonical entry, introduces its own hidden assumptions about what counts as "the same event." When two reports describe the same CVE but in meaningfully different contexts — one framing it as a supply chain compromise affecting firmware vendors, another documenting direct exploitation against financial sector endpoints — de-duplication may collapse that distinction into a single enriched record. The analyst receives one entry about the CVE. The operationally significant difference between how it is being used, by whom, and against what target class disappears into the merge.
Downstream, at the entity-extraction layer, platforms extract who, what, where, and when at scale. Contify's technology includes machine learning classification, deduplication, topics and industry categorization, summarization, business event detection, Named Entity Recognition (a technique for identifying persons, organizations, locations, and events in unstructured text), and others to deliver contextually relevant news feeds enriched with metadata. Named Entity Recognition is the backbone of this — it transforms unstructured text into queryable structured data. That is an enormous capability. It also performs differently across languages, scripts, and domains, in ways that are rarely surfaced to the analyst querying the downstream product.
The Triage Problem, Formally Stated
Before examining how triage fails, it is worth stating precisely what it is trying to accomplish. The triage problem is this: separating signal from noise under resource constraints while preserving coverage of rare but critical events. Each element of that definition creates a distinct difficulty. Separating signal from noise requires a model of what signal looks like — but novel threats, by definition, do not look like previously observed signals. Operating under resource constraints means that not everything can be surfaced; the system must rank, and ranking requires a scoring function that encodes someone's theory of importance. Preserving coverage of rare but critical events is in direct tension with statistical learning, which rewards models that perform well on frequent cases and penalizes them, relatively speaking, for errors on infrequent ones. A model optimized on historical threat data will be well-calibrated for the threats it has seen most often and poorly calibrated for the threat that has never appeared before — which is the only threat that genuinely surprises anyone.
The Structural Warp: How Triage Defines the Analytic Universe
The strongest objection to the claim that triage constitutes analysis is that triage is merely selection, not interpretation — that the model doesn't reason about significance, it just routes. This objection fails. Selection at scale is interpretation. When a system decides that 400 of 10,000 daily articles are worth surfacing to an analyst, it has made 9,600 judgments that the analyst will never review. The analytic universe is not what is available; it is what gets through.
Researchers are designing AI-powered systems to automatically select and summarize the reports most relevant to each analyst — which raises the issue of bias in the information presented. This involves recommendation: selecting relevant reports without an explicit query, a task that draws on previous work documenting the existence of human-machine feedback loops in recommender systems. The intelligence context differs from commercial recommendation in one critical respect: the stakes of a missed signal are not a disappointed reader but an unwarned policymaker.
Existing recommendation systems largely select content based on previously observed short-term behavioral responses — clicks, likes, favorites — collectively known as engagement. This reliance on engagement produces several types of bias. User interface designers have long understood that users are more likely to click items that appear earlier in a list, even if later items are just as relevant — known as "position bias." Recommenders usually rank clicked items higher for other users, which can result in runaway amplification of initially random ranking choices.
In a commercial news context, position bias means someone reads the wrong op-ed. In an intelligence context, it means the analyst systematically under-surfaces articles from sources that generate fewer clicks — which typically means non-English sources, niche regional outlets, and platforms with smaller readerships. These are often exactly the sources where early signals of emerging threats appear before they cross into the mainstream wire services.
A related phenomenon — "popularity bias" — leads a recommender to show only those items most likely to be clicked by a broad group of users. Popular items are the ones the largest number of people will engage with. Popular, in a collection pipeline, correlates with well-resourced, English-language, and Western in origin. The signal that matters — the single Kazakh-language forum post, the Telegram channel in Tigrinya, the Nigerian news site reporting on a militia movement — is by definition not popular. It will not be clicked by most users. It will not be recommended to the next analyst. It will be buried.
The problem compounds through the language layer. Stanford research documents that ChatGPT and Gemini work well for 1.52 billion English speakers but severely underperform for 97 million Vietnamese speakers and 1.5 million Nahuatl speakers — the main culprit is data scarcity, not algorithmic deficiency. Models trained predominantly on English-language text are not neutral instruments when applied to multilingual collection. They perform translation and extraction tasks with high confidence on English and major European languages, and with quietly degraded performance on languages that didn't flood their training corpora. The degradation doesn't announce itself. The model doesn't flag low confidence on a Dari-language article the way a human translator would tell you she's uncertain about a dialectal phrase. It produces output that looks finished, and the analyst accepts it.
Consider the input layer of a collection pipeline oriented toward Central Asia, the Sahel, or Southeast Asia. The Arabic analyst who in an earlier era would have personally reviewed the raw feed is now downstream of a triage model that produces a curated English-language summary. That summary was assembled by extracting entities from Arabic text, translating them, clustering the results against other articles, and presenting the deduplicated output. Each step in that chain introduces a distortion that is invisible at the final display layer. The analyst reads something that looks like a coherent brief. It may be missing the paragraph that mattered.
The Amplification Problem: Frequent Noise Gets Louder, Rare Signal Gets Quieter
The intuition behind AI-augmented collection is that more data, processed faster, means better coverage. The actual effect, in the absence of careful design, is the opposite: the sources that produce the most content get amplified, and the sources with the best signal-to-noise ratio but low volume get buried. The machine optimizes for volume and pattern-match, not for intelligence value.
The research on AI feedback loops makes this mechanism concrete. In a series of experiments with 1,401 participants, researchers revealed a feedback loop where human-AI interactions alter processes underlying human perceptual, emotional, and social judgments — subsequently amplifying biases in humans. This amplification is significantly greater than that observed in interactions between humans, due to both the tendency of AI systems to amplify biases and the way humans perceive AI systems. Participants are often unaware of the extent of the AI's influence, rendering them more susceptible to it.
The mechanism runs as follows: a triage model surfaces articles; the analyst engages with what the model surfaces; her engagement signals — what she reads, how long she spends on it, what she flags — feed back into the model's scoring function; the model learns her preferences; it surfaces more of what she already found interesting; she develops increasingly narrow visibility. She doesn't notice, because the model performs reliably from her subjective vantage point. She is always seeing something. She just doesn't know what she's not seeing.
Training an AI algorithm on a slightly biased dataset results in the algorithm not only adopting the bias but further amplifying it. When humans interact with the biased AI, their initial bias increases. The human-AI feedback loop causes AI to amplify subtle human biases, which are then further internalized by humans — a cycle that leads to substantial increases in bias over time. In an intelligence context, this is not a slow drift. It is a ratchet that tightens every time the analyst trusts the queue.
The predictive policing analogy is instructive precisely because it is well understood outside intelligence circles. Algorithms that analyze past crime data to predict future crime hotspots, when applied to historically over-policed neighborhoods, produce a feedback loop: a predictive model sees those areas as "high risk" and sends even more police there. The collection pipeline equivalent: a triage model trained on historical threat intelligence from a particular region surfaces more articles from that region; analysts focus on that region; collection is tasked toward that region; the model gets reinforced; other regions go dark. The model isn't wrong about what it learned. It's wrong about the shape of the world.
Feedly's staff threat intelligence advisor Josh Darby MacLellan described the platform's role as reducing the time spent on high-volume, low-complexity tasks like extracting entities, deduplication, and clustering — freeing analysts to interpret and communicate findings. That framing is accurate but incomplete. High-volume, low-complexity tasks are also the tasks where the analyst's direct contact with raw material enforces humility about coverage. The analyst who manually reviewed 200 articles a day knew, viscerally, that she was reading a subset and that the subset had a shape. The analyst who reads 40 AI-curated summaries may believe, falsely, that she has broader coverage than before.
Tools that reduce the burden of overwhelming data volume while preserving accuracy are critical — but accuracy at what? Accuracy at pattern recognition within the training distribution? Almost certainly yes. Accuracy at detecting novel threats outside that distribution? Unknown, and probably not — and the novel threat, by definition, is the one that matters most.
How Triage Creates Invisible Gaps
The amplification problem and the bias problem combine into something more consequential than either alone: a self-reinforcing cycle in which the triage model's initial coverage assumptions become the permanent shape of the analyst's world. The mechanism is worth tracing through a concrete scenario.
Suppose a triage model is trained primarily on English-language vendor advisories, U.S. government CERT bulletins, and Western cybersecurity conference proceedings. This is not a hypothetical choice; it reflects the actual distribution of labeled training data that is most readily available to developers building CTI-focused models. That training corpus contains extensive coverage of threat actors operating against North American and Western European targets — groups like APT28, Lazarus Group, and FIN7 are richly represented, with thousands of labeled examples of their TTPs, infrastructure patterns, and targeting behavior. It contains sparse coverage of threat actors primarily active in sub-Saharan Africa, Central Asia, or against non-Western targets, because those actors generate less English-language vendor reporting, fewer CVE disclosures, and fewer conference presentations.
The model, having learned what "important threat activity" looks like from that corpus, assigns lower relevance scores to reporting that doesn't pattern-match against its training examples. A Francophone security researcher's blog post documenting a novel initial access technique used by a West African cybercriminal group scores below the threshold. A Swahili-language forum thread discussing exploitation of a telecommunications provider clears neither the language-processing layer nor the relevance classifier. These articles exist in the 10,000-source feed. They do not appear in the analyst's queue.
The analyst, seeing no signal from those regions, does not task additional collection toward them. The model, receiving no analyst engagement on that regional content, receives no corrective signal. When the model is retrained on updated data six months later, the new training set reflects the analyst's engagement history — which is, by now, almost entirely focused on the regions the model already weighted heavily. The blind spot has become structurally embedded. An incident that originates in that underrepresented region — a supply chain compromise moving through a regional telecom, a financially motivated group expanding its targeting westward — arrives without warning precisely because the triage layer had quietly stopped watching.
Feedback Loops: When the Baseline Becomes the World
The amplification problem describes what happens in the first months of deployment. The feedback loop problem describes what happens in year two and beyond, when the model has been continuously updated on the analyst's engagement data, when the collection tasking has shifted toward what the model thinks is important, and when the original diversity of source coverage has eroded without anyone having made a decision to narrow it.
Feedback loops in the collection of intelligence information are possible precisely because users may also be responsible for tasking collection. Avoiding misalignment requires an alternate, ongoing, non-engagement signal of information quality. This is the key structural point: if the same system that curates collection also absorbs the analyst's engagement signals and those signals influence collection tasking, you have a closed loop in which the model's prior assumptions about importance become self-fulfilling. The world as seen by the model becomes the world as searched by the collectors, which becomes the world as known by the analysts, which becomes the world as reported to policymakers.
Feedback loops between humans and recommender systems create biases, and there is good reason to believe these biases apply to intelligence applications — biasing the information presented to analysts, the information tasked for collection as a result, and ultimately what analysts believe and their subsequent conclusions.
Microsoft's research on algorithmic feedback loops found that "all stable long-term outcomes will disadvantage some group" — that under realistic conditions of heterogeneous populations and imperfect measurement, even well-intentioned systems trend toward structural disadvantage for groups not well-represented in the training data. For a collection pipeline, the groups that get disadvantaged are not demographic categories but geographic regions, linguistic communities, and topic domains that the model wasn't trained to weight. They don't disappear from the world. They disappear from the analyst's picture of it.
This creates a failure mode qualitatively different from the familiar problem of collection gaps. Traditional collection gaps are known-unknowns: the analyst is aware that coverage of North Korean internal communications is limited by access constraints, or that economic data from certain countries is unreliable. She builds that uncertainty into her analytic judgments. The feedback-loop gap is an unknown-unknown: the analyst has no reason to suspect that her AI-curated feed has systematically de-weighted signals from, say, Sahel-based jihadist communications, because she never received a notification that the model's coverage had drifted in that direction, and the stream she receives looks full and complete. It's producing summaries. It's flagging entities. It looks like it's working.
The 2025 reporting on Jama'at Nusrat al-Islam wal-Muslimin (JNIM, the dominant jihadist coalition operating across the Sahel region) and its operations across the Sahel illustrates the exposure created by this kind of drift. The April 2026 attack was the worst jihadist strike the region had seen in years. The conditions that produced it — JNIM's territorial expansion, the political vacuum left by Western military withdrawal, the limitations of local intelligence services — had been visible in open-source African and Francophone reporting for months before the attack. The question worth asking is whether the analysts whose triage queues were dominated by Ukraine, China, and the Middle East received adequate signal from that region in the preceding quarter. Not whether the information existed. Whether it cleared the triage threshold.
Adversarial Triage: When Someone Else Controls the Filter
The triage-as-analysis problem has a passive form — bias, drift, feedback loops — and an active form: adversarial manipulation of the collection pipeline itself. Both are consequential. The active form is less discussed and more immediately dangerous.
Indirect prompt injections occur when a large language model accepts input from external sources, such as websites or files, and that external content contains data that, when interpreted by the model, alters its behavior in unintended ways. In a collection pipeline, this attack surface is not theoretical. The pipeline is definitionally ingesting untrusted external content at scale — web pages, scraped documents, translated materials, aggregated news feeds. Every one of those inputs is a potential injection vector.
Among AI security vulnerabilities reported to Microsoft, indirect prompt injection is one of the most widely used techniques — and it holds the top entry in the OWASP (Open Web Application Security Project) Top 10 for LLM Applications and Generative AI 2025. It earned that position because it is practically effective and architecturally difficult to defend. The risk is fundamental to how LLM-based systems process untrusted data: an attacker can provide specially crafted data that the LLM misinterprets as instructions, with impacts ranging from exfiltration of user data to performing unintended actions using user credentials.
The 2025 EchoLeak vulnerability in Microsoft 365 Copilot is the first well-documented production case. In June 2025, researchers at Aim Security disclosed EchoLeak, a zero-click vulnerability in Microsoft 365 Copilot that allowed a remote attacker to steal confidential data simply by sending an email — representing the first known case of a prompt injection weaponized to cause concrete data exfiltration in a production AI system. The significance for intelligence practitioners is not the Microsoft-specific details but the proof-of-concept it represents: an adversary can inject instructions into content that an LLM will subsequently ingest, causing the model to behave in attacker-specified ways without the user's knowledge.
In early 2025, researchers discovered that some academic papers contained hidden prompts designed to manipulate AI-powered peer review systems into generating favorable reviews. The same technique applied to an OSINT pipeline — a hostile state actor seeding open-source content with instructions designed to influence how a collection model classifies or routes subsequent material — would be difficult to detect and potentially highly effective. Adversaries embed hidden or manipulated instructions within website content that is later ingested by an LLM, exploiting benign features like webpage summarization or content analysis and causing the LLM to execute attacker-controlled prompts without the user's awareness. Real-world cases have now been observed, including AI-based ad review evasion.
The state-level implications are direct. Consider a collection pipeline monitoring open-source reporting from a country whose government understands that foreign intelligence services aggregate its public media. If that government seeds certain outlets with content containing natural-language instructions designed to cause the downstream triage model to deprioritize categories of reporting — not block them, just deprioritize, below the threshold of analyst review — the effect is a targeted coverage gap with no obvious fingerprints. The analyst's queue looks normal. The gap accumulates invisibly.
Despite defenses designed for indirect prompt injection attacks, their resilience remains questionable — researchers evaluated eight different defenses and bypassed all of them using adaptive attacks, consistently achieving an attack success rate of over 50 percent. The defense landscape is not static: preventative techniques like hardened system prompts and Spotlighting (a method for isolating untrusted inputs so the model treats them as data rather than instructions) combined with detection tools such as Microsoft Prompt Shields and impact mitigation through data governance and deterministic blocking of known exfiltration methods represent current best practice. Current best practice is not sufficient, and it is almost certainly not what is running in most deployed collection pipelines, which were typically built before the attack surface was well understood.
What the Analyst Who Inherits the Queue Must Do Differently
The argument assembled here has a practical endpoint. If AI triage is a consequential filter that determines the analyst's universe, and if that filter reproduces training biases, amplifies frequent sources, degrades over time through feedback loops, and can be actively manipulated by adversaries — then the analyst who inherits a triage-mediated queue cannot treat it as she would treat a raw feed. She has to treat it as an analysis product, with all the source-assessment discipline that entails.
Concretely: an analyst working with an AI-curated collection queue should, as a matter of tradecraft, periodically audit what the queue is not surfacing. That means spending time in raw feeds, not because the raw feed is pleasant to work through, but because the gap between raw and curated is where the structural biases live. It means mapping the linguistic composition of the source set behind the triage model and asking whether the model's extraction performance has been tested against the languages most relevant to the account. Feedly's own advisors describe the goal as freeing analysts to interpret and communicate — not replacing their judgment about what the source universe should look like. That judgment has to be actively exercised, not assumed to have been handled upstream.
Organizationally, the people who configure and retrain triage models need to be treated as part of the analytic workforce, not as IT support. The decisions embedded in a clustering threshold, a similarity score, a language-weighting scheme, or a training cutoff are intelligence decisions. Avoiding misalignment feedback loops requires an alternate, ongoing, non-engagement signal of information quality — which means the organization needs deliberate mechanisms to surface information the model didn't prioritize, separate from the engagement data the model will otherwise use to reinforce its own priors.
On the adversarial dimension: AI that consumes internal and external data is a bridge across trust boundaries and must be in threat models — data flows and abuse cases must be mapped, and application security reviews must extend to prompts, retrieval pipelines, and render surfaces, not just APIs and user interfaces. A collection pipeline that ingests adversarially controlled open-source content through an LLM layer is a collection pipeline with an unguarded perimeter. The ingested content is the attack surface.
The triage model will not tell you it has been compromised. Neither will the feedback loop, and neither will the geographic drift. The analytic product will look complete, coherent, and internally consistent right up to the moment a real event occurs in a place the queue had quietly stopped watching.
The analyst who cannot describe the shape of her triage model — what it weights, what languages it handles well, what its training distribution looked like, when it was last updated, and what feedback signals are currently influencing its rankings — does not know what she does not know. That is not a failure of analysis. It is a failure of collection awareness, and in a triage-mediated world, those are no longer separate problems.