M12E3: AI Geopolitics and the Chinese Frontier

Module 12, Episode 3: AI Geopolitics and the Chinese Frontier


The central analytic error Western intelligence organizations are making about Chinese AI is not underestimation. The error is the wrong kind of overestimation—focused on the wrong variables, drawing on the wrong evidence, and shaped by the distorting pressure of geopolitical anxiety from both directions simultaneously. Beijing wants the West to believe China's AI is either harmless consumer entertainment or an unstoppable sovereign capability, depending on which narrative is more useful in the moment. Washington has its own institutional incentives to either inflate the threat, to justify budget, or to dismiss it, to protect the export control rationale. The analyst who trusts either narrative will be wrong. The analyst who treats the capability question as an evidence problem, rather than a geopolitical one, has a fighting chance of getting it right.

This episode stakes a specific, falsifiable claim: as of May 2026, the PRC has built genuinely formidable AI capability across a broad frontier, but it lags the U.S. frontier by a measurable and analytically significant margin on independently verified benchmarks—while simultaneously being closer to parity than Western organizational planning has assumed on the tasks most relevant to intelligence operations: language generation, influence content production, coding, and adversarial information at scale. Analysts need a more granular model—where the gap is real, where it isn't, and where the gap is essentially irrelevant to threat.


China's AI Capability in May 2026

The most important single data point for assessing PRC AI capability in May 2026 came from NIST's Center for AI Standards and Innovation (CAISI). In April 2026, CAISI evaluated the open-weight AI model DeepSeek V4 Pro and determined that DeepSeek V4's capabilities lag behind the frontier by about eight months. That eight-month figure immediately became a political object. The U.S. government needed it to demonstrate that export controls are working; Chinese state media ignored it; and a significant segment of the AI research community contested the methodology. The critique with real analytic teeth is that it is impossible to reproduce CAISI's results because two of the nine benchmarks are non-public, and in those two benchmarks is where the gap is widest—for example, GPT-5.5 scored 71% on CTF-Archive-Diamond, one of CAISI's cybersecurity tests, with DeepSeek registering around 32%.

So we have a government evaluation with classified-benchmark weighting that produces an eight-month lag figure, and we have public benchmarks that produce something much closer to parity. Both are true, simultaneously, and the difference between them is analytically significant.

On public benchmarks, the picture is striking: GPQA Diamond, measuring PhD-level science reasoning, placed DeepSeek V4 at 90%, one point behind Claude Opus 4.6's 91%. Math olympiad benchmarks placed DeepSeek at 97%, 96%, and 96% respectively. On graduate-level reasoning via GPQA Diamond, GPT-5.5 scores 93.6% versus DeepSeek V4-Pro's 90.1%. A 3.5-point gap in graduate-level scientific reasoning is evidence of a capable system trailing at the margin, not a crippled one.

The specific domains where the gap widens matter enormously for threat modeling. DeepSeek V4 is text-only, with the separate vision variant lagging GPT-5.5 and Gemini 3.1 Pro substantially. On raw factuality, DeepSeek V4 scores 57.9 on SimpleQA-Verified versus Gemini 3.1 Pro's 75.6, and V4 hallucinates more on closed-book factual queries than the closed leaders. For intelligence applications requiring reliable factual recall—OSINT synthesis, document exploitation, watch officer support—this matters. For applications requiring fluent persuasive text generation in multiple languages, it matters considerably less.

The DeepSeek V4 story is also an economics story. V4 ships in two variants—V4 Pro with 1.6 trillion total parameters but only 49 billion active at inference time, and V4 Flash—both with a 1-million-token context window, MIT licensing, and pricing that lands V4 Pro at roughly one-seventh the cost of Claude Opus 4.7 and one-sixth the cost of GPT-5.5 on coding workloads. The architectural mechanism behind this efficiency is a Hybrid Attention Architecture combining Compressed Sparse Attention and Heavily Compressed Attention, which DeepSeek says cuts inference FLOPs (floating-point operations per second, the standard measure of computational work) at 1 million tokens to 27% of what the previous version required, and KV cache (the stored computational context that models maintain across a conversation) to just 10%. For an intelligence operation running millions of language generation tasks at scale—for influence operations, for translation of intercepted communications, for automated OSINT collection—cost is an operational variable. DeepSeek V4 Pro is less capable than GPT-5.5 at the margin and roughly seven times cheaper. That tradeoff will appear attractive to any adversary running high-volume content operations.

Alongside DeepSeek, Alibaba's Qwen series has become a genuinely parallel frontier track. Qwen 3.6 Plus Preview is Alibaba's next-generation flagship language model, released March 30-31, 2026, built on a new hybrid architecture designed for improved efficiency, stronger reasoning, and more reliable agentic behavior. Qwen3.6-Max-Preview claims the top position on six major coding benchmarks as of its April 20, 2026 release—the third major Qwen release in April 2026 alone. Three major flagship releases in a single month is not iterative improvement. It is a pace of development that reflects organizational urgency and deep engineering bench strength.

Qwen3.6 features expanded support to 201 languages and dialects. For an analyst assessing China's AI capability as an intelligence instrument, that specification—201 languages—is an influence operations capability specification, not a developer convenience feature.

Moonshot's Kimi models complete the picture of China's commercial AI frontier. Kimi 2.5 leads on LiveCodeBench for competitive programming, with each Chinese flagship model now holding specific domain leadership: Qwen on front-end coding, GLM on long-horizon autonomous tasks, and Kimi on competitive algorithm challenges. The Chinese open-source AI ecosystem is not a monolith chasing a single benchmark; it is a distributed competition among specialized labs that collectively cover the frontier in overlapping sectors. By early 2026, among the top six global open-source AI model families, China holds three seats—Qwen, DeepSeek, and Yi—and leads the U.S. in both download volume and community activity. A growing number of AI applications worldwide are being built on foundation models developed in China, expanding China's influence at the foundational AI technology layer.

That last point deserves to stand on its own. The analytic community has spent considerable energy on the question of frontier capability—which country's best model scores highest on which benchmark. The more operationally significant question is about deployment breadth. When Chinese-developed open-weight models are the foundational layer for applications in Southeast Asia, Africa, and Latin America, the signals intelligence (SIGINT) and OSINT implications extend well beyond what any benchmark captures.


The Military-Civil Fusion Architecture

Western analysts sometimes treat military-civil fusion as a rhetorical frame rather than as an operational mechanism with documented outputs. The evidence from public PLA procurement records makes that dismissal indefensible.

Drawing on 2,857 AI-related defense contract award notices published by the PLA between January 2023 and December 2024, Georgetown's Center for Security and Emerging Technology (CSET) found that while China's legacy defense sector players lead AI-related military procurement, an emerging set of nontraditional vendors and research institutions plays a consequential role as well. While some of the top suppliers are state-owned defense conglomerates (SOEs—state-owned enterprises), the majority are civilian companies and universities developing dual-use technologies. The structural shift is from a model in which defense capability flows from dedicated defense SOEs to a model in which civilian commercial AI firms—firms building products for the consumer market, firms whose engineers work on DeepSeek and Qwen and iFlytek—are simultaneously contracted by the PLA to deliver military AI capabilities.

iFlytek, which built China's first mass automated voice recognition and monitoring system and played a key role in government surveillance programs in Xinjiang and Tibet, has been on the U.S. Entity List since late 2019. The CSET dataset of PLA contracts shows that iFlytek Digital emerged as an important military supplier, winning 20 contracts in 2023 and 2024, including one for the development of an "Intelligent Speech Processing and Translation System" and another for an "Auxiliary Decision-Support System."

The analytic implication deserves explicit statement. iFlytek is building consumer speech recognition products for hundreds of millions of Chinese users; it is also under sanctions for its role in Xinjiang surveillance; it is also delivering speech processing and decision-support systems to the PLA under contract. These are not separate organizations. They are the same engineers, the same training data pipelines, the same models, deployed across civilian, surveillance, and military contexts simultaneously. Military-civil fusion is a procurement record.

CSET's February 2026 report examined thousands of PLA requests for proposal (RFPs) published between January 2023 and December 2024. The RFPs offer insights into the PLA's priorities for AI-enabled military technologies associated with C5ISRT: command, control, communications, computers, cyber, intelligence, surveillance, reconnaissance, and targeting. Outside of maritime and space domains, the PLA's RFPs reveal it aims to acquire increasingly sophisticated surveillance and cognitive domain capabilities, including facial and gait recognition systems, digital surveillance tools capable of recovering deleted data, and technologies for generating and detecting deepfakes—pointing to ongoing efforts to develop AI-enabled psychological warfare and cognitive targeting tools.

China's military doctrine describes the "cognitive domain" as a legitimate theater of operations, distinct from the kinetic, cyber, and electromagnetic domains. PLA procurement records show active acquisition of deepfake generation technology—not deepfake detection, but generation—for what the contracts term psychological warfare and cognitive targeting. The direct line from civilian large language model (LLM) capability to military cognitive operations capability runs through the MCF (military-civil fusion) procurement system, and the procurement records are public.

Under Xi's direct oversight through the Central Commission for Military-Civil Fusion Development, China has moved beyond the traditional model of distinct civilian and military technology sectors toward what Beijing terms a "fused" system. The emerging 15th Five-Year Plan framework institutionalizes MCF as the primary pathway for achieving what Chinese strategists call an "intelligentized" PLA by 2035. The plan calls for creating interoperable civilian-defense standards and shared infrastructure, and for establishing a "green channel" that allows scientific and technological advancements made in the civilian sector to move rapidly into military applications. This is a designed system for accelerating the transfer of commercial innovation into defense capability, not incidental dual-use.

For Western analytic organizations, the operational implication is that assessments of Chinese AI capability focusing only on explicitly military AI programs will systematically underestimate total PRC AI capacity. The capability available to the PLA is not limited to what the defense SOEs produce. It is the full output of the Chinese commercial AI ecosystem, redirected on demand through legally mandated fusion mechanisms. Every advance in DeepSeek's reasoning capability, every improvement in Qwen's multilingual fluency, every refinement in Kimi's long-context performance is, under MCF doctrine, available to PLA procurement the following quarter.


The Chip War: Where Constraints Bind

The U.S. export control effort on advanced semiconductors is the most consequential technology policy action the West has taken against Chinese AI development. It is working less well than its architects intended in one specific respect, while in another it is working better than its critics claim.

Where the controls constrain: the raw compute available for training frontier models. The Council on Foreign Relations published a widely noted report in early 2026 that systematically assessed the U.S.-China AI compute gap. Its core conclusion: measured by "effective compute available for frontier AI training," by the end of 2027, U.S. available AI compute could reach 17 times that of China. This gap stems not only from single-chip performance differences but from systemic deficits across three dimensions: the generational gap in advanced process nodes (TSMC 3nm versus SMIC 7nm, referring to the transistor size that determines chip density and efficiency), supply bottlenecks for critical components like HBM (high-bandwidth memory), and the maturity gap in software ecosystems.

A 17-to-1 compute advantage by 2027 is a structural constraint on China's ability to train frontier-scale models through brute-force scaling. DeepSeek's breakthrough with R1 in January 2025 demonstrated that architectural efficiency can substitute partially for raw compute—but "partially" is doing real work in that sentence. The primary cause of DeepSeek R2's extended delay revolves around difficulties training the model on domestically produced Huawei Ascend AI chips, a move reportedly encouraged by Chinese authorities. These chips presented instability and performance issues, forcing DeepSeek to pivot back to Nvidia hardware for the critical training phase. The Ascend dependency issue is a direct export control effect: the PRC cannot run its most capable open-weight models through their full training pipeline on domestic hardware without hitting the performance ceilings that forced DeepSeek back to Nvidia.

Where the controls fail to constrain: inference. Once a model is trained, running it—serving responses, generating text at scale, operating OSINT pipelines—requires inference compute, not training compute. DeepSeek V4's architectural innovations in compressed sparse attention reduce inference cost dramatically, cutting inference FLOPs at 1 million tokens to 27% of what the previous generation required, and KV cache to just 10%. A model that can be run at a quarter of the inference cost of its predecessor can be run on proportionally more modest hardware. Export controls on NVIDIA A100 and H100 chips constrain the training of new frontier models; they do not prevent inference-time operation of already-trained models at very large scale on domestically produced alternatives, including Huawei Ascend chips that, while inferior for training, are more adequate for inference at reduced computational load.

White House OSTP (Office of Science and Technology Policy) Director Michael Kratsios published a memo accusing China of running "deliberate, industrial-scale campaigns to distill U.S. frontier AI systems," using tens of thousands of proxy accounts and jailbreaking techniques. Distillation—using the outputs of a more capable model to train a less capable one—allows Chinese labs to extract capability from GPT-5.5 and Claude Opus 4.7 without direct access to the weights. If the accusation is accurate, export controls on chips and weights are being partially circumvented at the model capability level.

The honest summary of the chip war as of May 2026: the controls are delaying PRC frontier AI development by some number of months that independent experts estimate at six to eight months for the most capable models, while doing essentially nothing to constrain PRC AI inference capacity, influence operation scale, or the deployment of already-trained models for intelligence applications. The NIST CAISI evaluation estimates an eight-month lag. DeepSeek itself claims it trails state-of-the-art closed models by only three to six months. The vendor-claimed and government-verified numbers bracket a gap somewhere between "existentially significant" and "analytically irrelevant" depending on which task you are assessing.

For intelligence organizations, the relevant question is not which country has the better model for coding tasks or mathematical reasoning. It is whether the PRC has sufficient AI capability to execute at scale across the domains—language generation, translation, synthetic media production, OSINT automation, and adversarial content campaigns—where it is trying to establish operational advantage. On those specific tasks, the gap between a six-month-old U.S. frontier model and a current Chinese frontier model is essentially zero.


Chinese LLMs and the Information Environment

The Pentagon's 2025 China military report to Congress stated directly: "In 2024, China's commercial and academic AI sectors made progress on large language models and LLM-based reasoning models, which has narrowed the performance gap between China's models and the U.S. models currently leading the field." From an influence operations perspective, that capability narrowing is more precisely described as capability saturation. The PRC does not need GPT-5.5-level text fluency to run effective content operations at scale. It needed to clear a quality threshold that made AI-generated content indistinguishable from human-written content in the target languages of its operations. That threshold was cleared by 2024.

The documented operational record is specific. OpenAI disrupted four China-linked operations from March to June 2025 that had used ChatGPT. Ben Nimmo, principal investigator on OpenAI's intelligence and investigations team, told NPR that the China-linked operations "combined elements of influence operations, social engineering, and surveillance." The use of ChatGPT—a U.S. commercial model—for Chinese-linked operations shows why the "Chinese models" framing is analytically incomplete. PRC influence infrastructure is using whatever tools are available and adequate for the task, including U.S. frontier models accessed through commercial APIs.

Graphika's (a network analysis firm specializing in influence operations) "Falsos Amigos" report identified a network of 11 fake websites, established between late December 2024 and March 2025, using AI-generated pictures as logos or cover images to enhance credibility. These examples point to how China-linked information operations are increasingly using generative AI tools to refine previously deployed tactics including content laundering, covert dissemination of state propaganda, smear campaigns, and development of fake social media personas.

The Taiwan case shows the progression most clearly. In 2024, 2.159 million instances of disinformation were recorded, exceeding the 2023 total of 1.329 million. Facebook is the primary platform for disinformation dissemination, showing a 40% increase compared to 2023. The amount of disinformation spread on video platforms, forums, and X has grown by 151%, 664%, and 244% respectively, indicating the young generation is the primary target.

The DOD report notes that Beijing continued to invest in AI for military applications including ISR (intelligence, surveillance, and reconnaissance), decision-making assistance, cyber operations, and information campaigns, and that PLA publications have argued LLMs can boost efficiencies for creating synthetic media, including deepfakes. PRC military researchers have complained that the PLA lacks the necessary staff with adequate foreign-language skills and cross-cultural understanding for authentic content generation—and that leading generative AI technologies offer a potential technical solution to this deficiency.

That Pentagon assessment is more revealing than it may appear. The PLA's acknowledged weakness is model capability at the margin—it is the human expertise required to generate culturally authentic foreign-language content, content that sounds like a real Taiwanese citizen, a real Japanese student, a real Filipino nationalist, rather than a translated Chinese government narrative. The PLA recognizes this gap and is explicitly pursuing LLMs as the technical solution. Once that solution is adequate—once the models can produce culturally fluent persuasive content in Filipino, in Swahili, in Brazilian Portuguese without detectable machine authorship—the volume of operations currently documented in Taiwan scales globally, at marginal cost per additional market.

A paper from Stanford's Jennifer Pan and Princeton's Xu Xu examines how government regulation shapes output from Chinese companies' LLM chatbots, finding that "China's AI regulations are an extension of its censorship regime, building on and reinforcing existing government censorship efforts." Their findings confirm that Chinese models are substantially more likely to refuse to answer questions on sensitive political topics, or to give brief, selective, or otherwise misleading answers. This has a direct implication for intelligence operations: analysts using Chinese LLMs for research support will encounter systematic information gaps on topics the CCP (Chinese Communist Party) deems sensitive—structured omissions that track exactly the Chinese state's political priorities, not random hallucinations. The model does not lie chaotically. It lies coherently, in the direction of party-defined acceptable narratives.

When investigators asked five major LLM chatbots—ChatGPT, Copilot, Gemini, DeepSeek-R1, and Grok—to provide information on topics the PRC deems controversial in English and simplified Chinese, all chatbots sometimes returned responses indicative of censorship and bias aligning with the CCP. Among U.S.-hosted chatbots, Microsoft's Copilot was more likely to present CCP talking points as authoritative. The contamination is not limited to Chinese-developed models. The scale of Chinese state media and official content in global training corpora means Western models are also absorbing CCP framing on politically sensitive topics—at reduced but non-zero levels. In testing between October and November 2025 across approximately 180 questions about three conflicts, state-aligned propaganda appeared in 57 percent of responses across major AI platforms.

For an analyst using any commercial LLM for research on Taiwan, Tibet, Xinjiang, the South China Sea, or the legitimacy of PRC governance, the appropriate analytic posture is to treat LLM outputs as potentially contaminated sources requiring corroboration—exactly as one would treat a report from any source with known systematic bias.


Implications for Western Analytic Organizations

The threat model that most Western analytic organizations are operating against has a structural flaw. It envisions a bounded adversary capability—Chinese AI that is capable but inferior, deployed through recognizable channels, targetable through existing detection methods. The evidence in 2026 points to something more difficult: a capability that is adequate for most influence and collection tasks, deployed through channels that are partly domestic and partly through global commercial AI infrastructure that Western organizations also depend on, and producing content increasingly indistinguishable from authentic human output.

Three specific implications for organizational practice follow from this assessment.

The first is source environment modeling. The standard intelligence approach treats source reliability as a characteristic of the source—this outlet is generally reliable, that official is a known fabricator. AI-assisted adversarial content operations require treating the source environment itself as adversarially managed, not merely as containing some unreliable sources. If PRC influence operations are producing content at volumes that dwarf authentic production in specific topic areas—Taiwan security, PLA capability assessments, South China Sea territorial claims—then random sampling of open-source reporting on those topics will be systematically biased toward adversarially crafted narratives. The analyst needs to know which topic-geography combinations are high-confidence adversarial content targets, and weight accordingly. That knowledge requires dedicated OSINT analysis of the influence operation landscape, not just subject-matter analysis.

The second implication is human intelligence (HUMINT) exposure. OpenAI's Nimmo noted that China-linked operations combined elements of influence operations, social engineering, and surveillance. AI-generated personas operating across social media platforms can develop rapport with analysts and researchers over months—responding thoughtfully, sharing ostensibly exclusive information, building the appearance of a genuine source relationship—before being used to surface disinformation, extract information through targeted questions, or map the analyst's collection priorities. Classic source validation tradecraft applies but requires updating: the test for a competent AI-generated persona running in 2026 is not whether the text reads as human. It is whether the relationship pattern over time exhibits the consistent, contextually coherent, self-interested behavior that distinguishes a real person from an automated content system.

The third implication is model dependency risk in analytic workflows. If frontier models are being trained on outputs extracted from other frontier models under adversarial conditions—as the Kratsios memo formally accuses—then the analytic tools that Western organizations are deploying on classified networks are not fully auditable against adversarial contamination through the training data pipeline. This is a reason to demand provenance transparency from model providers and to treat AI-generated analysis of high-sensitivity geopolitical topics with the same structured skepticism that classical tradecraft demands of any source with unknown collection provenance.

The Pentagon's China report notes directly that LLMs and LLM-based reasoning models are useful for military applications including coding tasks to assist cyber operations, question-answering tasks to assist military decision-making, and synthetic content tailoring to assist influence operations—and that "the PLA continues to use MCF mechanisms to ensure China's academic and commercial AI communities provide robust, continuous support to military research and development projects."

The word "continuous" in that formulation is where analytic organizations should focus their attention. It describes not a discrete capability transfer but a structural pipeline in which civilian AI advancement feeds automatically and continuously into military capability. There is no clean boundary between the DeepSeek that handles customer service queries and the DeepSeek that assists in drafting operational messaging for Taiwan-focused influence campaigns. They are the same model, subject to the same MCF routing mechanisms.


Where Capability Sits in 2026, and Why the Uncertainty Matters

The honest answer to the frontier capability question is that analysts without access to classified assessments are working with bounded but genuine uncertainty—and that uncertainty itself carries analytic weight.

On the public-benchmark evidence: the Artificial Analysis Intelligence Index v4.0 (a standardized ranking of AI model performance across multiple dimensions) shows OpenAI near 60 points and DeepSeek in the low 50s as of May 2026, compressed far tighter than a year ago, with the methodology showing the gap is getting smaller. On the government-assessed evidence: CAISI places the gap at eight months, weighted toward domains measured by non-public benchmarks, including cybersecurity tasks where DeepSeek scores roughly half of the leading U.S. models. DeepSeek itself claims the gap is three to six months. The honest synthesis: somewhere in the range of three to eight months on general capability, with the gap compressed toward near-parity on the specific tasks most relevant to influence operations and widest on the cybersecurity tasks most relevant to classified network operations.

China's AI startup DeepSeek kicked off 2026 with research that analysts say improves scaling without increasing instability or cost. DeepSeek released a paper introducing a training method called Manifold-Constrained Hyper-Connections, co-authored by founder Liang Wenfeng. A lab that is publishing architectural innovations while simultaneously shipping frontier-competitive models is not at the limit of its capabilities. It is investing in the research infrastructure for the next generation while competing at the current one.

DeepSeek is focused on keeping its team lean—roughly 160 people—propelling development toward new models with the goal of reaching AGI (artificial general intelligence, the point at which AI systems can perform any intellectual task a human can), rather than scaling up commercial product offerings. One hundred sixty engineers producing models that the U.S. government independently assesses as the most capable Chinese-developed AI it has evaluated. The organizational model is itself an intelligence data point: DeepSeek's resource efficiency shows that the Western assumption—that AI frontier competition requires the resource base of a major hyperscaler—is wrong. That assumption has been comforting to those who reason that China's disadvantages in capital, chips, and talent will constrain its frontier development. The DeepSeek case contradicts it, in documented, reproducible form.

The more fundamental uncertainty is about what PRC intelligence and military AI systems look like in the classified domain. Everything assessed in this episode—DeepSeek, Qwen, Kimi, and their benchmarks—reflects the open-source PRC AI ecosystem. President Xi Jinping has emphasized the modernization of the PLA at the core of China's future planning, with the military expected to play a substantial role in the 15th Five-Year Plan guiding development from 2026 to 2030. The classified military AI programs, operating under MCF doctrine with direct access to commercial AI talent and models, are not visible in any public benchmark. They may be substantially more capable than the commercial models in specific military-relevant domains. Or they may be less capable, given the historically sluggish performance of China's defense SOEs in rapid technology adoption. The MCF procurement records suggest the latter risk is being deliberately addressed, but they do not resolve it.

This uncertainty asymmetrically affects planning. If Western analytic organizations assume PRC classified AI is merely an extension of the commercial frontier, they are likely underestimating capability in specific military-relevant domains. If they assume it dramatically exceeds the commercial frontier, they are likely overestimating and will make planning decisions based on a threat larger than the evidence supports. The disciplined analytic response is to maintain explicit, calibrated uncertainty—to reason about the range of likely capability rather than collapsing to a point estimate driven by the political convenience of either narrative.

That means building collection strategies that prioritize indicators of PRC classified AI deployment: procurement patterns, personnel movements between commercial AI labs and defense research institutions, and the specific capability signatures that distinguish commercial from military-grade AI in deployed applications. It means stress-testing Western analytic AI tools against the specific failure modes—factual hallucination, adversarial narrative embedding, structured political omission—that the evidence documents in Chinese-adjacent AI systems. And it means maintaining the institutional capacity to update assessments rapidly when new capability evidence emerges, because the pace of change in this domain has made any point estimate stale within months.


The frontier question is not simply "how good is Chinese AI?" The more important question, the one that shapes both organizational planning and specific analytic production, is this: which specific capabilities can the PRC execute reliably at scale, against which target populations, using which combination of domestic and global AI infrastructure, in what operational timeframe? The answer as of May 2026 is that PRC actors can execute adversarial content generation at large scale in dozens of target languages, with sufficient quality to pass casual detection, using commercially available models including both Chinese-developed and U.S.-developed systems accessed through global APIs. That capability exists today, is documented in operational reporting, and is improving on a timeline measured in months, not years.

Every analytic organization that treats this as a future threat rather than a current operating environment is miscalibrating its baseline. The influence operation volumes documented in Taiwan—2.159 million logged disinformation instances in 2024, a 64 percent increase over the prior year, with video and forum-based dissemination rising by triple digits—are not a preview of what AI-assisted adversarial content operations will look like. They are a measurement of what those operations already look like when conducted against a single target population of 23 million people with a well-resourced domestic counterintelligence apparatus actively counting cases. Scale that operational tempo to a target environment without Taiwan's dedicated monitoring infrastructure, and the documented numbers are a floor, not a ceiling.

This episode opened with the claim that Western intelligence organizations are making the wrong kind of error about Chinese AI—overestimating it on the wrong variables while underestimating it on the variables that matter operationally. The evidence supports that framing precisely. Frontier benchmark gaps on mathematical reasoning and graduate-level science are real but analytically secondary. The capability already deployed, already operational, and already producing measurable effects on information environments across the Indo-Pacific is the language generation and synthetic content capability—and on those tasks, the gap between Chinese and U.S. AI systems is, for practical operational purposes, closed.

The relevant question is no longer whether China has sufficient AI capability to conduct influence operations at scale. It does. The question now is what detection, attribution, and countermeasure infrastructure exists to operate in an environment where that is the baseline condition—and whether Western analytic organizations have built the collection priorities, tradecraft adaptations, and institutional habits appropriate to that environment. On the current evidence, most have not.