ChatGPT DNS Side Channel Attack: Securing LLM Integrations

Q: Can similar side-channel attacks affect self-hosted LLM deployments?

Yes. If you run an LLM with code execution capabilities such as a LangChain agent with shell access or a custom RAG pipeline with tool-use capabilities, the same DNS tunneling attack class applies. Network segmentation and DNS egress filtering are non-negotiable controls for any LLM deployment that can execute code or make network calls.

Q: Was any user data actually stolen through this vulnerability before the patch?

There is no public evidence of exploitation before the February 20, 2026 patch. However, DNS-based exfiltration is specifically difficult to detect without purpose-built monitoring and left no visible trace in the ChatGPT interface. Organizations that had employees upload sensitive documents to ChatGPT before February 2026 cannot conclusively rule out exposure without reviewing their DNS resolver logs for that period.

ChatGPT DNS side channel attacks work by encoding data into DNS subdomain queries and sending them outward through the resolver chain, completely bypassing the outbound network controls that OpenAI told users were in place. Check Point Research disclosed the technique on March 30, 2026, confirming that a single malicious prompt was enough to turn any ChatGPT conversation into a covert exfiltration channel. OpenAI fixed the underlying flaw on February 20, 2026.

For security engineers and enterprise architects, the flaw is a case study in what happens when you build an isolated execution environment and assume that blocking HTTP traffic is the same as blocking all outbound communication. It is not. DNS sits below that layer, and it was available inside the ChatGPT container the entire time.

This article walks through exactly how the attack worked, what the two compounding flaws were, how an attacker could weaponize a custom GPT to extract patient records or financial data without triggering a single warning, and what your enterprise LLM deployment needs to implement right now to avoid a similar blind spot in your own infrastructure.

Table of Contents

What Check Point Research Actually Found

Check Point Research published their technical disclosure on March 30, 2026. The research uncovered two distinct but compounding vulnerabilities in ChatGPT’s code execution runtime.

The first was a side-channel exfiltration path. OpenAI’s sandboxed Python execution environment, used for the Data Analysis feature, was designed to block direct outbound internet access. The company’s own documentation stated: “The ChatGPT code execution environment is unable to generate outbound network requests directly.” Check Point found that statement to be only partially true. While standard TCP/IP connections to external hosts were blocked, DNS resolution remained fully available inside the container.

The second flaw was a model-layer blind spot. Because the AI model had been trained to believe the execution environment had no outbound access, it did not treat DNS-based data transmission as an external data transfer. The model’s own safeguards, the ones that would normally require user confirmation before sending data out, simply did not fire. The model was not lying when it denied exfiltrating data; it genuinely did not understand that DNS could be used as a transport channel.

The result: a conversation that appeared completely normal from the user’s perspective was quietly sending encoded data to an attacker-controlled server through DNS queries. No warning dialog. No approval prompt. No indication of any kind.

How DNS Tunneling Works Inside a Sandboxed Runtime

DNS tunneling is a well-documented attack technique in network security, but its application inside an AI runtime is less commonly discussed. The mechanics here are worth understanding precisely, because the same pattern will reappear in other LLM platforms.

Standard DNS exists to resolve domain names into IP addresses. A client asks a resolver: what is the IP address of example.com? The resolver queries the authoritative name server and returns the answer. The key point is that the resolver chain carries the full hostname in each query, including any subdomain labels that prefix the root domain.

Attackers exploit this by encoding data into those subdomain labels. Instead of resolving a normal hostname, malicious code generates queries where the subdomain contains base64-encoded payload data. The attacker runs an authoritative name server for their domain, which logs every incoming query. Each DNS lookup delivers a fragment of the stolen data to the attacker’s server without any direct TCP connection being established between the victim container and the external destination.

To create a bidirectional channel (needed for the remote shell proof-of-concept), the attacker encodes command payloads into DNS TXT or CNAME responses. The process inside the container reads those responses, decodes the instructions, executes them, and sends results back through the next outbound DNS query. Check Point demonstrated this as a working remote shell inside the ChatGPT Linux environment. Commands executed through the side channel bypassed the model’s safety mechanisms entirely, because the model never saw them.

The encoding overhead is significant. DNS labels are limited to 63 characters each, and queries travel through resolver infrastructure that rate-limits and caches aggressively. But for the purpose of exfiltrating selected text, a patient name, a financial summary, a contract extract, the bandwidth is more than adequate.

The Dual-Flaw Architecture That Made This Possible

Understanding why this vulnerability existed requires separating the two independent failures that had to coexist for the attack to work.

Flaw 1: Incomplete network isolation. The ChatGPT code execution container ran on Linux with outbound TCP connections blocked. DNS resolution was permitted because the environment needed it for legitimate internal operations. This is a common misconfiguration pattern in containerized workloads: teams restrict direct internet access but forget that DNS queries route through resolver infrastructure that exits the network perimeter. The assumption that “no direct outbound access = no exfiltration path” is operationally incorrect. DNS, ICMP, and NTP can all carry data outward under the right conditions.

Flaw 2: Model-layer safety assumption failure. OpenAI’s safeguards included behavioral controls in the model itself. The AI was trained to resist requests that would send sensitive data to external parties without user confirmation. Those controls worked correctly for HTTP-based exfiltration. They failed here because the model operated under the documented assumption that the execution environment could not reach external servers. The model had no trained awareness that DNS resolution could itself be a transport channel. An attacker who knew this gap could craft a prompt that instructed the model to encode and resolve DNS hostnames containing conversation data, and the model would comply, not recognizing the action as external data transfer.

These two flaws are independent. Either one alone would not necessarily have led to silent exfiltration. Together, they created an attack that was both technically feasible and invisible to the model’s own safety layer.

Attack Scenario: The Malicious GPT as a Data Collector

Check Point Research built three proof-of-concept attacks. The most consequential for enterprise risk is the malicious custom GPT scenario, because it does not require the victim to do anything unusual.

A GPT (a customized version of ChatGPT configured with instructions, knowledge files, and optionally external API integrations) can be published to the GPT Store and made available to any ChatGPT user. Officially, GPT builders do not have access to individual user conversations. The vulnerability created an unofficial path around that boundary.

In the proof-of-concept, Check Point built a GPT configured as a personal health analyst. A user uploaded a PDF containing laboratory results, a document that included the patient’s name, date of birth, and test values. The user described symptoms and asked the GPT to interpret the results. From the user’s perspective, the interaction was indistinguishable from any other specialized assistant. The GPT analyzed the data and produced a medical assessment.

Simultaneously, and without triggering any warning, the GPT’s embedded instructions caused the model to encode the patient’s name and the generated medical assessment into DNS queries directed at the attacker’s name server. When the user directly asked whether any data had been sent externally, ChatGPT answered that it had not, that the file was stored in a secure internal location only. That response was technically accurate from the model’s perspective: it had no awareness that DNS queries were carrying data outward.

The attacker’s server received the extracted data regardless. For healthcare organizations deploying AI assistants, this scenario maps directly to a potential HIPAA breach. For financial services firms, it maps to a potential MiFID II or FCA data handling violation. For legal teams using AI for contract review, it represents possible privilege disclosure.

The Prompt Distribution Attack Vector

The custom GPT attack is not the only delivery mechanism. Check Point also documented a prompt-based variant that does not require the attacker to publish a GPT at all.

The attack begins when a user encounters a malicious prompt distributed through ordinary channels, a blog post listing productivity prompts, a LinkedIn post, a Reddit thread, or a Slack message from a trusted colleague who copied it from somewhere else. The prompt appears legitimate, perhaps promising to improve ChatGPT’s reasoning quality or enable a specific workflow. The user pastes it into a new conversation.

From that point forward, each new message in the conversation becomes a potential exfiltration target. The malicious prompt instructs the model to summarize subsequent messages and encode those summaries into DNS queries. The scope is configurable: raw user text, content extracted from uploaded files, or the model’s own generated output can all be targeted.

This vector is harder to defend against at the organizational level because it does not require any specific platform to be compromised. Any user with a ChatGPT subscription who encounters the prompt is potentially affected. Browser extensions, clipboard managers, and productivity tools that pipe content into ChatGPT automatically are particularly high-risk delivery mechanisms. For a broader treatment of how prompt-based attacks reach LLM deployments, see our guide on prompt injection attacks and how hackers manipulate AI systems.

Why This Matters Beyond the ChatGPT Patch

OpenAI fixed this specific vulnerability on February 20, 2026. That fix addressed the DNS exfiltration path in the code execution container. It does not address the broader class of attack.

The structural problem is that LLMs are increasingly deployed as execution environments. They read files, run code, call APIs, and process documents containing genuinely sensitive data. Every new capability added to an LLM platform expands its attack surface. The ChatGPT DNS side channel was one instance of a pattern that will recur: a covert outbound path that exists because the environment was designed to do useful things, and useful things require some form of external communication.

DNS tunneling has been a known attack technique in traditional network security for over two decades. Organizations deploy DNS monitoring, DNS filtering via filtered DNS security solutions, and anomaly detection for exactly this reason. The contribution of the Check Point research is demonstrating that the same technique is directly applicable inside AI execution runtimes, and that the model-layer safety controls that vendors present as a second line of defense are blind to it.

For security architects evaluating LLM integrations, the relevant question is not whether this specific CVE has been patched. It is what outbound communication paths exist in this platform, and which ones bypass the model’s safety layer. The answer will differ by platform, deployment model, and configuration. The patch applied to ChatGPT does not answer it for your private LLM deployment on AWS, Azure, or your on-premise infrastructure.

This is a theme covered in depth in our analysis of LLM security vulnerabilities and how large language models get exploited. The attack surface for enterprise AI deployments extends well beyond what most security teams currently monitor.

Enterprise LLM Integration Security Checklist

The following controls address the specific attack class described by Check Point Research, as well as the broader category of covert side-channel exfiltration from LLM execution environments. These apply whether you are deploying OpenAI APIs, Azure OpenAI Service, self-hosted open-weight models, or custom RAG pipelines.

Network Layer Controls

Block all outbound DNS from AI execution containers to public resolvers. Route all DNS through an internal resolver that you control, with full query logging enabled. This does not prevent DNS tunneling attempts, but it makes them visible and auditable. Any DNS query containing unusual entropy in the subdomain labels — base64-encoded strings, long random-looking tokens, or structured data patterns — is a detection signal worth investigating.

Apply explicit egress filtering at the container level. Blocking TCP 80/443 does not provide network isolation; you need to enumerate every permitted outbound protocol and deny everything else by default. For most LLM code execution environments, legitimate operations require very little outbound access. If your vendor cannot articulate what outbound traffic is necessary and why, treat that as a significant risk indicator.

Segment AI workloads from production data networks. If your LLM deployment can reach databases, internal APIs, or file shares containing sensitive data, a successful side-channel exfiltration attack has a much larger blast radius. The LLM should operate with the minimum necessary data access, behind the same network segmentation controls you would apply to any internet-facing service.

DNS Monitoring and Detection

Implement DNS anomaly detection specifically tuned for tunneling patterns. Commercial DNS security solutions and open-source tools like DNScat2 detection rules in Suricata or Zeek can identify the characteristic signatures: high query volume to a single domain, subdomains with high entropy, unusually long hostnames, and repeated queries with incrementing subdomain labels.

Log and analyze DNS query length distribution. Legitimate DNS traffic generates queries with an average subdomain label length of roughly 8 to 15 characters. DNS tunneling typically produces labels of 40 to 63 characters (the maximum per label). A simple histogram of subdomain lengths, sampled over a 24-hour window, will surface unusual patterns without requiring deep packet inspection.

Alert on DNS queries to newly registered domains, domains with no historical lookup pattern, and domains where the authoritative name server is a cloud VPS rather than an established hosting provider. These infrastructure characteristics are common in attacker-operated DNS tunneling setups.

A practical starting threshold: flag any DNS query where the subdomain Shannon entropy exceeds 3.5 bits per character, or where a single apex domain receives more than 500 queries per hour from a single source. Both thresholds are well above what legitimate DNS traffic generates from a single host, and both are consistent with the data volumes demonstrated in the Check Point proof-of-concept.

Prompt and Content Controls

Treat externally sourced prompts as untrusted input. This applies to prompts distributed via productivity blogs, community forums, colleague sharing, and any automated pipeline that ingests prompts from external sources. Establishing an approved-prompt library for enterprise ChatGPT or API deployments reduces exposure to prompt-based attack delivery, in much the same way that managing an approved software list reduces exposure to supply chain attacks.

Audit custom GPT configurations before allowing organizational use. If your organization permits employees to use GPTs from the GPT Store, you need a review process analogous to the app vetting process you apply to mobile or desktop software. GPT instructions are not visible to end users by default, and as the Check Point research demonstrates, those instructions can contain malicious logic without the user’s knowledge.

Implement output monitoring for AI-generated content that contains encoded strings. Legitimate AI output does not typically include long base64-encoded blocks, hexadecimal sequences, or structured subdomain-like strings. DLP rules tuned to detect these patterns in AI output can catch exfiltration attempts before the data reaches a resolver.

Vendor Security Posture Assessment

For every LLM platform in your supply chain, require documented answers to four questions. What outbound network paths exist from the execution environment? Which of those paths are logged and monitored by the vendor? What is the vendor’s process for disclosing and patching vulnerabilities discovered by third-party researchers? What was the timeline between internal discovery and public disclosure for the ChatGPT DNS side channel?

That last question is pointed. OpenAI confirmed to Check Point that it had already identified the underlying problem internally before the research was disclosed publicly. The fix was deployed on February 20, 2026, and Check Point published their findings on March 30, 2026. The timeline suggests OpenAI was working on the fix before external disclosure, which is the preferable sequence. The gap between internal discovery and public documentation is worth understanding when assessing vendor transparency.

For a structured approach to evaluating AI security across your vendor stack, our AI security guide covering threats and defenses in 2026 covers the full threat model for enterprise AI deployments, including vendor risk assessment frameworks.

Regulatory Exposure: GDPR, HIPAA, and FCA Implications

The ChatGPT DNS side channel is not merely a technical curiosity. For organizations in regulated industries, the proof-of-concept scenario creates direct regulatory exposure.

Under GDPR Article 32, data controllers and processors are required to implement appropriate technical and organizational measures to ensure a level of security appropriate to the risk. A covert exfiltration channel that sends personal data to an attacker-controlled server without user knowledge or consent is a clear Article 32 failure, regardless of whether the channel was documented by the vendor. Using a third-party AI service that contains this kind of vulnerability, without conducting a Data Protection Impact Assessment and implementing compensating controls, creates GDPR liability for the data controller.

Under HIPAA Security Rule 45 CFR 164.312(e)(1), covered entities must implement technical security measures to guard against unauthorized access to electronically protected health information transmitted over electronic communications networks. The health analyst GPT proof-of-concept demonstrated exactly this scenario: ePHI transmitted to an external server without authorization, without user knowledge, and without any of the controls required by the Security Rule.

For FCA-regulated firms in the UK, the Senior Managers and Certification Regime (SMCR) places direct accountability for operational resilience, including data security, on named individuals. An AI-enabled data breach resulting from an unreviewed vendor vulnerability is precisely the kind of incident that surfaces in regulatory correspondence.

Frequently Asked Questions

Has the ChatGPT DNS side channel vulnerability been fully fixed?

OpenAI deployed a fix to the specific DNS exfiltration path in ChatGPT’s code execution environment on February 20, 2026, as confirmed by both OpenAI and Check Point Research. The fix addresses the particular technique demonstrated in the proof-of-concept. However, the broader attack class, covert side channels from LLM execution runtimes using protocols other than HTTP/HTTPS, has not been comprehensively addressed across the industry. Security teams should assume that similar patterns may exist in other LLM platforms and conduct their own network-level audits accordingly.

Does this vulnerability affect the OpenAI API, or only the ChatGPT web interface?

The vulnerability resided in ChatGPT’s sandboxed code execution environment, which is invoked when users upload files for analysis or use the Data Analysis feature. This environment is accessible through the ChatGPT web interface and apps. Organizations consuming the OpenAI API directly and not using the code interpreter or data analysis tools were not exposed to this specific attack path. Custom GPTs built on the API that incorporated the code execution capability were potentially affected.

What does DNS tunneling look like in network logs, and how can security teams detect it?

DNS tunneling generates high query volume to a single apex domain, subdomains with high Shannon entropy above 3.5 bits per character, unusually long hostnames approaching the 253-character total DNS name limit, and time-series patterns where queries arrive at regular short intervals. Network detection tools including Zeek, Suricata with custom signatures, and commercial DNS security platforms like Cisco Umbrella or Infoblox BloxOne can be tuned to alert on these indicators. Reviewing your DNS resolver logs for outlier domains by query volume and subdomain entropy is the starting point for any detection effort.

Can similar side-channel attacks affect self-hosted LLM deployments?

Yes, and the risk is arguably higher in self-hosted environments, because the security of the execution sandbox depends entirely on your own configuration. If you run an LLM with a code execution capability, whether that is Ollama with a Python plugin, a LangChain agent with shell access, or a custom RAG pipeline with tool-use capabilities, the same DNS tunneling attack class is applicable. The attacker’s delivery mechanism changes to a prompt injection through the input pipeline rather than a malicious GPT, but the underlying technique is identical. Network segmentation and DNS egress filtering are non-negotiable controls for any LLM deployment that can execute code or make network calls.

Was any user data actually stolen through this vulnerability before the patch?

There is no public evidence that the vulnerability was exploited in the wild before the February 20, 2026 patch. However, the absence of evidence is not evidence of absence: DNS-based exfiltration is specifically difficult to detect without purpose-built monitoring, and the attack left no visible trace in the ChatGPT interface. Organizations that had employees upload sensitive documents to ChatGPT before February 2026 cannot conclusively rule out exposure without reviewing their DNS resolver logs for that period.

How do I assess whether my organization’s AI vendor has similar blind spots?

Request a security architecture document covering network isolation controls for execution environments, outbound protocol allowlists, DNS resolver configuration, logging coverage for all outbound traffic, and the process for third-party vulnerability disclosure and patching. Conduct your own network-level testing against any AI sandbox your organization operates. Subscribe to the vendor’s security advisory feed and treat AI platform updates with the same urgency you apply to operating system patches.

Secure Your LLM Deployments Before the Next Side Channel Surfaces

The ChatGPT DNS side channel closed on February 20, 2026. The attack class it demonstrated has not. Every LLM platform that provides a code execution environment, tool-use capability, or file processing feature has an analogous attack surface that deserves the same scrutiny your team would apply to any other internet-facing service processing sensitive data.

If you need help auditing your organization’s AI integration security posture, reviewing vendor contracts for security disclosure obligations, or implementing the DNS monitoring and network segmentation controls described above, Shield Operations works with enterprise security teams across the UK and Europe. Contact us to discuss a structured AI security assessment.