AI vs AI: The Autonomous Cybersecurity Arms Race Reshaping the SOC

Autonomous AI Attacks Emerge: Cybercriminals are weaponizing generative AI (e.g. WormGPT, FraudGPT) to automate phishing campaigns, craft polymorphic malware, and even discover zero-day exploits with minimal skill, drastically lowering the barrier to entry for advanced attacks ^[1] ^[2]. Experts warn an AI agent could soon execute an entire cyberattack autonomously, including finding a unique vulnerability and exploiting it without human hackers ^[3].
AI-Powered Defense in the SOC: Security teams are embracing AI copilots and “agentic” AI systems that augment or automate traditional SOC functions. Tools like Microsoft Security Copilot (built on GPT-4) let analysts query incidents in natural language and summarize threats from billions of signals ^[4] ^[5]. Meanwhile, SentinelOne’s Purple AI “Athena” can autonomously triage alerts and orchestrate responses in seconds, reducing dwell time and fatigue for defenders ^[6] ^[7].
Red Team vs Blue Team – AI Arms Race: Offensively, AI assists attackers in writing exploits and evasive payloads at machine speed, even helping novices mimic nation-state level techniques ^[8] ^[9]. Defensively, AI-driven detection and response systems are attempting to match that speed – spotting AI-generated threats, adapting to novel tactics, and even automating countermeasures (e.g. on-the-fly creation of new detection rules when a fresh attack is identified) ^[10]. This dynamic is fundamentally changing the traditional SOC stack, moving from manual, human-driven analysis to AI-augmented workflows and automation.
Expert Insights – Promise and Perils: “LLMs and generative AI are likely to have a major impact on the zero-day exploit ecosystem,” says cybersecurity researcher Chris Kubecka, noting these tools can analyze code at scale and accelerate vulnerability discovery – but also teach more people how to create exploits ^[11] ^[12]. Conversely, defensive experts like Tomer Weingarten (CEO of SentinelOne) highlight that AI finally allows security teams to detect and respond at machine speed to keep up with nation-state adversaries ^[13]. Yet caution abounds: LLMs can hallucinate or err, so human oversight and trustworthy AI design remain crucial in this high-stakes domain.
Follow the Money – Soaring Investment in AI Security: Venture capital is flooding into “AI-native” security startups building next-gen SOC platforms. Since 2022, AI-driven detection & response (AI-DR) firms have raised over $730 million ^[14]. In Q1 2025 alone, cybersecurity startups pulled in $2.7 B in funding (up 29% from late 2024), with investors specifically eyeing agentic AI innovations that automate cyber defense ^[15] ^[16]. Analysts predict that by 2028, 70% of threat detection/response systems will incorporate multi-agent AI, up from just 5% today ^[17]. For CISOs and SOC leaders, the message is clear: adapt to the AI revolution or risk being outpaced by attackers – and competitors – wielding autonomous cyber tools.

Introduction: The Dawn of Autonomous AI in Cybersecurity

Imagine a near future where a malware strain is not hand-coded by a human, but generated on the fly by an AI, and where the security system defending your network is itself an AI that detects and neutralizes the threat in milliseconds. This scenario is quickly moving from science fiction to reality. Recent advances in artificial intelligence – particularly large language models (LLMs) and generative AI – are transforming the cyber battlefield on both offense and defense. Attackers are equipping themselves with AI tools that can write phishing emails, find software vulnerabilities, and create polymorphic malicious code at a scale and speed impossible for humans to match. In parallel, defenders are deploying AI copilots and autonomous “blue team” agents that can sift through massive alert volumes, hunt for anomalies, and even initiate responses without waiting for human direction.

This report examines the emergence of autonomous AI in cybersecurity, focusing on the escalating arms race between AI-powered offensive tools and AI-enabled defensive systems. We explore how these technologies are upending the traditional Security Operations Center (SOC) stack, review real-world developments (from AI-discovered zero-days to GPT-4-powered security copilots), and assess strategic implications for security leaders. In the words of veteran cyber executive John Watters, “The security gap is the difference between the innovation pace of the adversary and the innovation pace of the defender… Adversaries lead. We all think we’re innovators — we’re not.” ^[18]. As autonomous AI becomes the new battleground, defenders will need to innovate faster – or risk falling behind in a world where attacks unfold at machine speed.

AI-Powered Offense: Generative Malware, AI-Discovered Zero-Days, and Autonomous Hackers

On the offensive side, AI is proving to be a force multiplier for cybercriminals and penetration testers alike. Large language models can analyze and write code, making them surprisingly adept at finding security holes and creating exploit payloads. “LLMs and generative AI are likely to have a major impact on the zero-day exploit ecosystem,” warns Chris Kubecka, a cybersecurity author and former USAF cyber officer ^[19]. These tools can pore over huge swaths of source code or binaries to pinpoint weaknesses far faster than a human, and even suggest exploit strategies in plain language ^[20]. In one experiment, Kubecka developed a custom “Zero Day GPT” system and uncovered 25 previously unknown vulnerabilities in a matter of months, a task that might have taken years otherwise ^[21]. One of the AI-found bugs was a severe flaw in the Zimbra email platform – the AI analyzed a security patch and not only identified a way to bypass it, but even wrote a working exploit for the new zero-day on the spot ^[22]. “By golly, it worked,” Kubecka said of the AI-generated exploit code ^[23].

This democratization of elite hacking skills has far-reaching implications. Tasks that once required deep expertise – like crafting a memory corruption exploit or devising evasion techniques – can now be assisted by generative AI. Novice hackers can use AI chatbots to produce sophisticated attacks, lowering the entry bar for cybercrime. Lucian Nițescu, red-team lead at a penetration testing firm, observes that “AI tools can help less experienced individuals create more sophisticated exploits and obfuscations… This lowers the entry barrier… while also assisting experienced exploit developers by suggesting improvements and novel attack vectors” ^[24]. In other words, an eager script-kiddie with access to an LLM can start approaching the capabilities of an APT crew. At the 2024 DefCamp hacking competition, even top-tier teams admitted leaning on ChatGPT for help solving challenges and filling knowledge gaps ^[25] ^[26]. When confronted with an unfamiliar cloud service during an attack scenario, the AI was “a great help in guiding them in the post-exploitation stage,” effectively mentoring the hackers on what to do next ^[27].

Perhaps the most headline-grabbing offensive AI tools are the new generative malware and social engineering engines circulating on the dark web. In mid-2023, researchers revealed WormGPT, an unscrupulous AI model built on the GPT-J architecture and designed explicitly for cybercrime ^[28]. WormGPT was essentially an evil twin of ChatGPT: an LLM with no ethical guardrails, marketed to criminals for tasks like writing convincing phishing emails, launching business email compromise (BEC) scams, and generating malware code. For a subscription fee (around $110/month at the time), buyers could leverage WormGPT to do “all sorts of illegal stuff” from phishing to virus creation ^[29]. Though the original WormGPT service was shut down after media exposure in 2023, it spawned copycats. By late 2024, threat actors were hijacking mainstream AI models to build new variants of WormGPT. Security analysts at Cato Networks uncovered WormGPT versions riding on Elon Musk’s xAI Grok model and the open-source Mistral model (dubbed “Mixtral”) ^[30]. These clandestine bots, accessible via Telegram, use clever prompt hacks to strip out the safety filters of the base models and spew out unrestricted malicious content ^[31] ^[32]. “Our analysis shows these new iterations of WormGPT are not bespoke models built from the ground up, but rather the result of threat actors skillfully adapting existing LLMs,” notes Vitaly Simonovich, the Cato researcher who analyzed them ^[33]. In testing, these AI agents readily produced working phishing lures and even PowerShell scripts to harvest Windows credentials, all at the behest of a user with no particular hacking skill ^[34].

Another black-hat offering, FraudGPT, has been advertised on Telegram since 2023 as an “AI for scammers,” priced around $200/month ^[35]. Its interface mimics ChatGPT, but instead of writing term papers it provides step-by-step instructions for fraud: e.g. generating phishing pages or crafting fake bank emails complete with prompts where to insert malicious links ^[36]. During one test, FraudGPT was able to churn out a very authentic-looking bank phishing email and advise exactly how to tweak it for maximum credulity ^[37]. It will even list popular target websites and common vulnerabilities for criminals planning their next campaign ^[38]. These malicious AIs are often trained on extensive malware and fraud datasets, giving them an encyclopedic knowledge of hacking techniques to share with users ^[39].

It’s not just about scams and scripts – AI engines are also inventing new attack techniques on the fly. WormGPT and its ilk can adjust their outputs in real-time to evade detection, potentially generating polymorphic code that changes with each iteration to slip past antivirus signatures ^[40]. One concerning capability noted in reports: an AI like WormGPT could autonomously scan a target network for vulnerabilities and exploit them to propagate an attack, functioning like an AI-powered worm ^[41]. Security experts have long feared self-spreading malware, and AI may provide the “brains” to make it a reality. In fact, former Mandiant CEO Kevin Mandia recently warned that we will likely soon see autonomous AI agents that can breach networks and pivot within them without any direct hacker involvement – essentially, automated digital intruders that learn as they go. “The day is near when bad actors will use AI to hijack another AI system companies rely on… forcing it to go rogue,” echoes John Watters, ex-Mandiant executive, who predicts an AI-driven attack will occur “within months” ^[42]. Even more alarming, Watters says such an attack “won’t be generic. It will be built uniquely for its victim, exploiting a zero-day vulnerability tailored to that company’s systems” ^[43]. In other words, the nightmare scenario is an AI that can identify a novel, victim-specific flaw (a zero-day), write an exploit for it, and execute a breach end-to-end – all potentially faster than a human team could respond.

Evidence is mounting that advanced threat actors (APTs) are already experimenting with AI in their toolkits. In early 2023, Microsoft and OpenAI published a joint cyber-intelligence report noting that “well-known APT groups had been using LLMs” to enhance their operations ^[44]. Detected techniques included LLM-informed reconnaissance (using AI to analyze reconnaissance data), LLM-enhanced scripting (AI-written malicious scripts or macros), and even LLM-assisted vulnerability research by adversaries ^[45]. Although it’s often hard to prove a given piece of malware or phishing email was AI-generated, researchers have spotted telltale signs – such as uniquely worded phishing lures that hint at AI authorship. What’s clear is that the offensive use of AI is rapidly advancing, creating more frequent and more sophisticated threats. “Cybercriminals leverage generative AI techniques to create polymorphic malware, zero-day exploits, and phishing attacks. These tactics are difficult to detect and mitigate,” notes a 2024 threat briefing by Radware ^[46]. Unlike the “spray and pray” attacks of old, AI allows bad actors to customize and personalize attacks at scale. We may soon face waves of intrusions where each malware sample is one-of-a-kind, crafted by an AI to exploit a specific weakness in each target environment ^[47].

AI-Augmented Defense: From GPT-4 Copilots to Autonomous SOC Agents

As attackers sharpen their knives with AI, defenders are responding in kind by arming their SOCs with AI copilots and autonomous detection & response systems. The goal: dramatically cut down reaction times and handle the skyrocketing volume and complexity of alerts that modern enterprises face. “AI and automation have long held the promise of fundamentally transforming security operations and supercharging analysts to detect and respond – at machine speed – to threats from even the most sophisticated adversaries,” says Tomer Weingarten, CEO of SentinelOne ^[48]. Now, that promise is being realized through new AI-driven defense platforms.

One of the most prominent examples is Microsoft’s Security Copilot, unveiled in March 2023 as a GPT-4-powered assistant for cybersecurity professionals. Security Copilot is essentially a specialized chatbot integrated into Microsoft’s security ecosystem, designed to help human analysts make sense of the vast streams of data and alerts. Powered by OpenAI’s GPT-4 and Microsoft’s own security-specific models, Security Copilot lets a defender ask questions like “What are all the security incidents in my enterprise this week?” and instantly get a summarized report ^[49]. Behind the scenes it taps into Microsoft’s troves of telemetry – some 65 trillion security signals collected daily – and the user’s local security logs to compile an answer ^[50]. Rather than replacing analysts, Security Copilot acts as a force multiplier: it can draft incident reports, explain a vulnerability in plain English, or even analyze a suspicious file or URL when provided by the user ^[51]. All interactions are logged for an audit trail, which is crucial for trust and oversight ^[52]. “This is like having a junior analyst that never sleeps, helping sift through data and highlight what matters,” is how one might view it. Microsoft stresses that Copilot is there to assist, not automate away, the human element ^[53]. For example, an analyst could ask, “Summarize the alerts from yesterday that relate to Log4j vulnerability,” or, “Analyze this executable for malicious behavior,” and the Copilot will produce an organized answer or report that the team can validate and act upon.

Other security vendors have launched similar AI assistant features. For instance, Palo Alto Networks introduced an AI Ops solution within its Cortex platform and heavily markets its Cortex XSIAM (Extended Security Intelligence & Automation Management) as an “AI-driven SOC platform” ^[54]. XSIAM ingests data from across an organization’s endpoints, network, cloud, and uses machine learning to correlate alerts and detect hidden threats, with automation workflows to respond faster ^[55] ^[56]. Think of it as a next-gen SIEM + SOAR where many analysis and response steps are handled by AI logic. Google’s cloud security (Chronicle) and IBM’s QRadar have also added AI-driven analytics to help prioritize threats, but Microsoft’s GPT-4 Copilot arguably grabbed the most attention by bringing an easy chat interface to incident response.

The real leap, however, is moving from AI assistance to AI autonomy in defense – what some call “agentic AI.” Agentic AI refers to AI systems that can not only answer questions, but take independent actions in response to situations ^[57]. In the SOC context, this means an AI that can observe an emerging attack and automatically enact countermeasures (isolating hosts, blocking IPs, updating firewall rules) without needing a human to ask first. SentinelOne’s “Purple AI” is at the forefront of this autonomous SOC movement. Initially, Purple AI was an LLM chatbot integrated into SentinelOne’s Singularity XDR platform, allowing analysts to conversationally hunt threats (e.g., “show me all devices communicating with this malicious domain”). But in April 2025 at RSA Conference, SentinelOne unveiled Purple AI Athena, which they describe as a full agentic AI solution for cyber defense ^[58] ^[59]. Athena takes Purple AI beyond just Q&A into automated decision-making. It continuously monitors the environment and, upon spotting suspicious activity, can determine the best response by orchestrating across various tools (endpoint agents, network controls, etc.) ^[60]. In essence, Athena tries to mimic the “iterative thought process and deductive reasoning of experienced SOC analysts” – but at machine speed ^[61]. According to SentinelOne, Athena can detect, triage, and even remediate incidents in seconds with minimal human oversight, drastically shrinking the mean time to respond (MTTR) ^[62].

Athena is built on three pillars ^[63]: first, deep analysis at machine speed, where it autonomously investigates anomalies across data sources and correlates them (like a seasoned threat hunter connecting dots); second, full-loop remediation, meaning it doesn’t stop at raising an alert but will take containment actions and even learn from new attacks by auto-generating detection rules to prevent similar future incidents ^[64]; and third, seamless integration with the security stack, so it can pull data from third-party SIEMs or data lakes and push response actions to various enforcement points ^[65]. This kind of AI agent effectively acts as an extra tier of analyst that works 24/7, handling tier-1 triage and response. A human SOC engineer might only get involved when the AI either neutralizes a threat (and presents a summary of what it did) or when it encounters something truly novel or ambiguous and escalates for confirmation.

SentinelOne’s not alone. Across the industry, we see a push towards autonomous SOC components. CrowdStrike, for example, has been infusing AI into its Falcon platform for automated threat scoring and proactively discovering attack patterns. IBM’s Security QRadar Suite now includes an AI assistant named “QRadar Advisor with Watson” that can investigate offenses and enrich them with threat intel automatically. And startups are entering the fray: startups like iCounter (headed by John Watters) are developing LLM-based tools explicitly to spot and block AI-driven attacks in real time ^[66] ^[67]. Watters predicts that come RSA Conference 2026, “the term AI-DR, or AI Detection and Response, will dominate the trade show floor” ^[68]. The concept of AI-DR is like the next evolution of EDR/XDR – whereas EDR watches endpoints for known threats, AI-DR solutions watch for signs of AI-driven or novel attacks and can dynamically respond. One focus area is detecting when an attacker is manipulating or “hijacking” an organization’s own AI systems ^[69]. For instance, if an intruder uses prompt injection to turn a company’s customer service chatbot into a malicious insider, an AI-DR tool’s job is to catch that behavior. In a recent example, the AI sales assistant of a SaaS firm (Salesloft) was compromised, illustrating how even benign AI agents can be turned against their owners ^[70]. This creates a whole new class of signals for defenders to monitor, and traditional tools aren’t built for it – hence the emergence of dedicated AI-DR products.

The traditional SOC stack – SIEM, SOAR, EDR, NDR, etc. – is thus being rapidly augmented (and in parts, replaced) by these AI-driven capabilities. Instead of static correlation rules and playbooks that humans have to maintain, AI systems can learn and adapt to new threats on the fly. They excel at noise reduction, parsing millions of events to bubble up the few truly important incidents (thus tackling the alert fatigue problem). They can also provide a level of insight that normally requires a seasoned analyst – for example, explaining in natural language why a cluster of alerts might indicate a ransomware attack, or suggesting the most likely path an intruder took through a network. The endgame that vendors are pitching is a kind of autonomous SOC, where human analysts work alongside AI agents. Mundane tasks like log parsing, initial triage, and report writing get offloaded to the AI, freeing humans to focus on creative problem-solving, complex investigations, and strategic defense improvements. It’s a compelling vision, especially as organizations struggle with cybersecurity talent shortages and an overstretched workforce.

However, these AI defenses are not silver bullets. Attackers will no doubt try to outwit defensive AI just as they circumvent other security measures. There’s already talk of “AI vs AI” skirmishes – e.g., malware designed to confuse or overwhelm AI detectors, or adversarial attacks on the machine learning models themselves. One known risk is that AI-based detectors can be vulnerable to adversarial inputs (e.g., a carefully crafted log entry or network packet sequence that causes an AI to miss an attack). Additionally, false positives and AI errors can cause new headaches – an overzealous autonomous system could, say, shut down critical servers mistakenly if it misinterprets benign activity as malicious. Early adopters have noted that while these systems are powerful, they still require supervision. A misbehaving “security AI” can create chaos, so many organizations introduce AI-driven actions gradually, often in a “human approve before enforcement” mode initially.

Red Team vs Blue Team: Implications of AI Autonomy on Both Sides

The advent of autonomous AI for both attackers (red team) and defenders (blue team) is fundamentally altering the cat-and-mouse dynamic of cybersecurity. Several key implications emerge from this AI-vs-AI escalation:

1. Speed and Scale of Attacks vs. Responses: Autonomous AI agents can operate at machine speed – far faster than any human. On offense, this means attacks can unfold and mutate extremely quickly. An AI worm could identify a target, penetrate, and spread across an enterprise in minutes, using a zero-day exploit it just discovered. On defense, only another machine-speed system has a chance of containing such an outbreak. Automation is emerging as a critical solution in these environments, notes Umesh Padval, a cyber venture capitalist, because AI “has the potential to make cybersecurity professionals more effective, streamline operations and reduce the time to resolution of critical hacks” ^[71]. In a sense, we’re approaching a era of “machine fights machine” in cyberspace – with human oversight around the edges. The traditional SOC processes (manual incident investigation, writing detection rules after an attack, etc.) are too slow when confronting AI-accelerated threats. This is driving adoption of SOAR playbooks and autonomous response features that can execute in seconds. It also raises the stakes for early detection: if your AI can catch the intruder’s AI in the act (e.g., detect the anomaly of an AI agent doing reconnaissance), you might thwart the attack; if not, by the time human analysts notice, it could be game over.

2. Erosion of the Skill Gap: Historically, sophisticated cyber attacks were the province of nation-states or elite hackers, and skilled defenders could often recognize the telltale patterns of less-skilled intruders. With AI, a relatively unskilled attacker can launch highly advanced attacks (because the expertise is coming from the AI’s training data). Likewise, a less-experienced junior analyst can investigate and respond to incidents like a seasoned pro with an AI copilot whispering in their ear. This erosion of skill gap cuts both ways. Security leaders can’t assume that “unsophisticated” threats will be low-quality – an amateur threat actor with an AI tool might craft malware that defeats your antivirus or write phishing lures that fool even savvy users ^[72]. On the flip side, organizations might not need as large a team of deeply experienced analysts if they effectively leverage AI assistants. The role of humans in both red and blue teams shifts more towards strategy, intuition, and oversight, while AI handles grunt work and even some creative work (like generating new exploit variants or correlations). In practice, this could democratize hacking (more adversaries with serious capabilities) and democratize defense (smaller companies able to do decent security with AI help). But it also means there’s less of a “gulf” between script-kiddy and nation-state – the baseline capability on both sides is rising.

3. Changes to the SOC Workflow and Stack: As AI systems take on tasks, the day-to-day operations in a SOC will change. We can expect fewer Level-1 analysts triaging alerts and more AI overseeing that initial triage. Those entry-level SOC analysts may evolve into “AI controllers” – managing and tuning AI systems, validating their findings, and handling exceptions or complex cases. The traditional hierarchy of Tier 1, 2, 3 analysts could blur, since AI might handle much of Tier 1 and Tier 2 work instantly. Tools-wise, the consolidation of the SOC stack is likely. Why maintain separate SIEM, user behavior analytics, threat intelligence platform, and SOAR tool, if one AI-driven platform can ingest data and do all of those functions coherently? Indeed, investors predict a rise of “new broad cybersecurity platforms” that use AI to combine capabilities, potentially edging out point solutions ^[73]. This integrated approach, exemplified by platforms like Cortex XSIAM, also solves one of SOCs’ biggest pain points: context-switching between tools. An AI can seamlessly pull data from multiple sources and present a unified story (something humans struggle with when juggling dozens of dashboards).

4. Emerging AI-Specific Threats: When both sides use AI, we get scenarios where attackers try to exploit defender AI, and vice versa. For example, prompt injection attacks – feeding malicious input to an AI like Security Copilot to make it spill secrets or malfunction – become a real concern. An attacker who gains initial access to a network might try to tamper with the logs or data that a defensive AI consumes, hoping to confuse it or blind it. Alternatively, attackers might unleash adversarial examples (specially crafted artifacts) that cause an AI-based detector to misclassify an attack as benign. Defensive AIs will need hardening against these tactics, including robust input validation and perhaps ensembles of models that cross-check each other. Likewise, defenders might deploy honeypot data or deception specifically to mislead offensive AI agents, tricking them into revealing themselves. It’s a new kind of chess match. John Watters highlighted that organizations’ own AI tools have “a huge target on their back because they can be overtaken and forced to hallucinate or go rogue” ^[74]. That implies part of cyber defense will be monitoring AI systems for signs of compromise or misuse, a concept that barely existed a few years ago. This is precisely what AI-DR aims to address – detecting when an AI is doing something it shouldn’t, whether due to error or adversary manipulation ^[75].

5. Policy, Oversight, and Ethics: The introduction of autonomous decision-making in security also raises policy and ethical questions. Who is accountable if an AI defensive system mistakenly shuts down a hospital’s network because it falsely detected an attack? How do we ensure AI recommendations are unbiased and don’t overlook attacks due to skewed training data? Enterprises will need to set governance rules for AI in the SOC, deciding which actions can be fully automated vs. which require human sign-off. Many are adopting an approach of “human on the loop” – the AI does everything up to the final action, which a human quickly reviews. Over time, as confidence in AI grows, more fully automated responses might be allowed for specific scenarios (e.g., isolating a clearly infected machine). Transparency is vital: AI systems should be able to explain why they flagged something, to build trust with human analysts. This is an active area of development (Explainable AI for cybersecurity). Furthermore, defenders must avoid an overreliance on AI that could deskill their human team – a balance has to be struck where AI is a tool, not a crutch.

6. The Adaptation Imperative for Security Teams: Perhaps the most important implication is strategic: CISOs and SOC leaders must adapt quickly to the AI era. That means investing in the right AI-enabled tools, but also training their staff to use them effectively and rethinking processes to integrate AI. It also means being realistic about the threat landscape: assuming that if you don’t leverage AI, your adversaries (or competitors) certainly will. Some organizations are already creating dedicated “AI for cybersecurity” research units, experimenting with both offensive and defensive AI in controlled environments to understand their capabilities and limits. Others are collaborating in industry groups to share AI threat intelligence – for example, if someone’s AI detects a novel phishing email written by ChatGPT, how can that info be shared quickly across the community? Security strategies and budgets are starting to reflect these priorities, with many 2024–2025 cyber roadmaps including projects for AI-driven security analytics, automated threat hunting, and so forth.

Recent Developments and News: AI-Driven Threats and AI-Based Detection in the Headlines

The past two years have provided a flurry of real-world examples that illustrate the trends discussed:

WormGPT and Dark AI Chatbots: In July 2023, news of WormGPT broke, describing it as “a blackhat alternative to ChatGPT” being sold on hacker forums ^[76]. Its creator even advertised it as a way to do “all sorts of illegal stuff” with AI. By August 2023, WormGPT’s author reportedly shut it down under scrutiny, but by late 2024, new WormGPT variants re-emerged on BreachForums, now built on top of more advanced models (xAI’s Grok and Mistral AI) to continue the service ^[77]. This cat-and-mouse shows how attempts to clamp down on misuse of GPT models led criminals to adapt and find new hosts for their malicious AI. Around the same time, FraudGPT was discovered, with cybersecurity firms like Netenrich testing its capabilities to generate phishing content en masse ^[78]. These incidents were widely covered in cybersecurity media, raising awareness that AI isn’t just helping defenders – it’s also turbocharging attackers.
AI-Discovered Vulnerabilities: In late 2023 and 2024, there were multiple reports of critical vulnerabilities found with AI assistance. For instance, a critical Linux kernel zero-day was reportedly identified by an AI system in an experiment, beating human researchers to the punch (this was often cited as a “proof of concept” that AI can do preventative security). More concretely, in 2024 Protect AI released VulnHunter, an open-source LLM-powered code analysis tool that was used to find over a dozen zero-day flaws in popular open-source projects ^[79] ^[80]. One such VulnHunter-discovered bug (CVE-2024-10099) was a remote code execution in a widely starred GitHub project; the AI provided the full exploitation chain to achieve it ^[81]. These stories hit tech news and demonstrated the dual potential of AI: it can help attackers find exploitable holes, but it can equally help defenders and researchers to preemptively catch and fix those holes. Microsoft’s AI research division also showcased using GPT-4 to assist in code review and vulnerability scanning, hinting that future DevSecOps might involve AI pair programmers that flag security issues as code is written.
Microsoft Security Copilot and Others Debut: On the defensive news front, Microsoft’s launch of Security Copilot in March 2023 garnered large headlines in both tech press and mainstream media. It was framed as “GPT-4 for cybersecurity”, and commentators speculated on how it could change SOC operations. Throughout 2024, Microsoft expanded trials of Security Copilot with enterprise customers, and by 2025 it announced integrations of Copilot into tools like Intune (for device management) and Entra (identity) – effectively bringing agentic AI directly into endpoint and identity management ^[82]. In interviews, Microsoft’s security VP hinted at “Security Copilot Agents”, suggesting the future where Copilot can proactively take actions (for example, an agent for reverse engineering malware automatically) ^[83] ^[84]. This indicates Microsoft’s strategy to move from a pure assistant to more automated “security AI agents” across its ecosystem. Similarly, other vendors made announcements: SentinelOne’s Purple AI Athena launch in April 2025 was covered by security outlets and pitched as “bringing autonomous decision-making to the SOC” ^[85]. Even the U.S. government got involved – agencies like the NSA and DHS in 2024 spoke about leveraging AI for national cyber defense, and DARPA launched a program called “AI Cyber Challenge (AIxCC)” to incentivize the creation of AI systems that can secure critical code (this challenge was highlighted at DEF CON 2023 where teams demonstrated AI-driven vulnerability discovery).
Notable AI-Powered Attacks: While fully autonomous attacks have not yet been publicly confirmed, we saw glimpses of AI aiding threat campaigns. In mid-2025, a breach at a financial firm was linked to a highly convincing phishing email that internal analysis determined was likely AI-generated due to linguistic patterns – the email bypassed both technical and human detection because it was uniquely crafted. There was also the case of the Salesloft AI agent compromise in 2025: Salesloft, a sales engagement platform, had introduced an AI feature to automate some sales communications. Attackers managed to manipulate that AI agent via a prompt injection-like flaw, causing it to send out unauthorized messages to customers, including malicious links. Major security companies (Dynatrace, Qualys, CyberArk, Cato Networks) were reportedly impacted as clients of Salesloft who received these rogue AI messages ^[86]. This incident made waves as it demonstrated a supply-chain like attack via AI, and underscored Watters’ point that organizations must now defend not just their own systems, but also their interconnected AI services from being turned against them.
Gartner and Industry Analysts’ Reports: In March 2025, Gartner released a report highlighting “AI-augmented security” as a top trend, with the statistic that more than $730 million has been invested in AI-based detection and response startups since 2022 ^[87]. Gartner predicted that within a few years, a majority of threat management workflows will involve multiple AI agents collaborating (e.g., one AI might detect an anomaly, hand off to another AI to enrich the data, and a third to suggest remediation). This report was cited in many industry blogs and likely spurred even more interest among investors and customers in AI-first security solutions.

In summary, the news cycle of 2023–2025 in cybersecurity has been dominated by the rise of AI on both sides of the fence. What was theoretical just a couple years prior – AI writing malware or AI running a SOC – has quickly become very tangible through these stories.

Venture Capital Flows: Big Bets on AI-Native Security Platforms

The surge of AI in cyber has not gone unnoticed by the investment community. Venture capital funding in cybersecurity, especially for AI-driven startups, has reached unprecedented levels. After a sluggish period in 2022–2023, investors came roaring back into cyber in 2024, largely due to the hype and promise of AI. According to Crunchbase data, VC-backed cybersecurity companies raised $2.7 billion in just the first quarter of 2025, a jump of 29% from the previous quarter ^[88]. Early-stage deals in particular are hot: “We’re witnessing competitive Series A and B rounds for companies demonstrating clear market traction,” says Ofer Schreiber of YL Ventures ^[89]. And what area of traction are investors seeking? AI is front and center.

Umesh Padval of Thomvest Ventures notes that many VCs are “scrutinizing how they could replicate Wiz’s success” ^[90] (referring to cloud security firm Wiz’s record-breaking $32B acquisition by Google, which showed the market’s appetite for cyber unicorns). He adds that “one significant trend driving interest in this space is AI, and more specifically, agentic AI” ^[91]. The belief is that AI-powered security platforms could reshape enterprise defense, and those that succeed will be hugely valuable. This has led to a proliferation of startups branding themselves as “AI-driven X” – whether it’s AI-driven cloud security, AI-driven identity threat detection, or AI-driven SOC automation.

Concrete numbers back the trend: Gartner’s March 2025 analysis found over $730 million had been poured specifically into AI-focused detection and response startups since 2022 ^[92]. Examples include firms like Vectra AI, which raised large rounds for its AI-driven network detection, and ReliaQuest (which acquired an AI analytics startup to bolster its platform). Another example is Huntress launching an AI-assisted threat hunting tool after raising funding. Many of the “Top Cybersecurity Startups of 2025” lists are filled with AI-centric companies, from those using machine learning to protect machine learning (e.g., HiddenLayer, which does “AI Detection & Response” for ML models ^[93]) to those offering AI copilots for security analysts (like BishopFox’s Cosmos platform announced at Black Hat, incorporating generative AI to assist pentesters).

Interestingly, it’s not just startups; incumbents are also investing heavily in AI. Big cybersecurity vendors (Palo Alto, Fortinet, Cisco, etc.) have acquired smaller AI firms to turbocharge their own AI capabilities. For instance, SentinelOne’s acquisition of Obsero AI was a move to enrich its Purple AI offering ^[94]. These acquisitions and investments signal that the market expects AI to be a core feature of any serious security product in the coming years.

From a macro perspective, AI startups across all sectors dominated VC funding in 2025 – nearly 58% of all global VC investment in Q1 2025 went to AI-related companies ^[95]. Cybersecurity is a major slice of that pie due to the urgent need for innovation. A SoftBank-led $40B investment into OpenAI in 2024 (making headlines) further validated the space ^[96]. And although some warn of a possible AI hype bubble, the consensus is that in security, the problems AI aims to solve (breaches, shortage of analysts, skyrocketing data) are very real and pressing.

For enterprise buyers (CISOs and procurement teams), this wave of VC-backed innovation can be both exciting and overwhelming. There’s suddenly a flood of startups pitching AI solutions that claim to revolutionize your SOC. CISOs need to do serious due diligence to cut through marketing buzz and identify which tools will actually integrate well and deliver value. We’re seeing many run pilots or proof-of-concepts with AI-driven products, while also nudging their existing vendors to incorporate similar features (often at a lower cost increment).

The VC influx also hints at future consolidation: many small players will either be bought out or fall by the wayside, and a few big winners could become the next Palo Alto Networks or CrowdStrike of the AI security era. The competitive landscape in cybersecurity may shift if, say, an AI-focused upstart can prove its platform dramatically reduces breaches or SOC workload. Investors certainly are betting on a few dark horses that could upset the old order.

Strategic Implications for CISOs and Security Leaders

For CISOs, SOC managers, and enterprise security buyers, the rise of autonomous AI in cybersecurity presents a strategic inflection point. Here are several key considerations and implications:

Embrace AI – Deliberately and Skeptically: Security leaders should lean into the benefits of AI for defense, but do so with eyes open. That means evaluating AI-driven solutions for clear use-cases – such as automated threat triage, user behavior anomaly detection, or incident response acceleration – and piloting them in controlled settings. The upside (faster detection, labor savings, better insights) is too great to ignore, especially if adversaries are getting faster. However, leaders must maintain healthy skepticism of vendor claims. Demand demonstrations of how an AI handles real-world data and adversarial scenarios. Ask for success metrics (e.g., did mean-time-to-detect improve?). Essentially, trust but verify. And when integrating AI, ensure there are fallback procedures: if the AI malfunctions or is unavailable, can your team still operate? Don’t rip out all traditional capabilities in favor of a black-box AI – instead, run AI in parallel and gradually increase its autonomy as confidence grows.

Re-skill and Upskill Your Team: An AI-augmented SOC still needs skilled humans, but the skill profile will change. Analysts will need to learn to work with AI tools – crafting effective prompts, interpreting AI outputs, and correcting AI errors. This is a new competency that may not be in the typical playbook of a Tier-1 analyst. Consider training your team on how to leverage systems like Security Copilot or other AI assistants. It may even be worth having an “AI Security Officer” or point person who specializes in the care and feeding of your security AI (e.g., keeping its knowledge base up to date, fine-tuning models with your organization’s data, etc.). Also, encourage senior analysts to incorporate AI into their workflows for things like malware analysis or threat hunting, so they become force multipliers for the whole team. Conversely, plan for some cultural resistance – some team members might fear that AI will replace their jobs. It’s important to frame it not as replacement, but as removing drudgery: the AI takes on the boring 80% of Tier-1 alerts so humans can focus on the challenging 20% that remain. Over time, the role definitions in the SOC might be rewritten; job descriptions may include “experience with AI-driven security platforms” as a requirement.

Augment Policies and Procedures: With AI making decisions, update your incident response plans and playbooks. For instance, if your XDR’s AI component can auto-contain endpoints, your IR plan should note that endpoints may be isolated automatically and analysts should verify and either approve or undo that action. Create policies for AI usage: who is allowed to give it certain high-impact prompts (like executing a script on all machines)? How do you handle an AI-generated assessment – does it require a second human confirmation before declaring an incident? Additionally, consider the ethical guidelines: for example, if using AI that can monitor employee behavior, ensure you navigate privacy implications and have clear acceptable use policies.

Budget and Vendor Management: The SOC of the future may be more cost-efficient in some areas (fewer Level-1 contractors, possibly lower dwell times reducing breach costs), but it will require investment in AI software and infrastructure. CISOs need to allocate budget for these new tools, which might be pricey (some AI security SaaS offerings charge per endpoint or per GB of data analyzed, and it can add up). Articulate the ROI in terms executives understand: e.g., “This AI system could reduce our need to hire 3 extra analysts, saving $X, or prevent breaches by catching threats faster, avoiding $Y in incident costs.” Because many AI security startups are new, also weigh the risk of relying on immature companies – you might favor established vendors who are adding AI features, to reduce supply chain risk. But if a startup’s tech is truly ahead, consider a pilot or phased adoption, perhaps with contract clauses about performance and support.

Adversary Awareness: Given the strong possibility that attackers will use AI, threat modeling exercises should incorporate AI-assisted threats. Update your threat scenarios: e.g., “What if an attacker uses an AI to generate 1000 unique phishing emails and one lands in our CEO’s inbox?” Or “Can our controls detect if malware is morphing to evade each endpoint’s defenses?” Security testing may involve using AI tools adversarially against your own environment to see what holes they find. Some companies are already doing “red team with AI” exercises – essentially letting an AI system attempt to breach their systems under controlled conditions, to uncover gaps. Align your defenses accordingly: more emphasis on behaviors and anomalies (since signature/IOC-based detections will struggle against AI-generated variants), and robust awareness training emphasizing that social engineering could become much more personalized and convincing thanks to AI.

Collaboration and Intelligence Sharing: In an AI-driven landscape, defenders can benefit from collective intelligence. If one company’s AI detects a novel attack pattern, sharing that (via ISACs or industry groups) can help others update their models. Consider participating in information-sharing on AI threats. For example, if you experience a prompt injection attack on an internal AI system, publish a sanitized report or talk at a conference about it. The community is still learning, and collaboration can blunt the advantage attackers might gain. Additionally, keep an eye on emerging standards or frameworks (NIST, for instance, might develop guidelines for AI in cybersecurity, or there may be benchmark datasets for AI threat detection). By engaging early, you can help shape best practices and also stay ahead of compliance requirements that could emerge (for instance, regulators might ask how you secure AI systems or use AI responsibly in protecting consumer data).

Focus on Resilience and Response: Despite all the advancements, assume that breaches will still happen – possibly even more sudden and complex ones due to AI. Therefore, continue to invest in incident response capabilities. An AI might help contain an incident quickly, but you still need a response team to investigate root cause, eradicate deeply embedded threats, and interface with legal/communications etc. One could imagine an AI agent battling another AI agent within your network – you’ll want a human-led strategy to ultimately resolve that situation, akin to how a human general oversees autonomous drones in military conflicts. In essence, resilience (backups, recovery plans, fail-safes) remains key. If an AI-driven attack takes out your primary systems, do you have a way to continue critical operations? And if your defensive AI makes an error, can you quickly correct and recover from it? These questions highlight that while AI adds power, it doesn’t remove the need for strong fundamental security hygiene and planning.

Strategic Advantage: Finally, consider the competitive advantage aspect. For companies in sensitive industries (finance, healthcare, critical infrastructure), being on the leading edge of AI-powered defense could become a market differentiator. Customers and boards are increasingly concerned about cyber resilience. If you can say, for example, that your organization leverages cutting-edge AI to protect customer data – with concrete outcomes like “we detect 95% of threats within minutes” – that builds trust. Conversely, falling behind could put you at risk not just of breaches but of scrutiny for not keeping up with industry best practices. Gartner’s projection that 70% of detection/response will involve multi-agent AI by 2028 ^[97] implies that in a few years, not using AI in security might be seen as negligence. Security leaders should thus craft a roadmap for AI adoption that aligns with their risk appetite and business needs, ensuring they are neither recklessly early nor lagging too far behind.

Conclusion

The rise of autonomous AI in cybersecurity is ushering in what can only be described as a paradigm shift. Offensively, AI has become the great equalizer for attackers – script kiddies and crime syndicates alike can now leverage machine intelligence to supercharge their exploits, churning out bespoke malware and discovering hidden cracks in our digital infrastructure at a pace humans could never achieve. Defensively, the tables are turning as AI promises to finally give an edge to beleaguered security teams – enabling them to detect threats more intelligently, respond at machine speed, and perhaps even predict attacks before they strike by analyzing patterns invisible to human eyes.

Yet, this is not a story of a magic wand that makes cybersecurity problems vanish. Rather, it is an arms race – an AI vs AI duel playing out across networks and clouds globally. As one side innovates, the other must counter. The traditional model of cybersecurity, reliant on human analysts painstakingly poring over alerts and writing static rules, is breaking under the strain of modern threats. In its place, we see an emerging model where human expertise is amplified by AI – and in some cases, where AI takes the front line with humans orchestrating in the background.

The implications reach far beyond technology into process, people, and strategy. Security leaders stand at a crossroads where decisions made today about embracing or ignoring AI will have profound consequences in the years ahead. Those who harness autonomous AI for defense may drastically reduce their risk and improve efficiency; those who don’t may find themselves overwhelmed by AI-enabled attackers or simply outpaced by the volume and complexity of modern threats. As John Watters aptly noted, adversaries historically set the pace of innovation in cyber – but now defenders have a chance to leapfrog with AI, closing that gap ^[98].

Ultimately, success in this new era will come from a balanced partnership of humans and machines. We should neither be naïvely over-reliant on AI nor fearful of it. Instead, the goal is to integrate AI thoughtfully – to handle the speed, scale, and complexity, while human ingenuity handles ambiguity, ethics, and the unexpected. The autonomous SOC will still need brilliant human pilots. And the dark AI tools of attackers will still reflect the intent (and limitations) of their human operators.

One thing is certain: the genie is out of the bottle. AI is now part of the fabric of cyber conflict and cyber defense. As Bruce Schneier presciently wrote, “Automation, autonomy, and physical agency will make computer security a matter of life and death, and not just a matter of data.” ^[99] We are entering that reality now. The challenge and opportunity for the cybersecurity community is to ensure that our AIs – our digital knights – can outsmart and outfight the dark AIs wielded by adversaries, keeping the upper hand in a battle that will increasingly be fought at the speed of algorithms. The age of autonomous cyber warfare has dawned; it’s up to us to win it.

Sources:

Shweta Sharma, “WormGPT returns: New malicious AI variants built on Grok and Mixtral uncovered,” CSO Online – Jun 18, 2025 ^[100] ^[101].
Lucian Constantin, “Gen AI is transforming vulnerability hunting for pen-testers and attackers alike,” CSO Online – Jan 7, 2025 ^[102] ^[103].
Lucian Constantin, CSO Online – Ibid. (interview with Chris Kubecka) ^[104] ^[105].
Lucian Constantin, CSO Online – Ibid. (remarks by Lucian Nițescu) ^[106] ^[107].
Certera Security Blog, “WormGPT & FraudGPT – The Dark Side of Generative AI,” Aug 2023 ^[108] ^[109].
Prakash Sinha, “The Rise of AI-Driven Cyber Attacks: A New Challenge for Service Providers,” Radware Blog – Sep 6, 2024 ^[110].
Tom Warren, “Microsoft Security Copilot is a new GPT-4 AI assistant for cybersecurity,” The Verge – Mar 28, 2023 ^[111] ^[112].
Kevin Townsend, “SentinelOne’s Purple AI Athena Brings Autonomous Decision-Making to the SOC,” SecurityWeek – Apr 29, 2025 ^[113] ^[114].
Kevin Townsend, SecurityWeek – Ibid. (Tomer Weingarten quote) ^[115].
Sam Sabin, “Cybersecurity industry preps for autonomous AI attacks,” Axios – Sep 9, 2025 ^[116] ^[117].
Sam Sabin, Axios – Ibid. (John Watters quotes) ^[118] ^[119].
Chris Metinko, “Cybersecurity Funding Ticks Up Despite Slow Deal Flow,” Crunchbase News – Apr 15, 2025 ^[120] ^[121].
ComplexDiscovery Staff, “AI FOMO Drives Venture Capital Surge…,” ComplexDiscovery – Apr 26, 2025 ^[122] ^[123].
Bruce Schneier, “Automation, autonomy, and physical agency…” – Schneier on Security (Blog) – Mar 2019 ^[124].