OpenAI Unleashes ChatGPT Agent – The AI Assistant That Can Plan, Shop and Make PowerPoints For You

ChatGPT Evolves From Chatbot to Personal Assistant
OpenAI has rolled out a major upgrade to ChatGPT, turning the popular chatbot into a personal AI assistant capable of not just answering questions, but taking actions on a user’s behalf. Dubbed the ChatGPT “Agent”, the new feature launched on Thursday and allows ChatGPT to “think” and act using its own virtual computer theguardian.com. In practical terms, users can now ask ChatGPT to handle multi-step tasks – from finding restaurant reservations or shopping online to generating entire spreadsheets and slide deck presentations autonomously theguardian.com. OpenAI says this agent mode lets the AI navigate websites, control web browsers and apps, manage files, and produce outputs like Excel files or PowerPoint slides, rather than just spitting out text responsestechmeme.com.
Why this matters: The upgrade is OpenAI’s boldest step yet to move beyond a static Q&A chatbot toward an AI that functions like a digital assistant or “agent.” Unlike standard chatbots, AI agents can carry out complex, multi-step workflows by interacting with software and websites on the user’s behalf theverge.com. “The hope is that agents are able to bring some real utility to users – to actually do things for them rather than just outputting polished text and sounding impressive,” notes Niamh Burns, a senior media analyst at Enders Analysis theguardian.com. In essence, ChatGPT’s new agent mode aims to fulfill that promise by doing real online work for users, not just chatting.
OpenAI’s ChatGPT Agent uses a built-in “virtual computer” to browse the web, fill forms, run code, and even produce Excel spreadsheets or PowerPoint presentations on behalf of the user openai.comtechmeme.com.
What Can the New ChatGPT Agent Do?
OpenAI bills ChatGPT Agent as a general-purpose digital assistant that can tackle a wide variety of computer-based tasks techcrunch.com. Some examples of what it can do include:
- Manage Schedules and Plans: Check your calendar and brief you on upcoming meetings, or find an evening when you’re free and then search for restaurant reservations on OpenTable openai.com theverge.com. It can plan events like a date night by cross-referencing your schedule with restaurant availability.
- Online Research and Reports: Conduct deep web research on a topic and compile a concise report or analysis. For instance, it could analyze trends (e.g. “the rise of Beanie Babies vs. Labubus”) and generate a summary or a detailed research paper theverge.com.
- Shopping and Orders: The agent can go online shopping for you. You can ask it to find products with certain criteria, compare options, and even place orders (with your permission) theguardian.com wired.com. OpenAI’s research lead Isa Fulford even had the agent order a batch of cupcakes by following her specific instructions – a task that “took almost an hour” but was still easier for her than doing it manually wired.com.
- Office Tasks – Spreadsheets and Presentations: Perhaps most notably, ChatGPT Agent can produce editable files. It can generate an Excel spreadsheet or a PowerPoint slide deck from scratch based on your prompt openai.com. For example, you could ask it to analyze data on your competitors and create a slide deck with charts summarizing the findings openai.com. It can also update spreadsheets with new data or convert a set of screenshots into a formatted presentation openai.com. The output files are downloadable and meant to be opened in standard office software (though OpenAI cautions the slide generation feature is still in beta) openai.com.
- Use Developer Tools and APIs: Under the hood, the agent has access to a programming terminal and can call public APIs. This means it could run code to perform custom calculations or query external services. It can integrate with apps like Gmail or GitHub via “connectors,” pulling in information (with user permission) to use in its responses openai.com. OpenAI says ChatGPT Agent can even fill out online forms and interface with services like Google Drive or SharePoint by making API calls wired.com.
All these capabilities are orchestrated by giving the AI its own “virtual browser/computer” to work in. When you assign a task, ChatGPT will navigate websites, click links or buttons, scroll pages, fill text fields, write and execute code, and so on – whatever steps are needed to complete the assignment openai.comtechmeme.com. It works iteratively and autonomously, deciding which tool or website to use next. For instance, planning that Japanese dinner might involve searching recipes on Google, then opening a grocery site to order ingredients, and finally generating a shopping list spreadsheet – all done by the agent without the user micromanaging each step.
How Does ChatGPT Agent Work?
Behind the scenes, ChatGPT Agent is powered by a new AI model OpenAI built specifically for agent tasks, separate from the base GPT-4 model theverge.com. The model was trained via reinforcement learning to handle complex tasks that require using multiple tools (like browsers, APIs, and code) in sequence theverge.com. In fact, OpenAI merged two earlier experimental systems – Operator (a browsing/automation tool) and Deep Research (an in-depth analysis tool) – into this unified agent. “We realized that the two products are very complementary, and basically decided to combine teams,” Fulford says wired.com. The result is an agent that combines Operator’s ability to click around the web with Deep Research’s skill in synthesizing information into one workflow wired.com.
Toolbox of Skills: ChatGPT Agent comes equipped with multiple specialized tools it can wield openai.com:
- A Visual Browser for interacting with websites through a normal GUI, as a human would (clicking buttons, navigating pages).
- A Text-based Browser for sending quick HTTP requests and parsing raw text (useful for faster reading of large text or when visual rendering isn’t needed) openai.com.
- A Terminal/Console that lets it run code, manipulate files, or use command-line utilities within its sandboxed environment openai.com.
- Direct API Access, allowing it to call external services’ APIs (e.g., posting to Google Calendar, querying a database, or fetching data from an online service) openai.com.
- Connectors to User Accounts: Users can connect their own apps (like email or GitHub). With permission, the agent can pull in relevant info from your emails, calendar, or other accounts to accomplish tasks openai.com. For example, it might scan your Gmail for recent messages if that’s needed to draft a summary, or check your calendar via an API to find free time slots.
These tools allow the AI to choose the optimal approach for a task. It might use the API to quickly check your calendar availability, then switch to the visual browser to navigate an OpenTable reservation page that requires clicking and human-like interaction openai.com. It could download a file via the text browser or API, run code on it in the terminal to analyze or reformat it, then open the results in the visual browser to present them to you openai.com. All of this happens within the agent’s virtual machine, isolated from your actual device – so it’s like the AI has its own computer where it carries out your instructions openai.com.
User Experience: To the end user, using ChatGPT Agent is straightforward. The feature is available via a new “Agent mode” in ChatGPT’s interface for those with access openai.com. You simply start a prompt with a task (you can also type a slash command “/agent”) and the AI takes it from there theverge.com. As it works, an on-screen narration shows what the agent is doing step by step – for example, “Browsing to maps.google.com… Searching for ‘Italian restaurants near me’…” – so you can follow along openai.com openai.com. Notably, you can interrupt or steer the agent at any time: you can pause the process to clarify your instructions or ask it to take a different approach, and it will adapt mid-task without losing progress openai.com. This collaborative loop is meant to keep the AI aligned with your goals.
Safety Features: Keeping AI Actions in Check
Empowering an AI to take actions online raises obvious safety concerns, and OpenAI acknowledges this new mode comes with “more risks than previous models” theguardian.com. To mitigate these, OpenAI has implemented a stack of safeguards and limitations:
- User Permission for Sensitive Actions: “You’re always in control,” OpenAI emphasizes theguardian.com. ChatGPT Agent will request explicit confirmation before doing anything with serious consequences, such as making a purchase, sending an email, or booking a reservation on your behalf theguardian.com theverge.com. The user must approve these irreversible steps, preventing the AI from, say, impulsively ordering $1,000 of gadgets on Amazon without you knowing.
- “High-Risk” Content Restrictions (Bio/Chem): Given the agent’s enhanced capabilities, OpenAI has classified it under a “High Biological and Chemical Risk” category, even though they have “no definitive evidence” it could help create a bioweapon theguardian.com theverge.com. This precaution (part of OpenAI’s Preparedness Framework) means additional guardrails are active. Specifically, OpenAI runs a real-time content classifier on every agent prompt to check if it’s related to biology or chemistry, and if so, the agent’s response is vetted by a second safety model to ensure it isn’t providing dangerous instructions techcrunch.com techcrunch.com. In other words, if someone tried to misuse the agent to, say, cook up a toxic substance, the system is designed to catch and block it.
- Trained to Refuse Harmful Tasks: The agent has been trained to reject certain suspicious or malicious requests. For example, it will refuse if prompted to carry out something obviously dangerous or unethical, like performing a bank transfer to an unknown account or executing destructive commands theguardian.com. OpenAI says red-teamers and domain experts helped test the system against “realistic scenarios” to harden these refusals openai.com.
- Disabled Long-Term Memory: One interesting limitation – ChatGPT’s long-term chat memory is turned off in agent mode techcrunch.com. Normally, ChatGPT can remember information from earlier in a conversation or past sessions (if enabled), but OpenAI worried that a clever attacker could exploit this during agent tasks (via so-called prompt injections) to make the agent leak sensitive data or do unwanted things techcrunch.com. As a result, the agent currently operates statelessly, not carrying over info from previous chats. OpenAI may re-enable memory in the future once they’re confident it’s safe, but for now this “extra precaution” avoids potential data leaks wired.com.
- Financial Transactions Off-Limits: OpenAI has also restricted financial operations for now. The agent will not execute money transfers or stock trades, for example, even if asked theverge.com. In fact, there’s a safeguard called “Watch Mode” that kicks in if the agent is browsing certain sensitive websites (like banks or trading platforms) – it will pause its activity if the user navigates away from the agent’s browser tab, to prevent any sneaky moves in the background theverge.com.
- Extensive Testing and Bounty Program: OpenAI touts that this model has their “most comprehensive safety stack to date” in terms of threat modeling and monitoring openai.com openai.com. They collaborated with outside biosecurity experts and had domain specialists red-team the agent before launch openai.com. Alongside release, OpenAI also published a detailed system card explaining risks and is offering a bug bounty to encourage external researchers to report vulnerabilities openai.com openai.com.
Despite these precautions, OpenAI knows unexpected behaviors may still emerge when an AI is operating in the wild internet. The company says it will iteratively refine the agent and adjust safeguards as needed. For now, users are advised to supervise the agent’s actions (the interface encourages this by narrating every step). “With this model there are more risks than with previous models,” OpenAI admits, which is why they are “exercising caution and implementing the needed safeguards now” theguardian.com theguardian.com.
Early Limitations: Speed and Reliability
Don’t fire your human assistant just yet. In early demos and testing, ChatGPT Agent has shown impressive capabilities but also notable limitations:
- Slow and Steady: The agent often takes a while to complete tasks. It might spend several minutes clicking and browsing to gather information, far longer than a direct chatbot answer. In one demonstration, having the agent sift through a Google Calendar and restaurant sites to suggest dinner options took about 10–15 minutes theguardian.com. Generating a complex slide deck or conducting extensive research could take even longer (OpenAI staff noted a slides task took ~25 minutes in testing) wired.com. “Even if it takes 15 minutes, half an hour, it’s quite a big speed-up compared to how long it would take you to do it,” argues Fulford, pointing out that users can kick off a task and then do other things while the agent works theverge.com. Still, patience is required; the agent is not instantaneous. OpenAI’s Yash Kumar estimates an average task takes ~10–15 minutes in the current version wired.com.
- Occasional Hiccups: As with any AI, the agent can make mistakes or get “stuck” on a task. Early users have reported mixed results. Some complex workflows might confuse it, or it might misinterpret an instruction halfway through. One early tester commented that the agent “failed at the three different tasks I gave it… A nice glimpse of the future, but not normally useful yet.”techmeme.com. This underscores that the technology, while advanced, is not infallible. OpenAI itself notes the agent is “still in its early stages” and “can still make mistakes.” openai.com Future updates are expected to improve its reliability and reasoning.
- Basic Output Quality: The PowerPoint/slide generation feature is currently in beta, which means the slides it creates may look quite plain or require polishing openai.com. OpenAI focused first on getting the content and structure right, rather than flashy design. They warn that formatting might be rudimentary and occasionally there are discrepancies between the slide preview and the exported PowerPoint file openai.com. Similarly, while the agent can edit spreadsheets and maintain formulas, it’s not yet an Excel wizard at the level of a skilled human. OpenAI is already training the next version to produce more “polished, sophisticated outputs” in presentations openai.com.
- No European Launch (Yet): Notably, ChatGPT Agent did not launch in the EU. OpenAI is “still working on enabling access for the European Economic Area and Switzerland” openai.com. Users elsewhere (including the US and UK) gained access immediately, but European users are left waiting indefinitely. OpenAI hasn’t given a firm timeline for EU rollout theverge.com. This is likely tied to regulatory concerns – the EU’s stringent data and AI regulations may require additional compliance steps from OpenAI before unleashing an autonomous agent. For now, Europeans see only a message that the feature is unavailable in their region.
On the positive side, OpenAI claims the new agent’s underlying model is far more capable than previous versions, which bodes well for handling complexity. The model reportedly achieved state-of-the-art scores on several tough benchmarks techcrunch.com. For example, it scored 41.6% on “Humanity’s Last Exam,” a massive expert-level test spanning 100+ subjects – roughly double the score of OpenAI’s prior models on that test techcrunch.com. On a notoriously difficult math benchmark (FrontierMath), it managed 27.4% accuracy with tool use, versus just 6.3% by the best earlier model techcrunch.com. These improvements suggest the agent is much better at solving complex, multi-step problems when it can use tools. “OpenAI says ChatGPT agent is far more capable than its previous offerings,” TechCrunch reports techcrunch.com – though until more users push it to its limits in real-world scenarios, it remains to be seen how “capable” it truly is outside controlled tests techcrunch.com.
Availability: Who Can Use ChatGPT Agent?
OpenAI is initially rolling out ChatGPT Agent as a perk for paying subscribers only. As of this week, the feature is being enabled for users on the ChatGPT Pro, Plus, and Team plans (roughly equivalent to premium tiers) techcrunch.com. Pro users were slated to receive access first (on launch day), followed by Plus and Team subscribers over the next few days openai.com. Enterprise and Education plan customers will get it “in the coming weeks” once the kinks are worked out openai.com theverge.com. There is no announced timeline for free users to receive agent capabilities – and it’s possible it will remain a paid feature for the foreseeable future, given the added value and high compute costs involved.
Along with tiered access, OpenAI has imposed monthly usage limits. Pro subscribers (the highest tier) can run up to 400 agent tasks per month, while Plus and Team users get 40 tasks per month included wired.com wired.com. This cap ensures the expensive operations don’t run away unchecked, but additional usage may be available for purchase via a credit system if users need more openai.com. The tasks are counted per “agentic prompt,” meaning each time you activate the agent to do something counts as one.
As mentioned, European users cannot access ChatGPT Agent at launch theguardian.com. When non-EU users toggle agent mode, they are warned about the feature’s experimental nature and then can proceed. EU users, however, are simply blocked. OpenAI’s note that it’s working on EEA access suggests the hold-up is likely compliance with EU regulations (perhaps related to privacy and the AI Act). This geofenced rollout is reminiscent of how some previous ChatGPT features (like web browsing) were temporarily withheld in regions over legal uncertainty. For now, anyone in the EU will have to wait until OpenAI ensures the agent meets local requirements.
An AI Agent Arms Race – Google, Anthropic & Others
OpenAI’s push into “agentic” AI comes amid a broader industry trend toward autonomous AI assistants. In fact, competitors have been gearing up their own agent-like features:
- Anthropic’s Claude: Last year, Anthropic (maker of the Claude chatbot) introduced a capability called “Computer Use” – essentially giving Claude the ability to use a computer like a human, such as browsing websites and performing tasks on a user’s machine theverge.com. Just two months ago (May 2025), Anthropic launched its latest model Claude 2 (codename Opus 4) with agentic features and similarly activated special bio-safety measures to prevent misuse theverge.com. This shows even smaller AI startups are cognizant of both the power and the risks of autonomous agents.
- Google’s AI Extensions: Google has been working on integrating its generative AI (like Bard and Assistant) with direct actions. They’ve demoed AI that can draft emails in Gmail, summarize documents in Google Drive, and even control a browser through their experimental “Duet AI” for Workspace. The Guardian notes that Google recently launched similar assistant “agents” that can juggle between apps to complete user tasks theguardian.com. Additionally, just last week Google hired key staff from a startup (Windsurf) specifically to bolster its agentic AI projects theverge.com, underscoring the competitive race to build Jarvis-like assistants.
- Other Players: Meta (Facebook) and Amazon have also mentioned AI agent ambitions on earnings calls, indicating everyone in Big Tech sees this as the next big thing theverge.com. For instance, e-commerce companies imagine AI agents that can handle customer service chats or shopping requests end-to-end. In a striking early example, fintech company Klarna reported in early 2024 that its AI customer-service agent handled two-thirds of all customer chats, doing the work of approximately 700 humans theverge.com. That success story helped popularize the term “AI agent” in corporate circles, and since then many CEOs have been touting agent-based AI as a goal theverge.com.
- Past Experiments: OpenAI itself dipped its toes into agents earlier. In January 2025 it released Operator as a research preview, described as “an agent that can go to the web to perform tasks for you” theverge.com. Operator could click and scroll through websites. There was also the Deep Research mode that could write long-form analyses. These precursors, however, were limited in scope and sometimes brittle. Other startups (like Adept AI’s ACT-1) have shown agents that can execute commands in software like a human, but none have yet become mainstream products. The early generations of AI agents struggled with complex tasks and reliability techcrunch.com – often requiring lots of hand-holding. Tech execs painted visions of AI assistants that could do anything, but the reality lagged behind the hype techcrunch.com.
Now with ChatGPT Agent, OpenAI is attempting to leapfrog those earlier efforts. By combining strengths (web browsing + analysis) and using GPT-4-level intelligence, they claim to have an agent finally approaching the grand vision. “This is the best UX for an agent ever. ABSOLUTELY INSANE. BEAT THIS!!” one excited user posted after the launchtechmeme.com. While that sentiment is obviously hyperbolic, it captures the excitement in some corners of the AI community that we’re inching closer to a “J.A.R.V.I.S.” – Iron Man’s fictional AI butler – in real life theverge.com. For now, ChatGPT Agent and its peers are still early steps toward that ideal, mostly handling research, coding and basic online errands rather than truly open-ended autonomy theverge.com. But the competitive momentum is unmistakable: every AI company wants to be first to crack the AI assistant that people will actually use daily.
Monetization: Will Agents Make Money for OpenAI?
With the launch of ChatGPT Agent, OpenAI is not only showcasing new tech – it’s also eyeing potential revenue streams. The company has heavily subsidized ChatGPT’s development (with Microsoft investing billions), and needs to turn its hugely popular AI into a “money-making product” wired.com. Agents could be key to that monetization in a few ways:
- Subscription Upsell: Simply put, agent mode is a premium feature that could drive more users to paid plans. By limiting it to Plus/Pro subscribers, OpenAI makes the $20+ monthly fee more attractive to power users who want an AI assistant to offload work. This is the straightforward immediate monetization: get more people paying for ChatGPT access.
- Transaction Fees: OpenAI’s CEO Sam Altman has hinted at earning commissions from commerce done via its AI. He speculated that OpenAI could “charge a 2% fee on sales generated” through its assistant’s efforts theguardian.com. In other words, if ChatGPT Agent helps you buy a product or book a hotel, OpenAI might take a small cut (from the merchant or via affiliate links). This model would turn AI-driven shopping or booking into a revenue generator. The recent agent demo showing it guiding a user to retail checkouts immediately fueled chatter that OpenAI might integrate such affiliate or referral fees down the line theguardian.com.
- Sponsored Results/Ads: AI assistants could become a new platform for advertising. If an agent suggests products or restaurants, will brands pay to be recommended? “Some version of ads or sponsored placement feels inevitable,” observes analyst Niamh Burns, noting the “growing pressure [on AI companies] to monetise their products.” theguardian.com There is a precedent – search engines make money from ads, so an AI that replaces search might too. However, OpenAI denies any current use of sponsored content in ChatGPT Agent’s recommendations theguardian.com. They stated the agent does not include paid product placements, and “there are no plans to change that.” theguardian.com For now, results are supposed to be purely based on user’s criteria and the AI’s judgment. Still, the door remains open for future ad models once the assistant ecosystem matures.
- Enterprise Services: OpenAI could also monetize by offering the agent as part of enterprise software solutions. For instance, companies might pay to integrate ChatGPT Agent into their internal tools or to have it handle customer support. OpenAI is already in contract negotiations with Microsoft about continued partnership, and one can imagine advanced agents being packaged into Microsoft’s offerings (which could indirectly bring revenue or favorable terms to OpenAI) wired.com. The “enterprise use cases” were a big consideration in the agent’s design, according to product lead Yash Kumar wired.com, meaning OpenAI is likely thinking about how businesses can leverage (and pay for) this tech.
In the near term, OpenAI’s focus is likely on refining the agent and driving subscriptions. But eventually, if agents become as useful as promised, they could facilitate entire transactions or workflows – and OpenAI will certainly look to capture some value from that. The company has to balance user trust (an agent recommending products must feel unbiased to be trusted) with monetization. How they do so will be closely watched. As analyst Burns mused, if an agent finds products for you, “what goes into the process of that system finding the products? Would there be commercial deals where brands pay to be featured by assistants…?” theguardian.com. OpenAI insists not for now, but the economic incentive is there.
Expert Reactions and Outlook
The debut of ChatGPT Agent has generated both excitement and cautious commentary from experts and early users. Ethan Mollick, a professor at Wharton known for experimenting with AI in education, was part of a small group with early access. His verdict was upbeat: “ChatGPT agent is, I think, a big step forward for getting AIs to do real work. Even at this stage, it does a good job autonomously doing research & assembling Excel files (with formulas!), PowerPoint, etc.”techmeme.com. Mollick said it gave a glimpse of how various agent abilities are “coming together,” even if it’s not perfect yettechmeme.com. Other AI researchers echoed that sentiment, impressed by the way ChatGPT Agent can chain tasks and produce usable outputs that previously took many manual steps.
At the same time, there’s recognition that real-world testing has just begun. How reliably the agent handles the messy open internet, whether it can avoid falling for scams or misinformation as it browses, and to what extent average users find it genuinely useful – those are open questions. “It remains to be seen how capable it truly is in the real world,” TechCrunch noted, given that prior agents were brittle when facing unexpected scenarios techcrunch.com. There is also the broader societal concern of handing more agency to AI: even with permission checks, stories of AI making odd or risky decisions will surely surface. OpenAI’s own system card acknowledges “novel risks” with such autonomy and pledges ongoing research into mitigating them openai.com openai.com.
For now, the introduction of ChatGPT Agent represents a milestone in AI’s march from purely assistive text generation to actual task execution. It’s part of a paradigm shift from “chatbots” to “agents” – AI systems that can take initiative and complete goals in the digital world, not just converse. “Agent is the buzziest of buzzwords right now,” writes WIRED, precisely because so many companies are chasing that vision wired.com. OpenAI has planted a flag firmly in this new territory, leveraging the popularity and familiarity of ChatGPT to push an agent to the masses (or at least the paying masses).
The bottom line: If you’re an eligible ChatGPT user, you can now offload certain tedious or complex tasks to an AI helper and watch it work through them step-by-step. It can feel a bit magical – like having a diligent intern who never sleeps – and also a bit unnerving to see the AI roam the web on its own. This launch is the beginning of a grand experiment in how everyday people might use AI agents. As one early adopter put it: “[It] does a good job autonomously… It gives a sense of how agents are coming together.”techmeme.com In the coming months, we’ll see if ChatGPT Agent truly delivers on its promise of convenience and productivity, and how it stacks up against the growing field of rival AI assistants. One thing is for sure: the era of AI that acts, not just chats, has officially begun.
Sources:
- Booth, R. (2025, July 17). The Guardian – OpenAI launches personal assistant capable of controlling files and web browsers. theguardian.com theguardian.com theguardian.com theguardian.com theguardian.com
- OpenAI. (2025, July 17). Introducing ChatGPT Agent: Bridging Research and Action (Official OpenAI Blog) openai.com openai.com openai.com openai.com
- Field, H. (2025, July 17). The Verge – OpenAI’s new ChatGPT Agent can control an entire computer and do tasks for you. theverge.com theverge.com theverge.com theverge.com
- Zeff, M. (2025, July 17). TechCrunch – OpenAI launches a general purpose agent in ChatGPT. techcrunch.com techcrunch.com techcrunch.com techcrunch.com
- Rogers, R. (2025, July 17). WIRED – OpenAI’s New ChatGPT Agent Tries to Do It All. wired.com wired.com wired.com wired.com
- Techmeme. (2025, July 17). Aggregated tech news on ChatGPT Agent launch (including Ethan Mollick commentary)techmeme.com