Control even more Mac apps with AI and voice

Control even more Mac apps with AI and voice

While we wait for OpenAI’s next big thing — presumably ChatGPT’s GPT-5 model (or whatever it’s called) — we might get to see and use another big upgrade for genAI products. AI agents are programs that let you manage and control Mac or Windows apps using written ChatGPT prompts and voice.

That’s the future of AI computing I want from ChatGPT and similar AI products. I want to tell AI to look at my screen and do whatever I ask it to. I also want it to perform internet searches on my behalf, whether in the ChatGPT app or a browser. More importantly, I want to do most of it by voice, with the help of Advanced Voice Mode and future versions of it.

OpenAI isn’t quite there, but the demos the company showed during its penultimate “12 Days” announcement tell me we’re heading in that direction.

The ChatGPT app for Mac now works with even more coding apps. OpenAI also added support for note-taking apps, including the Apple Notes app that comes on all Apple devices. The best part about it is that Advanced Voice Mode can be used to talk to ChatGPT about the contents on your screen.

We’re still not getting full agentic functionality from ChatGPT. It’s an iterative upgrade over what OpenAI announced about a month ago for the Mac app. At the time, the company gave ChatGPT the ability to look at the contents on the screen in certain coding apps on the Mac. The feature wasn’t as much an agentic control of the computer as OpenAI leveraged a built-in accessibility API in macOS to read text.

But OpenAI has built on that, adding support for more coding apps, including Warp, IntelliJ IDEA, PyCharm, and others. As you can see in the short clips below, you can use the Mac app to instruct ChatGPT to look at the screen in a specific app. You’ll be able to choose which app you’re giving the AI access to, and that’s the only thing it’ll see.

In addition to expanding support to more apps, OpenAI is also bringing o1 and o1 pro mode to the mix of models that can be used to code for you. Advanced Voice Mode wasn’t a part of these demos, as the OpenAI engineers still used text-based prompts to demo the new functionality.

These updates aren’t exciting if you don’t code for a living. But you can already write all sorts of text. That’s where ChatGPT might come in handy if you use a Mac. Here’s where the Advanced Voice Mode feature shines.

OpenAI showed a demo of ChatGPT’s agentic behavior for note-taking apps. The list of supported apps includes Apple Notes, Notion, and Quip. The demo below shows ChatGPT looking at a Notion note to understand the context and answer a prompt. The AI then performs a search for factual information about a historic event and generates text trying to mimic the voice of the note’s author.

Instead of typing the prompts, you might want to talk to ChatGPT with Advanced Voice Mode rather than type your prompts. This might boost your productivity, assuming everything works as intended. As you’ll see in the second demo, the AI quickly understands your needs, but it might fail here and there.

Then again, we’re still in the early days here. What OpenAI has shown only scratches the surface of what ChatGPT might do one day on Mac and PC.

Until then, you can try these new ChatGPT capabilities in the Mac app as long as you have access to a premium tier (Plus, Pro, Team, Enterprise, and Edu). OpenAI will bring these features to the Windows app in the future. ChatGPT Free users will also get them next year.

Source link

Visited 1 times, 1 visit(s) today

Leave a Reply

Your email address will not be published. Required fields are marked *