Microsoft has significantly enhanced Copilot with voice and vision capabilities, making it a more personalized AI assistant. New features include a virtual news presenter mode, visual perception of surroundings, and natural voice interaction similar to OpenAI’s Advanced Voice Mode.
Copilot’s redesign spans across mobile, web, and the dedicated Windows app, offering a card-based user interface reminiscent of the work accomplished by Inflection AI with its Pi personalized AI assistant. Earlier this year, Microsoft brought on board several individuals from Inflection AI, including Mustafa Suleyman, co-founder of Google DeepMind, who now serves as CEO of Microsoft AI. This marks Suleyman’s inaugural significant alteration to Copilot since assuming leadership of the AI assistant’s consumer division.
In an open letter today, Suleyman expressed, “At Microsoft AI, we are crafting an AI companion for everyone. I am confident we can usher in a more serene, supportive era of technology, unlike anything we have seen before.”
Microsoft’s Copilot offers a new, warmer interface and a personalized Copilot Discover page. The personalized homepage, customized based on conversation history, provides a more inviting and useful experience.
Earlier this year, Microsoft moved its consumer Copilot to Suleyman’s team to experiment with personality and customization. Yusuf Mehdi noted they learned a lot from the Pi and Inflection AI teams, emphasizing customer needs and long conversations in Copilot’s development.
Along with Copilot’s new look, Microsoft is enhancing it with voice capabilities similar to OpenAI’s ChatGPT. Users can now converse with the AI, ask questions, and interrupt like in a real discussion. Four voice options are available, and users are prompted to choose one when first using the updated Copilot.
Mehdi stated, “We are heavily investing in voice.” He added, “When used in the way we’ve designed it, you really start to engage in conversations. This provides a glimpse of our long-term vision, where the AI can assist you and see what you see if you desire.”
Copilot Vision, a key focus in Microsoft’s redesign, lets the AI view the same webpage content as you. Users can ask about text, images, and content, with natural responses via Copilot Voice. This feature aids in online shopping by providing product recommendations and options.
Microsoft assures that Copilot Vision sessions are temporary, require user consent, and do not store or use content for training. Initially, it supports a restricted list of popular websites for safety, and won’t work on paywalled or sensitive content during the preview phase.
Microsoft plans to integrate new voice and vision capabilities into Copilot. Copilot Vision can analyze old handwritten recipes, providing food explanations and cooking time tips. Earlier this year, Microsoft demonstrated Copilot’s ability to assist with navigating Xbox games.
The next phase of Copilot introduces Copilot Daily, an audio summary of news and weather read in a manner similar to a CNN anchor. It is intended for brief morning listening and exclusively features content from news and weather providers authorized by Copilot. Initially, Microsoft is collaborating with Reuters, Axel Springer, Hearst, and the Financial Times, with intentions to incorporate more sources over time.
Microsoft is introducing Think Deeper to Copilot Labs for testing. Copilot Vision will also be available in Labs, with Microsoft taking a cautious approach following the Recall incident. Microsoft has recently revamped Recall with improved security and privacy options, and users can now choose to uninstall the feature or disable it.