Uncategorized

Microsoft aims to win over developers who use artificial intelligence

A Brief History of the Microsoft Copilot Runtime and How it’s Been Done (with an Empty Note on Emojis)

Microsoft had a lot to say about Windows and AI — and a little to say about custom emoji — during the Build 2024 keynote. The company is putting a lot of effort into trying to find Artificial Intelligence in every area it can. Copilot is watching your screen to assist you in playing a game or working for an agent.

Developers will also be able to improve Windows’ new Recall feature by adding contextual information to their apps that feeds into the database powering this feature. “This integration helps users pick up where they left off in your app, improving app engagement and users’ seamless flow between Windows and your app,” says Davuluri.

At Microsoft Build today, the company is providing a lot more details about exactly how this Windows Copilot Runtime works. The runtime includes a library of APIs that developers can tap into for their own apps, with AI frameworks and toolchains that are designed for developers to ship their own on-device models on Windows.

Developers will be able to use the Windows Copilot Library to integrate things like Studio Effects, filters, portrait blur, and other features into their apps. Meta is adding the Windows Studio Effects into WhatsApp, so you’ll get features like background blur and eye contact during video calls. Developers can use the Live Captions and the new translation feature, even with little to no coding.

Copilot Plus PCs, Windows 10 Power Toys, and Microsoft’s Advanced Paste Feature for developers with Vector Embeddings

Yesterday, Microsoft demonstrated the ability for Copilot Plus PCs to record and store your activities on your PC so that you can go back in time and look at them again. This is all powered by a new Windows Semantic Index that stores this data locally, and Microsoft plans to allow developers to build something similar.

“We will make this capability available for developers with Vector Embeddings API to build their own vector store and RAG within their applications and with their app data,” says Davuluri.

Microsoft’s new Advanced Paste feature is available now as part of the PowerToys suite for Windows 11, giving you the ability to convert the contents of your clipboard as you go. The Advanced Paste menu can be triggered with the help of Windows Key + Shift + V, and you can then convert paste to a variety of formats using keyboard shortcuts. The prompt box can be used to change or summarize the text before you paste it. The catch: you’ll need an OpenAI API key and credits in your OpenAI account for the AI part.

You’ll be able to use Microsoft’s File Explorer to keep track of your coding projects soon, as the company is integrating Git into the file system browser. The company says it will be possible for developers to keep track of file status, commit messages and current branch from within File Explorer. The app now supports 7-Zip and TAR compression.

Windows X Elite, Windows Edge, and Phi-4-Vision: A Multimodal Language Model for Video Translation and Real-Time Capturing

The Windows version of the same kit has a Snapdragon X Elite chip inside. It also has 32GB of RAM, a 512GB SSD, and plenty of ports, though it’s not clear if just anyone can buy it.

Microsoft is adding the ability to add your own characters to Microsoft Teams, and it will make it easier for people to express themselves. Like in Slack, admins can limit who is allowed to add emojis, and they won’t be visible outside of your organization’s domain. They are coming in July.

Microsoft’s Edge browser is getting an automatic real-time video translation feature which can be used to translate videos from sites like YouTube. The feature works with a handful of languages, offering translation from Spanish to English or vice versa — or from English to German, Hindi, Italian, and Russian. Microsoft says that more languages and video platforms will be added to in the future.

The company announced a new version of the model in April. It is a small language model that can work on a mobile device, but it is multimodal and can read text and look at pictures. Image analysis is a big use case for artificial intelligence companies, and smartphone usage is about as good as anywhere to use them. The preview version of the Phi-4-vision is in Microsoft’s family of models.