Uncategorized

Everything was announced at the I/O

Google Now on Tap: How to Get More Contextual Information from Your Desktop Using What You See on a Screen, What You Need to Know, and What You Can Do

Google has introduced a new AI model to its lineup: Gemini 1.5 Flash. The new model is just as powerful as the previous one, but it is designed for narrow, high-frequency, low-latency tasks. That makes it better at generating fast responses. Google also made some changes to Gemini 1.5 that it says will improve its ability to translate, reason, and code. Google says it has doubled the scope of the context window of the 1.5 Pro version.

If you set Gemini as the default assist on your Android phone, it can already summarize or answer questions about a webpage or a screenshot. Soon, it’ll also be able to tell if there’s a video on your screen and prompt you to ask questions about it. You could already get it to do in a more roundabout way by using the automatic caption on the video.

You will need to have access to the paid version of Gemini to use it if you are looking at a PDF. That’s because the feature ingests the entire PDF, so it requires the long context window available to Gemini Advanced subscribers. It has now become an expert on the topic, so it may be the dishwasher owner’s manual, or the local curbside recycling guidelines. $20 per month is the cost for the oneai premium plan.

Nearly a decade ago, Google showed off a feature called Now on Tap in Android Marshmallow—tap and hold the home button and Google will surface helpful contextual information related to what’s on the screen. Are you talking about a movie with your friend? If you use Now on Tap, you can get information about the title without leaving the messaging app. You can search for a restaurant in the popular review site, Yelp. The phone could recommend Opentable with a tap.

“I think what’s exciting is we now have the technology to build really exciting assistants,” Dave Burke, vice president of engineering on Android, tells me over a Google Meet video call. I don’t think we had the technology back then to do it well so we need a computer system that comprehends what it sees. Now we do.”

In Mountain View, California, today at the I/O developer conference, the new features in its Android operating system feel like they came from the Now on Tap of old. These features are powered by a decade of changes in large language models.

Circle to Search: Asking Questions to Help You Understand What You See, and How You Can Do It With Your Google Camera – A Case Study

Samat claims Google has received positive feedback from consumers, but Circle to Search’s latest feature hails specifically from student feedback. Circle to Search can be used when a user circles a problem in physics or math to get step-by-step instructions.

Samat made it clear Gemini wasn’t just providing answers but was showing students how to solve the problems. Later this year, Circle to Search will be able to solve more complex problems like diagrams and graphs. This is all powered by Google’s LearnLM models, which are fine-tuned for education.

This week, everybody in the US will be able to see ” Artificial Intelligence Overviews”, formerly known as “Search Generative Experience”. Similar to what you find in other search tools like Perplexity or Arc Search, results pages will be created with summarized answers from the web.

There is already a tool for you to search for things by looking at images, but now you can do it with a video as well. That makes it possible to take a video of something and ask a question during the video as well as to have the internet’s artificial intelligence try to get answers from the web.

In the summer, there will be a new feature that can help anyone with photos that have been there at least a decade. “Ask Photos” lets Gemini pore over your Google Photos library in response to your questions, and the feature goes beyond just pulling up pictures of dogs and cats. The CEO asked the person what his plate number was. The number itself and a picture of it were followed by a response to make sure that was correct.

The company believes that Project Arche will become a do-everything virtual assistant with the ability to watch, understand, and remember what it sees through its device’s camera, and to do things for you. It’s powering many of the most impressive demos from I/O this year, and the company’s aim for it is to be an honest-to-goodness AI agent that can’t just talk to you but also actually do things on your behalf.

There is a new generative artificial intelligence model that can convert text, image and video into high definition video. A video can be shot in a variety of styles and can be altered with more prompt. The company is already offering Veo to some creators for use in YouTube videos but is also pitching it to Hollywood for use in films.

Gemini Nano: A new chatbot that uses Artificial Intelligence to assist you in YouTube videos and other social media content creation and improvement with Veo

There is a new custom chatbot created by the internet giant. Users can tell Gemini how to respond and what it does, just like with Openai’s GPTs. You will be able to do that when you become a Gemini Advanced subscriber and you want it to be a positive and insistent running coach.

If you’re on an Android phone or tablet, you can now circle a math problem on your screen and get help solving it. The problem won’t be solved by Artificial Intelligence, but it will break it down into steps to make it easier to complete.

Using on-device Gemini Nano AI smarts, Google says Android phones will be able to help you avoid scam calls by looking out for red flags, like common scammer conversation patterns, and then popping up real-time warnings like the one above. The company promises to offer more details on the feature later in the year.

The lightweight version of the Gemini model is going to be accessible on the desktop in a new way. The built-in assistant uses artificial intelligence to help you make text for more than just a social media post.

The company says that it can now detect artificial intelligence in videos created with the Veo video generator and will add watermarking to content created with it.