Google has announced an experiment to embed a large machine learning language model into Chrome. To access the model from web applications and browser add-ons, the Prompt API is proposed, which allows you to send requests in natural language, similar to chatbots. It is expected that a large language model built into the browser will simplify the execution of AI tasks in web applications and will allow you to not worry about installing and managing language models.
The experiment used is the Gemini Nano model, the most compact of the Gemini family. It is also possible to install expert models that expand the base model with additional knowledge that may be required to solve specific problems, as well as to implement skills such as machine translation and summarization. Models are executed locally on the user's system without accessing external services.
The Runtime used to execute the model automatically uses the GPUs and NPUs available in the system to speed up work with the model or switches to executing the model using the CPU. The advantages of running the model on the user’s system include maintaining the confidentiality of the processed data, the ability to continue working in offline mode in the absence of a network connection or if problems arise with the quality of communication, reducing delays when sending requests, and eliminating dependence on external services.

API Prompt, developed to interact with the model, allows you to perform not only the simplest single requests in natural language, but also organize the involvement of the model in processing and classifying data taking into account the context, take into account requests and data previously sent in the session, and also use the model to select the best options ( for example, you can ask to select an icon from a list of emoji for a specific comment on the site). Additionally, it is planned to develop the API for use in the field of content creation and solving such problems as rewriting in other words, proofreading and correcting grammar.
In general, two types of API are being developed to interact with the built-in AI model - Task and Exploratory. The first provides access to capabilities for solving certain problems, such as translating text from one language to another (API Translation) or summarizing the main essence of the text (API Summarization). The second type is aimed at creating and testing experimental prototypes when developing new Task APIs. Work is also underway on the LoRA (Low-Rank Adaptation) API to adapt the weighting coefficients of the base model to improve the efficiency of solving certain problems.
Access to participate in the experiment is provided after filling out the application. The API is under active development and will be expanded and changed taking into account user feedback and preferences until the final version is adopted. In the future, we plan to organize more accessible testing using the Origin Trials mode, which provides the ability to work with experimental APIs from applications downloaded from localhost or 127.0.0.1, or after registering and receiving a special token that is valid for a limited time for a specific site. In parallel with manufacturers of other browsers, work is underway to standardize the developed APIs.
Source: opennet.ru
