Published OpenChatKit, a toolkit for building chatbots

The OpenChatKit open toolkit is introduced, aimed at simplifying the creation of chatbots for specialized and general applications. The system is adapted to perform tasks such as answering questions, conducting multi-stage dialogues, summarizing, extracting information, classifying text. The code is written in Python and distributed under the Apache 2.0 license. The project includes a ready-made model, code for training your model, utilities for testing the results of the model, tools for supplementing the model with context from an external index and adapting the base model to solve your own problems.

The bot is based on a basic machine learning model (GPT-NeoXT-Chat-Base-20B), built using a language model covering about 20 billion parameters and optimized for conversational communication. The model was trained using data obtained from the collections of the LAION, Together and Ontocord.ai projects.

To expand the existing knowledge base, a system is proposed that is capable of extracting additional information from external repositories, APIs and other sources. For example, it is possible to update information using data from Wikipedia and news feeds. Additionally, a moderation model is available, trained with 6 billion parameters, based on the GPT-JT model, and designed to filter inappropriate questions or limit discussions to certain topics.

Separately, we can note the ChatLLaMA project, which offers a library for creating intelligent assistants similar to ChatGPT. The project is developing with an eye to the possibility of running on its own equipment and creating personalized solutions designed to cover narrow areas of knowledge (for example, medicine, law, games, scientific research, etc.). The ChatLLaMA code is licensed under the GPLv3.

The project supports the use of models based on the LLaMA (Large Language Model Meta AI) architecture proposed by Meta. The full LLaMA model covers 65 billion parameters, but for ChatLLaMA it is recommended to use the variants with 7 and 13 billion parameters or the GPTJ (6 billion), GPTNeoX (1.3 billion), 20BOPT (13 billion), BLOOM (7.1 billion) and Galactica (6.7 billion) models ). Initially, LLaMA models are supplied only to researchers upon special request, but since torrents were used to deliver the data, enthusiasts have prepared a script that allows anyone to download the model.

Source: opennet.ru

Add a comment