OpenChatKit, kayan aikin ƙirƙira chatbots, an buga

An gabatar da kayan aikin buɗe tushen tushen OpenChatKit, da nufin sauƙaƙe ƙirƙirar taɗi don ƙwararrun amfani da gabaɗaya. An daidaita tsarin don yin ayyuka kamar amsa tambayoyi, gudanar da tattaunawa ta matakai daban-daban, taƙaitawa, fitar da bayanai, da rarraba rubutu. An rubuta lambar a Python kuma an rarraba ta ƙarƙashin lasisin Apache 2.0. Aikin ya haɗa da samfurin da aka shirya, lambar don horar da samfurin ku, kayan aiki don gwada sakamakon samfurin, kayan aiki don haɓaka samfurin tare da mahallin daga ma'anar waje da daidaita tsarin tushe don magance matsalolin ku.

Bot ɗin ya dogara ne akan ƙirar koyan injuna (GPT-NeoXT-Chat-Base-20B), wanda aka gina ta amfani da ƙirar harshe wanda ke rufe kusan sigogin biliyan 20 kuma an inganta shi don sadarwar tattaunawa. Don horar da ƙirar, an yi amfani da bayanan da aka samu daga tarin ayyukan LAION, Tare da Ontocord.ai.

Don faɗaɗa tushen ilimin da ake da shi, an gabatar da tsarin da zai iya dawo da ƙarin bayani daga ma'ajiyar waje, APIs da sauran tushe. Misali, yana yiwuwa a sabunta bayanai ta amfani da bayanai daga Wikipedia da ciyarwar labarai. Akwai samfurin daidaitawa na zaɓi, wanda aka horar akan sigogi biliyan 6 kuma bisa tsarin GPT-JT, don tace tambayoyin da basu dace ba ko iyakance tattaunawa zuwa takamaiman batutuwa.

Na dabam, zamu iya ambaton aikin ChatLLaMA, wanda ke ba da ɗakin karatu don ƙirƙirar mataimakan ƙwararru irin na ChatGPT. Ana ci gaba da aikin tare da sa ido kan yuwuwar yin aiki akan kayan aikin ku da ƙirƙirar keɓaɓɓun mafita waɗanda aka tsara don rufe kunkuntar wuraren ilimi (misali, magani, doka, wasanni, binciken kimiyya, da sauransu). Lambar ChatLLaMA tana da lasisi ƙarƙashin GPLv3.

Aikin yana goyan bayan amfani da samfura bisa tsarin LLAMA (Babban Harshe Model Meta AI) da Meta ya gabatar. Cikakken samfurin LLAMA ya ƙunshi sigogi biliyan 65, amma don ChatLLaMA ana ba da shawarar yin amfani da zaɓuɓɓuka tare da sigogi 7 da 13 biliyan ko GPTJ (biliyan 6), GPTNeoX (biliyan 1.3), 20BOPT (biliyan 13), BLOOM (biliyan 7.1) da kuma Galactica (6.7 biliyan) model ). Da farko, ana ba da samfuran LLAMA ga masu bincike ne kawai akan buƙata ta musamman, amma tunda ana amfani da torrent don isar da bayanai, masu sha'awar sun shirya rubutun da ya ba kowa damar sauke samfurin.

source: budenet.ru

Add a comment