A cikin wannan labarin, za mu dubi abubuwan da ke cikin na'urar reactor na I/O da yadda yake aiki, rubuta aiwatarwa a cikin ƙasa da layi na 200, da kuma yin tsari mai sauƙi na HTTP akan buƙatun miliyan 40 / min.
Magana
An rubuta labarin don taimakawa fahimtar aikin I/O reactor, don haka fahimtar kasada lokacin amfani da shi.
Ana buƙatar ilimin abubuwan asali don fahimtar labarin. C harshe da wasu gogewa a cikin haɓaka aikace-aikacen cibiyar sadarwa.
An rubuta duk lambar a cikin yaren C daidai gwargwadon (hankali: dogon PDF) zuwa C11 Standard don Linux kuma akwai akan GitHub.
Me yasa wannan ya zama dole?
Tare da karuwar shaharar Intanet, sabobin yanar gizo sun fara buƙatar ɗaukar adadin haɗin kai lokaci guda, sabili da haka an gwada hanyoyin guda biyu: toshe I / O akan babban adadin zaren OS da rashin toshe I / O a hade tare da tsarin sanarwar taron, wanda kuma ake kira “System selector” (epol/ku/Farashin IOCP/da sauransu).
Hanya ta farko ta ƙunshi ƙirƙirar sabon zaren OS don kowane haɗin da ke shigowa. Rashin hasaransa shine rashin daidaituwa mara kyau: tsarin aiki zai aiwatar da yawa mahallin canji и kira tsarin. Suna aiki mai tsada kuma suna iya haifar da rashin RAM kyauta tare da adadin haɗin kai mai ban sha'awa.
Sigar da aka gyara tana haskakawa kafaffen adadin zaren (Pool pool), don haka hana tsarin daga zubar da kisa, amma a lokaci guda gabatar da wata sabuwar matsala: idan wani thread pool a halin yanzu an katange ta da dogon karanta ayyukan, sa'an nan sauran kwasfa da suka riga sun sami damar samun bayanai ba za su iya. yi haka.
Hanya ta biyu tana amfani tsarin sanarwar taron (System selector) wanda OS ke bayarwa. Wannan labarin ya tattauna mafi yawan nau'in zaɓen tsarin, dangane da faɗakarwa (al'amuran, sanarwa) game da shirye-shiryen ayyukan I/O, maimakon a kunne. sanarwa game da kammala su. Misali mai sauƙaƙan amfani da shi ana iya wakilta shi ta hanyar toshe zane mai zuwa:
Bambancin waɗannan hanyoyin shine kamar haka:
Toshe ayyukan I/O dakatar mai amfani kwarara har zuwahar sai OS ɗin ya dace defragments mai shigowa fakitin IP zuwa byte stream (TCP, karɓar bayanai) ko kuma ba za a sami isasshen sarari a cikin maƙallan rubutu na ciki don aikawa ta gaba ba. KOME (aikawa data).
Mai zaɓin tsarin akan lokaci sanar da shirin cewa OS riga fakitin IP masu lalacewa (TCP, liyafar bayanai) ko isasshen sarari a cikin buffers na rubutu na ciki riga samuwa (aika bayanai).
A taƙaice, ajiye zaren OS ga kowane I/O ɓarna ce ta ikon sarrafa kwamfuta, saboda a zahiri, zaren ba sa yin aiki mai amfani (wannan shine inda kalmar ta fito daga. "katse software"). Mai zaɓin tsarin yana magance wannan matsala, yana ƙyale shirin mai amfani yayi amfani da albarkatun CPU fiye da tattalin arziki.
I/O reactor model
Reactor I/O yana aiki azaman Layer tsakanin mai zaɓin tsarin da lambar mai amfani. An kwatanta ka'idar aikinsa ta hanyar zane mai zuwa:
Bari in tunatar da ku cewa taron sanarwa ne cewa wani soket yana iya yin aikin I/O mara toshewa.
Mai sarrafa taron aiki ne da ake kira da reactor na I/O lokacin da aka karɓi wani abu, wanda sai yayi aikin I/O mara toshewa.
Yana da mahimmanci a lura cewa reactor na I/O shine ta ma'anar zaren guda ɗaya, amma babu wani abu da zai hana ra'ayin yin amfani da shi a cikin mahalli da yawa a cikin rabon zaren 1: 1 reactor, don haka ana sake yin amfani da duk nau'ikan CPU.
Aiwatarwa
Za mu sanya haɗin gwiwar jama'a a cikin fayil reactor.h, da aiwatarwa - in reactor.c. reactor.h za ta ƙunshi sanarwa kamar haka:
Nuna sanarwa a cikin reactor.h
typedef struct reactor Reactor;
/*
* Указатель на функцию, которая будет вызываться I/O реактором при поступлении
* события от системного селектора.
*/
typedef void (*Callback)(void *arg, int fd, uint32_t events);
/*
* Возвращает `NULL` в случае ошибки, не-`NULL` указатель на `Reactor` в
* противном случае.
*/
Reactor *reactor_new(void);
/*
* Освобождает системный селектор, все зарегистрированные сокеты в данный момент
* времени и сам I/O реактор.
*
* Следующие функции возвращают -1 в случае ошибки, 0 в случае успеха.
*/
int reactor_destroy(Reactor *reactor);
int reactor_register(const Reactor *reactor, int fd, uint32_t interest,
Callback callback, void *callback_arg);
int reactor_deregister(const Reactor *reactor, int fd);
int reactor_reregister(const Reactor *reactor, int fd, uint32_t interest,
Callback callback, void *callback_arg);
/*
* Запускает цикл событий с тайм-аутом `timeout`.
*
* Эта функция передаст управление вызывающему коду если отведённое время вышло
* или/и при отсутствии зарегистрированных сокетов.
*/
int reactor_run(const Reactor *reactor, time_t timeout);
Tsarin reactor na I/O ya ƙunshi bayanin fayil mai zaɓe epol и tebur zantaGHashTable, wanda ke yin taswirar kowane soket zuwa CallbackData (tsarin mai sarrafa taron da hujjar mai amfani da shi).
Lura cewa mun ba da damar iyawa nau'in bai cika ba bisa ga index. IN reactor.h mun bayyana tsarin reactor, kuma cikin reactor.c mun ayyana shi, ta haka ne za mu hana mai amfani da shi canza filayensa a sarari. Wannan yana ɗaya daga cikin alamu boye bayanai, wanda ya dace daidai da fassarar C.
Ayyuka reactor_register, reactor_deregister и reactor_reregister sabunta jerin kwasfa na sha'awa da masu gudanar da taron daidai a cikin mai zaɓin tsarin da tebur ɗin zanta.
Bayan reactor na I/O ya katse taron tare da mai bayanin fd, yana kiran mai kula da taron daidai, wanda ya wuce fd, abin rufe fuska abubuwan da aka haifar da alamar mai amfani zuwa void.
Nuna aikin reactor_run().
int reactor_run(const Reactor *reactor, time_t timeout) {
int result;
struct epoll_event *events;
if ((events = calloc(MAX_EVENTS, sizeof(*events))) == NULL)
abort();
time_t start = time(NULL);
while (true) {
time_t passed = time(NULL) - start;
int nfds =
epoll_wait(reactor->epoll_fd, events, MAX_EVENTS, timeout - passed);
switch (nfds) {
// Ошибка
case -1:
perror("epoll_wait");
result = -1;
goto cleanup;
// Время вышло
case 0:
result = 0;
goto cleanup;
// Успешная операция
default:
// Вызвать обработчиков событий
for (int i = 0; i < nfds; i++) {
int fd = events[i].data.fd;
CallbackData *callback =
g_hash_table_lookup(reactor->table, &fd);
callback->callback(callback->arg, fd, events[i].events);
}
}
}
cleanup:
free(events);
return result;
}
Don taƙaitawa, jerin kira na aiki a lambar mai amfani zai ɗauki nau'i mai zuwa:
Uwar garken zaren guda ɗaya
Domin gwada reactor na I/O a ƙarƙashin babban nauyi, za mu rubuta sabar gidan yanar gizo mai sauƙi ta HTTP wanda ke amsa kowane buƙatu tare da hoto.
Magana mai sauri ga ka'idar HTTP
HTTP - wannan shine ka'idar matakin aikace-aikace, da farko ana amfani dashi don hulɗar uwar garken-browser.
Ana iya amfani da HTTP cikin sauƙi sufuri yarjejeniya TCP, aikawa da karɓar saƙonni a cikin sigar da aka ƙayyade ƙayyadaddun bayanai.
CRLF jerin haruffa biyu ne: r и n, raba layin farko na buƙatun, rubutun kai da bayanai.
<КОМАНДА> - daya daga CONNECT, DELETE, GET, HEAD, OPTIONS, PATCH, POST, PUT, TRACE. Mai lilo zai aika umarni zuwa uwar garken mu GET, ma'ana "Aika mini abinda ke cikin fayil ɗin."
<URI> - Uniform mai gano albarkatu. Misali, idan URI = /index.html, sannan abokin ciniki ya bukaci babban shafin yanar gizon.
<ВЕРСИЯ HTTP> - sigar ka'idar HTTP a cikin tsari HTTP/X.Y. Sigar da aka fi amfani da ita a yau ita ce HTTP/1.1.
<ЗАГОЛОВОК N> maɓalli ne-darajar biyu a cikin tsari <КЛЮЧ>: <ЗНАЧЕНИЕ>, aika zuwa uwar garken don ƙarin bincike.
<ДАННЫЕ> - bayanan da uwar garken ke buƙata don yin aikin. Sau da yawa yana da sauƙi JSON ko wani tsari.
<КОД СТАТУСА> lamba ce dake wakiltar sakamakon aikin. Sabar mu koyaushe zata dawo da matsayi 200 (aikin nasara).
<ОПИСАНИЕ СТАТУСА> - wakilcin kirtani na lambar matsayi. Don lambar matsayi 200 wannan shine OK.
<ЗАГОЛОВОК N> - taken tsari iri ɗaya kamar a cikin buƙatun. Za mu mayar da taken Content-Length (girman fayil) kuma Content-Type: text/html (nau'in bayanan dawowa).
<ДАННЫЕ> - bayanan da mai amfani ya nema. A cikin yanayinmu, wannan ita ce hanyar zuwa hoton a ciki HTML.
fayil http_server.c (uwar garken zaren guda ɗaya) ya haɗa da fayil common.h, wanda ya ƙunshi samfurori masu zuwa:
Nuna samfuran ayyuka a gama-gari.h
/*
* Обработчик событий, который вызовется после того, как сокет будет
* готов принять новое соединение.
*/
static void on_accept(void *arg, int fd, uint32_t events);
/*
* Обработчик событий, который вызовется после того, как сокет будет
* готов отправить HTTP ответ.
*/
static void on_send(void *arg, int fd, uint32_t events);
/*
* Обработчик событий, который вызовется после того, как сокет будет
* готов принять часть HTTP запроса.
*/
static void on_recv(void *arg, int fd, uint32_t events);
/*
* Переводит входящее соединение в неблокирующий режим.
*/
static void set_nonblocking(int fd);
/*
* Печатает переданные аргументы в stderr и выходит из процесса с
* кодом `EXIT_FAILURE`.
*/
static noreturn void fail(const char *format, ...);
/*
* Возвращает файловый дескриптор сокета, способного принимать новые
* TCP соединения.
*/
static int new_server(bool reuse_port);
An kuma bayyana macro mai aiki SAFE_CALL() kuma an ayyana aikin fail(). Macro yana kwatanta ƙimar magana tare da kuskure, kuma idan yanayin gaskiya ne, ya kira aikin fail():
#define SAFE_CALL(call, error)
do {
if ((call) == error) {
fail("%s", #call);
}
} while (false)
aiki fail() yana buga muhawarar da aka wuce zuwa tashar (kamar printf()) kuma ya ƙare shirin tare da lambar EXIT_FAILURE:
aiki new_server() yana dawo da bayanin fayil na soket "uwar garken" wanda aka ƙirƙira ta hanyar kiran tsarin socket(), bind() и listen() kuma mai ikon karɓar haɗin kai masu shigowa cikin yanayin da ba tare da toshewa ba.
Lura cewa an fara ƙirƙiri soket ɗin a cikin yanayin da ba tare da toshewa ba ta amfani da tuta SOCK_NONBLOCKdon haka a cikin aikin on_accept() (kara karantawa) kiran tsarin accept() bai daina aiwatar da zaren ba.
idan reuse_port daidai yake da true, to wannan aikin zai saita soket tare da zaɓi SO_REUSEPORT ta hanyar setsockopt()don amfani da tashar tashar jiragen ruwa guda ɗaya a cikin mahalli mai zare da yawa (duba sashe "Sabar-Treaded Multi-threaded").
Mai Gudanar da Taron on_accept() da ake kira bayan OS ya haifar da wani taron EPOLLIN, a wannan yanayin yana nufin cewa za a iya karɓar sabon haɗin. on_accept() yana karɓar sabon haɗi, yana canza shi zuwa yanayin da ba tare da toshewa ba kuma yayi rijista tare da mai sarrafa taron on_recv() a cikin I/O reactor.
Mai Gudanar da Taron on_recv() da ake kira bayan OS ya haifar da wani taron EPOLLIN, a wannan yanayin yana nufin cewa haɗin ya yi rajista on_accept(), shirye don karɓar bayanai.
on_recv() yana karanta bayanai daga haɗin kai har sai an karɓi buƙatar HTTP gaba ɗaya, sannan ta yi rajistar mai sarrafa on_send() don aika martanin HTTP. Idan abokin ciniki ya karya haɗin, an soke soket ɗin kuma an rufe shi ta amfani da shi close().
Nuna aiki akan_recv()
static void on_recv(void *arg, int fd, uint32_t events) {
RequestBuffer *buffer = arg;
// Принимаем входные данные до тех пор, что recv возвратит 0 или ошибку
ssize_t nread;
while ((nread = recv(fd, buffer->data + buffer->size,
REQUEST_BUFFER_CAPACITY - buffer->size, 0)) > 0)
buffer->size += nread;
// Клиент оборвал соединение
if (nread == 0) {
SAFE_CALL(reactor_deregister(reactor, fd), -1);
SAFE_CALL(close(fd), -1);
request_buffer_destroy(buffer);
return;
}
// read вернул ошибку, отличную от ошибки, при которой вызов заблокирует
// поток
if (errno != EAGAIN && errno != EWOULDBLOCK) {
request_buffer_destroy(buffer);
fail("read");
}
// Получен полный HTTP запрос от клиента. Теперь регистрируем обработчика
// событий для отправки данных
if (request_buffer_is_complete(buffer)) {
request_buffer_clear(buffer);
SAFE_CALL(reactor_reregister(reactor, fd, EPOLLOUT, on_send, buffer),
-1);
}
}
Mai Gudanar da Taron on_send() da ake kira bayan OS ya haifar da wani taron EPOLLOUT, ma'ana cewa haɗin ya yi rajista on_recv(), shirye don aika bayanai. Wannan aikin yana aika martanin HTTP mai ɗauke da HTML tare da hoto ga abokin ciniki sannan ya canza mai sarrafa taron zuwa on_recv().
Kuma a ƙarshe, a cikin fayil ɗin http_server.c, cikin aiki main() Mun ƙirƙiri wani I/O reactor ta amfani da reactor_new(), Ƙirƙiri soket ɗin uwar garken kuma yi rajistar shi, fara reactor ta amfani da reactor_run() daidai minti daya, sa'an nan kuma mu saki albarkatun mu fita shirin.
Bari mu duba cewa komai yana aiki kamar yadda aka zata. Hadawa (chmod a+x compile.sh && ./compile.sh a cikin tushen aikin) kuma kaddamar da uwar garken da aka rubuta da kansa, bude http://127.0.0.1:18470 a cikin browser kuma duba abin da muke tsammani:
Bari mu auna aikin uwar garken mai zare ɗaya. Bari mu bude tashoshi biyu: a daya za mu gudu ./http_server, a cikin wani daban- aiki. Bayan minti daya, za a nuna ƙididdiga masu zuwa a tasha ta biyu:
Sabar ɗinmu mai zare guda ɗaya ta sami damar aiwatar da buƙatun sama da miliyan 11 a cikin minti ɗaya waɗanda suka samo asali daga haɗin kai 100. Ba mummunan sakamako ba, amma za a iya inganta shi?
Multithreaded uwar garken
Kamar yadda aka ambata a sama, ana iya ƙirƙirar reactor na I/O a cikin zaren daban-daban, ta haka ne ake amfani da duk nau'ikan CPU. Bari mu sanya wannan hanyar a aikace:
Lura cewa hujjar aikin new_server() ni'ima true. Wannan yana nufin cewa mun sanya zaɓi ga soket ɗin uwar garken SO_REUSEPORTdon amfani da shi a cikin mahalli da yawa. Kuna iya karanta ƙarin bayani a nan.
Gudu na biyu
Yanzu bari mu auna aikin uwar garken mai zaren Multi-threaded:
Yawan buƙatun da aka sarrafa a cikin minti 1 ya ƙaru da ~ 3.28 sau! Amma mun kasance kusan ~ XNUMX miliyan ne kawai na lambar zagaye, don haka bari mu yi ƙoƙarin gyara hakan.
Da farko bari mu dubi kididdigar da aka samar cikakke:
Amfani da CPU Affinity, tari tare da -march=native, PGO, karuwa a yawan hits cache, karuwa MAX_EVENTS da amfani EPOLLET bai ba da gagarumin karuwa a cikin aikin ba. Amma menene zai faru idan kun ƙara yawan haɗin haɗin gwiwa lokaci guda?
An sami sakamakon da ake so, kuma tare da shi jadawali mai ban sha'awa yana nuna dogaro da adadin buƙatun da aka sarrafa a cikin minti 1 akan adadin haɗin:
Mun ga cewa bayan ɗaruruwan haɗin haɗin gwiwa, adadin buƙatun da aka sarrafa na sabobin biyu ya ragu sosai (a cikin sigar zaren Multi-threaded wannan ya fi sananne). Wannan yana da alaƙa da aiwatar da tari na Linux TCP/IP? Jin kyauta don rubuta zato game da wannan hali na jadawali da ingantawa don zaɓuɓɓuka masu zare da yawa da guda ɗaya a cikin sharhi.
Yadda lura a cikin sharhi, wannan gwajin aikin ba ya nuna halayen I / O reactor a ƙarƙashin kaya na gaske, saboda kusan koyaushe uwar garken yana hulɗa tare da bayanan bayanai, fitar da rajistan ayyukan, yana amfani da cryptography tare da TLS da dai sauransu, sakamakon abin da lodi ya zama maras Uniform (dynamic). Za a gudanar da gwaje-gwaje tare da abubuwan ɓangare na uku a cikin labarin game da proactor I/O.
Rashin hasara na I/O reactor
Kuna buƙatar fahimtar cewa I/O reactor ba ya da kura-kurai, wato:
Yin amfani da reactor na I/O a cikin mahalli da yawa yana da ɗan wahala, saboda dole ne ku sarrafa kwararar ruwa da hannu.
Aiki ya nuna cewa a mafi yawan lokuta lodi ba Uniform ba ne, wanda zai iya haifar da tsinkayar zaren guda ɗaya yayin da wani ya shagaltu da aiki.
Idan wani mai gudanar da taron ya toshe zaren, to shi ma mai zaɓin tsarin zai toshe, wanda zai haifar da kurakurai masu wuyar ganowa.
Yana magance waɗannan matsalolin I/O proactor, wanda sau da yawa yana da jadawali wanda ke rarraba kaya daidai gwargwado zuwa tafkin zaren, kuma yana da API mafi dacewa. Za mu yi magana game da shi daga baya, a cikin wani labarin na.
ƙarshe
Anan ne tafiyar mu daga ka'idar kai tsaye zuwa sharar profiler ta zo ƙarshe.
Bai kamata ku tsaya a kan wannan ba, saboda akwai wasu hanyoyi masu ban sha'awa iri ɗaya daidai don rubuta software na cibiyar sadarwa tare da matakai daban-daban na dacewa da sauri. Abin sha'awa, a ganina, ana ba da hanyoyin haɗin gwiwa a ƙasa.