Linux ili ndi zida zambiri zosinthira kernel ndi kugwiritsa ntchito. Ambiri aiwo ali ndi zotsatira zoyipa pakugwiritsa ntchito ntchito ndipo sangathe kugwiritsidwa ntchito popanga.
Zaka zingapo zapitazo kunali
Pali kale zida zambiri zomwe zimagwiritsa ntchito eBPF, ndipo m'nkhaniyi tiwona momwe mungalembere mbiri yanu pogwiritsa ntchito laibulale.
Ceph Ndi Pang'onopang'ono
Wolandira watsopano wawonjezedwa ku gulu la Ceph. Titasamutsa zina mwazinthuzo, tidawona kuti liwiro la kukonza zopempha ndi ilo linali lotsika kwambiri kuposa ma seva ena.
Mosiyana ndi nsanja zina, wolandirayo adagwiritsa ntchito bcache ndi linux 4.15 kernel yatsopano. Aka kanali koyamba kuti masinthidwe ambiri agwiritsidwe ntchito pano. Ndipo panthawiyo zinali zoonekeratu kuti gwero la vuto likhoza kukhala chirichonse.
Kufufuza Wolandira
Tiyeni tiyambe ndi kuyang'ana zomwe zimachitika mkati mwa ndondomeko ya ceph-osd. Kwa izi tidzagwiritsa ntchito
Chithunzicho chimatiuza kuti ntchitoyo fdatasync () amatenga nthawi yayitali kutumiza zopempha ku ntchito generic_make_request(). Izi zikutanthauza kuti mwina chomwe chimayambitsa mavuto ndi kwinakwake kunja kwa daemon ya osd. Izi zitha kukhala kernel kapena ma disks. Kutulutsa kwa iostat kunawonetsa kuchedwa kwakukulu pakukonza zopempha ndi ma disks a bcache.
Tikayang'ana wolandirayo, tidapeza kuti daemon ya systemd-udevd imadya nthawi yambiri ya CPU - pafupifupi 20% pamacores angapo. Ili ndi khalidwe lachilendo, choncho muyenera kudziwa chifukwa chake. Popeza Systemd-udevd imagwira ntchito ndi uevents, tidaganiza zowayang'ana udevadm monitor. Zikuoneka kuti chiwerengero chachikulu cha zochitika zosintha zinapangidwa pa chipangizo chilichonse cha chipika mu dongosolo. Izi sizachilendo, kotero tiyenera kuyang'ana zomwe zimapanga zochitika zonsezi.
Kugwiritsa ntchito BCC Toolkit
Monga tadziwira kale, kernel (ndi ceph daemon mu call system) imakhala nthawi yayitali generic_make_request(). Tiyeni tiyese kuyesa liwiro la ntchitoyi. MU
Izi nthawi zambiri zimagwira ntchito mwachangu. Zomwe zimachita ndikungopereka pempho ku mzere woyendetsa chipangizo.
Bcache ndi chipangizo chovuta chomwe chimakhala ndi ma disks atatu:
- chipangizo chothandizira (chosungira disk), pamenepa ndi pang'onopang'ono HDD;
- caching chipangizo (caching disk), apa ndi gawo limodzi la chipangizo cha NVMe;
- chida cha bcache chomwe pulogalamuyo imayendera.
Tikudziwa kuti kutumiza kwa pempho kumachedwa, koma ndi zida ziti mwa izi? Tithana ndi izi posachedwa.
Tsopano tikudziwa kuti zochitika zitha kuyambitsa zovuta. Kupeza chomwe chimayambitsa mbadwo wawo sikophweka. Tiyerekeze kuti iyi ndi pulogalamu yamtundu wina yomwe imayambitsidwa nthawi ndi nthawi. Tiyeni tiwone mtundu wa mapulogalamu omwe amayendetsa padongosolo pogwiritsa ntchito script execsnoop kuchokera momwemo
Mwachitsanzo monga chonchi:
/usr/share/bcc/tools/execsnoop | tee ./execdump
Sitiwonetsa kutulutsa kwathunthu kwa execsnoop apa, koma mzere umodzi wosangalatsa kwa ife umawoneka motere:
sh 1764905 5802 0 sudo arcconf getconfig 1 AD | grep Temperature | awk -F '[:/]' '{print $2}' | sed 's/^ ([0-9]*) C.*/1/'
Mzere wachitatu ndi PPID (kholo PID) ya ndondomekoyi. Njira yokhala ndi PID 5802 idakhala imodzi mwazinthu zowunikira. Poyang'ana masinthidwe a dongosolo loyang'anira, magawo olakwika adapezeka. Kutentha kwa adaputala ya HBA kunatengedwa masekondi 30 aliwonse, omwe nthawi zambiri amafunikira. Titasintha nthawi yowerengera kukhala yotalikirapo, tidapeza kuti kuchedwa kwa pempho kwa wolandirayo sikunawonekerenso poyerekeza ndi olandila ena.
Koma sizikudziwikabe chifukwa chake chipangizo cha bcache chinali chochedwa kwambiri. Tidakonza nsanja yoyeserera yokhala ndi masinthidwe ofanana ndikuyesera kubweretsanso vutolo pogwiritsa ntchito fio pa bcache, nthawi ndi nthawi timagwiritsa ntchito udevadm trigger kuti tipange zochitika.
Kulemba Zida Zochokera ku BCC
Tiyeni tiyese kulemba chida chosavuta kuti tifufuze ndikuwonetsa mafoni omwe akuchedwa kwambiri generic_make_request(). Tilinso ndi chidwi ndi dzina la drive yomwe ntchitoyi idayitanidwira.
Dongosololi ndi losavuta:
- Register kprobe pa generic_make_request():
- Timasunga dzina la disk mu kukumbukira, kupezeka kudzera mkangano wa ntchito;
- Timasunga chizindikiro chanthawi.
- Register kretprobe za kubwerera kuchokera generic_make_request():
- Timapeza sitampu yamakono;
- Timayang'ana sitampu yosungidwa ndikuyifanizira ndi yomwe ilipo;
- Ngati zotsatira zake ndi zazikulu kuposa zomwe zafotokozedwa, ndiye kuti timapeza dzina la disk losungidwa ndikuliwonetsa pa terminal.
Kprobes ΠΈ kretprobes gwiritsani ntchito njira yopumira kuti musinthe code yogwirira ntchito pa ntchentche. Mutha kuwerenga
Zolemba za eBPF mkati mwa python script zikuwoneka motere:
bpf_text = βββ # Here will be the bpf program code βββ
Kusinthanitsa deta pakati pa ntchito, mapulogalamu a eBPF amagwiritsa ntchito
struct data_t {
u64 pid;
u64 ts;
char comm[TASK_COMM_LEN];
u64 lat;
char disk[DISK_NAME_LEN];
};
BPF_HASH(p, u64, struct data_t);
BPF_PERF_OUTPUT(events);
Apa timalembetsa tebulo la hashi lotchedwa p, ndi mtundu wachinsinsi u64 ndi mtengo wamtundu struct data_t. Gome lidzakhalapo malinga ndi pulogalamu yathu ya BPF. BPF_PERF_OUTPUT macro amalembetsa tebulo lina lotchedwa zochitika, zomwe zimagwiritsidwa ntchito
Mukayesa kuchedwa pakati pa kuyitana ntchito ndikubwerera kuchokera ku izo, kapena pakati pa mafoni kupita kuzinthu zosiyanasiyana, muyenera kuganizira kuti zomwe mwalandira ziyenera kukhala zamtundu womwewo. Mwa kuyankhula kwina, muyenera kukumbukira za zotheka kufanana kukhazikitsidwa kwa ntchito. Tili ndi mphamvu yoyezera kuchedwa pakati pa kuyitana ntchito muzochitika za ndondomeko imodzi ndikubwerera kuchokera ku ntchitoyo muzochitika za ndondomeko ina, koma izi ndizopanda ntchito. Chitsanzo chabwino apa chingakhale
Kenaka, tifunika kulemba code yomwe idzagwire ntchito ikadzatchedwa:
void start(struct pt_regs *ctx, struct bio *bio) {
u64 pid = bpf_get_current_pid_tgid();
struct data_t data = {};
u64 ts = bpf_ktime_get_ns();
data.pid = pid;
data.ts = ts;
bpf_probe_read_str(&data.disk, sizeof(data.disk), (void*)bio->bi_disk->disk_name);
p.update(&pid, &data);
}
Apa mtsutso woyamba wa ntchito yotchedwa ntchito idzalowetsedwa m'malo ngati mtsutso wachiwiri
Ntchito yotsatirayi idzayitanidwa pakubwerera kuchokera generic_make_request():
void stop(struct pt_regs *ctx) {
u64 pid = bpf_get_current_pid_tgid();
u64 ts = bpf_ktime_get_ns();
struct data_t* data = p.lookup(&pid);
if (data != 0 && data->ts > 0) {
bpf_get_current_comm(&data->comm, sizeof(data->comm));
data->lat = (ts - data->ts)/1000;
if (data->lat > MIN_US) {
FACTOR
data->pid >>= 32;
events.perf_submit(ctx, data, sizeof(struct data_t));
}
p.delete(&pid);
}
}
Ntchitoyi ndi yofanana ndi yapitayi: timapeza PID ya ndondomekoyi ndi ndondomeko ya nthawi, koma osagawa kukumbukira kwa deta yatsopano. M'malo mwake, timasaka tebulo la hashi kuti tipeze zomwe zilipo kale pogwiritsa ntchito kiyi == PID yamakono. Ngati dongosolo likupezeka, ndiye kuti timapeza dzina la ndondomeko yoyendetsera ntchito ndikuwonjezerapo.
Kusintha kwa binary komwe timagwiritsa ntchito pano ndikofunikira kuti tipeze ulusi wa GID. izo. PID ya njira yayikulu yomwe idayambitsa ulusi pazomwe tikugwira ntchito. Ntchito timayitana
Potulutsa ku terminal, sitikhala ndi chidwi ndi ulusi, koma tili ndi chidwi ndi njira yayikulu. Pambuyo poyerekezera kuchedwa kotsatira ndi malire opatsidwa, timadutsa dongosolo lathu deta mu malo ogwiritsa ntchito kudzera pa tebulo zochitika, pambuyo pake timachotsa cholowacho p.
Muzolemba za python zomwe zidzatsegule kachidindo iyi, tifunika kusintha MIN_US ndi FACTOR ndi zocheperako ndi mayunitsi a nthawi, zomwe tidutse pazokanganazo:
bpf_text = bpf_text.replace('MIN_US',str(min_usec))
if args.milliseconds:
bpf_text = bpf_text.replace('FACTOR','data->lat /= 1000;')
label = "msec"
else:
bpf_text = bpf_text.replace('FACTOR','')
label = "usec"
Tsopano tikuyenera kukonzekera pulogalamu ya BPF kudzera
b = BPF(text=bpf_text)
b.attach_kprobe(event="generic_make_request",fn_name="start")
b.attach_kretprobe(event="generic_make_request",fn_name="stop")
Tiyeneranso kudziwa struct data_t m'mawu athu, apo ayi sitidzatha kuwerenga chilichonse:
TASK_COMM_LEN = 16 # linux/sched.h
DISK_NAME_LEN = 32 # linux/genhd.h
class Data(ct.Structure):
_fields_ = [("pid", ct.c_ulonglong),
("ts", ct.c_ulonglong),
("comm", ct.c_char * TASK_COMM_LEN),
("lat", ct.c_ulonglong),
("disk",ct.c_char * DISK_NAME_LEN)]
Chomaliza ndikutulutsa deta ku terminal:
def print_event(cpu, data, size):
global start
event = ct.cast(data, ct.POINTER(Data)).contents
if start == 0:
start = event.ts
time_s = (float(event.ts - start)) / 1000000000
print("%-18.9f %-16s %-6d %-1s %s %s" % (time_s, event.comm, event.pid, event.lat, label, event.disk))
b["events"].open_perf_buffer(print_event)
# format output
start = 0
while 1:
try:
b.perf_buffer_poll()
except KeyboardInterrupt:
exit()
Script yokha ikupezeka pa
Pomaliza! Tsopano tikuwona kuti chomwe chimawoneka ngati chida choyimilira cha bcache ndichoyimitsa generic_make_request() kwa diski yosungidwa.
Dulani mu Kernel
Ndi chiyani kwenikweni chomwe chikuchedwetsa panthawi yofunsira? Tikuwona kuti kuchedwa kumachitika ngakhale isanayambe kuwerengera ndalama, i.e. kuwerengera za pempho linalake lofuna kutulutsanso ziwerengero zake (/proc/diskstats kapena iostat) sikunayambe. Izi zitha kutsimikiziridwa mosavuta pogwiritsa ntchito iostat ndikubweretsanso vuto, kapena
Ngati tiyang'ana pa ntchito generic_make_request(), ndiye tiwona kuti pempho lowerengera lisanayambe, ntchito zina ziwiri zimatchedwa. Choyamba - generic_make_request_checks(), imayang'ana kuvomerezeka kwa pempho lokhudzana ndi makonda a disk. Chachiwiri -
ret = wait_event_interruptible(q->mq_freeze_wq,
(atomic_read(&q->mq_freeze_depth) == 0 &&
(preempt || !blk_queue_preempt_only(q))) ||
blk_queue_dying(q));
M'menemo, kernel imadikirira kuti mzerewo usungunuke. Tiyeni tiyeze kuchedwa blk_queue_enter():
~# /usr/share/bcc/tools/funclatency blk_queue_enter -i 1 -m
Tracing 1 functions for "blk_queue_enter"... Hit Ctrl-C to end.
msecs : count distribution
0 -> 1 : 341 |****************************************|
msecs : count distribution
0 -> 1 : 316 |****************************************|
msecs : count distribution
0 -> 1 : 255 |****************************************|
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 1 | |
Zikuwoneka ngati tatsala pang'ono kupeza yankho. Ntchito zomwe zimagwiritsidwa ntchito poyimitsa / kumasula mzere ndi
Nthawi yomwe imafunika kuchotsa mzerewu ndi yofanana ndi disk latency pomwe kernel imadikirira kuti ntchito zonse zomwe zili pamzere zimalize. Mzere ukakhala wopanda kanthu, zosintha zimayikidwa. Pambuyo pake amatchedwa
Tsopano tikudziwa mokwanira kukonza zinthu. Lamulo la udevadm trigger limapangitsa kuti makonzedwe a chipangizo chotchinga agwiritsidwe ntchito. Zokonda izi zikufotokozedwa m'malamulo a udev. Titha kupeza makonda omwe akuundana pamzere poyesera kuwasintha kudzera mu sysfs kapena kuyang'ana pa kernel source code. Titha kuyesanso kugwiritsa ntchito BCC
~# /usr/share/bcc/tools/trace blk_freeze_queue -K -U
PID TID COMM FUNC
3809642 3809642 systemd-udevd blk_freeze_queue
blk_freeze_queue+0x1 [kernel]
elevator_switch+0x29 [kernel]
elv_iosched_store+0x197 [kernel]
queue_attr_store+0x5c [kernel]
sysfs_kf_write+0x3c [kernel]
kernfs_fop_write+0x125 [kernel]
__vfs_write+0x1b [kernel]
vfs_write+0xb8 [kernel]
sys_write+0x55 [kernel]
do_syscall_64+0x73 [kernel]
entry_SYSCALL_64_after_hwframe+0x3d [kernel]
__write_nocancel+0x7 [libc-2.23.so]
[unknown]
3809631 3809631 systemd-udevd blk_freeze_queue
blk_freeze_queue+0x1 [kernel]
queue_requests_store+0xb6 [kernel]
queue_attr_store+0x5c [kernel]
sysfs_kf_write+0x3c [kernel]
kernfs_fop_write+0x125 [kernel]
__vfs_write+0x1b [kernel]
vfs_write+0xb8 [kernel]
sys_write+0x55 [kernel]
do_syscall_64+0x73 [kernel]
entry_SYSCALL_64_after_hwframe+0x3d [kernel]
__write_nocancel+0x7 [libc-2.23.so]
[unknown]
Malamulo a Udev amasintha kawirikawiri ndipo nthawi zambiri izi zimachitika molamulidwa. Chifukwa chake tikuwona kuti ngakhale kugwiritsa ntchito zikhalidwe zomwe zakhazikitsidwa kale kumayambitsa kuchedwetsa kusamutsa pempho kuchokera ku pulogalamu kupita ku diski. Zoonadi, kupanga zochitika za udev pamene palibe kusintha kwa kasinthidwe ka disk (mwachitsanzo, chipangizocho sichinakwezedwe / chotsekedwa) sichiri chabwino. Komabe, titha kuthandiza kernel kuti isagwire ntchito yosafunikira ndikuyimitsa mzere wopempha ngati sikofunikira.
Kutsiliza
eBPF ndi chida chosinthika komanso champhamvu. Mβnkhaniyo tinaona chitsanzo chimodzi chothandiza ndi kusonyeza mbali yaingβono ya zimene tingathe kuchita. Ngati mukufuna kupanga zida za BCC, ndizoyenera kuziwona
Palinso zida zina zosangalatsa zowongolera ndi kuyika mbiri kutengera eBPF. Mmodzi wa iwo -
Source: www.habr.com