Indlela yesayensi ye-poke, noma indlela yokukhetha ukumiswa kwesizindalwazi usebenzisa amabhentshimakhi kanye ne-algorithm yokwenza kahle

Sawubona.

Nginqume ukwabelana ngalokho engikutholile - isithelo sokucabanga, ukuzama kanye nephutha.
Kakhulu: lokhu akutholakali, yiqiniso - konke lokhu kufanele ngabe kwaziwa isikhathi eside, kulabo abathintekayo ekusetshenzisweni kwedatha yezibalo kanye nokwenza kahle kwanoma yiziphi izinhlelo, hhayi ikakhulukazi i-DBMS.
Futhi: yebo, bayazi, babhala izindatshana ezithokozisayo ocwaningweni lwabo, isibonelo (UPD.: emazwaneni baveze iphrojekthi ethakazelisa kakhulu: ottertune )
Ngakolunye uhlangothi: ngaphandle angiboni noma yikuphi ukukhulunywa okusabalele noma ukusatshalaliswa kwale ndlela ku-inthanethi phakathi kochwepheshe be-IT, i-DBA.

Ngakho, iphuzu.

Ake sicabange ukuthi sinomsebenzi: ukusetha uhlelo oluthile lwesevisi ukuze lusevise uhlobo oluthile lomsebenzi.

Kuyaziwa ngalo msebenzi: ukuthi uyini, ukuthi izinga lalo msebenzi likalwa kanjani, futhi iyini indinganiso yokulinganisa le mfanelo.

Ake futhi sicabange ukuthi kuyaziwa futhi kuqondwa kancane: ukuthi umsebenzi wenziwa kanjani (noma) kulolu hlelo lwesevisi.

"Okuningi noma okuncane" - lokhu kusho ukuthi kungenzeka ukulungiselela (noma ukuthole endaweni ethile) ithuluzi elithile, insizakalo, isevisi engahlanganiswa futhi isetshenziswe ohlelweni ngomthwalo wokuhlola owanele ngokwanele kulokho okuzobe kukhiqizwa, ezimweni ezanele ngokwanele ukusebenza ekukhiqizeni.

Hhayi-ke, ake sicabange ukuthi isethi yemingcele yokulungisa yalesi simiso sesevisi iyaziwa, engasetshenziswa ukulungisa lolu hlelo ngokuphathelene nokukhiqiza komsebenzi walo.

Futhi yini inkinga - akukho ukuqonda okuphelele ngokwanele kwalesi simiso sesevisi, okukuvumela ukuthi ulungiselele ngobuchwepheshe izilungiselelo zalesi simiso ukuze ulayishe esikhathini esizayo endaweni enikeziwe futhi uthole ukukhiqiza okudingekayo kwesistimu.

Hhayi-ke. Lokhu kuhlale kunjalo.

Yini ongayenza lapha?

Nokho, into yokuqala efika engqondweni ukubheka imibhalo yalolu hlelo. Qonda ukuthi ububanzi obumukelekayo buyini amanani amapharamitha okulungisa. Futhi, ngokwesibonelo, usebenzisa indlela yokwehlela yokudidiyela, khetha amanani amapharamitha wesistimu ekuhlolweni.

Labo. nikeza uhlelo uhlobo oluthile lokucushwa, ngendlela yesethi ethile yamanani emingcele yayo yokucushwa.

Faka umthwalo wokuhlola kuwo, usebenzisa lona kanye lolu hlelo lokusebenza, ijeneretha yokulayisha.
Futhi bheka inani - impendulo, noma imethrikhi yekhwalithi yesistimu.

Umcabango wesibili ungase ube isiphetho sokuthi lesi yisikhathi eside kakhulu.

Nokho, okungukuthi: uma kunemingcele eminingi yokuhlela, uma ububanzi bamanani abo amboziwe bukhulu, uma ukuhlolwa komthwalo ngamunye kuthatha isikhathi esiningi ukuqeda, khona-ke: yebo, konke lokhu kungase kuthathe inani elingamukeleki. isikhathi.

Hhayi-ke, nakhu ongakuqonda futhi ukukhumbule.

Ungathola ukuthi kusethi yamanani emingcele yezilungiselelo zesistimu yesevisi kukhona i-vector, njengokulandelana kwamanye amanani.

Ivekhtha ngayinye enjalo, ezinye izinto ziyalingana (ngokuthi ayithinteki kule vector), ihambelana nenani eliqinisekile ngokuphelele lemethrikhi - inkomba yekhwalithi yokusebenza kwesistimu ngaphansi komthwalo wokuhlola.

Yebo

Ake sisho i-vector yokumisa uhlelo njenge Indlela yesayensi ye-poke, noma indlela yokukhetha ukumiswa kwesizindalwazi usebenzisa amabhentshimakhi kanye ne-algorithm yokwenza kahlekuphi Indlela yesayensi ye-poke, noma indlela yokukhetha ukumiswa kwesizindalwazi usebenzisa amabhentshimakhi kanye ne-algorithm yokwenza kahle; Kuphi Indlela yesayensi ye-poke, noma indlela yokukhetha ukumiswa kwesizindalwazi usebenzisa amabhentshimakhi kanye ne-algorithm yokwenza kahle - inani lamapharamitha wokucushwa kwesistimu, zingaki zale mingcele.

Nevelu yemethrikhi ehambisana nalokhu Indlela yesayensi ye-poke, noma indlela yokukhetha ukumiswa kwesizindalwazi usebenzisa amabhentshimakhi kanye ne-algorithm yokwenza kahle asikuchaze ngokuthi
Indlela yesayensi ye-poke, noma indlela yokukhetha ukumiswa kwesizindalwazi usebenzisa amabhentshimakhi kanye ne-algorithm yokwenza kahle, bese sithola umsebenzi: Indlela yesayensi ye-poke, noma indlela yokukhetha ukumiswa kwesizindalwazi usebenzisa amabhentshimakhi kanye ne-algorithm yokwenza kahle

Hhayi-ke: yonke into ngokushesha yehlela kuye, esimweni sami: cishe ngikhohliwe ezinsukwini zami zomfundi, ama-algorithms wokusesha okudlulele komsebenzi.

Kulungile, kodwa lapha kuphakama umbuzo wenhlangano nosetshenziswayo: iyiphi i-algorithm okufanele isetshenziswe.

  1. Ngomqondo - ukuze ukwazi ukufaka ikhodi kancane ngesandla.
  2. Futhi ukuze isebenze, i.e. ithole i-extremum (uma ikhona), kahle, okungenani ngokushesha kunokwehla kokuxhumanisa.

Iphuzu lokuqala libonisa ukuthi sidinga ukubheka ezindaweni ezithile lapho ama-algorithms anjalo asevele esetshenzisiwe, futhi, ngandlela thile, alungele ukusetshenziswa kukhodi.
Hhayi-ke, ngiyazi python и cran-r

Iphuzu lesibili lisho ukuthi udinga ukufunda ngama-algorithms ngokwawo, ukuthi ayini, ukuthi ziyini izidingo zawo, kanye nezici zomsebenzi wawo.

Futhi lokho abakunikezayo kungaba imiphumela emibi ewusizo - imiphumela, noma ngokuqondile ku-algorithm ngokwayo.

Noma zingatholakala emiphumeleni ye-algorithm.

Okuningi kuncike ezimeni zokufaka.

Isibonelo, uma, ngesizathu esithile, udinga ukuthola umphumela ngokushesha, kahle, udinga ukubheka kuma-algorithms wokwehla kwe-gradient bese ukhetha eyodwa yazo.

Noma, uma isikhathi singabalulekile kangako, ungakwazi, ngokwesibonelo, ukusebenzisa izindlela zokwenza kahle ze-stochastic, njenge-algorithm yofuzo.

Ngiphakamisa ukucabangela umsebenzi wale ndlela, ukukhetha ukucushwa kwesistimu, usebenzisa i-algorithm yofuzo, ngokulandelayo, okushoyo: umsebenzi welabhorethri.

Okwangempela:

  1. Makube khona, njengesistimu yesevisi: oracle xe 18c
  2. Ivumele inikeze umsebenzi wokwenziwayo kanye nenhloso: ukuthola okuphumayo okuphezulu kakhulu okungakhona kwesizindalwazi esingaphansi, emisebenzini/isekhondi.
  3. Okwenziwayo kungase kuhluke kakhulu kumvelo yokusebenza ngedatha nomongo womsebenzi.
    Masivume ukuthi lena imisebenzi engacubunguli inani elikhulu ledatha yethebula.
    Ngomqondo wokuthi abakhiqizi idatha yokuhlehlisa eningi kunokwenza futhi futhi abacubunguli amaphesenti amakhulu emigqa namathebula amakhulu.

Lokhu okwenziwayo okushintsha umugqa owodwa kuthebula elikhulu kakhulu noma elincane, elinenani elincane lezinkomba kuleli thebula.

Kulesi simo: ukukhiqizwa kwesizindalwazi esincane sokucubungula ukuthengiselana, ngokubhukha, kuzonqunywa ikhwalithi yokucubungula isizindalwazi se-redox.

Umshwana wokuzihlangula - uma sikhuluma ngokuqondile ngezilungiselelo ze-subdb.

Ngoba, esimweni esivamile, kungase kube khona, ngokwesibonelo, ukukhiya kokuhwebelana phakathi kweseshini ye-SQL, ngenxa yomklamo womsebenzi womsebenzisi onedatha yethebula kanye/noma imodeli yethebula.

Okungukuthi, okuzoba nomthelela ocindezelayo kumethrikhi ye-tps futhi lokhu kuzoba yisici sangaphandle, esihlobene nesizindalwazi esingaphansi: kahle, yile ndlela imodeli yethebula eyakhiwe ngayo kanye nomsebenzi onedatha kuwo ukuthi ukuvinjelwa kwenzeka.

Ngakho-ke, ngenxa yobumsulwa bokuhlolwa, sizokhipha lesi sici, futhi ngezansi ngizocacisa kahle ukuthi kanjani.

  1. Ake sicabange, ngokuqiniseka, ukuthi u-100% wemiyalo ye-SQL ethunyelwe kusizindalwazi iyimiyalo ye-DML.
    Vumela izici zokusebenza komsebenzisi nesizindalwazi esingaphansi zifane ekuhlolweni.
    Okungukuthi: inani lamaseshini e-skl, idatha yethebula, ukuthi izikhathi ze-skl zisebenza kanjani nazo.
  2. I-Subd isebenza ngaphakathi FORCE LOGGING, ARCHIVELOG mods. Imodi ye-Flashback-database ivaliwe, ezingeni le-subd.
  3. Phinda wenze izingodo: etholakala ohlelweni lwefayela oluhlukile, "kudiski" ehlukile;
    Okunye okuyingxenye ebonakalayo yesizindalwazi: kolunye, uhlelo lwefayela oluhlukile, “kudiski” ehlukile:

Imininingwane eyengeziwe mayelana nedivayisi ephathekayo. izingxenye zesizindalwazi selabhorethri

SQL> select status||' '||name from v$controlfile;
 /db/u14/oradata/XE/control01.ctl
SQL> select GROUP#||' '||MEMBER from v$logfile;
1 /db/u02/oradata/XE/redo01_01.log
2 /db/u02/oradata/XE/redo02_01.log
SQL> select FILE_ID||' '||TABLESPACE_NAME||' '||round(BYTES/1024/1024,2)||' '||FILE_NAME as col from dba_data_files;
4 UNDOTBS1 2208 /db/u14/oradata/XE/undotbs1_01.dbf
2 SLOB 128 /db/u14/oradata/XE/slob01.dbf
7 USERS 5 /db/u14/oradata/XE/users01.dbf
1 SYSTEM 860 /db/u14/oradata/XE/system01.dbf
3 SYSAUX 550 /db/u14/oradata/XE/sysaux01.dbf
5 MONITOR 128 /db/u14/oradata/XE/monitor.dbf
SQL> !cat /proc/mounts | egrep "/db/u[0-2]"
/dev/vda1 /db/u14 ext4 rw,noatime,nodiratime,data=ordered 0 0
/dev/mapper/vgsys-ora_redo /db/u02 xfs rw,noatime,nodiratime,attr2,nobarrier,inode64,logbsize=256k,noquota 0 0

Ekuqaleni, ngaphansi kwalezi zimo zomthwalo, bengifuna ukusebenzisa i-subd yokuthengiselana I-SLOB-usizo
Inesici esihle kakhulu, ngizocaphuna umbhali:

Enhliziyweni ye-SLOB "indlela ye-SLOB." Indlela ye-SLOB ihlose ukuhlola izinkundla
ngaphandle kokuphikisana kwesicelo. Umuntu akakwazi ukushayela ukusebenza okuphezulu kwehadiwe
usebenzisa ikhodi yohlelo, okungukuthi, isibonelo, eboshwe ukukhiya uhlelo noma ngisho
ukwabelana ngamabhulokhi we-Oracle Database. Kulungile—kukhona okungaphezulu lapho wabelana ngedatha
kumabhulokhi wedatha! Kepha i-SLOB-ekusetshenzisweni kwayo okuzenzakalelayo-ivikelekile engxabanweni enjalo.

Lesi simemezelo: siyahambisana, sinjalo.
Kulula ukulawula izinga lokufana kwamaseshini we-cl, lokhu kuyisihluthulelo -t qala uhlelo lokusebenza runit.sh kusuka ku-SLOB
Iphesenti lemiyalo ye-DML lilawulwa, enanini lemiyalezo yombhalo ethunyelwa ku-subd, iseshini ngayinye yombhalo, ipharamitha UPDATE_PCT
Ngokwehlukana futhi kulula kakhulu: SLOB ngokwayo, ngaphambi nangemva kweseshini yokulayisha - ilungiselela i-statspack, noma ama-awr-snapshots (lokho okusethwe ukuthi kulungiswe).

Nokho, kwavela ukuthi SLOB ayisekeli izikhathi ze-SQL ezinobude obungaphansi kwamasekhondi angu-30.
Ngakho-ke, ngiqale ngifaka amakhodi eyami, inguqulo yesisebenzi-sabantu abampofu yesilayishi, bese iqhubeka isebenza.

Ake ngicacise ukuthi isilayishi senzani nokuthi sikwenza kanjani, ukuze kucace.
Ngokuyisisekelo, i-loader ibonakala kanje:

Ikhodi yesisebenzi

function dotx()
{
local v_period="$2"
[ -z "v_period" ] && v_period="0"
source "/home/oracle/testingredotracе/config.conf"

$ORACLE_HOME/bin/sqlplus -S system/${v_system_pwd} << __EOF__
whenever sqlerror exit failure
set verify off
set echo off
set feedback off

define wnum="$1"
define period="$v_period"
set appinfo worker_&&wnum

declare
 v_upto number;
 v_key  number;
 v_tots number;
 v_cts  number;
begin
 select max(col1) into v_upto from system.testtab_&&wnum;
 SELECT (( SYSDATE - DATE '1970-01-01' ) * 86400 ) into v_cts FROM DUAL;
 v_tots := &&period + v_cts;
 while v_cts <= v_tots
 loop
  v_key:=abs(mod(dbms_random.random,v_upto));
  if v_key=0 then
   v_key:=1;
  end if;
  update system.testtab_&&wnum t
  set t.object_name=translate(dbms_random.string('a', 120), 'abcXYZ', '158249')
  where t.col1=v_key
  ;
  commit;
  SELECT (( SYSDATE - DATE '1970-01-01' ) * 86400 ) into v_cts FROM DUAL;
 end loop;
end;
/

exit
__EOF__
}
export -f dotx

Abasebenzi bethulwa ngale ndlela:

Abasebenzi abasebenzayo

echo "starting test, duration: ${TEST_DURATION}" >> "$v_logfile"
for((i=1;i<="$SQLSESS_COUNT";i++))
do
 echo "sql-session: ${i}" >> "$v_logfile"
 dotx "$i" "${TEST_DURATION}" &
done
echo "waiting..." >> "$v_logfile"
wait

Futhi amatafula abasebenzi alungiswa kanje:

Ukudala amatafula

function createtable() {
source "/home/oracle/testingredotracе/config.conf"
$ORACLE_HOME/bin/sqlplus -S system/${v_system_pwd} << __EOF__
whenever sqlerror continue
set verify off
set echo off
set feedback off

define wnum="$1"
define ts_name="slob"

begin
 execute immediate 'drop table system.testtab_&&wnum';
exception when others then null;
end;
/

create table system.testtab_&&wnum tablespace &&ts_name as
select rownum as col1, t.*
from sys.dba_objects t
where rownum<1000
;
create index testtab_&&wnum._idx on system.testtab_&&wnum (col1);
--alter table system.testtab_&&wnum nologging;
--alter index system.testtab_&&wnum._idx nologging;
exit
__EOF__
}
export -f createtable

seq 1 1 "$SQLSESS_COUNT" | xargs -n 1 -P 4 -I {} -t bash -c "createtable "{}"" | tee -a "$v_logfile"
echo "createtable done" >> "$v_logfile"

Labo. Kumsebenzi ngamunye (empeleni: iseshini ehlukile ye-SQL ku-DB) kwakhiwa ithebula elihlukile, isisebenzi esisebenza ngalo.

Lokhu kuqinisekisa ukungabikho kwezingidi zebhizinisi phakathi kwezikhathi zabasebenzi.
Isisebenzi ngasinye: senza into efanayo, ngetafula laso, amatafula ayafana.
Bonke abasebenzi benza umsebenzi ngesikhathi esifanayo.
Ngaphezu kwalokho, isikhathi eside ngokwanele ukuze, ngokwesibonelo, ukushintshwa kwelogi nakanjani kwenzeke, futhi izikhathi ezingaphezu kwesisodwa.
Nokho, ngokufanele, izindleko ezihambisanayo nemiphumela yavela.
Endabeni yami, ngilungise isikhathi somsebenzi wabasebenzi ngemizuzu engu-8.

Ucezu lombiko we-statspack ochaza ukusebenza kwe-subd engaphansi komthwalo

Database    DB Id    Instance     Inst Num  Startup Time   Release     RAC
~~~~~~~~ ----------- ------------ -------- --------------- ----------- ---
          2929910313 XE                  1 07-Sep-20 23:12 18.0.0.0.0  NO

Host Name             Platform                CPUs Cores Sockets   Memory (G)
~~~~ ---------------- ---------------------- ----- ----- ------- ------------
     billing.izhevsk1 Linux x86 64-bit           2     2       1         15.6

Snapshot       Snap Id     Snap Time      Sessions Curs/Sess Comment
~~~~~~~~    ---------- ------------------ -------- --------- ------------------
Begin Snap:       1630 07-Sep-20 23:12:27       55        .7
  End Snap:       1631 07-Sep-20 23:20:29       62        .6
   Elapsed:       8.03 (mins) Av Act Sess:       8.4
   DB time:      67.31 (mins)      DB CPU:      15.01 (mins)

Cache Sizes            Begin        End
~~~~~~~~~~~       ---------- ----------
    Buffer Cache:     1,392M              Std Block Size:         8K
     Shared Pool:       288M                  Log Buffer:   103,424K

Load Profile              Per Second    Per Transaction    Per Exec    Per Call
~~~~~~~~~~~~      ------------------  ----------------- ----------- -----------
      DB time(s):                8.4                0.0        0.00        0.20
       DB CPU(s):                1.9                0.0        0.00        0.04
       Redo size:        7,685,765.6              978.4
   Logical reads:           60,447.0                7.7
   Block changes:           47,167.3                6.0
  Physical reads:                8.3                0.0
 Physical writes:              253.4                0.0
      User calls:               42.6                0.0
          Parses:               23.2                0.0
     Hard parses:                1.2                0.0
W/A MB processed:                1.0                0.0
          Logons:                0.5                0.0
        Executes:           15,756.5                2.0
       Rollbacks:                0.0                0.0
    Transactions:            7,855.1

Ukubuyela emsebenzini waselabhorethri.
Sizokwenza, ezinye izinto zilingane, ziguqule amanani emingcele elandelayo yesizindalwazi saselabhorethri:

  1. Usayizi wamaqembu wokungena kusizindalwazi. ububanzi benani: [32, 1024] MB;
  2. Inani lamaqembu ejenali kusizindalwazi. uhla lwenani: [2,32];
  3. log_archive_max_processes uhla lwenani: [1,8];
  4. commit_logging amanani amabili avunyelwe: batch|immediate;
  5. commit_wait amanani amabili avunyelwe: wait|nowait;
  6. log_buffer ububanzi benani: [2,128] MB.
  7. log_checkpoint_timeout inani lebanga: [60,1200] imizuzwana
  8. db_writer_processes ububanzi benani: [1,4]
  9. undo_retention inani lebanga: [30;300] imizuzwana
  10. transactions_per_rollback_segment ububanzi benani: [1,8]
  11. disk_asynch_io amanani amabili avunyelwe: true|false;
  12. filesystemio_options amanani alandelayo avunyelwe: none|setall|directIO|asynch;
  13. db_block_checking amanani alandelayo avunyelwe: OFF|LOW|MEDIUM|FULL;
  14. db_block_checksum amanani alandelayo avunyelwe: OFF|TYPICAL|FULL;

Umuntu onolwazi ekugcineni imininingwane yolwazi ye-Oracle angakwazi ukusho ukuthi yini nokuthi yimaphi amanani okufanele asethwe, kusuka kumingcele eshiwo kanye namanani awo amukelekayo, ukuze athole ukukhiqiza okukhulu kwedathabhesi yomsebenzi ngedatha ekhonjiswa ikhodi yesicelo , lapha ngenhla.

Kodwa.

Iphuzu lomsebenzi waselabhorethri ukukhombisa ukuthi i-algorithm yokwenza kahle ngokwayo izosicacisela lokhu ngokushesha uma kuqhathaniswa.

Kithina, okusele nje wukubheka idokhumenti, ngohlelo olwenziwa ngendlela oyifisayo, okwanele nje ukuthola ukuthi yimiphi imingcele okufanele uyishintshe nokuthi yiziphi izigaba.
Futhi futhi: faka ikhodi ikhodi ezosetshenziselwa ukusebenza nesistimu yangokwezifiso ye-algorithm ekhethiwe yokwenza kahle.

Ngakho, manje mayelana nekhodi.
Ngikhulume ngenhla mayelana cran-r, okungukuthi: konke ukukhohlisa ngesistimu eyenziwe ngendlela oyifisayo kuhlelwa ngendlela yeskripthi sika-R.

Umsebenzi wangempela, ukuhlaziya, ukukhethwa ngenani lemethrikhi, ama-vectors wesistimu: leli iphakheji GA (imibhalo)
Iphakheji, kulokhu, ayifaneleki kakhulu, ngomqondo wokuthi ilindele ukuthi ama-vectors (ama-chromosome, uma ngokwephakheji) acaciswe ngendlela yezintambo zezinombolo ezinengxenye eyingxenye.

Futhi i-vector yami, kusukela kumanani emingcele yokusetha: lawa amanani angu-14 - izinombolo kanye namanani ezintambo.

Inkinga, yebo, igwenywa kalula ngokunikeza izinombolo ezithile kumanani eyunithi yezinhlamvu.

Ngakho, ekugcineni, ucezu oluyinhloko lweskripthi sika-R lubukeka kanje:

Shayela u-GA::ga

cat( "", file=v_logfile, sep="n", append=F)

pSize = 10
elitism_value=1
pmutation_coef=0.8
pcrossover_coef=0.1
iterations=50

gam=GA::ga(type="real-valued", fitness=evaluate,
lower=c(32,2, 1,1,1,2,60,1,30,1,0,0, 0,0), upper=c(1024,32, 8,10,10,128,800,4,300,8,10,40, 40,30),
popSize=pSize,
pcrossover = pcrossover_coef,
pmutation = pmutation_coef,
maxiter=iterations,
run=4,
keepBest=T)
cat( "GA-session is done" , file=v_logfile, sep="n", append=T)
gam@solution

Lapha, ngosizo lower и upper izimfanelo ezingaphansi ga empeleni, kucacisiwe indawo yesikhala sokusesha, lapho kuzoseshwa khona leyo vector (noma ama-vector) okuzotholwa inani eliphezulu lomsebenzi wokuqina.

I-ga subroutine yenza usesho olukhulisa umsebenzi wokuqina.

Hhayi-ke, kuvele ukuthi, kulokhu, kuyadingeka ukuthi umsebenzi wokuqina, ukuqonda i-vector njengesethi yamanani amapharamitha athile we-subd, uthole imethrikhi evela ku-subd.

Okungukuthi: bangaki, ngokusethwa kwe-subd enikeziwe kanye nomthwalo onikeziwe ku-subd: i-subd icubungula ukuthengiselana ngomzuzwana.

Okusho ukuthi, lapho kuvulwa, izinyathelo eziningi ezilandelayo kufanele zenziwe ngaphakathi komsebenzi wokuqina:

  1. Icubungula i-vector yokufaka yezinombolo - ukuyiguqula ibe amanani amapharamitha angaphansi.
  2. Umzamo wokudala inombolo enikeziwe yamaqembu okwenza kabusha osayizi othile. Ngaphezu kwalokho, umzamo ungase ungaphumeleli.
    Amaqembu omagazini asevele ekhona ku-subd, ngobuningi obuthile kanye nobukhulu obuthile, ukuhlanzeka kokuhlolwa - d.b. kususiwe.
  3. Uma iphuzu langaphambilini liphumelele: ukucacisa amanani emingcele yokucushwa ku-database (futhi: kungase kube nokwehluleka)
  4. Uma isinyathelo sangaphambilini siphumelele: ukumisa i-subd, ukuqala i-subd ukuze amanani epharamitha asanda kushiwo asebenze. (futhi: kungase kube khona inkinga)
  5. Uma isinyathelo sangaphambilini siphumelele: yenza ukuhlolwa komthwalo. thola amamethrikhi ku-subd.
  6. Buyisela i-subd esimweni sayo sokuqala, i.e. susa amaqembu okungena engeziwe, buyisela ukucushwa kwesizindalwazi sangempela sokusebenza.

Ikhodi yokusebenza kokufaneleka

evaluate=function(p_par) {
v_module="evaluate"
v_metric=0
opn=NULL
opn$rg_size=round(p_par[1],digit=0)
opn$rg_count=round(p_par[2],digit=0)
opn$log_archive_max_processes=round(p_par[3],digit=0)
opn$commit_logging="BATCH"
if ( round(p_par[4],digit=0) > 5 ) {
 opn$commit_logging="IMMEDIATE"
}
opn$commit_logging=paste("'", opn$commit_logging, "'",sep="")

opn$commit_wait="WAIT"
if ( round(p_par[5],digit=0) > 5 ) {
 opn$commit_wait="NOWAIT"
}
opn$commit_wait=paste("'", opn$commit_wait, "'",sep="")

opn$log_buffer=paste(round(p_par[6],digit=0),"m",sep="")
opn$log_checkpoint_timeout=round(p_par[7],digit=0)
opn$db_writer_processes=round(p_par[8],digit=0)
opn$undo_retention=round(p_par[9],digit=0)
opn$transactions_per_rollback_segment=round(p_par[10],digit=0)
opn$disk_asynch_io="true"
if ( round(p_par[11],digit=0) > 5 ) {
 opn$disk_asynch_io="false"
} 

opn$filesystemio_options="none"
if ( round(p_par[12],digit=0) > 10 && round(p_par[12],digit=0) <= 20 ) {
 opn$filesystemio_options="setall"
}
if ( round(p_par[12],digit=0) > 20 && round(p_par[12],digit=0) <= 30 ) {
 opn$filesystemio_options="directIO"
}
if ( round(p_par[12],digit=0) > 30 ) {
 opn$filesystemio_options="asynch"
}

opn$db_block_checking="OFF"
if ( round(p_par[13],digit=0) > 10 && round(p_par[13],digit=0) <= 20 ) {
 opn$db_block_checking="LOW"
}
if ( round(p_par[13],digit=0) > 20 && round(p_par[13],digit=0) <= 30 ) {
 opn$db_block_checking="MEDIUM"
}
if ( round(p_par[13],digit=0) > 30 ) {
 opn$db_block_checking="FULL"
}

opn$db_block_checksum="OFF"
if ( round(p_par[14],digit=0) > 10 && round(p_par[14],digit=0) <= 20 ) {
 opn$db_block_checksum="TYPICAL"
}
if ( round(p_par[14],digit=0) > 20 ) {
 opn$db_block_checksum="FULL"
}

v_vector=paste(round(p_par[1],digit=0),round(p_par[2],digit=0),round(p_par[3],digit=0),round(p_par[4],digit=0),round(p_par[5],digit=0),round(p_par[6],digit=0),round(p_par[7],digit=0),round(p_par[8],digit=0),round(p_par[9],digit=0),round(p_par[10],digit=0),round(p_par[11],digit=0),round(p_par[12],digit=0),round(p_par[13],digit=0),round(p_par[14],digit=0),sep=";")
cat( paste(v_module," try to evaluate vector: ", v_vector,sep="") , file=v_logfile, sep="n", append=T)

rc=make_additional_rgroups(opn)
if ( rc!=0 ) {
 cat( paste(v_module,"make_additional_rgroups failed",sep="") , file=v_logfile, sep="n", append=T)
 return (0)
}

v_rc=0
rc=set_db_parameter("log_archive_max_processes", opn$log_archive_max_processes)
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("commit_logging", opn$commit_logging )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("commit_wait", opn$commit_wait )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("log_buffer", opn$log_buffer )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("log_checkpoint_timeout", opn$log_checkpoint_timeout )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("db_writer_processes", opn$db_writer_processes )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("undo_retention", opn$undo_retention )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("transactions_per_rollback_segment", opn$transactions_per_rollback_segment )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("disk_asynch_io", opn$disk_asynch_io )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("filesystemio_options", opn$filesystemio_options )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("db_block_checking", opn$db_block_checking )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("db_block_checksum", opn$db_block_checksum )
if ( rc != 0 ) {  v_rc=1 }

if ( rc!=0 ) {
 cat( paste(v_module," can not startup db with that vector of settings",sep="") , file=v_logfile, sep="n", append=T)
 rc=stop_db("immediate")
 rc=create_spfile()
 rc=start_db("")
 rc=remove_additional_rgroups(opn)
 return (0)
}

rc=stop_db("immediate")
rc=start_db("")
if ( rc!=0 ) {
 cat( paste(v_module," can not startup db with that vector of settings",sep="") , file=v_logfile, sep="n", append=T)
 rc=stop_db("abort")
 rc=create_spfile()
 rc=start_db("")
 rc=remove_additional_rgroups(opn)
 return (0)
}

rc=run_test()
v_metric=getmetric()

rc=stop_db("immediate")
rc=create_spfile()
rc=start_db("")
rc=remove_additional_rgroups(opn)

cat( paste("result: ",v_metric," ",v_vector,sep="") , file=v_logfile, sep="n", append=T)
return (v_metric)
}

Lokho. wonke umsebenzi: wenziwa emsebenzini wokuqina.

I-ga-subroutine icubungula ama-vectors, noma, ngendlela efanele, ama-chromosome.
Lapho, okubaluleke kakhulu kithina ukukhethwa kwama-chromosome anezakhi zofuzo lapho umsebenzi wokufaneleka ukhiqiza amanani amakhulu.

Lokhu, empeleni, inqubo yokusesha isethi elungile yama-chromosome kusetshenziswa i-vector endaweni yokusesha ene-N-dimensional.

Icace kakhulu, inemininingwane incazelo, ngezibonelo zekhodi engu-R, umsebenzi we-algorithm yofuzo.

Ngingathanda ukuphawula ngokuhlukana amaphuzu amabili ezobuchwepheshe.

Izingcingo ezisizayo ezivela kuhlelo evaluate, isibonelo, ukumisa-ukuqala, ukubeka inani lepharamitha engaphansi, kwenziwa ngokusekelwe cran-r imisebenzi system2

Ngosizo lwalokho: umbhalo othile we-bash noma umyalo ubizwa.

Isibonelo:

set_db_ipharamitha

set_db_parameter=function(p1, p2) {
v_module="set_db_parameter"
v_cmd="/home/oracle/testingredotracе/set_db_parameter.sh"
v_args=paste(p1," ",p2,sep="")

x=system2(v_cmd, args=v_args, stdout=T, stderr=T, wait=T)
if ( length(attributes(x)) > 0 ) {
 cat(paste(v_module," failed with: ",attributes(x)$status," ",v_cmd," ",v_args,sep=""), file=v_logfile, sep="n", append=T)
 return (attributes(x)$status)
}
else {
 cat(paste(v_module," ok: ",v_cmd," ",v_args,sep=""), file=v_logfile, sep="n", append=T)
 return (0)
}
}

Iphuzu lesibili umugqa, evaluate imisebenzi, ngokulondoloza inani elithile lemethrikhi kanye nevekhtha yokushuna ehambisanayo kufayela lokungena:

cat( paste("result: ",v_metric," ",v_vector,sep="") , file=v_logfile, sep="n", append=T)

Lokhu kubalulekile, ngoba kusukela kulolu chungechunge lwedatha, kuzokwaziwa ukuthola ulwazi olwengeziwe mayelana nokuthi yiziphi izingxenye zevekhtha yokushuna ezinomphumela omkhulu noma omncane kunani lemethrikhi.

Okusho ukuthi: kuzokwazi ukwenza ukuhlaziywa kwe-attribute-importamce.

Pho yini engenzeka?

Ngefomu legrafu, uma u-oda izivivinyo ngendlela ekhuphukayo yemethrikhi, isithombe simi kanje:

Indlela yesayensi ye-poke, noma indlela yokukhetha ukumiswa kwesizindalwazi usebenzisa amabhentshimakhi kanye ne-algorithm yokwenza kahle

Enye idatha ehambisana namanani adlulele wemethrikhi:
Indlela yesayensi ye-poke, noma indlela yokukhetha ukumiswa kwesizindalwazi usebenzisa amabhentshimakhi kanye ne-algorithm yokwenza kahle
Lapha, kusithombe-skrini esinemiphumela, ngizocacisa: amanani e-vector yokulungisa anikezwa ngokuya ngekhodi yokusebenza kokufaneleka, hhayi ngokohlu lwezinombolo zamapharamitha/ububanzi bamanani wepharamitha, olwakhiwe. ngenhla embhalweni.

Hhayi-ke. Ingabe kuningi noma okuncane, ~ 8 thousand tps: umbuzo ohlukile.
Ngaphakathi kohlaka lomsebenzi welabhorethri, lesi sibalo asibalulekile, okubalulekile yi-dynamics, ukuthi leli nani lishintsha kanjani.

Amandla lapha mahle.
Kusobala ukuthi okungenani into eyodwa ithonya ngokuphawulekayo inani lemethrikhi, i-ga-algorithm, ihlunga ngamavekhtha ekhromosome: emboziwe.
Uma sibheka ukuguquguquka okunamandla kwamanani ejika, kukhona okungenani into eyodwa ngaphezulu, nakuba incane kakhulu, enomthelela.

Yilapho udinga khona attribute-importance ukuhlaziywa ukuze kuqondwe ukuthi yiziphi izibaluli (kahle, kulesi simo, izingxenye zevekhtha yokushuna) nokuthi zithonya elingakanani inani lemethrikhi.
Futhi kusukela kulolu lwazi: qonda ukuthi yiziphi izici ezithintekile izinguquko ezicini ezibalulekile.

Khipha attribute-importance kungenzeka ngezindlela ezahlukene.

Ngalezi zinhloso, ngithanda i-algorithm randomForest Iphakheji engu-R yegama elifanayo (imibhalo)
randomForest, njengoba ngiqonda umsebenzi wakhe ngokujwayelekile kanye nendlela yakhe yokuhlola ukubaluleka kwezimfanelo ikakhulukazi, yakha imodeli ethile yokuncika kokuhlukahluka kwempendulo kuzici.

Esimweni sethu, ukuhluka kwempendulo kuyimethrikhi etholwe kusizindalwazi ekuhlolweni komthwalo: tps;
Futhi izimfanelo ziyizingxenye ze-vector yokulungisa.

Ngakho lapha randomForest ihlola ukubaluleka kwesibaluli semodeli ngayinye ngezinombolo ezimbili: %IncMSE — ukuthi ubukhona/ukungabikho kwalesi sibaluli kumodeli kuyishintsha kanjani ikhwalithi ye-MSE yale modeli (Mean Squared Error);

Futhi i-IncNodePurity iyinombolo ekhombisa kahle ukuthi, ngokusekelwe kumanani alesi sibaluli, idathasethi enokubuka ingahlukaniswa, ukuze engxenyeni eyodwa kube nedatha enenani elilodwa lemethrikhi echazwayo, kanti kwenye elinye inani lemethrikhi.
Hhayi-ke, okungukuthi: lokhu kuyimfanelo yokuhlukanisa ngezinga elingakanani (ngibone incazelo ecace kakhulu, yolimi lwesiRashiya ku-RandomForest lapha).

Ikhodi engu-R yesisebenzi-sampofu yokucubungula idathasethi enemiphumela yokuhlolwa komthwalo:

x=NULL
v_data_file=paste('/tmp/data1.dat',sep="")
x=read.table(v_data_file, header = TRUE, sep = ";", dec=",", quote = ""'", stringsAsFactors=FALSE)
colnames(x)=c('metric','rgsize','rgcount','lamp','cmtl','cmtw','lgbffr','lct','dbwrp','undo_retention','tprs','disk_async_io','filesystemio_options','db_block_checking','db_block_checksum')

idxTrain=sample(nrow(x),as.integer(nrow(x)*0.7))
idxNotTrain=which(! 1:nrow(x) %in% idxTrain )
TrainDS=x[idxTrain,]
ValidateDS=x[idxNotTrain,]

library(randomForest)
#mtry=as.integer( sqrt(dim(x)[2]-1) )
rf=randomForest(metric ~ ., data=TrainDS, ntree=40, mtry=3, replace=T, nodesize=2, importance=T, do.trace=10, localImp=F)
ValidateDS$predicted=predict(rf, newdata=ValidateDS[,colnames(ValidateDS)!="metric"], type="response")
sum((ValidateDS$metric-ValidateDS$predicted)^2)
rf$importance

Ungakhetha ngokuqondile ama-hyperparameters we-algorithm ngezandla zakho futhi, ngokugxila kukhwalithi yemodeli, khetha imodeli egcwalisa ngokunembile ukubikezela kudathasethi yokuqinisekisa.
Ungabhala uhlobo oluthile lomsebenzi walo msebenzi (ngendlela, futhi, usebenzisa uhlobo oluthile lwe-algorithm yokwenza kahle).

Ungasebenzisa iphakethe le-R caret, akulona iphuzu elibalulekile.

Ngenxa yalokho, kulesi simo, umphumela olandelayo utholwa ukuze kuhlolwe izinga lokubaluleka kwezibaluli:

Indlela yesayensi ye-poke, noma indlela yokukhetha ukumiswa kwesizindalwazi usebenzisa amabhentshimakhi kanye ne-algorithm yokwenza kahle

Hhayi-ke. Ngakho-ke, singaqala ukucabanga komhlaba wonke:

  1. Kuvela ukuthi okubaluleke kakhulu, ngaphansi kwalezi zimo zokuhlola, kwakuyipharamitha commit_wait
    Ngobuchwepheshe, icacisa imodi yokwenza yokusebenza kwe-io kokubhala kabusha idatha kusuka kusigcinalwazi selogi ye-subdb kuya eqenjini lamanje lokungena: okuvumelanayo noma okungavumelaniyo.
    Okushoyo nowait okuholela ekukhuphukeni okucishe kume mpo, okuphindaphindiwe kwevelu ye-tps metric: lokhu ukufakwa kwemodi ye-asynchronous io emaqenjini okwenziwa kabusha.
    Umbuzo ohlukile ukuthi kufanele yini ukwenze lokhu kusizindalwazi sokudla. Lapha ngikhawulela ekuveleni ngithi: lokhu kuyisici esibalulekile.
  2. Kunengqondo ukuthi usayizi webhafa yelogi ye-subd: kuvele kube yinto ebalulekile.
    Uma umncane usayizi we-log buffer, uyancipha umthamo waso wokubhafa, uvame ukuchichima kanye/noma ukungakwazi ukwaba indawo ekhululekile kuyo ingxenye yedatha ye-redox entsha.
    Lokhu kusho: ukubambezeleka okuhlobene nokwabiwa kwesikhala kubhafa yelogi kanye/noma ukulahla idatha yokwenza kabusha esuka kuyo ibe amaqembu okwenza kabusha.
    Lokhu kubambezeleka, kunjalo, kufanele futhi kuthinte ukusebenza kwesizindalwazi semisebenzi.
  3. Ipharamitha db_block_checksum: kahle, futhi, ngokuvamile kusobala - ukucubungula okwenziwe kuholela ekwakhekeni kwamabhulokhi angenalutho kunqolobane yedatha yesizindalwazi esingaphansi.
    Okungukuthi, lapho ukuhlola ama-checksum ama-datablocks kunikwe amandla, isizindalwazi kufanele sicubungule - ukubala lawa masheke kumzimba webhulokhi yedatha, hlola ngalokho okubhalwe kunhlokweni yebhulokhi yedatha: okufanayo/akufani.
    Umsebenzi onjalo, futhi, awukwazi kodwa ukubambezeleka ukucutshungulwa kwedatha, futhi ngokufanele, ipharamitha kanye nendlela esetha le pharamitha kuba okubalulekile.
    Kungakho umthengisi enikeza, emibhalweni yale pharamitha, amanani ahlukene namanothi ukuthi yebo, kuzoba nomthelela, kodwa ungakhetha amanani ahlukene, ngisho “nokucisha” nemithelela ehlukile.

Nokho, isiphetho somhlaba wonke.

Indlela, ngokuvamile, ibonakala isebenza ngempela.

Uzivumela impela, ezinyathelweni zakuqala zokuhlolwa komthwalo wesistimu ethile yesevisi, ukuze akhethe (uhlelo) ukucushwa kwawo okulungile komthwalo, hhayi ukuhlolisisa kakhulu imininingwane yokusetha uhlelo lomthwalo.

Kodwa ayikushiyi ngaphandle ngokuphelele - okungenani ezingeni lokuqonda: uhlelo kufanele lwaziwe mayelana "namafindo okulungisa" kanye nebanga elivumelekile lokuzungezisa lawa mafindo.

Indlela yokubhekana nayo ingathola ngokushesha ukucushwa kwesistimu okulungile.
Futhi ngokusekelwe emiphumeleni yokuhlolwa, kuyenzeka ukuthola ulwazi mayelana nemvelo yobudlelwano phakathi kwamamethrikhi okusebenza kwesistimu kanye namanani wemingcele yezilungiselelo zesistimu.

Okungukuthi, kufanele kube nomthelela ekuveleni kwalokhu kuqonda okujulile kwesistimu, ukusebenza kwayo, okungenani ngaphansi komthwalo onikeziwe.

Empeleni, lokhu ukushintshaniswa kwezindleko zokuqonda uhlelo olwenziwe ngokwezifiso lwezindleko zokulungiselela ukuhlolwa okunjalo kwesistimu.

Ngingathanda ukuqaphela ngokwehlukana: kule ndlela, izinga lokufaneleka kokuhlolwa kwesistimu ezimweni zokusebenza ezizoba nazo ekusebenzeni kwezentengiselwano kubaluleke kakhulu.

Siyabonga ngokunaka kwakho nesikhathi.

Source: www.habr.com

Engeza amazwana