Habka poke sayniska, ama sida loo doorto qaabaynta xogta iyada oo la adeegsanayo bar-tilmaameedka iyo algorithmization optimization

Hello.

Waxaan go'aansaday in aan la wadaago helitaankeyga - midhaha fikirka, tijaabinta iyo qaladka.
Guud ahaan: tani ma aha helitaan, dabcan - waxaas oo dhan waa in la yaqaan muddo dheer, kuwa ku lug leh habaynta xogta tirakoobka ee codsatay iyo hagaajinta nidaam kasta, maahan si gaar ah DBMS.
Iyo: haa, way og yihiin, waxay qoraan maqaallo xiiso leh cilmi-baaristooda, Tusaale (UPD.: faallooyinka waxay tilmaameen mashruuc aad u xiiso badan: ottertune )
Dhanka kale: si ka baxsan ma arko wax hadal ah oo baahsan ama faafin habkan internetka ee khubarada IT-ga, DBA.

Markaa, ilaa heerka.

Aynu ka soo qaadno inaynu hawl u hayno: in aynu dejino nidaam adeeg gaar ah oo loogu adeegayo nooc shaqo ah.

Waa la og yahay shaqadan: waxa ay tahay, sida loo cabbiro tayada shaqadan, iyo maxay tahay shuruudaha lagu cabbirayo tayadan.

Aynu sidoo kale u qaadanno in ay badan tahay ama ka yar tahay la yaqaan oo la fahmay: sida saxda ah ee shaqada loogu qabto (ama leh) nidaamka adeeggan.

"In ka badan ama ka yar" - tani waxay ka dhigan tahay in ay suurtogal tahay in la diyaariyo (ama laga helo meel) qalab gaar ah, utility, adeeg la farsameyn karo oo lagu dabaqi karo nidaamka oo leh culeys tijaabo ah oo ku filan waxa ku jiri doona wax soo saarka, xaaladaha ku filan ka shaqaynta wax soo saarka .

Hagaag, aynu u qaadano in jaangooyooyinka hagaajinta ee nidaamka adeeggan la yaqaan, kaas oo loo isticmaali karo in lagu habeeyo nidaamkan marka loo eego wax soo saarka shaqadiisa.

Oo waa maxay dhibaatadu - ma jirto faham ku filan oo dhammaystiran oo ku saabsan nidaamka adeeggan, mid kuu ogolaanaya inaad si khibrad leh u habayso goobaha nidaamkan ee culeyska mustaqbalka ee madal la siiyay oo aad hesho wax soo saarka nidaamka loo baahan yahay.

Waa hagaag. Tani had iyo jeer waa xaaladdu.

Maxaad ka qaban kartaa halkan?

Waa hagaag, waxa ugu horreeya ee maskaxda ku soo dhaca waa in la eego dukumentiyada nidaamkan. Faham waxa kala duwanaanta la aqbali karo ay u yihiin qiyamka cabbirrada hagaajinta. Iyo, tusaale ahaan, adoo adeegsanaya habka abtirsiinta isku-dubbaridka, dooro qiyamka cabbirrada nidaamka imtixaannada.

Kuwaas. sii nidaamka nooc ka mid ah qaabeynta, qaab qaab gaar ah oo qiyam ah oo loogu talagalay xuduudaha qaabeynta.

U mari culays tijaabo ah, adigoo isticmaalaya qalabkan-utility-ga, koronto dhaliyaha.
Oo fiiri qiimaha - jawaabta, ama mitirka tayada nidaamka.

Fikradda labaad waxay noqon kartaa gabagabada in tani ay tahay waqti aad u dheer.

Hagaag, taasi waa: haddii ay jiraan wax badan oo xaddidaadyo dejineed ah, haddii kala duwanaanta qiyamkooda la maamulayo ay weyn yihiin, haddii tijaabinta culeyska shaqsiyeed kastaa ay qaadato waqti badan si loo dhammaystiro, markaa: haa, waxaas oo dhan waxay qaadan karaan wax aan la aqbali karin. waqti dheer.

Hagaag, halkan waa waxa aad fahmi karto oo aad xasuusan karto.

Waxaad ogaan kartaa in set of qiyamka ee nidaamka habayn adeegyada ay jiraan vector ah, sida isku xigxiga qiyamka qaarkood.

Mid kasta oo noocaan oo kale ah, waxyaabo kale oo siman (markaas oo aan saameyn ku yeelan vectorkan), waxay u dhigantaa qiimaha gebi ahaanba qeexan ee mitirka - tilmaame tayada nidaamka hawlgalka ee culeyska tijaabada.

ie

Aan u muujino habka qaabeynta vector sida Habka poke sayniska, ama sida loo doorto qaabaynta xogta iyada oo la adeegsanayo bar-tilmaameedka iyo algorithmization optimizationhalkaas oo Habka poke sayniska, ama sida loo doorto qaabaynta xogta iyada oo la adeegsanayo bar-tilmaameedka iyo algorithmization optimization; Halkee Habka poke sayniska, ama sida loo doorto qaabaynta xogta iyada oo la adeegsanayo bar-tilmaameedka iyo algorithmization optimization - tirada halbeegyada habaynta nidaamka, intee in le'eg ayaa jira.

Iyo qiimaha mitirka u dhigma kan Habka poke sayniska, ama sida loo doorto qaabaynta xogta iyada oo la adeegsanayo bar-tilmaameedka iyo algorithmization optimization aynu u tilmaamno sida
Habka poke sayniska, ama sida loo doorto qaabaynta xogta iyada oo la adeegsanayo bar-tilmaameedka iyo algorithmization optimization, ka dib waxaan helnaa shaqo: Habka poke sayniska, ama sida loo doorto qaabaynta xogta iyada oo la adeegsanayo bar-tilmaameedka iyo algorithmization optimization

Hagaag, markaa: wax walba isla markiiba waxay hoos ugu dhacaan, kiiskeyga: ku dhawaad ​​​​la ilaaway maalmaha ardaygeyga, algorithms ee raadinta xag-jirnimada shaqada.

Hagaag, laakiin halkan su'aal urureed iyo mid la dabaqay ayaa soo baxaysa: algoorithmkee la isticmaalo.

  1. Dareen ahaan - si aad gacan yar ugu codayn karto.
  2. Si ay u shaqeyso, i.e. helay xag-jirnimada (haddii uu jiro mid), si fiican, ugu yaraan ka dhaqso badan isku-dubarid faraca.

Qodobka ugu horreeya wuxuu tilmaamayaa inaan u baahanahay inaan eegno qaar ka mid ah bay'ada kuwaas oo algorithms-yadaas horay loo fuliyay, oo ay yihiin, qaab ahaan, diyaar u ah isticmaalka koodka.
Hagaag, waan ogahay python ΠΈ cran-r

Qodobka labaad wuxuu ka dhigan yahay inaad u baahan tahay inaad wax ka akhrido algorithms-yada laftooda, waxay yihiin, shuruudahooda, iyo sifooyinka shaqadooda.

Iyo waxa ay bixiyaan waxay noqon karaan waxyeelo faa'iido leh - natiijooyinka, ama si toos ah algorithm laftiisa.

Ama waxaa laga heli karaa natiijada algorithm.

Wax badan ayaa ku xiran shuruudaha wax gelinta.

Tusaale ahaan, haddii, sabab qaar ka mid ah, aad u baahan tahay si aad u hesho natiijo degdeg ah, si fiican, waxaad u baahan tahay in aad eegto dhinaca gradient algoorithms-yada oo dooro mid iyaga ka mid ah.

Ama, haddii wakhtigu aanu sidaas muhiim u ahayn, waxaad isticmaali kartaa, tusaale ahaan, isticmaalida hababka kor u qaadida stochastic, sida algorithm-ka hidda-socodka.

Waxaan soo jeedinayaa in la tixgeliyo shaqada habkan, xulashada qaabeynta nidaamka, adoo isticmaalaya algorithm hidda-socodka, ee soo socda, si loo hadlo: shaqada shaybaarka.

Asalka:

  1. Ha jiro, nidaam adeeg ahaan: oracle xe 18c
  2. U oggolow inay u adeegto dhaqdhaqaaqa macaamilka iyo yoolka: si loo helo wax-soo-saarka ugu sarreeya ee suurtogalka ah ee xog-hoosaadka, macaamilada/sek.
  3. Wax kala iibsigu aad ayuu u kala duwanaan karaa marka loo eego dabeecadda ku shaqaynta xogta iyo macnaha guud ee shaqada.
    Aynu ku heshiino inay kuwani yihiin wax kala beddelasho oo aan ka shaqayn tiro badan oo xog tabular ah.
    Marka loo eego in aanay dhalin xog ka noqosho ka badan dib-u-samaynta oo aanay farsamayn boqolleyda badan ee safafka iyo miisaska waaweyn.

Kuwani waa wax kala beddelasho oo beddela hal saf oo ka mid ah in ka badan ama ka yar miis weyn, oo leh tiro yar oo tusmooyin ah miiskan.

Xaaladdan oo kale: wax soo saarka xogta hoose ee habaynta macaamilada ayaa, iyada oo boos celin ah, lagu go'aamin doono tayada habaynta xogta redox.

Afeef - haddii aan si gaar ah uga hadalno goobaha subdb.

Sababtoo ah, guud ahaan, waxaa jiri kara, tusaale ahaan, quful ganacsi oo u dhexeeya fadhiyada SQL, sababtoo ah naqshadeynta shaqada isticmaale ee xogta iyo/ama qaabka shaxda.

Taas oo, dabcan, ay saameyn niyad-jab ah ku yeelan doonto mitirka TPS tanina waxay noqon doontaa arrin ka baxsan, marka loo eego xogta hoose: si fiican, tani waa sida qaabka loo qaabeeyey iyo shaqada xogta ku jirta ee xannibaaduhu ay dhacaan.

Sidaa darteed, daahirsanaanta tijaabada, waxaan ka saari doonaa qodobkan, hoostana waxaan caddayn doonaa sida saxda ah.

  1. Aynu ka soo qaadno, si qeexan, in 100% amarrada SQL ee loo soo gudbiyay xogta ay yihiin amarrada DML.
    U ogolow sifooyinka isticmaalaha la shaqaynta xog-hoosaadka inay la mid noqdaan imtixaanada.
    Kuwaas oo kala ah: tirada fadhiyada skl, xogta shaxda, sida kalfadhiyada skl ula shaqeeyaan.
  2. Subd wuxuu ku shaqeeyaa FORCE LOGGING, ARCHIVELOG mods Habka kaydka xogta ee Flashback waa dansan yahay, heerka hoose
  3. Redo logs: oo ku yaal nidaam faylal gaar ah, oo ku yaal "disk" gooni ah;
    Inta soo hartay ee qaybta jireed ee xogta: mid kale, nidaam faylal gaar ah, oo ku yaal "disk" gooni ah:

Faahfaahin dheeraad ah oo ku saabsan qalabka jireed. qaybaha xogta shaybaarka

SQL> select status||' '||name from v$controlfile;
 /db/u14/oradata/XE/control01.ctl
SQL> select GROUP#||' '||MEMBER from v$logfile;
1 /db/u02/oradata/XE/redo01_01.log
2 /db/u02/oradata/XE/redo02_01.log
SQL> select FILE_ID||' '||TABLESPACE_NAME||' '||round(BYTES/1024/1024,2)||' '||FILE_NAME as col from dba_data_files;
4 UNDOTBS1 2208 /db/u14/oradata/XE/undotbs1_01.dbf
2 SLOB 128 /db/u14/oradata/XE/slob01.dbf
7 USERS 5 /db/u14/oradata/XE/users01.dbf
1 SYSTEM 860 /db/u14/oradata/XE/system01.dbf
3 SYSAUX 550 /db/u14/oradata/XE/sysaux01.dbf
5 MONITOR 128 /db/u14/oradata/XE/monitor.dbf
SQL> !cat /proc/mounts | egrep "/db/u[0-2]"
/dev/vda1 /db/u14 ext4 rw,noatime,nodiratime,data=ordered 0 0
/dev/mapper/vgsys-ora_redo /db/u02 xfs rw,noatime,nodiratime,attr2,nobarrier,inode64,logbsize=256k,noquota 0 0

Markii hore, xaaladahan culeyska, waxaan rabay in aan isticmaalo subd wax kala iibsiga SLOB-utility
Waxay leedahay sifo cajiib ah, waxaan ka soo xigan doonaa qoraaga:

Xudunta SLOB waa "habka SLOB." Habka SLOB wuxuu higsanayaa inuu tijaabiyo aaladaha
iyada oo aan muran codsi. Qofku ma wadi karo waxqabadka ugu badan ee hardware
iyadoo la isticmaalayo koodka codsiga kaas oo ah, tusaale ahaan, ku xidhan codsiga qufulka ama xataa
wadaaga Oracle Database blocks. Taasi waa sax β€” waxaa jira dulsaar marka xogta la wadaagayo
in blocks xogta! Laakin SLOB- marka la geeyo goobteeda caadiga ah β€” way ka difaaci kartaa murankan.

Bayaankan: waa u dhigma, waa.
Way ku habboon tahay in la habeeyo heerka isbarbardhigga fadhiyada cl, tani waa furaha -t bilaabi utility runit.sh ka SLOB
Boqolkiiba amarrada DML waa la habeeyey, tirada fariimaha qoraalka ah ee loo diro subd, fadhi kasta oo qoraal ah, halbeeg UPDATE_PCT
Si gooni ah oo aad ugu habboon: SLOB lafteeda, ka hor iyo ka dib fadhiga culeyska - waxay diyaarisaa statspack, ama awr-snapshots (waxa loo qorsheeyay in la diyaariyo).

Si kastaba ha ahaatee, waxaa soo baxday in SLOB ma taageerto fadhiyada SQL ee mudada ka yar 30 ilbiriqsi.
Sidaa darteed, waxaan marka hore calaamadeeyay nooca shaqalaha iyo beeralayda ee rayga, ka dibna wuu sii shaqaynayaa.

Aan caddeeyo waxa raxanuhu qabto iyo sida uu u sameeyo, si loo caddeeyo.
Asal ahaan loaderku wuxuu u eg yahay sidan:

Koodhka shaqaalaha

function dotx()
{
local v_period="$2"
[ -z "v_period" ] && v_period="0"
source "/home/oracle/testingredotracΠ΅/config.conf"

$ORACLE_HOME/bin/sqlplus -S system/${v_system_pwd} << __EOF__
whenever sqlerror exit failure
set verify off
set echo off
set feedback off

define wnum="$1"
define period="$v_period"
set appinfo worker_&&wnum

declare
 v_upto number;
 v_key  number;
 v_tots number;
 v_cts  number;
begin
 select max(col1) into v_upto from system.testtab_&&wnum;
 SELECT (( SYSDATE - DATE '1970-01-01' ) * 86400 ) into v_cts FROM DUAL;
 v_tots := &&period + v_cts;
 while v_cts <= v_tots
 loop
  v_key:=abs(mod(dbms_random.random,v_upto));
  if v_key=0 then
   v_key:=1;
  end if;
  update system.testtab_&&wnum t
  set t.object_name=translate(dbms_random.string('a', 120), 'abcXYZ', '158249')
  where t.col1=v_key
  ;
  commit;
  SELECT (( SYSDATE - DATE '1970-01-01' ) * 86400 ) into v_cts FROM DUAL;
 end loop;
end;
/

exit
__EOF__
}
export -f dotx

Shaqaalaha waxaa loo bilaabay sidan:

Shaqaale ordaya

echo "starting test, duration: ${TEST_DURATION}" >> "$v_logfile"
for((i=1;i<="$SQLSESS_COUNT";i++))
do
 echo "sql-session: ${i}" >> "$v_logfile"
 dotx "$i" "${TEST_DURATION}" &
done
echo "waiting..." >> "$v_logfile"
wait

Miisaska shaqaalahana waxaa loo diyaariyey sidatan:

Abuuritaanka miisaska

function createtable() {
source "/home/oracle/testingredotracΠ΅/config.conf"
$ORACLE_HOME/bin/sqlplus -S system/${v_system_pwd} << __EOF__
whenever sqlerror continue
set verify off
set echo off
set feedback off

define wnum="$1"
define ts_name="slob"

begin
 execute immediate 'drop table system.testtab_&&wnum';
exception when others then null;
end;
/

create table system.testtab_&&wnum tablespace &&ts_name as
select rownum as col1, t.*
from sys.dba_objects t
where rownum<1000
;
create index testtab_&&wnum._idx on system.testtab_&&wnum (col1);
--alter table system.testtab_&&wnum nologging;
--alter index system.testtab_&&wnum._idx nologging;
exit
__EOF__
}
export -f createtable

seq 1 1 "$SQLSESS_COUNT" | xargs -n 1 -P 4 -I {} -t bash -c "createtable "{}"" | tee -a "$v_logfile"
echo "createtable done" >> "$v_logfile"

Kuwaas. Shaqaale kasta (dhab ahaan: fadhi SQL gaar ah oo ku yaal DB) miis gaar ah ayaa loo sameeyay, kaas oo shaqaaluhu la shaqeeyo.

Tani waxay hubinaysaa maqnaanshaha qufulka macaamil ganacsi ee u dhexeeya fadhiyada shaqaalaha.
Shaqaaluhuna isku wax buu sameeyaa, oo miiskiisuu wataa, miisaskuna waa isku mid.
Dhammaan shaqaaluhu waxay qabtaan shaqo isku waqti ah.
Intaa waxaa dheer, wakhti dheer oo ku filan si, tusaale ahaan, furaha loggu uu hubaal yahay inuu dhaco, iyo in ka badan hal mar.
Waa hagaag, sidaas awgeed, kharashyada la xidhiidha iyo saamaynta ayaa kacay.
Kiiskeyga, waxaan habeeyay muddada shaqada shaqaalaha 8 daqiiqo.

Qayb ka mid ah warbixinta statspack oo qeexaysa hawlgalka subd-ka rarnaa

Database    DB Id    Instance     Inst Num  Startup Time   Release     RAC
~~~~~~~~ ----------- ------------ -------- --------------- ----------- ---
          2929910313 XE                  1 07-Sep-20 23:12 18.0.0.0.0  NO

Host Name             Platform                CPUs Cores Sockets   Memory (G)
~~~~ ---------------- ---------------------- ----- ----- ------- ------------
     billing.izhevsk1 Linux x86 64-bit           2     2       1         15.6

Snapshot       Snap Id     Snap Time      Sessions Curs/Sess Comment
~~~~~~~~    ---------- ------------------ -------- --------- ------------------
Begin Snap:       1630 07-Sep-20 23:12:27       55        .7
  End Snap:       1631 07-Sep-20 23:20:29       62        .6
   Elapsed:       8.03 (mins) Av Act Sess:       8.4
   DB time:      67.31 (mins)      DB CPU:      15.01 (mins)

Cache Sizes            Begin        End
~~~~~~~~~~~       ---------- ----------
    Buffer Cache:     1,392M              Std Block Size:         8K
     Shared Pool:       288M                  Log Buffer:   103,424K

Load Profile              Per Second    Per Transaction    Per Exec    Per Call
~~~~~~~~~~~~      ------------------  ----------------- ----------- -----------
      DB time(s):                8.4                0.0        0.00        0.20
       DB CPU(s):                1.9                0.0        0.00        0.04
       Redo size:        7,685,765.6              978.4
   Logical reads:           60,447.0                7.7
   Block changes:           47,167.3                6.0
  Physical reads:                8.3                0.0
 Physical writes:              253.4                0.0
      User calls:               42.6                0.0
          Parses:               23.2                0.0
     Hard parses:                1.2                0.0
W/A MB processed:                1.0                0.0
          Logons:                0.5                0.0
        Executes:           15,756.5                2.0
       Rollbacks:                0.0                0.0
    Transactions:            7,855.1

Ku soo noqoshada shaqada shaybaadhka.
Waxaan, waxyaabo kale oo siman, kala duwanaan doonaa qiyamka xuduudaha soo socda ee xogta hoose ee shaybaarka:

  1. Baaxadda kooxaha log database. qiimaha kala duwan: [32, 1024] MB;
  2. Tirada kooxaha joornaalka ee kaydka qiimaha kala duwan: [2,32];
  3. log_archive_max_processes qiimaha kala duwan: [1,8];
  4. commit_logging laba qiime ayaa la ogol yahay: batch|immediate;
  5. commit_wait laba qiime ayaa la ogol yahay: wait|nowait;
  6. log_buffer qiimaha kala duwan: [2,128] MB.
  7. log_checkpoint_timeout qiimaha kala duwan: [60,1200] ilbiriqsi
  8. db_writer_processes qiimaha kala duwan: [1,4]
  9. undo_retention qiimaha kala duwan: [30;300] ilbiriqsi
  10. transactions_per_rollback_segment qiimaha kala duwan: [1,8]
  11. disk_asynch_io laba qiime ayaa la ogol yahay: true|false;
  12. filesystemio_options qiyamka soo socda waa la ogolyahay: none|setall|directIO|asynch;
  13. db_block_checking qiyamka soo socda waa la ogolyahay: OFF|LOW|MEDIUM|FULL;
  14. db_block_checksum qiyamka soo socda waa la ogolyahay: OFF|TYPICAL|FULL;

Qofka waayo-aragnimada u leh ilaalinta xogta Oracle wuxuu si dhab ah u sheegi karaa waxa iyo waxa qiyamka loo baahan yahay in la dejiyo, laga bilaabo cabbirada la cayimay iyo qiyamkooda la aqbali karo, si loo helo wax soo saar weyn oo kaydka xogta shaqada ee xogta lagu muujiyey code code , halkan sare.

Laakin.

Ujeedada shaqada shaybaadhka ayaa ah in la tuso in hagaajinta algorithm lafteeda ay si dhakhso ah noogu caddayn doonto tan.

Annaga, waxa hadhay oo dhan waa in aan eegno dukumeentiga, iyada oo loo marayo nidaamka la beddeli karo, oo kaliya oo ku filan si loo ogaado cabbirrada la beddelo iyo inta u dhaxaysa.
Iyo sidoo kale: code code-ka loo isticmaali doono si uu ula shaqeeyo nidaamka caadada ah ee algorithm hagaajinta ee la doortay.

Haddaba, hadda ku saabsan code.
Kor ayaan uga hadlay cran-r, tusaale ahaan: dhammaan wax-is-daba-marinta nidaamka la habeeyey waxa loo habeeyey qaabka qoraalka R.

Hawsha dhabta ah, falanqaynta, xulashada mitirka qiimaha, nidaamka gobolka vectors: tani waa xirmo GA (dukumiintiyada)
Xirmada, kiiskan, maaha mid aad u habboon, macnaha in ay filayso vectors (chromosomes, haddii marka la eego xirmada) lagu qeexo qaabka xargaha tirooyinka leh qayb jajab ah.

Iyo vector-kayga, laga soo bilaabo qiyamka xuduudaha dejinta: kuwani waa 14 tiro - tirooyin iyo qiyamka xargaha.

Dhibaatada, dabcan, si fudud ayaa looga fogaanayaa iyada oo loo qaybinayo tirooyin gaar ah oo lagu qiimeeyo qiimaha xargaha.

Sidaa darteed, dhamaadka, qaybta ugu weyn ee qoraalka R waxay u egtahay sidan:

Wac GA::ga

cat( "", file=v_logfile, sep="n", append=F)

pSize = 10
elitism_value=1
pmutation_coef=0.8
pcrossover_coef=0.1
iterations=50

gam=GA::ga(type="real-valued", fitness=evaluate,
lower=c(32,2, 1,1,1,2,60,1,30,1,0,0, 0,0), upper=c(1024,32, 8,10,10,128,800,4,300,8,10,40, 40,30),
popSize=pSize,
pcrossover = pcrossover_coef,
pmutation = pmutation_coef,
maxiter=iterations,
run=4,
keepBest=T)
cat( "GA-session is done" , file=v_logfile, sep="n", append=T)
gam@solution

Halkan, iyadoo la kaashanayo lower ΠΈ upper sifooyinka subroutine ga Asal ahaan, aag ka mid ah booska raadinta ayaa la cayimay, halkaas oo raadinta lagu sameyn doono vector (ama vectors) kaas oo qiimaha ugu sarreeya ee shaqada jirdhiska la heli doono.

Ga subroutine wuxuu sameeyaa raadinta kor u qaadaysa shaqada jirdhiska.

Hagaag, ka dib, waxay soo baxday in, kiiskan, ay lagama maarmaan tahay in shaqada fayaqabka, fahamka vectorka sida set of qiyamka ee xuduudaha qaar ka mid ah subd, uu helo mitir ka subd.

Taasi waa: immisa, oo leh habayn-hoosaadyo la bixiyay iyo culays la saaray subd: hab-socodka subd-ku-socodka macaamilka ilbiriqsikii.

Taasi waa, marka la furayo, talaabooyinka badan ee soo socda waa in lagu sameeyaa gudaha shaqada jirdhiska:

  1. Hagaajinta vector-ka tirooyinka - u beddelidda qiyamka xuduudaha xogta hoose.
  2. Isku day lagu abuurayo tiro la siiyay oo ah kooxaha dib u samaynta ee cabbirka la bixiyay. Waxaa intaa dheer, iskudaygu wuxuu noqon karaa mid aan guulaysan.
    Kooxo joornaaliiste ah oo hore ugu jiray subd, tiro iyo qaar ka mid ah, daahirnimada tijaabada - d.b. tirtiray.
  3. Haddii qodobkii hore uu guulaysto: qeexida qiyamka xuduudaha qaabeynta xogta xogta (mar kale: waxaa jiri kara guuldarro)
  4. Haddii tillaabadii hore ay guulaysato: joojinta subd-ka, bilawga subd si qiyamka halbeegyada cusub ee la cayimay ay u dhaqan galaan. (mar kale: waxaa jiri kara cilad)
  5. Haddii tillaabadii hore ay guulaysato: samee tijaabada culayska ka hel qiyaasaha subd.
  6. Ku soo celi subd-ka xaaladdiisii ​​asalka ahayd, i.e. tirtir kooxaha log dheeriga ah, ku soo celi qaabaynta xogta hoose ee asalka ah si ay u shaqeyso.

Koodhka shaqada jimicsiga

evaluate=function(p_par) {
v_module="evaluate"
v_metric=0
opn=NULL
opn$rg_size=round(p_par[1],digit=0)
opn$rg_count=round(p_par[2],digit=0)
opn$log_archive_max_processes=round(p_par[3],digit=0)
opn$commit_logging="BATCH"
if ( round(p_par[4],digit=0) > 5 ) {
 opn$commit_logging="IMMEDIATE"
}
opn$commit_logging=paste("'", opn$commit_logging, "'",sep="")

opn$commit_wait="WAIT"
if ( round(p_par[5],digit=0) > 5 ) {
 opn$commit_wait="NOWAIT"
}
opn$commit_wait=paste("'", opn$commit_wait, "'",sep="")

opn$log_buffer=paste(round(p_par[6],digit=0),"m",sep="")
opn$log_checkpoint_timeout=round(p_par[7],digit=0)
opn$db_writer_processes=round(p_par[8],digit=0)
opn$undo_retention=round(p_par[9],digit=0)
opn$transactions_per_rollback_segment=round(p_par[10],digit=0)
opn$disk_asynch_io="true"
if ( round(p_par[11],digit=0) > 5 ) {
 opn$disk_asynch_io="false"
} 

opn$filesystemio_options="none"
if ( round(p_par[12],digit=0) > 10 && round(p_par[12],digit=0) <= 20 ) {
 opn$filesystemio_options="setall"
}
if ( round(p_par[12],digit=0) > 20 && round(p_par[12],digit=0) <= 30 ) {
 opn$filesystemio_options="directIO"
}
if ( round(p_par[12],digit=0) > 30 ) {
 opn$filesystemio_options="asynch"
}

opn$db_block_checking="OFF"
if ( round(p_par[13],digit=0) > 10 && round(p_par[13],digit=0) <= 20 ) {
 opn$db_block_checking="LOW"
}
if ( round(p_par[13],digit=0) > 20 && round(p_par[13],digit=0) <= 30 ) {
 opn$db_block_checking="MEDIUM"
}
if ( round(p_par[13],digit=0) > 30 ) {
 opn$db_block_checking="FULL"
}

opn$db_block_checksum="OFF"
if ( round(p_par[14],digit=0) > 10 && round(p_par[14],digit=0) <= 20 ) {
 opn$db_block_checksum="TYPICAL"
}
if ( round(p_par[14],digit=0) > 20 ) {
 opn$db_block_checksum="FULL"
}

v_vector=paste(round(p_par[1],digit=0),round(p_par[2],digit=0),round(p_par[3],digit=0),round(p_par[4],digit=0),round(p_par[5],digit=0),round(p_par[6],digit=0),round(p_par[7],digit=0),round(p_par[8],digit=0),round(p_par[9],digit=0),round(p_par[10],digit=0),round(p_par[11],digit=0),round(p_par[12],digit=0),round(p_par[13],digit=0),round(p_par[14],digit=0),sep=";")
cat( paste(v_module," try to evaluate vector: ", v_vector,sep="") , file=v_logfile, sep="n", append=T)

rc=make_additional_rgroups(opn)
if ( rc!=0 ) {
 cat( paste(v_module,"make_additional_rgroups failed",sep="") , file=v_logfile, sep="n", append=T)
 return (0)
}

v_rc=0
rc=set_db_parameter("log_archive_max_processes", opn$log_archive_max_processes)
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("commit_logging", opn$commit_logging )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("commit_wait", opn$commit_wait )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("log_buffer", opn$log_buffer )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("log_checkpoint_timeout", opn$log_checkpoint_timeout )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("db_writer_processes", opn$db_writer_processes )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("undo_retention", opn$undo_retention )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("transactions_per_rollback_segment", opn$transactions_per_rollback_segment )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("disk_asynch_io", opn$disk_asynch_io )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("filesystemio_options", opn$filesystemio_options )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("db_block_checking", opn$db_block_checking )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("db_block_checksum", opn$db_block_checksum )
if ( rc != 0 ) {  v_rc=1 }

if ( rc!=0 ) {
 cat( paste(v_module," can not startup db with that vector of settings",sep="") , file=v_logfile, sep="n", append=T)
 rc=stop_db("immediate")
 rc=create_spfile()
 rc=start_db("")
 rc=remove_additional_rgroups(opn)
 return (0)
}

rc=stop_db("immediate")
rc=start_db("")
if ( rc!=0 ) {
 cat( paste(v_module," can not startup db with that vector of settings",sep="") , file=v_logfile, sep="n", append=T)
 rc=stop_db("abort")
 rc=create_spfile()
 rc=start_db("")
 rc=remove_additional_rgroups(opn)
 return (0)
}

rc=run_test()
v_metric=getmetric()

rc=stop_db("immediate")
rc=create_spfile()
rc=start_db("")
rc=remove_additional_rgroups(opn)

cat( paste("result: ",v_metric," ",v_vector,sep="") , file=v_logfile, sep="n", append=T)
return (v_metric)
}

Taasi. shaqada oo dhan: lagu sameeyay shaqada jirdhiska.

Ga-subroutine-ku waxa uu habeeyaa vectors, ama, si sax ah, koromosoomyada.
Taas oo, waxa noogu muhiimsan waa xulashada koromosoomyada leh hidde-sideyaasha kuwaas oo shaqada fayowgu ay soo saarto qiyam weyn.

Tani, nuxur ahaan, waa habka lagu raadinayo qaybta ugu wanaagsan ee koromosoomyada iyadoo la adeegsanayo vector ee meel goob raadin N-cabbir ah.

Aad u cad, faahfaahsan sharraxaad, oo leh tusaalayaal R-code, shaqada algorithm hidde

Waxaan jeclaan lahaa inaan si gooni ah u xuso laba qodob oo farsamo.

Wicitaannada caawinta ee shaqada evaluate, tusaale ahaan, joojinta bilawga, dejinta qiimaha subd parameter, ayaa la sameeyaa iyadoo lagu salaynayo cran-r hawlaha system2

Iyada oo la kaashanayo taas: qoraal bash ah ama amar ayaa loo yaqaan.

Tusaale ahaan:

set_db_parameter

set_db_parameter=function(p1, p2) {
v_module="set_db_parameter"
v_cmd="/home/oracle/testingredotracΠ΅/set_db_parameter.sh"
v_args=paste(p1," ",p2,sep="")

x=system2(v_cmd, args=v_args, stdout=T, stderr=T, wait=T)
if ( length(attributes(x)) > 0 ) {
 cat(paste(v_module," failed with: ",attributes(x)$status," ",v_cmd," ",v_args,sep=""), file=v_logfile, sep="n", append=T)
 return (attributes(x)$status)
}
else {
 cat(paste(v_module," ok: ",v_cmd," ",v_args,sep=""), file=v_logfile, sep="n", append=T)
 return (0)
}
}

Qodobka labaad waa xariiqda, evaluate hawlaha, iyadoo lagu badbaadinayo qiime mitir oo gaar ah iyo habaynteeda u dhigma ee faylka log:

cat( paste("result: ",v_metric," ",v_vector,sep="") , file=v_logfile, sep="n", append=T)

Tani waa muhiim, sababtoo ah xogtan xogta ah, waxaa suurtogal ah in la helo macluumaad dheeraad ah oo ku saabsan qaybaha qalabka hagaajinta ee saamaynta weyn ama ka yar ee qiimaha metric.

Taasi waa: waxaa suurtogal ah in la sameeyo falanqaynta sifo-importamce.

Haddaba maxaa dhici kara?

Qaab garaaf ahaan, haddii aad u dalbato imtixaanada siday u kala horreeyaan, sawirku waa sidatan:

Habka poke sayniska, ama sida loo doorto qaabaynta xogta iyada oo la adeegsanayo bar-tilmaameedka iyo algorithmization optimization

Xogta qaar ee u dhiganta qiimaha xad dhaafka ah ee mitirka:
Habka poke sayniska, ama sida loo doorto qaabaynta xogta iyada oo la adeegsanayo bar-tilmaameedka iyo algorithmization optimization
Halkan, shaashadda natiijooyinka, waxaan caddayn doonaa: qiyamka vector-ka hagaajinta waxaa lagu bixiyaa marka la eego koodhka shaqada fayaqabka, ma aha marka loo eego liiska tirada cabbirrada / kala duwanaanta qiyamka cabbirka, kaas oo la sameeyay. kor ku xusan qoraalka.

Waa hagaag. Ma wax badan ama wax yar, ~ 8 kun tps: su'aal gooni ah.
Qaab dhismeedka shaqada shaybaarka, jaantuskani maaha mid muhiim ah, waxa muhiimka ah waa dhaqdhaqaaqa, sida qiimahani isbeddelayo.

Dhaqdhaqaaqa halkan waa wanaagsan yahay.
Way caddahay in ugu yaraan hal arrin si weyn u saamayso qiimaha mitirka, algorithm-ka, kala soocida unugyada koromosoomyada: daboolan.
Marka la eego dhaqdhaqaaqa caddaaladda leh ee qiyamka qalooca, waxaa jira ugu yaraan hal arrin oo kale oo, in kasta oo aad u yar, leh saameyn.

Tani waa meesha aad u baahan tahay attribute-importance falanqaynta si loo fahmo sifooyinka (si fiican, kiiskan, qaybaha tuning vector) iyo inta ay saameeyaan qiimaha mitirka.
Iyo macluumaadkan: ka fahan waxyaabaha ay saameeyeen isbeddelada sifooyinka muhiimka ah.

Run attribute-importance suurto gal siyaabo kala duwan.

Ujeedooyinkan, waxaan jeclahay algorithm-ka randomForest xirmo R oo isku magac ah (dukumiintiyada)
randomForest, Sida aan fahamsanahay shaqadiisa guud ahaan iyo habka uu u qiimeeyo muhiimadda sifooyinka gaar ahaan, waxay dhistaa qaab gaar ah oo ku tiirsanaanta doorsoomiyaha jawaabta sifooyinka.

Xaaladeena, doorsoomiyaha jawaabtu waa mitir laga helay xogta xogta ee imtixaanada culayska: tps;
Sifooyinkuna waa qaybo ka mid ah habaynta vector-ka.

Markaa halkan randomForest wuxuu ku qiimeeyaa muhiimada nooc kasta sifo isagoo leh laba tiro: %IncMSE - sida joogitaanka/maqnaanshiyaha sifadan ee moodelku u beddelo tayada MSE ee moodelkan (Mean Squared Error);

Iyo IncNodePurity waa tiro ka tarjumaysa sida ugu wanaagsan, iyadoo lagu saleynayo qiyamka sifadan, xogta xogta leh ee fiirsashada waa la qaybin karaa, sidaas darteed qayb ka mid ah waxaa jira xog leh hal qiime oo mitir ah oo la sharraxay, iyo tan kale qiime kale oo mitirka ah.
Hagaag, taasi waa: ilaa xad intee le'eg ayay tani tahay sifada kala soocida (waxaan arkay sida ugu cad, sharraxaadda luqadda Ruushka ee RandomForest halkan).

Shaqaale-Beeralay R-code si loogu habeeyo xog-ururinta natiijooyinka imtixaannada culeyska:

x=NULL
v_data_file=paste('/tmp/data1.dat',sep="")
x=read.table(v_data_file, header = TRUE, sep = ";", dec=",", quote = ""'", stringsAsFactors=FALSE)
colnames(x)=c('metric','rgsize','rgcount','lamp','cmtl','cmtw','lgbffr','lct','dbwrp','undo_retention','tprs','disk_async_io','filesystemio_options','db_block_checking','db_block_checksum')

idxTrain=sample(nrow(x),as.integer(nrow(x)*0.7))
idxNotTrain=which(! 1:nrow(x) %in% idxTrain )
TrainDS=x[idxTrain,]
ValidateDS=x[idxNotTrain,]

library(randomForest)
#mtry=as.integer( sqrt(dim(x)[2]-1) )
rf=randomForest(metric ~ ., data=TrainDS, ntree=40, mtry=3, replace=T, nodesize=2, importance=T, do.trace=10, localImp=F)
ValidateDS$predicted=predict(rf, newdata=ValidateDS[,colnames(ValidateDS)!="metric"], type="response")
sum((ValidateDS$metric-ValidateDS$predicted)^2)
rf$importance

Waxaad si toos ah u dooran kartaa hyperparameters of algorithm gacmahaaga iyo, diiradda saaraya tayada moodeelka, dooro qaab si sax ah u buuxiya saadaasha xogta ansaxinta.
Waxaad u qori kartaa nooc ka mid ah shaqada shaqadan (sida, mar labaad, adoo isticmaalaya nooc ka mid ah algorithm-ka hagaajinta).

Waxaad isticmaali kartaa xirmo R caret, ma aha qodobku waa muhiim.

Natiijo ahaan, kiiskan, natiijada soo socota ayaa la helaa si loo qiimeeyo heerka muhiimka ah ee sifooyinka:

Habka poke sayniska, ama sida loo doorto qaabaynta xogta iyada oo la adeegsanayo bar-tilmaameedka iyo algorithmization optimization

Waa hagaag. Haddaba, waxaan ku bilaabi karnaa milicsiga caalamka:

  1. Waxaa soo baxday in tan ugu muhiimsan, xaaladahan tijaabada ah, ay ahayd cabbirka commit_wait
    Farsamo ahaan, waxay qeexaysaa qaabka fulinta hawlgalka io ee qorista xogta dib-u-celinta ee kaydiyaha log subdb ilaa kooxda log ee hadda: synchronous ama asynchronous.
    qiimaha nowait taas oo keenta in ku dhawaad ​​toosan, korodhka badan ee qiimaha tps metric: kani waa ku darida habka io asynchronous ee kooxaha redo.
    Su'aal gaar ah ayaa ah inaad tan ku sameyso kaydka cuntada iyo in kale. Halkan waxaan ku koobayaa inaan sheego: tani waa arrin muhiim ah.
  2. Waa macquul in cabbirka kaydiyaha log ee subd: uu noqdo arrin muhiim ah.
    Inta yar ee cabbirka kaydiyaha log-gu, waa ay yaraataa awooddeeda wax-ka-qabsiga, inta badan way buuxdhaaftaa iyo/ama awood la'aanta in lagu qoondeeyo aag bilaash ah qayb ka mid ah xogta dib-u-habaynta cusub.
    Tani waxay ka dhigan tahay: dib u dhacyo la xidhiidha qoondaynta booska kaydiyaha log-ka iyo/ama ku daadinta xogta dib-u-habeeynta ee kooxaha redo.
    Dib u dhacyadani, dabcan, waa inay saameeyaan soo saarista xogta macluumaadka wax kala iibsiga.
  3. Xildhibaan db_block_checksum: si fiican, sidoo kale, guud ahaan way caddahay - habaynta macaamilku waxay keenaysaa samaynta blocks darty ee kaydka kaydka xogta hoose.
    Taas oo, marka hubinta xisaabaadka xogta ee la sahlay, kaydku waa inuu habeeyaa - xisaabiyaa jeegagyadan ka soo baxa xogta xannibaadda, ku hubi waxa ku qoran madaxa xogta xannibaadda: tabarruc/kuma habboona.
    Shaqadan oo kale, mar kale, dib uma dhigi karto habaynta xogta, sidaas awgeed, halbeegga iyo habka dejinaya cabbirkan ayaa noqday mid muhiim ah.
    Taasi waa sababta iibiyuhu u soo bandhigayo, dukumeentiga loogu talagalay cabbirkan, qiyamka kala duwan iyo qoraallada haa, waxaa jiri doona saameyn, laakiin waxaad dooran kartaa qiyam kala duwan, xitaa "off" iyo saameyno kala duwan.

Waa hagaag, gunaanad caalami ah.

Habka, guud ahaan, wuxuu noqdaa mid si fiican u shaqeeya.

Waxa uu aad u ogolaanayaa naftiisa, marxaladaha hore ee imtixaanka load ee nidaamka adeegga gaar ah, si ay u doortaan ay (nidaamka) qaabeynta ugu fiican ee load, ma si aad u qoto dheer galay waxyaabaha gaarka ah ee dejinta nidaamka load.

Laakiin si buuxda ugama saarayso - ugu yaraan heerka fahamka: nidaamku waa in la ogaadaa oo ku saabsan "kumbuyuutarrada hagaajinta" iyo qaybaha la oggol yahay ee wareegyadan.

Habka ayaa markaa si dhakhso ah u heli kara habaynta nidaamka ugu fiican.
Oo ku saleysan natiijooyinka imtixaanka, waxaa suurtagal ah in la helo macluumaadka ku saabsan dabeecadda xiriirka ka dhexeeya cabbirada waxqabadka nidaamka iyo qiyamka xuduudaha nidaamka nidaamka.

Taas oo, dabcan, ay tahay in ay gacan ka geysato soo ifbaxa fahamkan aadka u qoto dheer ee nidaamka, hawlgalkiisa, ugu yaraan culeys la bixiyay.

Ficil ahaan, tani waa beddelka kharashyada fahamka nidaamka habaysan ee kharashyada diyaarinta tijaabadan nidaamka.

Waxaan jeclaan lahaa inaan si gaar ah u xuso: habkan, heerka ku filnaanta tijaabinta nidaamka ee xaaladaha hawlgalka ee ay ku yeelan doonto hawlgalka ganacsiga ayaa aad muhiim u ah.

Waad ku mahadsan tahay dareenkaaga iyo waqtigaaga.

Source: www.habr.com

Add a comment