Hanyar poke na kimiyya, ko yadda za a zaɓi saitin bayanai ta amfani da alamomi da ingantaccen algorithm

Hello

Na yanke shawarar raba abin da na samu - 'ya'yan itacen tunani, gwaji da kuskure.
Gabaɗaya: wannan ba samowa ba ne, ba shakka - duk wannan yakamata an san shi na dogon lokaci, ga waɗanda ke da hannu a cikin sarrafa bayanan ƙididdiga da haɓaka kowane tsarin, ba lallai ba ne musamman DBMS.
Kuma: a, sun sani, suna rubuta labarai masu ban sha'awa akan binciken su, misali (UPD.: A cikin sharhin sun nuna wani aiki mai ban sha'awa: ottertune )
A gefe guda: ban ga wani ambaton ko yada wannan hanya ta Intanet a tsakanin kwararrun IT, DBA ba.

Don haka, zuwa ga ma'ana.

Bari mu ɗauka cewa muna da ɗawainiya: don saita wani tsarin sabis don hidimar wani nau'in aiki.

An sani game da wannan aikin: abin da yake, yadda ake auna ingancin wannan aikin, kuma menene ma'auni don auna wannan ingancin.

Bari mu kuma ɗauka cewa an fi sani ko žasa da fahimta: daidai yadda ake yin aiki a cikin (ko tare da) wannan tsarin sabis.

"Ƙari ko žasa" - wannan yana nufin cewa yana yiwuwa a shirya (ko samun shi daga wani wuri) wani kayan aiki, mai amfani, sabis wanda za'a iya haɗawa da amfani da tsarin tare da nauyin gwajin da ya isa ga abin da zai kasance a cikin samarwa, a cikin yanayin da ya isa ya yi aiki a samarwa.

Da kyau, bari mu ɗauka cewa an san saitunan daidaitawa na wannan tsarin sabis, wanda za'a iya amfani dashi don daidaita wannan tsarin dangane da yawan aiki na aikinsa.

Kuma menene matsalar - babu isassun cikakkiyar fahimtar wannan tsarin sabis, wanda ke ba ku damar tsara saitunan wannan tsarin don ɗaukar nauyi na gaba akan dandamalin da aka bayar kuma ku sami aikin da ake buƙata na tsarin.

To. Kusan haka lamarin yake.

Me za ku iya yi a nan?

To, abu na farko da ya zo a hankali shi ne duba takardun wannan tsarin. Fahimtar abin da keɓaɓɓun jeri don ƙimar ma'aunin daidaitawa. Kuma, alal misali, ta amfani da hanyar zuriya mai daidaitawa, zaɓi ƙima don sigogin tsarin a cikin gwaje-gwaje.

Wadancan. ba wa tsarin wani nau'i na tsari, a cikin nau'i na ƙayyadaddun ƙayyadaddun dabi'u don sigogin tsarin sa.

Aiwatar da nauyin gwaji zuwa gare shi, ta amfani da wannan kayan aiki-mai amfani, janareta mai ɗaukar nauyi.
Kuma duba ƙimar - amsa, ko ma'auni na ingancin tsarin.

Tunani na biyu na iya zama ƙarshen cewa wannan lokaci ne mai tsayi sosai.

To, wato: idan akwai sigogi da yawa na saiti, idan jeri na ƙimar da ake gudanar da su sun yi girma, idan kowane gwajin lodin ya ɗauki lokaci mai yawa don kammalawa, to: a, duk wannan na iya ɗaukar abin da ba za a yarda da shi ba. kwana biyu.

To, ga abin da za ku iya fahimta kuma ku tuna.

Za ka iya gano cewa a cikin saitin dabi'u na sigogin tsarin tsarin sabis akwai vector, a matsayin jerin wasu dabi'u.

Kowane irin nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'in nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i nau'i) wanda ya dace da ma'auni na ma'auni gaba daya - mai nuna ingancin tsarin aiki a ƙarƙashin gwajin gwaji.

Ee

Bari mu nuna vector daidaitawar tsarin kamar Hanyar poke na kimiyya, ko yadda za a zaɓi saitin bayanai ta amfani da alamomi da ingantaccen algorithminda Hanyar poke na kimiyya, ko yadda za a zaɓi saitin bayanai ta amfani da alamomi da ingantaccen algorithm; Ina Hanyar poke na kimiyya, ko yadda za a zaɓi saitin bayanai ta amfani da alamomi da ingantaccen algorithm - adadin sigogin tsarin tsarin, da yawa daga cikin waɗannan sigogi akwai.

Kuma darajar ma'aunin da ya dace da wannan Hanyar poke na kimiyya, ko yadda za a zaɓi saitin bayanai ta amfani da alamomi da ingantaccen algorithm mu nuna kamar
Hanyar poke na kimiyya, ko yadda za a zaɓi saitin bayanai ta amfani da alamomi da ingantaccen algorithm, to muna samun aiki: Hanyar poke na kimiyya, ko yadda za a zaɓi saitin bayanai ta amfani da alamomi da ingantaccen algorithm

To, sa'an nan: duk abin da nan da nan ya sauko zuwa, a cikin akwati na: kusan mantawa daga kwanakin dalibai na, algorithms don neman iyakar aiki.

To, amma a nan wata tambaya ta ƙungiya da aiki ta taso: wane algorithm don amfani.

  1. A cikin ma'anar - don ku iya ƙididdige ƙasa da hannu.
  2. Kuma don yin aiki, i.e. sami extremum (idan akwai daya), da kyau, a kalla sauri fiye da daidaita zuriyar.

Batu na farko yana nuna cewa muna buƙatar duba zuwa wasu wurare waɗanda aka riga aka aiwatar da irin waɗannan algorithms, kuma a wasu nau'ikan, a shirye suke don amfani da su a cikin lamba.
To, na sani python и cran-r

Batu na biyu yana nufin cewa kuna buƙatar karantawa game da algorithms kansu, menene su, menene buƙatun su, da fasalin aikin su.

Kuma abin da suke bayarwa na iya zama tasiri mai amfani - sakamako, ko kai tsaye daga algorithm kanta.

Ko za a iya samun su daga sakamakon algorithm.

Yawancin ya dogara da yanayin shigarwa.

Misali, idan, saboda wasu dalilai, kuna buƙatar samun sakamako da sauri, da kyau, kuna buƙatar duba zuwa algorithms na zuriyar gradient kuma zaɓi ɗayansu.

Ko, idan lokaci ba shi da mahimmanci, zaka iya, alal misali, amfani da hanyoyin ingantawa na stochastic, kamar algorithm na kwayoyin halitta.

Ina ba da shawarar yin la'akari da aikin wannan tsarin, zaɓin tsarin tsarin, ta amfani da algorithm na kwayoyin halitta, a gaba, don yin magana: aikin dakin gwaje-gwaje.

Na asali:

  1. Bari a sami, azaman tsarin sabis: oracle xe 18c
  2. Bari ya yi aiki da ayyukan ma'amala da makasudin: don samun mafi girman abin da za a iya samu na subdatabase, a cikin ma'amaloli/sec.
  3. Ma'amaloli na iya bambanta sosai a yanayin aiki tare da bayanai da mahallin aiki.
    Bari mu yarda cewa waɗannan ma'amaloli ne waɗanda ba sa aiwatar da adadin bayanai masu yawa.
    A ma'anar cewa ba sa samar da ƙarin bayanan gyara fiye da sake gyarawa kuma ba sa aiwatar da adadi mai yawa na layuka da manyan teburi.

Waɗannan ma'amaloli ne waɗanda ke canza jeri ɗaya a cikin babban tebur fiye ko ƙasa da haka, tare da ƙaramin adadin fihirisa akan wannan tebur.

A cikin wannan yanayin: yawan aiki na subdatabase don sarrafa ma'amaloli, tare da ajiyar wuri, za a ƙaddara ta ingancin sarrafawa ta hanyar redox database.

Disclaimer - idan muka yi magana musamman game da saitunan subdb.

Domin, a cikin yanayin gabaɗaya, ana iya samun, alal misali, makullai na ma'amala tsakanin zaman SQL, saboda ƙirar aikin mai amfani tare da bayanan tambura da/ko ƙirar tabular.

Wanne, ba shakka, zai sami tasiri mai ban tsoro akan ma'aunin TPS kuma wannan zai zama wani abu mai ban sha'awa, dangane da subdatabase: da kyau, wannan shine yadda aka tsara ƙirar tabular da aiki tare da bayanai a ciki wanda ke faruwa blockages.

Don haka, don tsabtar gwajin, za mu ware wannan factor, kuma a ƙasa zan bayyana ainihin yadda.

  1. Bari mu ɗauka, don tabbatacciyar, cewa 100% na umarnin SQL da aka ƙaddamar zuwa bayanan bayanai sune umarnin DML.
    Bari halayen mai amfani aiki tare da ƙananan bayanai su kasance iri ɗaya a cikin gwaje-gwaje.
    Wato: adadin zaman skl, bayanan tabular, yadda zaman skl ke aiki da su.
  2. Subd yana aiki a ciki FORCE LOGGING, ARCHIVELOG mods. Yanayin Flashback-database yana kashe, a matakin subd.
  3. Sake rajistan ayyukan: yana cikin tsarin fayil daban, akan “faifai” daban;
    Sauran sassan jiki na bayanan: a cikin wani, tsarin fayil daban, akan “faifai” daban:

Ƙarin cikakkun bayanai game da na'urar jiki. dakin gwaje-gwaje database aka gyara

SQL> select status||' '||name from v$controlfile;
 /db/u14/oradata/XE/control01.ctl
SQL> select GROUP#||' '||MEMBER from v$logfile;
1 /db/u02/oradata/XE/redo01_01.log
2 /db/u02/oradata/XE/redo02_01.log
SQL> select FILE_ID||' '||TABLESPACE_NAME||' '||round(BYTES/1024/1024,2)||' '||FILE_NAME as col from dba_data_files;
4 UNDOTBS1 2208 /db/u14/oradata/XE/undotbs1_01.dbf
2 SLOB 128 /db/u14/oradata/XE/slob01.dbf
7 USERS 5 /db/u14/oradata/XE/users01.dbf
1 SYSTEM 860 /db/u14/oradata/XE/system01.dbf
3 SYSAUX 550 /db/u14/oradata/XE/sysaux01.dbf
5 MONITOR 128 /db/u14/oradata/XE/monitor.dbf
SQL> !cat /proc/mounts | egrep "/db/u[0-2]"
/dev/vda1 /db/u14 ext4 rw,noatime,nodiratime,data=ordered 0 0
/dev/mapper/vgsys-ora_redo /db/u02 xfs rw,noatime,nodiratime,attr2,nobarrier,inode64,logbsize=256k,noquota 0 0

Da farko, a ƙarƙashin waɗannan yanayin lodi, Ina so in yi amfani da subd na ma'amala SLOB-mai amfani
Yana da irin wannan siffa mai ban mamaki, zan faɗi marubucin:

A tsakiyar SLOB shine "hanyar SLOB." Hanyar SLOB na nufin gwada dandamali
ba tare da takaddamar aikace-aikacen ba. Mutum ba zai iya fitar da iyakar aikin hardware ba
ta amfani da lambar aikace-aikace wato, misali, ɗaure ta hanyar kulle aikace-aikacen ko ma
raba Oracle Database tubalan. Haka ne—akwai sama da ƙasa lokacin raba bayanai
in data blocks! Amma SLOB-a cikin tsohowar turawa-ba shi da kariya daga irin wannan takaddama.

Wannan sanarwa: yayi daidai, shi ne.
Ya dace don daidaita matakin daidaitawar zaman cl, wannan shine maɓalli -t kaddamar da amfani runit.sh daga SLOB
An tsara yawan adadin umarnin DML, a cikin adadin saƙonnin rubutu da aka aika zuwa subd, kowane zaman rubutu, siga. UPDATE_PCT
Na dabam da kuma dacewa sosai: SLOB kanta, kafin da kuma bayan lodawa zaman - shirya statspack, ko awr-snapshots (abin da aka saita don shirya).

Duk da haka, ya zama cewa SLOB baya goyan bayan zaman SQL tare da tsawon ƙasa da daƙiƙa 30.
Saboda haka, na farko code na kaina, ma'aikaci-baƙauye version na loader, sa'an nan ya ci gaba da aiki.

Bari in fayyace abin da loader yake yi da yadda yake yinsa, don a fayyace.
Ainihin Loader yayi kama da haka:

Lambar ma'aikaci

function dotx()
{
local v_period="$2"
[ -z "v_period" ] && v_period="0"
source "/home/oracle/testingredotracе/config.conf"

$ORACLE_HOME/bin/sqlplus -S system/${v_system_pwd} << __EOF__
whenever sqlerror exit failure
set verify off
set echo off
set feedback off

define wnum="$1"
define period="$v_period"
set appinfo worker_&&wnum

declare
 v_upto number;
 v_key  number;
 v_tots number;
 v_cts  number;
begin
 select max(col1) into v_upto from system.testtab_&&wnum;
 SELECT (( SYSDATE - DATE '1970-01-01' ) * 86400 ) into v_cts FROM DUAL;
 v_tots := &&period + v_cts;
 while v_cts <= v_tots
 loop
  v_key:=abs(mod(dbms_random.random,v_upto));
  if v_key=0 then
   v_key:=1;
  end if;
  update system.testtab_&&wnum t
  set t.object_name=translate(dbms_random.string('a', 120), 'abcXYZ', '158249')
  where t.col1=v_key
  ;
  commit;
  SELECT (( SYSDATE - DATE '1970-01-01' ) * 86400 ) into v_cts FROM DUAL;
 end loop;
end;
/

exit
__EOF__
}
export -f dotx

Ana kaddamar da ma'aikata kamar haka:

Ma'aikata masu gudu

echo "starting test, duration: ${TEST_DURATION}" >> "$v_logfile"
for((i=1;i<="$SQLSESS_COUNT";i++))
do
 echo "sql-session: ${i}" >> "$v_logfile"
 dotx "$i" "${TEST_DURATION}" &
done
echo "waiting..." >> "$v_logfile"
wait

Kuma an shirya teburin ma'aikata kamar haka:

Ƙirƙirar teburi

function createtable() {
source "/home/oracle/testingredotracе/config.conf"
$ORACLE_HOME/bin/sqlplus -S system/${v_system_pwd} << __EOF__
whenever sqlerror continue
set verify off
set echo off
set feedback off

define wnum="$1"
define ts_name="slob"

begin
 execute immediate 'drop table system.testtab_&&wnum';
exception when others then null;
end;
/

create table system.testtab_&&wnum tablespace &&ts_name as
select rownum as col1, t.*
from sys.dba_objects t
where rownum<1000
;
create index testtab_&&wnum._idx on system.testtab_&&wnum (col1);
--alter table system.testtab_&&wnum nologging;
--alter index system.testtab_&&wnum._idx nologging;
exit
__EOF__
}
export -f createtable

seq 1 1 "$SQLSESS_COUNT" | xargs -n 1 -P 4 -I {} -t bash -c "createtable "{}"" | tee -a "$v_logfile"
echo "createtable done" >> "$v_logfile"

Wadancan. Ga kowane ma'aikaci (a zahiri: wani zaman SQL daban a cikin DB) an ƙirƙiri tebur daban, wanda ma'aikacin ke aiki tare da shi.

Wannan yana tabbatar da rashi na ma'amala tsakanin zaman ma'aikaci.
Kowane ma'aikaci: yana yin abu ɗaya, tare da nasa tebur, tebur ɗin duk ɗaya ne.
Duk ma'aikata suna yin aiki na lokaci ɗaya.
Bugu da ƙari, na dogon lokaci don haka, alal misali, maɓallin log ɗin zai faru, kuma fiye da sau ɗaya.
To, bisa ga haka, haɗin kai da tasirin ya tashi.
A cikin akwati na, na saita tsawon lokacin aikin ma'aikata a minti 8.

Wani yanki na rahoton statspack wanda ke bayyana aikin subd ɗin da ke ƙarƙashin kaya

Database    DB Id    Instance     Inst Num  Startup Time   Release     RAC
~~~~~~~~ ----------- ------------ -------- --------------- ----------- ---
          2929910313 XE                  1 07-Sep-20 23:12 18.0.0.0.0  NO

Host Name             Platform                CPUs Cores Sockets   Memory (G)
~~~~ ---------------- ---------------------- ----- ----- ------- ------------
     billing.izhevsk1 Linux x86 64-bit           2     2       1         15.6

Snapshot       Snap Id     Snap Time      Sessions Curs/Sess Comment
~~~~~~~~    ---------- ------------------ -------- --------- ------------------
Begin Snap:       1630 07-Sep-20 23:12:27       55        .7
  End Snap:       1631 07-Sep-20 23:20:29       62        .6
   Elapsed:       8.03 (mins) Av Act Sess:       8.4
   DB time:      67.31 (mins)      DB CPU:      15.01 (mins)

Cache Sizes            Begin        End
~~~~~~~~~~~       ---------- ----------
    Buffer Cache:     1,392M              Std Block Size:         8K
     Shared Pool:       288M                  Log Buffer:   103,424K

Load Profile              Per Second    Per Transaction    Per Exec    Per Call
~~~~~~~~~~~~      ------------------  ----------------- ----------- -----------
      DB time(s):                8.4                0.0        0.00        0.20
       DB CPU(s):                1.9                0.0        0.00        0.04
       Redo size:        7,685,765.6              978.4
   Logical reads:           60,447.0                7.7
   Block changes:           47,167.3                6.0
  Physical reads:                8.3                0.0
 Physical writes:              253.4                0.0
      User calls:               42.6                0.0
          Parses:               23.2                0.0
     Hard parses:                1.2                0.0
W/A MB processed:                1.0                0.0
          Logons:                0.5                0.0
        Executes:           15,756.5                2.0
       Rollbacks:                0.0                0.0
    Transactions:            7,855.1

Komawa aikin dakin gwaje-gwaje.
Za mu, sauran abubuwan daidai suke, sun bambanta ƙimar waɗannan sigogi na subdatabase na dakin gwaje-gwaje:

  1. Girman kungiyoyin log ɗin bayanai. iyakar darajar: [32, 1024] MB;
  2. Adadin ƙungiyoyin jarida a cikin bayanan. iyakar darajar: [2,32];
  3. log_archive_max_processes iyakar darajar: [1,8];
  4. commit_logging ana halatta dabi'u biyu: batch|immediate;
  5. commit_wait ana halatta dabi'u biyu: wait|nowait;
  6. log_buffer iyakar darajar: [2,128] MB.
  7. log_checkpoint_timeout iyakar darajar: [60,1200] seconds
  8. db_writer_processes iyakar darajar: [1,4]
  9. undo_retention iyakar darajar: [30;300] seconds
  10. transactions_per_rollback_segment iyakar darajar: [1,8]
  11. disk_asynch_io ana halatta dabi'u biyu: true|false;
  12. filesystemio_options Ana ba da izinin waɗannan dabi'u: none|setall|directIO|asynch;
  13. db_block_checking Ana ba da izinin waɗannan dabi'u: OFF|LOW|MEDIUM|FULL;
  14. db_block_checksum Ana ba da izinin waɗannan dabi'u: OFF|TYPICAL|FULL;

Mutumin da ke da gogewa a cikin adana bayanan Oracle tabbas zai iya faɗi menene kuma ga waɗanne dabi'u yakamata a saita, daga ƙayyadaddun sigogi da ƙimar su karɓaɓɓu, don samun mafi girman yawan aiki na bayanan don aiki tare da bayanan da aka nuna ta hanyar. lambar aikace-aikacen , nan sama.

Amma.

Ma'anar aikin dakin gwaje-gwaje shine don nuna cewa ingantaccen algorithm kanta zai fayyace mana wannan cikin sauri.

A gare mu, duk abin da ya rage shi ne duba cikin takaddun, ta hanyar tsarin da za a iya daidaitawa, kawai isa don gano abin da za a canza da kuma a cikin wace jeri.
Hakanan: lambar lambar da za a yi amfani da ita don aiki tare da tsarin al'ada na ingantaccen ingantaccen algorithm da aka zaɓa.

Don haka, yanzu game da code.
Na yi magana a sama game da cran-r, watau: duk magudi tare da tsarin da aka keɓance an tsara su ta hanyar rubutun R.

Haƙiƙanin ɗawainiya, bincike, zaɓi ta ƙimar awo, tsarin yanayin yanayin tsarin: wannan fakiti ne GA (takardun shaida)
Kunshin, a wannan yanayin, bai dace sosai ba, a cikin ma'anar cewa yana tsammanin vectors (chromosomes, idan cikin sharuddan kunshin) za a ƙayyade su a cikin nau'i na kirtani na lambobi tare da ɓangaren juzu'i.

Kuma na vector, daga dabi'u na saitin sigogi: wadannan su ne 14 yawa - lamba da kirtani dabi'u.

Matsalar, ba shakka, ana samun sauƙin gujewa ta hanyar sanya wasu takamaiman lambobi zuwa ƙimar kirtani.

Don haka, a ƙarshe, babban ɓangaren rubutun R yayi kama da haka:

Call GA::ga

cat( "", file=v_logfile, sep="n", append=F)

pSize = 10
elitism_value=1
pmutation_coef=0.8
pcrossover_coef=0.1
iterations=50

gam=GA::ga(type="real-valued", fitness=evaluate,
lower=c(32,2, 1,1,1,2,60,1,30,1,0,0, 0,0), upper=c(1024,32, 8,10,10,128,800,4,300,8,10,40, 40,30),
popSize=pSize,
pcrossover = pcrossover_coef,
pmutation = pmutation_coef,
maxiter=iterations,
run=4,
keepBest=T)
cat( "GA-session is done" , file=v_logfile, sep="n", append=T)
gam@solution

Anan, tare da taimako lower и upper subroutine halaye ga da gaske, an ƙayyade yanki na sararin samaniya, wanda a cikinsa za a yi bincike don irin wannan vector (ko vectors) wanda za a samu iyakar ƙimar aikin motsa jiki.

Ga subroutine yana yin bincike yana haɓaka aikin motsa jiki.

Da kyau, to, ya zama cewa, a cikin wannan yanayin, yana da mahimmanci cewa aikin motsa jiki, fahimtar vector a matsayin saiti na ƙimar wasu sigogi na subd, yana karɓar ma'auni daga subd.

Wato: nawa, tare da saitin subd ɗin da aka ba da kuma aka ba da kaya akan subd: subd yana aiwatar da ma'amala a cikin sakan daya.

Wato, lokacin buɗewa, dole ne a aiwatar da matakai da yawa a cikin aikin motsa jiki:

  1. Gudanar da shigar da vector na lambobi - canza shi zuwa ƙima don sigogin bayanan subdata.
  2. Ƙoƙarin ƙirƙira adadin da aka bayar na rukunin sake yin girman da aka bayar. Bugu da ƙari, ƙoƙarin na iya zama bai yi nasara ba.
    Ƙungiyoyin mujallu waɗanda suka riga sun wanzu a cikin subd, a wasu ƙididdiga da wasu girmansu, don tsarkin gwaji - d.b. share
  3. Idan batu na baya ya ci nasara: ƙayyade ƙimar sigogin daidaitawa zuwa bayanan bayanan (sake: za'a iya samun gazawa)
  4. Idan matakin da ya gabata ya yi nasara: dakatar da subd, fara subd domin sabbin ƙayyadaddun ƙimar ƙimar su yi tasiri. (sake: ana iya samun glitch)
  5. Idan matakin da ya gabata ya yi nasara: yi gwajin nauyi. sami awo daga subd.
  6. Mayar da subd ɗin zuwa matsayinsa na asali, i.e. share ƙarin ƙungiyoyin log ɗin, mayar da ainihin tushen bayanan bayanan zuwa aiki.

Lambar aikin motsa jiki

evaluate=function(p_par) {
v_module="evaluate"
v_metric=0
opn=NULL
opn$rg_size=round(p_par[1],digit=0)
opn$rg_count=round(p_par[2],digit=0)
opn$log_archive_max_processes=round(p_par[3],digit=0)
opn$commit_logging="BATCH"
if ( round(p_par[4],digit=0) > 5 ) {
 opn$commit_logging="IMMEDIATE"
}
opn$commit_logging=paste("'", opn$commit_logging, "'",sep="")

opn$commit_wait="WAIT"
if ( round(p_par[5],digit=0) > 5 ) {
 opn$commit_wait="NOWAIT"
}
opn$commit_wait=paste("'", opn$commit_wait, "'",sep="")

opn$log_buffer=paste(round(p_par[6],digit=0),"m",sep="")
opn$log_checkpoint_timeout=round(p_par[7],digit=0)
opn$db_writer_processes=round(p_par[8],digit=0)
opn$undo_retention=round(p_par[9],digit=0)
opn$transactions_per_rollback_segment=round(p_par[10],digit=0)
opn$disk_asynch_io="true"
if ( round(p_par[11],digit=0) > 5 ) {
 opn$disk_asynch_io="false"
} 

opn$filesystemio_options="none"
if ( round(p_par[12],digit=0) > 10 && round(p_par[12],digit=0) <= 20 ) {
 opn$filesystemio_options="setall"
}
if ( round(p_par[12],digit=0) > 20 && round(p_par[12],digit=0) <= 30 ) {
 opn$filesystemio_options="directIO"
}
if ( round(p_par[12],digit=0) > 30 ) {
 opn$filesystemio_options="asynch"
}

opn$db_block_checking="OFF"
if ( round(p_par[13],digit=0) > 10 && round(p_par[13],digit=0) <= 20 ) {
 opn$db_block_checking="LOW"
}
if ( round(p_par[13],digit=0) > 20 && round(p_par[13],digit=0) <= 30 ) {
 opn$db_block_checking="MEDIUM"
}
if ( round(p_par[13],digit=0) > 30 ) {
 opn$db_block_checking="FULL"
}

opn$db_block_checksum="OFF"
if ( round(p_par[14],digit=0) > 10 && round(p_par[14],digit=0) <= 20 ) {
 opn$db_block_checksum="TYPICAL"
}
if ( round(p_par[14],digit=0) > 20 ) {
 opn$db_block_checksum="FULL"
}

v_vector=paste(round(p_par[1],digit=0),round(p_par[2],digit=0),round(p_par[3],digit=0),round(p_par[4],digit=0),round(p_par[5],digit=0),round(p_par[6],digit=0),round(p_par[7],digit=0),round(p_par[8],digit=0),round(p_par[9],digit=0),round(p_par[10],digit=0),round(p_par[11],digit=0),round(p_par[12],digit=0),round(p_par[13],digit=0),round(p_par[14],digit=0),sep=";")
cat( paste(v_module," try to evaluate vector: ", v_vector,sep="") , file=v_logfile, sep="n", append=T)

rc=make_additional_rgroups(opn)
if ( rc!=0 ) {
 cat( paste(v_module,"make_additional_rgroups failed",sep="") , file=v_logfile, sep="n", append=T)
 return (0)
}

v_rc=0
rc=set_db_parameter("log_archive_max_processes", opn$log_archive_max_processes)
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("commit_logging", opn$commit_logging )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("commit_wait", opn$commit_wait )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("log_buffer", opn$log_buffer )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("log_checkpoint_timeout", opn$log_checkpoint_timeout )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("db_writer_processes", opn$db_writer_processes )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("undo_retention", opn$undo_retention )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("transactions_per_rollback_segment", opn$transactions_per_rollback_segment )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("disk_asynch_io", opn$disk_asynch_io )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("filesystemio_options", opn$filesystemio_options )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("db_block_checking", opn$db_block_checking )
if ( rc != 0 ) {  v_rc=1 }
rc=set_db_parameter("db_block_checksum", opn$db_block_checksum )
if ( rc != 0 ) {  v_rc=1 }

if ( rc!=0 ) {
 cat( paste(v_module," can not startup db with that vector of settings",sep="") , file=v_logfile, sep="n", append=T)
 rc=stop_db("immediate")
 rc=create_spfile()
 rc=start_db("")
 rc=remove_additional_rgroups(opn)
 return (0)
}

rc=stop_db("immediate")
rc=start_db("")
if ( rc!=0 ) {
 cat( paste(v_module," can not startup db with that vector of settings",sep="") , file=v_logfile, sep="n", append=T)
 rc=stop_db("abort")
 rc=create_spfile()
 rc=start_db("")
 rc=remove_additional_rgroups(opn)
 return (0)
}

rc=run_test()
v_metric=getmetric()

rc=stop_db("immediate")
rc=create_spfile()
rc=start_db("")
rc=remove_additional_rgroups(opn)

cat( paste("result: ",v_metric," ",v_vector,sep="") , file=v_logfile, sep="n", append=T)
return (v_metric)
}

Wannan. duk aikin: an yi a cikin aikin motsa jiki.

Ga-subroutine yana sarrafa vectors, ko, mafi daidai, chromosomes.
A cikin abin da, abin da ya fi mahimmanci a gare mu shine zaɓi na chromosomes tare da kwayoyin halitta wanda aikin dacewa ya haifar da manyan dabi'u.

Wannan, a zahiri, shine tsarin nemo mafi kyawun saitin chromosomes ta amfani da vector a cikin sararin bincike mai girman N.

A sarari, daki-daki bayani, tare da misalai na R-code, aikin algorithm na kwayoyin halitta.

Ina so in lura da maki biyu na fasaha daban daban.

Kiran taimako daga aikin evaluate, alal misali, farawa tasha, saita ƙimar siginar subd, ana yin su bisa ga cran-r ayyuka system2

Tare da taimakon wanda: ana kiran wasu rubutun bash ko umarni.

Alal misali:

saita_db_parameter

set_db_parameter=function(p1, p2) {
v_module="set_db_parameter"
v_cmd="/home/oracle/testingredotracе/set_db_parameter.sh"
v_args=paste(p1," ",p2,sep="")

x=system2(v_cmd, args=v_args, stdout=T, stderr=T, wait=T)
if ( length(attributes(x)) > 0 ) {
 cat(paste(v_module," failed with: ",attributes(x)$status," ",v_cmd," ",v_args,sep=""), file=v_logfile, sep="n", append=T)
 return (attributes(x)$status)
}
else {
 cat(paste(v_module," ok: ",v_cmd," ",v_args,sep=""), file=v_logfile, sep="n", append=T)
 return (0)
}
}

Batu na biyu shine layi, evaluate ayyuka, tare da adana takamaiman ƙimar awo da madaidaicin madaidaicin vector zuwa fayil ɗin log:

cat( paste("result: ",v_metric," ",v_vector,sep="") , file=v_logfile, sep="n", append=T)

Wannan yana da mahimmanci, saboda daga wannan tsararrun bayanai, za a iya samun ƙarin bayani game da wanne daga cikin abubuwan da aka haɗa na vector ɗin ke da tasiri mafi girma ko ƙarami akan ƙimar awo.

Wato: zai yiwu a gudanar da nazarin sifa-importamce.

To me zai iya faruwa?

A cikin sigar jadawali, idan kun yi odar gwaje-gwaje a cikin tsarin awo mai hawa, hoton yana kamar haka:

Hanyar poke na kimiyya, ko yadda za a zaɓi saitin bayanai ta amfani da alamomi da ingantaccen algorithm

Wasu bayanan da suka yi daidai da matsananciyar ƙimar awo:
Hanyar poke na kimiyya, ko yadda za a zaɓi saitin bayanai ta amfani da alamomi da ingantaccen algorithm
Anan, a cikin hoton allo tare da sakamakon, zan bayyana: ana ba da ƙimar vector mai kunnawa dangane da lambar aikin motsa jiki, ba dangane da adadin adadin sigogi / jeri na ƙimar sigina ba, wanda aka tsara. sama a cikin rubutu.

To. Yana da yawa ko kadan, ~ 8 dubu tps: tambaya daban.
A cikin tsarin aikin dakin gwaje-gwaje, wannan adadi ba shi da mahimmanci, abin da ke da mahimmanci shine mahimmanci, yadda wannan darajar ta canza.

Abubuwan da ke faruwa a nan suna da kyau.
A bayyane yake cewa aƙalla abu ɗaya yana tasiri mahimmancin ƙimar awo, ga-algorithm, rarrabuwa ta cikin chromosome vectors: an rufe.
Yin la'akari da ingantaccen ƙarfin kuzari na dabi'un lanƙwasa, akwai aƙalla ƙarin abu ɗaya wanda, ko da yake ya fi ƙanƙanta, yana da tasiri.

Wannan shine inda kuke buƙata attribute-importance bincike don fahimtar abin da halaye (da kyau, a cikin wannan yanayin, abubuwan da aka gyara na vector na kunnawa) da kuma yadda suke rinjayar ƙimar awo.
Kuma daga wannan bayanin: fahimtar abin da abubuwan da canje-canjen da suka shafi manyan halaye suka shafi.

Run attribute-importance mai yiwuwa ta hanyoyi daban-daban.

Don waɗannan dalilai, Ina son algorithm randomForest Kunshin R na suna iri ɗaya (takardun shaida)
randomForest, kamar yadda na fahimci aikinsa gabaɗaya da kuma tsarinsa na tantance mahimmancin sifofi musamman, yana gina wani tsari na dogaro da sauyin amsa akan halayen.

A cikin yanayinmu, madaidaicin amsa shine awo da aka samu daga ma'ajin bayanai a cikin gwaje-gwajen lodi: tps;
Kuma sifofi su ne abubuwan da ke cikin tuning vector.

Don haka a nan randomForest yana kimanta mahimmancin kowane samfurin sifa tare da lambobi biyu: %IncMSE - yadda kasancewar / rashin wannan sifa a cikin samfurin yana canza ingancin MSE na wannan ƙirar (Kuskuren Squared Ma'anar);

Kuma IncNodePurity lamba ce da ke nuna yadda da kyau, dangane da ƙimar wannan sifa, ana iya raba bayanan da ke da abubuwan lura, ta yadda a cikin wani ɓangaren akwai bayanai tare da ƙimar ma'aunin da aka bayyana, kuma a ɗayan tare da bayanin. wani darajar ma'aunin.
To, wato: zuwa wane irin sifa ce wannan sifa (Na ga mafi bayyananne, bayanin harshen Rashanci akan RandomForest). a nan).

R-code na ma'aikaci-baƙi don sarrafa saitin bayanai tare da sakamakon gwaje-gwajen lodi:

x=NULL
v_data_file=paste('/tmp/data1.dat',sep="")
x=read.table(v_data_file, header = TRUE, sep = ";", dec=",", quote = ""'", stringsAsFactors=FALSE)
colnames(x)=c('metric','rgsize','rgcount','lamp','cmtl','cmtw','lgbffr','lct','dbwrp','undo_retention','tprs','disk_async_io','filesystemio_options','db_block_checking','db_block_checksum')

idxTrain=sample(nrow(x),as.integer(nrow(x)*0.7))
idxNotTrain=which(! 1:nrow(x) %in% idxTrain )
TrainDS=x[idxTrain,]
ValidateDS=x[idxNotTrain,]

library(randomForest)
#mtry=as.integer( sqrt(dim(x)[2]-1) )
rf=randomForest(metric ~ ., data=TrainDS, ntree=40, mtry=3, replace=T, nodesize=2, importance=T, do.trace=10, localImp=F)
ValidateDS$predicted=predict(rf, newdata=ValidateDS[,colnames(ValidateDS)!="metric"], type="response")
sum((ValidateDS$metric-ValidateDS$predicted)^2)
rf$importance

Kuna iya zaɓar hyperparameters na algorithm tare da hannuwanku kuma, mai da hankali kan ingancin ƙirar, zaɓi ƙirar da ta cika tsinkaya akan saitin tabbatarwa.
Kuna iya rubuta wani nau'in aiki don wannan aikin (ta hanyar, sake, ta amfani da wasu nau'in ingantawa algorithm).

Kuna iya amfani da kunshin R caret, ba batun yana da mahimmanci ba.

A sakamakon haka, a cikin wannan yanayin, ana samun sakamako mai zuwa don tantance ƙimar mahimmancin halayen:

Hanyar poke na kimiyya, ko yadda za a zaɓi saitin bayanai ta amfani da alamomi da ingantaccen algorithm

To. Don haka, za mu iya fara tunanin duniya:

  1. Ya bayyana cewa mafi mahimmanci, ƙarƙashin waɗannan yanayin gwaji, shine siga commit_wait
    A fasaha, yana ƙayyadaddun yanayin aiwatar da aikin io na rubuta bayanan sake gyarawa daga madaidaicin log ɗin subdb zuwa rukunin log ɗin na yanzu: aiki tare ko asynchronous.
    Ma'ana nowait wanda ke haifar da kusan a tsaye, haɓaka da yawa a cikin ƙimar ma'aunin tps: wannan shine haɗa yanayin io asynchronous a cikin ƙungiyoyin redo.
    Wata tambaya daban ita ce ko ya kamata ku yi wannan a cikin bayanan abinci ko a'a. Anan na iyakance kaina ga kawai faɗi: wannan muhimmin al'amari ne.
  2. Yana da ma'ana cewa girman ma'aunin buffer na subd: ya juya ya zama muhimmin abu.
    Karamin girman buffer log ɗin, ƙarancin ƙarfin buffer ɗin sa, mafi yawan lokuta yana ambaliya da/ko rashin iya ware yanki kyauta a cikin sa don wani yanki na sabbin bayanan redox.
    Wannan yana nufin: jinkirin da ke da alaƙa da keɓance sarari a cikin buffer log da/ko zubar da bayanan sake gyara daga gare ta zuwa ƙungiyoyin sake gyarawa.
    Waɗannan jinkirin, ba shakka, ya kamata kuma suyi tasiri ga abubuwan da ke cikin bayanan bayanan don ma'amaloli.
  3. Alamar db_block_checksum: da kyau, kuma, a bayyane yake - sarrafa ma'amala yana haifar da samuwar darty blocks a cikin ma'ajin buffer na subdatabase.
    Wanne, lokacin da aka kunna checksums na datablocks, dole ne a aiwatar da ma'ajin bayanai - lissafin waɗannan kididdigar daga jikin bayanan, bincika su da abin da aka rubuta a cikin taken datablock: matches/bai dace ba.
    Irin wannan aikin, kuma, ba zai iya jinkirta sarrafa bayanai ba, sabili da haka, siga da tsarin da ke saita wannan siga ya zama mahimmanci.
    Abin da ya sa mai sayarwa ya ba da, a cikin takardun don wannan ma'auni, daban-daban dabi'u da kuma bayanin cewa a, za a yi tasiri, amma zaka iya zaɓar dabi'u daban-daban, har ma da "kashe" da tasiri daban-daban.

To, ƙarshen duniya.

Hanyar, a gaba ɗaya, ta zama mai aiki sosai.

Ya ba da damar kansa sosai, a farkon matakan gwajin lodi na wani tsarin sabis, don zaɓar tsarin sa (tsarin) mafi kyawun tsari don ɗaukar nauyi, ba don zurfafawa cikin ƙayyadaddun ƙayyadaddun tsarin don ɗaukar nauyi ba.

Amma ba ya ware shi gaba ɗaya - aƙalla a matakin fahimta: dole ne a san tsarin game da "kullun gyare-gyare" da kuma iyakoki na jujjuyawar waɗannan kullin.

Hanyar da za a iya sa'an nan a gwada da sauri sami mafi kyawun tsarin tsarin.
Kuma dangane da sakamakon gwajin, yana yiwuwa a sami bayanai game da yanayin dangantakar dake tsakanin ma'aunin aikin tsarin da ƙimar sigogin saitunan tsarin.

Wanne, ba shakka, ya kamata ya ba da gudummawa ga fitowar wannan zurfin fahimtar tsarin, aikinsa, aƙalla ƙarƙashin nauyin da aka ba shi.

A aikace, wannan shine musayar farashin fahimtar tsarin da aka keɓance don farashin shirya irin wannan gwajin na tsarin.

Ina so in lura daban: a cikin wannan tsarin, ƙimar isasshiyar gwajin tsarin zuwa yanayin aiki wanda zai samu a cikin kasuwancin kasuwanci yana da mahimmanci.

Na gode da kulawa da lokacin ku.

source: www.habr.com

Add a comment