Sakin re2c lexer janareta 1.2

ya faru saki re2c, janareta na nazari na lexical kyauta don harsunan C da C++. Ka tuna cewa re2c an rubuta shi a cikin 1993 ta hanyar Peter Bambulis a matsayin janareta na gwaji na masu nazari na lexical mai sauri, ya bambanta da sauran janareta a cikin saurin lambar da aka ƙirƙira da ƙirar mai amfani da ba ta dace ba wanda ke ba da damar masu nazari su kasance cikin sauƙi da ingantaccen haɗawa cikin lambar data kasance. tushe. Tun daga wannan lokacin, al'umma ne suka samar da aikin kuma ya ci gaba da kasancewa dandalin gwaje-gwaje da bincike a fannin nahawu da na'urori masu iyaka.

An ɗauki kusan shekara guda ana shirye-shiryen sakin. Yawancin lokaci, kamar yadda aka saba, an kashe shi don haɓaka tsarin ka'idar da rubutu
labarai"Ingantacciyar Haɓakar Matsala ta POSIX akan NFA".
Algorithms da aka bayyana a cikin labarin ana aiwatar da su a cikin ɗakin karatu na gwaji libre2c
(an hana gina ɗakin karatu da gwaje-gwajen aiki ta tsohuwa kuma an kunna ta ta hanyar saita zaɓi "-enable-libs"). Ba a nufin ɗakin karatu a matsayin mai fafatawa ga ayyukan da ake da su kamar RE2 ba, amma a matsayin dandalin bincike don haɓaka sababbin.
algorithms (wanda za'a iya amfani dashi a cikin re2c ko a wasu ayyukan). Hakanan ya dace daga ra'ayi na gwaji, auna aikin da ƙirƙirar ɗaure zuwa wasu harsuna.

Babban sabbin abubuwa a cikin sigar re2c 1.2:

  • An ƙara sabuwar hanyar sauƙaƙe don duba ƙarshen bayanan shigarwa ("Dokar EOF"). Don wannan, an ƙara ƙirar "re2c: eof",
    yana ba ka damar zaɓar hali na ƙarshe,
    da ƙa'ida ta musamman "$", wanda aka jawo idan lexer
    cikin nasara ya kai ƙarshen bayanan shigar.
    A tarihi, re2c yana ba da zaɓi na hanyoyin tabbatarwa da yawa don
    ƙarshen abubuwan shigar da suka bambanta cikin iyakancewa, inganci da sauƙi
    aikace-aikace. An tsara sabuwar hanyar don sauƙaƙe lambar rubutu, yayin da
    yayin da ya kasance mai tasiri kuma ya dace. Tsofaffin hanyoyin
    har yanzu yana aiki kuma ana iya fifita shi a wasu lokuta.

  • Ƙara ikon haɗa fayilolin waje ta amfani da umarni
    "/*!include:re2c "file.re" */", inda "file.re" shine sunan fayil ɗin da za a haɗa. Re2c yana neman fayiloli a ciki har da directory fayil,
    haka kuma a cikin jerin hanyoyin da aka ƙayyade ta amfani da zaɓin "-I".
    Fayilolin da aka haɗa suna iya haɗawa da wasu fayiloli.
    Re2c yana ba da fayilolin "misali" a cikin "haɗa/" directory
    aikin - ana sa ran cewa ma'anoni masu amfani zasu tara a can
    maganganu na yau da kullum, wani abu kamar ɗakin karatu na yau da kullum.
    Ya zuwa yanzu, bisa buƙatar ma'aikata, an ƙara fayil ɗaya mai ma'anar nau'ikan Unicode.

  • Ƙara ikon samar da fayilolin kai tare da sabani
    abun ciki ta amfani da zaɓuɓɓukan "-t --type-header" (ko dace
    daidaitawa) da sabbin umarni "/*! header:re2c:on*/" da
    "/*! header:re2c:off*/". Wannan na iya zama da amfani a lokuta inda
    lokacin da re2c yana buƙatar samar da ma'anar masu canji, sifofi da macros,
    ana amfani da shi a cikin wasu sassan fassarar.

  • Re2c yanzu ya fahimci ainihin UTF8 da azuzuwan halaye a cikin maganganun yau da kullun.
    Ta hanyar tsohuwa, re2c yana fassara kalmomi kamar "∀x∃y" kamar
    jerin haruffa 1-bit ASCII "e2 88 80 78 20 e2 88 83 79"
    (lambobin hex), kuma masu amfani dole ne su tsere wa haruffa Unicode da hannu:
    "\u2200x \\ u2203y." Wannan yana da matukar damuwa kuma ba zato ba tsammani ga mutane da yawa
    masu amfani (kamar yadda rahotannin bugu akai-akai suka tabbatar). Don haka yanzu
    re2c yana ba da zaɓi "--input-encoding {ascii | utf8}",
    wanda ke ba ka damar canza hali kuma ka rarraba "∀x ∃y" kamar yadda
    "2200 78 20 2203 79."

  • Re2c yanzu yana ba da damar yin amfani da tubalan re2c na yau da kullun a cikin yanayin "-r --sake amfani".
    Wannan ya dace idan fayil ɗin shigarwa ya ƙunshi tubalan da yawa kuma kawai wasu daga cikinsu
    yana buƙatar sake amfani da shi.

  • Yanzu zaku iya saita tsarin faɗakarwa da saƙon kuskure
    ta amfani da sabon zaɓi "--location-format {gnu | msvc}". Ana nuna tsarin GNU
    a matsayin "filename:line:column:", da kuma tsarin MSVC a matsayin "sunan fayil (layi, shafi)".
    Wannan fasalin yana iya zama da amfani ga masu son IDE.
    An kuma ƙara wani zaɓi na "--verbose", wanda ke nuna ɗan gajeren saƙon nasara idan an yi nasara.

  • An inganta yanayin "daidaituwa" tare da sassauƙa - an gyara wasu kurakurai na tantancewa kuma
    gabanin ma'aikaci ba daidai ba a lokuta da ba kasafai ba.
    A tarihi, zaɓin "-F --flex-support" ya ba ku damar rubuta lamba
    gauraye cikin salo mai sassauƙa da salon re2c, wanda ke sa yin nazari da ɗan wahala.
    Yanayin dacewa ba a cika yin amfani da shi ba a sabuwar lamba,
    amma re2c ya ci gaba da tallafa masa don dacewa da baya.

  • Ma'aikacin ragi na ajin "/" yanzu yana aiki
    kafin faɗaɗa ɓoyayyen ɓoye, wanda ke ba da damar yin amfani da shi a cikin mafi yawan lokuta,
    idan aka yi amfani da rikodin tsawon tsayin haruffa (misali UTF8).

  • Fayil ɗin fitarwa yanzu an ƙirƙira shi ta atomatik: re2c yana fara ƙirƙirar fayil na wucin gadi
    kuma ya rubuta sakamakon a ciki, sannan ya sake suna fayil ɗin wucin gadi zuwa fitarwa
    aiki daya.

  • An kammala takaddun kuma an sake rubutawa; musamman, an kara sababbi
    surori game da http://re2c.org/manual/manual.html#buffer-refilling заполнение буфера
    и game da hanyoyin duba ƙarshen bayanan shigarwa.
    Ana tattara sabon takaddun a cikin tsari
    cikakken jagora mai shafi ɗaya
    tare da misalai (ana yin tushe iri ɗaya a cikin manpage da a cikin takaddun kan layi).
    An yi ƙoƙari mara kyau don inganta karatun shafin akan wayoyi.

  • Daga ra'ayin masu haɓakawa, re2c ya sami ƙarin tsarin tsarin ƙasa
    gyara kurakurai An kashe lambar cire kuskure a cikin ginin saki da
    za a iya kunna ta amfani da zaɓin daidaitawa "--enable-debug".

source: budenet.ru

Add a comment