Me ke faruwa da ajiyar RDF yanzu?

Yanar Gizon Yanar Gizon Semantic da Bayanan Haɗi kamar sararin samaniya ne: babu rayuwa a wurin. Don zuwa can na wani lokaci mai tsawo ko žasa ... Ban san abin da suka gaya maka lokacin yaro ba a mayar da martani ga "Ina so in zama dan sama jannati." Amma kuna iya lura da abin da ke faruwa yayin da kuke Duniya; Yana da sauƙin zama masanin falaki mai son ko ma ƙwararru.

Labarin zai mayar da hankali kan kwanan nan, wanda bai girmi watanni da yawa ba, abubuwan da ke faruwa daga duniyar ajiyar RDF. Misalin da ke cikin sakin layi na farko an yi wahayi zuwa gare shi ta hoton talla mai girman almara a ƙarƙashin yanke.


Hoton almara

Me ke faruwa da ajiyar RDF yanzu?

I. GraphQL don samun damar RDF

Suna cewacewa GraphQL yana nufin ya zama harshen samun damar bayanai na duniya. Menene game da ikon shiga RDF ta amfani da GraphQL?

Daga cikin akwatin wannan damar da:

Idan ma'adanin bai ba da irin wannan damar ba, ana iya aiwatar da shi da kansa ta hanyar rubuta "mai warwarewa" mai dacewa. Wannan shi ne abin da suka yi, alal misali, a cikin aikin Faransanci DataTourism. Ko kuma ba za ku iya sake rubuta wani abu ba, amma kawai ɗauka HyperGraphQL.

Daga ra'ayi na orthodox mabiyi na Semantic Web da kuma Linked Data, duk wannan, ba shakka, yana da baƙin ciki, tun da alama an tsara don integrations gina a kusa da na gaba data silo, kuma ba dace dandamali (RDF Stores, ba shakka). .

Abubuwan da aka samu daga kwatanta GraphQL da SPARQL sau biyu ne.

  • A gefe guda, GraphQL yana kama da dangi mai nisa na SPARQL: yana magance matsalolin sakewa da kuma yawan tambayoyin da suka saba da REST - ba tare da wanda, mai yiwuwa ba, ba zai yiwu a yi la'akari ba. harshen tambaya, aƙalla don yanar gizo;
  • A gefe guda, ƙaƙƙarfan makirci na GraphQL abin takaici ne. Saboda haka, "hankalinsa" yana da alama yana da iyakancewa idan aka kwatanta da cikakken ra'ayi na RDF. Kuma babu wani analogue na hanyoyin dukiya, don haka ba a bayyana ma dalilin da yasa yake "Graph-".

II. Adafta don MongoDB

Halin da ya dace da na baya.

  • A cikin Stardog yanzu yiwu - musamman, duk akan GraphQL iri ɗaya - saita taswirar bayanan MongoDB cikin jadawali na RDF;
  • Ontotext GraphDB yana kwanan nan Yana da damar saka gutsutsutsu cikin SPARQL akan Tambayar MongoDB.

Idan muka yi magana da yawa game da adaftan zuwa tushen JSON, waɗanda ke ba da izinin ƙari ko žasa "a kan tashi" don wakiltar JSON da aka adana a cikin waɗannan kafofin azaman RDF, zamu iya tunawa da tsayin daka. Ƙirƙirar SPARQL, wanda za a iya gyarawa, misali, zuwa Apache Jena.

Taƙaice abubuwan da suka faru biyu na farko, zamu iya cewa ɗakunan ajiya na RDF suna nuna cikakken shirye-shiryen haɗin kai da aiki a cikin yanayin "nauyin polyglot". An sani, duk da haka, cewa wannan karshen ya dade ya fita daga salon, kuma ana maye gurbinsa da shi yana zuwa Multi-samfuri. Me game da Multi-modeling a cikin duniyar ajiyar RDF?

A takaice dai, babu wata hanya. Ina so in keɓe wani labarin dabam ga batun DBMSs masu yawan ƙima, amma a yanzu ana iya lura cewa a halin yanzu babu DBMSs masu yawa "dangane" akan ƙirar hoto (RDF za a iya la'akari da shi azaman nau'insa) . Wasu ƙananan ƙira-ƙira-ƙira-ƙira-ƙira-ƙira-ƙira-ƙiraren RDF don madadin ƙirar jadawali na LPG - za a tattauna a ciki sashen V.

III. OLTP vs. OLAP

Duk da haka, guda Gartner Ya rubuta cewacewa multimodel shine sine qua non condition da farko don dakunan aiki DBMS. Wannan abu ne mai fahimta: a cikin halin da ake ciki na "ajiya mai yawa", manyan matsalolin sun taso tare da ma'amala.

Amma ina ma'ajiyar RDF ke kan sikelin OLTP-OLAP? Zan amsa ta wannan hanya: ba a can ko nan ba. Don nuna abin da ake nufi da su, ana buƙatar taƙaitaccen taƙaitaccen abu na uku. A matsayin zaɓi zan ba da shawara OLIP - Gudanar da Fasaha ta Kan layi.

Duk da haka, har yanzu:

  • hanyoyin haɗin kai tare da MongoDB da aka aiwatar a cikin GraphDB ba kaɗan ba ne nufi don yin aiki a kusa da rubuta al'amurran da suka shafi aiki;
  • Stardog ya ci gaba da gaba da gaba daya sake rubutawa injin, kuma tare da burin inganta aikin rikodi.

Yanzu bari in gabatar da sabon dan wasa a kasuwa. Daga masu kirkiro IBM Netezza da Amazon Redshift - AnzoGraph™. An buga hoto daga tallace-tallace na samfurin bisa ga shi a farkon labarin. AnzoGraph yana sanya kanta azaman maganin GOLAP. Yaya kuke son SPARQL tare da ayyukan taga? -

SELECT ?month (COUNT(?event) OVER (PARTITION BY ?month) AS ?events) WHERE {  …  }

IV. RocksDB

Ya riga ya girma akwai hanyar haɗi zuwa sanarwar Stardog 7 Beta, wanda ya ce Stardog zai yi amfani da RocksDB a matsayin tsarin ajiya mai mahimmanci - maɓalli mai mahimmanci, cokali mai yatsa na Facebook na Google's LevelDB. Me ya sa yake da daraja magana game da wani yanayi?

Na farko, yin hukunci da Labarin Wikipedia, Ba ma'ajin RDF kawai ake "dasa su" zuwa RocksDB. Akwai ayyukan da za a yi amfani da RocksDB azaman injin ajiya a ArangoDB, MongoDB, MySQL da MariaDB, Cassandra.

Abu na biyu, ayyukan (wato, ba samfura ba) akan batutuwa masu dacewa an ƙirƙira su akan RocksDB.

Misali, eBay yana amfani da RocksDB a ciki dandamali don "jadawali na ilimi". Af, yana da ban dariya karanta: Harshen tambaya ya fara ne azaman tsarin girma na gida, amma kwanan nan yana canzawa ya zama kamar SPARQL.. Kamar a cikin barkwanci: komai yawan jadawali na ilimin da muka yi, har yanzu muna ƙarewa da RDF.

Wani misali - wanda ya bayyana 'yan watanni da suka wuce Sabis na Tambayar Tarihin Wikidata. Kafin gabatarwar ta, Wikidata bayanan tarihi dole ne a shiga ta hanyar MWAPI zuwa daidaitaccen Mediawiki API. Yanzu mai yawa yana yiwuwa tare da SPARQL mai tsabta. "A ƙarƙashin hular" akwai kuma RocksDB. Af, WDHQS an yi shi, ga alama, ta wanda ya shigo da Freebase a cikin Google Knowledge Graph.

V. LPG goyon baya

Bari in tunatar da ku babban bambanci tsakanin jadawali na LPG da jadawali na RDF.

A cikin LPG, ana iya sanya kaddarorin sikeli zuwa ga misalan gefen, yayin da a cikin RDF za a iya sanya su kawai zuwa “nau’i-nau’i” (amma ba kawai kaddarorin scalar ba, har ma da haɗin kai na yau da kullun). Wannan iyakancewar RDF idan aka kwatanta da LPG nasara daya ko wata dabara dabara. Iyakar LPG idan aka kwatanta da RDF sun fi wahalar shawo kan su, amma jadawali na LPG sun fi kama da hotuna daga littafin Harari fiye da jadawali na RDF, shi ya sa mutane ke son su.

Babu shakka, aikin "tallafin LPG" ya faɗi kashi biyu:

  1. yin canje-canje ga tsarin RDF wanda ke ba da damar yin kwaikwayon tsarin LPG a ciki;
  2. yin canje-canje ga yaren tambayar RDF waɗanda ke ba da damar samun damar bayanai a cikin wannan ƙirar da aka gyara, ko aiwatar da ikon yin tambayoyi ga wannan ƙirar a cikin shahararrun yarukan tambayar LPG.

V.1. Samfurin bayanai

Akwai hanyoyi da yawa masu yiwuwa a nan.

V.1.1. Singleton Property

Hanyar da ta fi dacewa don daidaita RDF da LPG mai yiwuwa ne singleton dukiya:

  • Maimakon, misali, predicate :isMarriedTo ana amfani da predicates :isMarriedTo1, :isMarriedTo2 i t. d.
  • Waɗannan abubuwan annabta sai su zama batutuwan sabbin 'yan uku: :isMarriedTo1 :since "2013-09-13"^^xsd:date da sauransu.
  • Haɗin waɗannan misalan na predicates tare da gama gari an kafa su ta hanyar nau'i uku :isMarriedTo1 rdf:singletonPropertyOf :isMarriedTo.
  • Babu shakka cewa rdf:singletonPropertyOf rdfs:subPropertyOf rdf:type, amma ka yi tunanin dalilin da ya sa bai kamata ka rubuta kawai ba :isMarriedTo1 rdf:type :isMarriedTo.

Ana magance matsalar "tallafin LPG" anan a matakin RDFS. Irin wannan shawarar yana buƙatar haɗawa cikin dacewa misali. Ana iya buƙatar wasu canje-canje don shagunan RDF waɗanda ke goyan bayan haɗewar sakamako, amma a yanzu, Kadarorin Singleton ana iya ɗauka azaman wata dabarar ƙirar ƙira.

V.1.2. Gyarawa Anyi Daidai

Ƙananan hanyoyi na butulci sun samo asali ne daga fahimtar cewa al'amuran kadarorin suna nan take ta 'yan uku. Ta hanyar samun damar faɗin wani abu game da 'yan uku, za mu iya magana game da al'amuran dukiya.

Mafi ƙarfi daga cikin waɗannan hanyoyin shine RDF*, aka RDR, haife a cikin zurfin Blazegraph. Yana daga farkon zabe don kanka da AnzoGraph. Ƙarfafawar tsarin yana ƙaddara ta gaskiyar cewa a cikin tsarinsa miƙa daidai canje-canje a RDF Semantics. Batun, duk da haka, yana da sauƙin gaske. A cikin serialization na Kunkuru na RDF yanzu zaku iya rubuta wani abu kamar haka:

<<:bob :isMarriedTo :alice>> :since "2013-09-13"^^xsd:date .

V.1.3. Sauran hanyoyin

Ba za ku iya damu da ilimin tauhidi na yau da kullun ba, amma kawai ku ɗauka cewa 'yan uku suna da wasu abubuwan ganowa, waɗanda, ba shakka, URIs ne, kuma suna ƙirƙirar sabbin uku tare da waɗannan URIs. Abin da ya rage shi ne ba da dama ga waɗannan URIs a cikin SPARQL. Don haka ya isa Stardog.

A cikin Allegrograph tafi ta hanyar tsaka-tsaki. An san cewa masu gano sau uku a Allegrograph ne, amma lokacin aiwatar da halayen sau uku ba sa tsayawa. Duk da haka, har yanzu yana da nisa sosai daga ilimin tauhidi. Yana da kyau a lura cewa halayen uku ba URI ba ne, kuma ƙimar waɗannan halayen na iya zama zahiri kawai. Mabiya LPG suna samun daidai abin da suke so. A cikin tsarin NQX na musamman, misali mai kama da wanda ke sama don RDF* yayi kama da wannan:

:bob :marriedTo :alice {"since" : "2013-09-13"}

V.2. Harsunan tambaya

Bayan goyon bayan LPG ta hanya ɗaya ko wata a matakin ƙirar, kuna buƙatar ba da damar yin tambayoyi kan bayanai a cikin irin wannan ƙirar.

  • Blazegraph don tambayoyin RDF* yana goyan bayan SPARQL* и Gremlin. Tambayar SPARQL* tayi kama da haka:

 SELECT * { <<:bob :isMarriedTo ?wife>> :since ?since }

  • Anzograph kuma yana goyan bayan SPARQL* kuma zai taimaka Mawaki, harshen tambaya a cikin Neo4j.
  • Stardog yana goyan bayan nasa fadada SPARQL da sake Gremlin. Kuna iya samun URI uku da "bayanan meta" a cikin SPARQL ta amfani da wani abu kamar haka:

SELECT * {
    BIND (stardog:identifier(:bob, :isMarriedTo, ?wife) AS ?id)
    ?id :since ?since
}

  • Allegrograph kuma yana goyan bayan nasa fadada SPARQL:

 SELECT * { ("since" ?since)  franz:attributesNameValue  ( :bob :marriedTo ?wife ) }

Af, GraphDB a lokaci guda yana goyan bayan Tinkerpop/Gremlin ba tare da tallafawa LPG ba, amma wannan ya tsaya a sigar 8.0 ko 8.1.

VI. Tsananin lasisi

Ba a sami ƙarin abubuwan da suka faru kwanan nan ba a mahadar “Tsarin zaɓin zaɓi na uku” da “buɗewar tushen triplestore”. Sabbin shagunan RDF masu buɗewa suna da nisa daga kasancewa kyakkyawan zaɓi don amfanin yau da kullun, kuma sabbin shagunan sau uku waɗanda nake so in yi amfani da su (kamar AnzoGraph) tushen rufe suke. Maimakon haka, zamu iya magana game da raguwa ...

Tabbas, ba a rufe tushen buɗe ido a baya ba, amma wasu wuraren buɗaɗɗen mabuɗin a hankali ba a ganin su a matsayin cancantar zaɓar. Virtuoso, wanda ke da bugu na tushen buɗe ido, shine, a ganina, nutsewa cikin kwari. Blazegraph ya sayi AWS kuma ya kafa tushen Amazon Neptune; yanzu dai ba a san ko za a sake sakin akalla guda daya ba. Jena kawai ta rage...

Idan bude tushen ba shi da mahimmanci, amma kawai kuna son gwada shi, to, komai ma ya yi ƙasa da rosy fiye da da. Misali:

  • Stardog tsayawa rarraba sigar kyauta (duk da haka, lokacin gwaji na sigar yau da kullun ya ninka sau biyu);
  • в GraphDB Cloud, inda a baya zaku iya zaɓar tsarin asali na kyauta, an dakatar da sabbin rajistar mai amfani.

Gabaɗaya, ga matsakaicin mutum IT, sararin samaniya yana ƙara zama wanda ba zai iya isa ba; haɓakarsa yana zama yawancin kamfanoni.

source: www.habr.com

Add a comment