Nthawi zambiri, mukamagwira ntchito ndi yankho lolandiridwa kuchokera ku API, kapena ndi data ina iliyonse yomwe ili ndi mtengo wovuta, mumakumana ndi mawonekedwe a JSON ndi XML.
Mawonekedwewa ali ndi zabwino zambiri: amasunga deta molumikizana bwino ndikupewa kubwereza kosafunikira.
Choyipa cha mawonekedwewa ndizovuta kuzikonza ndikuzisanthula. Zosasinthika sizingagwiritsidwe ntchito powerengera kapena kuwona.

Nkhaniyi ndi kupitiriza komveka kwa kufalitsa Zikuthandizani kuti musinthe ma data osakhazikika kukhala mawonekedwe odziwika bwino komanso osasunthika mosavuta pogwiritsa ntchito phukusi. tidyr, yomwe ili gawo la maziko a library tidyverse, ndi ntchito zake za m’banja unnest_*().
Zamkatimu
Ngati mukufuna kusanthula deta, mungakhale ndi chidwi changa и njira. Zambiri mwazomwe zimaperekedwa ku chilankhulo cha R.
Mau oyamba
Rectangling (Zomwe womasulira: Sindinapeze zomasulira zokwanira za liwuli, ndiye tisiya momwe zilili.) - ndi njira yosinthira deta yosasinthika yokhala ndi zisa kukhala tebulo lamitundu iwiri yokhala ndi mizere yodziwika bwino. tidyr Pali ntchito zingapo zomwe zingakuthandizeni kukulitsa mizati ya mndandanda ndikusintha deta kukhala mawonekedwe amakona anayi, tabular:
unnest_longer()imatenga chinthu chilichonse chamndandanda ndikupanga mzere watsopano.unnest_wider()amatenga chinthu chilichonse chamndandanda wazambiri ndikupanga gawo latsopano.unnest_auto()zimangozindikira kuti ndi ntchito iti yomwe ili yabwino kugwiritsa ntchito
unnest_longer()kapenaunnest_wider().hoist()ofanana ndiunnest_wider()koma amasankha zigawo zomwe zafotokozedwa ndikulola kugwira ntchito ndi magawo angapo a zisa.
Mavuto ambiri okhudzana ndi kutembenuza deta yosasinthika ndi magawo angapo a zisa ku tebulo lamitundu iwiri akhoza kuthetsedwa mwa kuphatikiza ntchito zomwe zalembedwa ndi dplyr.
Kuti tiwonetse njirazi, tidzagwiritsa ntchito phukusi repurrrsive, yomwe imapereka mindandanda yambiri yovuta, yamagulu angapo yotengedwa kuchokera pa intaneti ya API.
library(tidyr)
library(dplyr)
library(repurrrsive)Ogwiritsa ntchito a GitHub
Yambani ndi gh_ogwiritsa, mndandanda womwe uli ndi zambiri za ogwiritsa ntchito a GitHub asanu ndi limodzi. Choyamba, tiyeni tisinthe mndandanda gh_ogwiritsa в tibble chimango:
users <- tibble( user = gh_users ) Izi zikuwoneka ngati zotsutsana: chifukwa chiyani mumapereka mndandanda? gh_ogwiritsa, ku dongosolo la deta lovuta kwambiri? Koma chimango cha deta chili ndi ubwino waukulu: chimaphatikizapo ma vector angapo, kotero chirichonse chimatsatiridwa mu chinthu chimodzi.
Chigawo chilichonse cha chinthucho users ndi mndandanda wamatchulidwe pomwe chinthu chilichonse chikuyimira mzati.
names(users$user[[1]])
#> [1] "login" "id" "avatar_url"
#> [4] "gravatar_id" "url" "html_url"
#> [7] "followers_url" "following_url" "gists_url"
#> [10] "starred_url" "subscriptions_url" "organizations_url"
#> [13] "repos_url" "events_url" "received_events_url"
#> [16] "type" "site_admin" "name"
#> [19] "company" "blog" "location"
#> [22] "email" "hireable" "bio"
#> [25] "public_repos" "public_gists" "followers"
#> [28] "following" "created_at" "updated_at"Pali njira ziwiri zosinthira zigawo za mndandanda kukhala mizati. unnest_wider() amatenga gawo lililonse ndikupanga gawo latsopano:
users %>% unnest_wider(user)
#> # A tibble: 6 x 30
#> login id avatar_url gravatar_id url html_url followers_url
#> <chr> <int> <chr> <chr> <chr> <chr> <chr>
#> 1 gabo… 6.60e5 https://a… "" http… https:/… https://api.…
#> 2 jenn… 5.99e5 https://a… "" http… https:/… https://api.…
#> 3 jtle… 1.57e6 https://a… "" http… https:/… https://api.…
#> 4 juli… 1.25e7 https://a… "" http… https:/… https://api.…
#> 5 leep… 3.51e6 https://a… "" http… https:/… https://api.…
#> 6 masa… 8.36e6 https://a… "" http… https:/… https://api.…
#> # … with 23 more variables: following_url <chr>, gists_url <chr>,
#> # starred_url <chr>, subscriptions_url <chr>, organizations_url <chr>,
#> # repos_url <chr>, events_url <chr>, received_events_url <chr>,
#> # type <chr>, site_admin <lgl>, name <chr>, company <chr>, blog <chr>,
#> # location <chr>, email <chr>, public_repos <int>, public_gists <int>,
#> # followers <int>, following <int>, created_at <chr>, updated_at <chr>,
#> # bio <chr>, hireable <lgl>Pachifukwa ichi tili ndi tebulo lokhala ndi mizati 30, ndipo sitidzafunikira zambiri, kotero tikhoza unnest_wider() ntchito hoist(). hoist() imatithandiza kuchotsa zigawo zosankhidwa pogwiritsa ntchito mawu ofanana ndi purrr::pluck():
users %>% hoist(user,
followers = "followers",
login = "login",
url = "html_url"
)
#> # A tibble: 6 x 4
#> followers login url user
#> <int> <chr> <chr> <list>
#> 1 303 gaborcsardi https://github.com/gaborcsardi <named list [27]>
#> 2 780 jennybc https://github.com/jennybc <named list [27]>
#> 3 3958 jtleek https://github.com/jtleek <named list [27]>
#> 4 115 juliasilge https://github.com/juliasilge <named list [27]>
#> 5 213 leeper https://github.com/leeper <named list [27]>
#> 6 34 masalmon https://github.com/masalmon <named list [27]>hoist() imachotsa zigawo zomwe zatchulidwa pamndandanda wosuta, kotero inu mukhoza kulingalira hoist() momwe mungasunthire zigawo kuchokera pamndandanda wamkati wa chimango cha data kupita kumlingo wake wapamwamba.
Zosungirako za Github
Kuwongolera mndandanda gh_repos timayamba mofanana, kusandulika kukhala tibble:
repos <- tibble(repo = gh_repos)
repos
#> # A tibble: 6 x 1
#> repo
#> <list>
#> 1 <list [30]>
#> 2 <list [30]>
#> 3 <list [30]>
#> 4 <list [26]>
#> 5 <list [30]>
#> 6 <list [30]>Nthawi iyi zinthu wosuta kuyimira mndandanda wa nkhokwe za wogwiritsa ntchitoyu. Chosungira chilichonse chimakhala chowonera padera, motero malinga ndi lingaliro la data yokonzedwa bwino (chidziwitso: data yokhazikika) ziyenera kukhala mizere yatsopano, chifukwa chake timagwiritsa ntchito unnest_longer() koma ayi unnest_wider():
repos <- repos %>% unnest_longer(repo)
repos
#> # A tibble: 176 x 1
#> repo
#> <list>
#> 1 <named list [68]>
#> 2 <named list [68]>
#> 3 <named list [68]>
#> 4 <named list [68]>
#> 5 <named list [68]>
#> 6 <named list [68]>
#> 7 <named list [68]>
#> 8 <named list [68]>
#> 9 <named list [68]>
#> 10 <named list [68]>
#> # … with 166 more rowsTsopano titha kugwiritsa ntchito unnest_wider() kapena hoist() :
repos %>% hoist(repo,
login = c("owner", "login"),
name = "name",
homepage = "homepage",
watchers = "watchers_count"
)
#> # A tibble: 176 x 5
#> login name homepage watchers repo
#> <chr> <chr> <chr> <int> <list>
#> 1 gaborcsardi after <NA> 5 <named list [65]>
#> 2 gaborcsardi argufy <NA> 19 <named list [65]>
#> 3 gaborcsardi ask <NA> 5 <named list [65]>
#> 4 gaborcsardi baseimports <NA> 0 <named list [65]>
#> 5 gaborcsardi citest <NA> 0 <named list [65]>
#> 6 gaborcsardi clisymbols "" 18 <named list [65]>
#> 7 gaborcsardi cmaker <NA> 0 <named list [65]>
#> 8 gaborcsardi cmark <NA> 0 <named list [65]>
#> 9 gaborcsardi conditions <NA> 0 <named list [65]>
#> 10 gaborcsardi crayon <NA> 52 <named list [65]>
#> # … with 166 more rowsSamalani kugwiritsa ntchito c("owner", "login"): izi zimatipatsa mwayi wopeza mulingo wachiwiri kuchokera pamndandanda wokhazikika ownerNjira ina ndiyo kupeza mndandanda wonse owner ndiyeno kugwiritsa ntchito unnest_wider() ikani chilichonse mwazinthu zake muzambiri:
repos %>%
hoist(repo, owner = "owner") %>%
unnest_wider(owner)
#> # A tibble: 176 x 18
#> login id avatar_url gravatar_id url html_url followers_url
#> <chr> <int> <chr> <chr> <chr> <chr> <chr>
#> 1 gabo… 660288 https://a… "" http… https:/… https://api.…
#> 2 gabo… 660288 https://a… "" http… https:/… https://api.…
#> 3 gabo… 660288 https://a… "" http… https:/… https://api.…
#> 4 gabo… 660288 https://a… "" http… https:/… https://api.…
#> 5 gabo… 660288 https://a… "" http… https:/… https://api.…
#> 6 gabo… 660288 https://a… "" http… https:/… https://api.…
#> 7 gabo… 660288 https://a… "" http… https:/… https://api.…
#> 8 gabo… 660288 https://a… "" http… https:/… https://api.…
#> 9 gabo… 660288 https://a… "" http… https:/… https://api.…
#> 10 gabo… 660288 https://a… "" http… https:/… https://api.…
#> # … with 166 more rows, and 11 more variables: following_url <chr>,
#> # gists_url <chr>, starred_url <chr>, subscriptions_url <chr>,
#> # organizations_url <chr>, repos_url <chr>, events_url <chr>,
#> # received_events_url <chr>, type <chr>, site_admin <lgl>, repo <list>M'malo moganizira kusankha ntchito yoyenera unnest_longer() kapena unnest_wider() mungagwiritse ntchito unnest_auto()Ntchitoyi imagwiritsa ntchito ma heuristic angapo kuti isankhe ntchito yoyenera kwambiri yosinthira deta ndikuwonetsa uthenga wokhudza njira yosankhidwa.
tibble(repo = gh_repos) %>%
unnest_auto(repo) %>%
unnest_auto(repo)
#> Using `unnest_longer(repo)`; no element has names
#> Using `unnest_wider(repo)`; elements have 68 names in common
#> # A tibble: 176 x 67
#> id name full_name owner private html_url description fork url
#> <int> <chr> <chr> <lis> <lgl> <chr> <chr> <lgl> <chr>
#> 1 6.12e7 after gaborcsa… <nam… FALSE https:/… Run Code i… FALSE http…
#> 2 4.05e7 argu… gaborcsa… <nam… FALSE https:/… Declarativ… FALSE http…
#> 3 3.64e7 ask gaborcsa… <nam… FALSE https:/… Friendly C… FALSE http…
#> 4 3.49e7 base… gaborcsa… <nam… FALSE https:/… Do we get … FALSE http…
#> 5 6.16e7 cite… gaborcsa… <nam… FALSE https:/… Test R pac… TRUE http…
#> 6 3.39e7 clis… gaborcsa… <nam… FALSE https:/… Unicode sy… FALSE http…
#> 7 3.72e7 cmak… gaborcsa… <nam… FALSE https:/… port of cm… TRUE http…
#> 8 6.80e7 cmark gaborcsa… <nam… FALSE https:/… CommonMark… TRUE http…
#> 9 6.32e7 cond… gaborcsa… <nam… FALSE https:/… <NA> TRUE http…
#> 10 2.43e7 cray… gaborcsa… <nam… FALSE https:/… R package … FALSE http…
#> # … with 166 more rows, and 58 more variables: forks_url <chr>,
#> # keys_url <chr>, collaborators_url <chr>, teams_url <chr>,
#> # hooks_url <chr>, issue_events_url <chr>, events_url <chr>,
#> # assignees_url <chr>, branches_url <chr>, tags_url <chr>,
#> # blobs_url <chr>, git_tags_url <chr>, git_refs_url <chr>,
#> # trees_url <chr>, statuses_url <chr>, languages_url <chr>,
#> # stargazers_url <chr>, contributors_url <chr>, subscribers_url <chr>,
#> # subscription_url <chr>, commits_url <chr>, git_commits_url <chr>,
#> # comments_url <chr>, issue_comment_url <chr>, contents_url <chr>,
#> # compare_url <chr>, merges_url <chr>, archive_url <chr>,
#> # downloads_url <chr>, issues_url <chr>, pulls_url <chr>,
#> # milestones_url <chr>, notifications_url <chr>, labels_url <chr>,
#> # releases_url <chr>, deployments_url <chr>, created_at <chr>,
#> # updated_at <chr>, pushed_at <chr>, git_url <chr>, ssh_url <chr>,
#> # clone_url <chr>, svn_url <chr>, size <int>, stargazers_count <int>,
#> # watchers_count <int>, language <chr>, has_issues <lgl>,
#> # has_downloads <lgl>, has_wiki <lgl>, has_pages <lgl>,
#> # forks_count <int>, open_issues_count <int>, forks <int>,
#> # open_issues <int>, watchers <int>, default_branch <chr>,
#> # homepage <chr>Otchulidwa a Game of Thrones
got_chars ali ndi mawonekedwe ofanana ndi gh_users: ndi mndandanda wa mndandanda wa mayina, pomwe gawo lililonse la mndandanda wamkati limafotokoza zamtundu wa Game of Thrones. got_chars Kuti tipange mawonedwe a tebulo, timayamba ndikupanga chithunzi cha data, monga momwe zilili m'zitsanzo zomwe zili pamwambapa, kenako timasuntha chinthu chilichonse kukhala gawo lapadera:
chars <- tibble(char = got_chars)
chars
#> # A tibble: 30 x 1
#> char
#> <list>
#> 1 <named list [18]>
#> 2 <named list [18]>
#> 3 <named list [18]>
#> 4 <named list [18]>
#> 5 <named list [18]>
#> 6 <named list [18]>
#> 7 <named list [18]>
#> 8 <named list [18]>
#> 9 <named list [18]>
#> 10 <named list [18]>
#> # … with 20 more rows
chars2 <- chars %>% unnest_wider(char)
chars2
#> # A tibble: 30 x 18
#> url id name gender culture born died alive titles aliases father
#> <chr> <int> <chr> <chr> <chr> <chr> <chr> <lgl> <list> <list> <chr>
#> 1 http… 1022 Theo… Male Ironbo… In 2… "" TRUE <chr … <chr [… ""
#> 2 http… 1052 Tyri… Male "" In 2… "" TRUE <chr … <chr [… ""
#> 3 http… 1074 Vict… Male Ironbo… In 2… "" TRUE <chr … <chr [… ""
#> 4 http… 1109 Will Male "" "" In 2… FALSE <chr … <chr [… ""
#> 5 http… 1166 Areo… Male Norvos… In 2… "" TRUE <chr … <chr [… ""
#> 6 http… 1267 Chett Male "" At H… In 2… FALSE <chr … <chr [… ""
#> 7 http… 1295 Cres… Male "" In 2… In 2… FALSE <chr … <chr [… ""
#> 8 http… 130 Aria… Female Dornish In 2… "" TRUE <chr … <chr [… ""
#> 9 http… 1303 Daen… Female Valyri… In 2… "" TRUE <chr … <chr [… ""
#> 10 http… 1319 Davo… Male Wester… In 2… "" TRUE <chr … <chr [… ""
#> # … with 20 more rows, and 7 more variables: mother <chr>, spouse <chr>,
#> # allegiances <list>, books <list>, povBooks <list>, tvSeries <list>,
#> # playedBy <list>kapangidwe got_chars zovuta pang'ono kuposa gh_users, chifukwa zigawo zina za mndandanda char Ndiwo mndandanda, chifukwa chake timapeza mindandanda - mindandanda:
chars2 %>% select_if(is.list)
#> # A tibble: 30 x 7
#> titles aliases allegiances books povBooks tvSeries playedBy
#> <list> <list> <list> <list> <list> <list> <list>
#> 1 <chr [3]> <chr [4]> <chr [1]> <chr [3]> <chr [2]> <chr [6]> <chr [1]>
#> 2 <chr [2]> <chr [11]> <chr [1]> <chr [2]> <chr [4]> <chr [6]> <chr [1]>
#> 3 <chr [2]> <chr [1]> <chr [1]> <chr [3]> <chr [2]> <chr [1]> <chr [1]>
#> 4 <chr [1]> <chr [1]> <???> <chr [1]> <chr [1]> <chr [1]> <chr [1]>
#> 5 <chr [1]> <chr [1]> <chr [1]> <chr [3]> <chr [2]> <chr [2]> <chr [1]>
#> 6 <chr [1]> <chr [1]> <???> <chr [2]> <chr [1]> <chr [1]> <chr [1]>
#> 7 <chr [1]> <chr [1]> <???> <chr [2]> <chr [1]> <chr [1]> <chr [1]>
#> 8 <chr [1]> <chr [1]> <chr [1]> <chr [4]> <chr [1]> <chr [1]> <chr [1]>
#> 9 <chr [5]> <chr [11]> <chr [1]> <chr [1]> <chr [4]> <chr [6]> <chr [1]>
#> 10 <chr [4]> <chr [5]> <chr [2]> <chr [1]> <chr [3]> <chr [5]> <chr [1]>
#> # … with 20 more rowsMasitepe anu otsatirawa amadalira zolinga zanu zowunikira. Mungafunike kuyika zambiri za buku lililonse ndi mndandanda wapa TV momwe munthu amawonekera:
chars2 %>%
select(name, books, tvSeries) %>%
pivot_longer(c(books, tvSeries), names_to = "media", values_to = "value") %>%
unnest_longer(value)
#> # A tibble: 180 x 3
#> name media value
#> <chr> <chr> <chr>
#> 1 Theon Greyjoy books A Game of Thrones
#> 2 Theon Greyjoy books A Storm of Swords
#> 3 Theon Greyjoy books A Feast for Crows
#> 4 Theon Greyjoy tvSeries Season 1
#> 5 Theon Greyjoy tvSeries Season 2
#> 6 Theon Greyjoy tvSeries Season 3
#> 7 Theon Greyjoy tvSeries Season 4
#> 8 Theon Greyjoy tvSeries Season 5
#> 9 Theon Greyjoy tvSeries Season 6
#> 10 Tyrion Lannister books A Feast for Crows
#> # … with 170 more rowsKapena mukufuna kupanga tebulo lomwe limakupatsani mwayi wofananiza munthu ndi ntchito:
chars2 %>%
select(name, title = titles) %>%
unnest_longer(title)
#> # A tibble: 60 x 2
#> name title
#> <chr> <chr>
#> 1 Theon Greyjoy Prince of Winterfell
#> 2 Theon Greyjoy Captain of Sea Bitch
#> 3 Theon Greyjoy Lord of the Iron Islands (by law of the green lands)
#> 4 Tyrion Lannister Acting Hand of the King (former)
#> 5 Tyrion Lannister Master of Coin (former)
#> 6 Victarion Greyjoy Lord Captain of the Iron Fleet
#> 7 Victarion Greyjoy Master of the Iron Victory
#> 8 Will ""
#> 9 Areo Hotah Captain of the Guard at Sunspear
#> 10 Chett ""
#> # … with 50 more rows(Dziwani zinthu zopanda pake "" m'munda title, izi zimachitika chifukwa cha zolakwika zomwe zimachitika polowetsa deta got_chars: m'malo mwake, otchulidwa omwe alibe buku lofananira kapena mitu yapa TV pamunda title iyenera kukhala ndi vector yautali 0, osati vekitala yautali 1 yokhala ndi chingwe chopanda kanthu.)
Tikhoza kulembanso chitsanzo pamwambapa pogwiritsa ntchito ntchitoyi unnest_auto()Njirayi ndi yabwino kuwunikira kamodzi, koma musadalire unnest_auto() kugwiritsa ntchito nthawi zonse. Mfundo ndi yakuti ngati deta yanu isintha unnest_auto() ikhoza kusintha njira yosinthira deta yosankhidwa ngati idakulitsa mindandanda kukhala mizere pogwiritsa ntchito unnest_longer(), ndiye posintha mawonekedwe a data yomwe ikubwera, malingaliro angasinthidwe mokomera unnest_wider(), ndipo kugwiritsa ntchito njirayi nthawi zonse kungayambitse zolakwika zosayembekezereka.
tibble(char = got_chars) %>%
unnest_auto(char) %>%
select(name, title = titles) %>%
unnest_auto(title)
#> Using `unnest_wider(char)`; elements have 18 names in common
#> Using `unnest_longer(title)`; no element has names
#> # A tibble: 60 x 2
#> name title
#> <chr> <chr>
#> 1 Theon Greyjoy Prince of Winterfell
#> 2 Theon Greyjoy Captain of Sea Bitch
#> 3 Theon Greyjoy Lord of the Iron Islands (by law of the green lands)
#> 4 Tyrion Lannister Acting Hand of the King (former)
#> 5 Tyrion Lannister Master of Coin (former)
#> 6 Victarion Greyjoy Lord Captain of the Iron Fleet
#> 7 Victarion Greyjoy Master of the Iron Victory
#> 8 Will ""
#> 9 Areo Hotah Captain of the Guard at Sunspear
#> 10 Chett ""
#> # … with 50 more rowsGeocoding ndi Google
Kenako, tiwonanso zovuta za data zomwe zatengedwa kuchokera ku ntchito ya Google ya geocoding. Zidziwitso zosungira zimasemphana ndi mfundo za ntchito za Google Maps API, kotero ndilemba kaye chopukutira chosavuta cha API. Chovala ichi chimadalira kusunga kiyi ya Google Maps API mumitundu yosiyanasiyana. Ngati mulibe kiyi ya Google Maps API yosungidwa m'malo omwe muli, mawu achidule omwe ali mugawoli sagwira.
has_key <- !identical(Sys.getenv("GOOGLE_MAPS_API_KEY"), "")
if (!has_key) {
message("No Google Maps API key found; code chunks will not be run")
}
# https://developers.google.com/maps/documentation/geocoding
geocode <- function(address, api_key = Sys.getenv("GOOGLE_MAPS_API_KEY")) {
url <- "https://maps.googleapis.com/maps/api/geocode/json"
url <- paste0(url, "?address=", URLencode(address), "&key=", api_key)
jsonlite::read_json(url)
}Mndandanda womwe ntchitoyi imabwerera ndi yovuta kwambiri:
houston <- geocode("Houston TX")
str(houston)
#> List of 2
#> $ results:List of 1
#> ..$ :List of 5
#> .. ..$ address_components:List of 4
#> .. .. ..$ :List of 3
#> .. .. .. ..$ long_name : chr "Houston"
#> .. .. .. ..$ short_name: chr "Houston"
#> .. .. .. ..$ types :List of 2
#> .. .. .. .. ..$ : chr "locality"
#> .. .. .. .. ..$ : chr "political"
#> .. .. ..$ :List of 3
#> .. .. .. ..$ long_name : chr "Harris County"
#> .. .. .. ..$ short_name: chr "Harris County"
#> .. .. .. ..$ types :List of 2
#> .. .. .. .. ..$ : chr "administrative_area_level_2"
#> .. .. .. .. ..$ : chr "political"
#> .. .. ..$ :List of 3
#> .. .. .. ..$ long_name : chr "Texas"
#> .. .. .. ..$ short_name: chr "TX"
#> .. .. .. ..$ types :List of 2
#> .. .. .. .. ..$ : chr "administrative_area_level_1"
#> .. .. .. .. ..$ : chr "political"
#> .. .. ..$ :List of 3
#> .. .. .. ..$ long_name : chr "United States"
#> .. .. .. ..$ short_name: chr "US"
#> .. .. .. ..$ types :List of 2
#> .. .. .. .. ..$ : chr "country"
#> .. .. .. .. ..$ : chr "political"
#> .. ..$ formatted_address : chr "Houston, TX, USA"
#> .. ..$ geometry :List of 4
#> .. .. ..$ bounds :List of 2
#> .. .. .. ..$ northeast:List of 2
#> .. .. .. .. ..$ lat: num 30.1
#> .. .. .. .. ..$ lng: num -95
#> .. .. .. ..$ southwest:List of 2
#> .. .. .. .. ..$ lat: num 29.5
#> .. .. .. .. ..$ lng: num -95.8
#> .. .. ..$ location :List of 2
#> .. .. .. ..$ lat: num 29.8
#> .. .. .. ..$ lng: num -95.4
#> .. .. ..$ location_type: chr "APPROXIMATE"
#> .. .. ..$ viewport :List of 2
#> .. .. .. ..$ northeast:List of 2
#> .. .. .. .. ..$ lat: num 30.1
#> .. .. .. .. ..$ lng: num -95
#> .. .. .. ..$ southwest:List of 2
#> .. .. .. .. ..$ lat: num 29.5
#> .. .. .. .. ..$ lng: num -95.8
#> .. ..$ place_id : chr "ChIJAYWNSLS4QIYROwVl894CDco"
#> .. ..$ types :List of 2
#> .. .. ..$ : chr "locality"
#> .. .. ..$ : chr "political"
#> $ status : chr "OK"Mwamwayi, titha kudutsa vuto losintha deta iyi kukhala tebulo pogwiritsa ntchito ntchito. tidyrKuti ntchitoyi ikhale yovuta komanso yotheka, ndiyamba ndikuyika mizinda ingapo:
city <- c ( "Houston" , "LA" , "New York" , "Chicago" , "Springfield" ) city_geo <- purrr::map (city, geocode) Ndimasintha zotsatira zomwe mwapeza kukhala tibbleKuti zitheke, ndiwonjezera mzati wokhala ndi dzina lofananira la mzinda.
loc <- tibble(city = city, json = city_geo)
loc
#> # A tibble: 5 x 2
#> city json
#> <chr> <list>
#> 1 Houston <named list [2]>
#> 2 LA <named list [2]>
#> 3 New York <named list [2]>
#> 4 Chicago <named list [2]>
#> 5 Springfield <named list [2]>Mulingo woyamba uli ndi zigawo status и result, zomwe titha kugwiritsa ntchito unnest_wider() :
loc %>%
unnest_wider(json)
#> # A tibble: 5 x 3
#> city results status
#> <chr> <list> <chr>
#> 1 Houston <list [1]> OK
#> 2 LA <list [1]> OK
#> 3 New York <list [1]> OK
#> 4 Chicago <list [1]> OK
#> 5 Springfield <list [1]> OKZindikirani kuti results ndi mndandanda wamagulu ambiri. Mizinda yambiri ili ndi chinthu chimodzi (choyimira mtengo wapadera womwe umafanana ndi geocoding API), koma Springfield ili ndi ziwiri. Tikhoza kuwachotsa m'mizere yosiyana pogwiritsa ntchito unnest_longer() :
loc %>%
unnest_wider(json) %>%
unnest_longer(results)
#> # A tibble: 5 x 3
#> city results status
#> <chr> <list> <chr>
#> 1 Houston <named list [5]> OK
#> 2 LA <named list [5]> OK
#> 3 New York <named list [5]> OK
#> 4 Chicago <named list [5]> OK
#> 5 Springfield <named list [5]> OKTsopano onse ali ndi zigawo zofanana, zomwe zingathe kutsimikiziridwa pogwiritsa ntchito unnest_wider():
loc %>%
unnest_wider(json) %>%
unnest_longer(results) %>%
unnest_wider(results)
#> # A tibble: 5 x 7
#> city address_componen… formatted_addre… geometry place_id types status
#> <chr> <list> <chr> <list> <chr> <lis> <chr>
#> 1 Houst… <list [4]> Houston, TX, USA <named … ChIJAYWN… <lis… OK
#> 2 LA <list [4]> Los Angeles, CA… <named … ChIJE9on… <lis… OK
#> 3 New Y… <list [3]> New York, NY, U… <named … ChIJOwg_… <lis… OK
#> 4 Chica… <list [4]> Chicago, IL, USA <named … ChIJ7cv0… <lis… OK
#> 5 Sprin… <list [5]> Springfield, MO… <named … ChIJP5jI… <lis… OKTitha kupeza mayendedwe a latitude ndi longitude a mzinda uliwonse pokulitsa mndandanda geometry:
loc %>%
unnest_wider(json) %>%
unnest_longer(results) %>%
unnest_wider(results) %>%
unnest_wider(geometry)
#> # A tibble: 5 x 10
#> city address_compone… formatted_addre… bounds location location_type
#> <chr> <list> <chr> <list> <list> <chr>
#> 1 Hous… <list [4]> Houston, TX, USA <name… <named … APPROXIMATE
#> 2 LA <list [4]> Los Angeles, CA… <name… <named … APPROXIMATE
#> 3 New … <list [3]> New York, NY, U… <name… <named … APPROXIMATE
#> 4 Chic… <list [4]> Chicago, IL, USA <name… <named … APPROXIMATE
#> 5 Spri… <list [5]> Springfield, MO… <name… <named … APPROXIMATE
#> # … with 4 more variables: viewport <list>, place_id <chr>, types <list>,
#> # status <chr>Ndiyeno malo amene muyenera kuwonjezera location:
loc %>%
unnest_wider(json) %>%
unnest_longer(results) %>%
unnest_wider(results) %>%
unnest_wider(geometry) %>%
unnest_wider(location)
#> # A tibble: 5 x 11
#> city address_compone… formatted_addre… bounds lat lng location_type
#> <chr> <list> <chr> <list> <dbl> <dbl> <chr>
#> 1 Hous… <list [4]> Houston, TX, USA <name… 29.8 -95.4 APPROXIMATE
#> 2 LA <list [4]> Los Angeles, CA… <name… 34.1 -118. APPROXIMATE
#> 3 New … <list [3]> New York, NY, U… <name… 40.7 -74.0 APPROXIMATE
#> 4 Chic… <list [4]> Chicago, IL, USA <name… 41.9 -87.6 APPROXIMATE
#> 5 Spri… <list [5]> Springfield, MO… <name… 37.2 -93.3 APPROXIMATE
#> # … with 4 more variables: viewport <list>, place_id <chr>, types <list>,
#> # status <chr>Ndiponso, unnest_auto() imathandizira ntchito yomwe yafotokozedwayo ndi zoopsa zina zomwe zingayambike chifukwa cha kusintha kwa data yomwe ikubwera:
loc %>%
unnest_auto(json) %>%
unnest_auto(results) %>%
unnest_auto(results) %>%
unnest_auto(geometry) %>%
unnest_auto(location)
#> Using `unnest_wider(json)`; elements have 2 names in common
#> Using `unnest_longer(results)`; no element has names
#> Using `unnest_wider(results)`; elements have 5 names in common
#> Using `unnest_wider(geometry)`; elements have 4 names in common
#> Using `unnest_wider(location)`; elements have 2 names in common
#> # A tibble: 5 x 11
#> city address_compone… formatted_addre… bounds lat lng location_type
#> <chr> <list> <chr> <list> <dbl> <dbl> <chr>
#> 1 Hous… <list [4]> Houston, TX, USA <name… 29.8 -95.4 APPROXIMATE
#> 2 LA <list [4]> Los Angeles, CA… <name… 34.1 -118. APPROXIMATE
#> 3 New … <list [3]> New York, NY, U… <name… 40.7 -74.0 APPROXIMATE
#> 4 Chic… <list [4]> Chicago, IL, USA <name… 41.9 -87.6 APPROXIMATE
#> 5 Spri… <list [5]> Springfield, MO… <name… 37.2 -93.3 APPROXIMATE
#> # … with 4 more variables: viewport <list>, place_id <chr>, types <list>,
#> # status <chr>Titha kungoyang'ananso adilesi yoyamba ya mzinda uliwonse:
loc %>%
unnest_wider(json) %>%
hoist(results, first_result = 1) %>%
unnest_wider(first_result) %>%
unnest_wider(geometry) %>%
unnest_wider(location)
#> # A tibble: 5 x 11
#> city address_compone… formatted_addre… bounds lat lng location_type
#> <chr> <list> <chr> <list> <dbl> <dbl> <chr>
#> 1 Hous… <list [4]> Houston, TX, USA <name… 29.8 -95.4 APPROXIMATE
#> 2 LA <list [4]> Los Angeles, CA… <name… 34.1 -118. APPROXIMATE
#> 3 New … <list [3]> New York, NY, U… <name… 40.7 -74.0 APPROXIMATE
#> 4 Chic… <list [4]> Chicago, IL, USA <name… 41.9 -87.6 APPROXIMATE
#> 5 Spri… <list [5]> Springfield, MO… <name… 37.2 -93.3 APPROXIMATE
#> # … with 4 more variables: viewport <list>, place_id <chr>, types <list>,
#> # status <chr>Kapena ntchito hoist() kuti mutsike m'madzi ambiri kuti mupiteko lat и lng.
loc %>%
hoist(json,
lat = list("results", 1, "geometry", "location", "lat"),
lng = list("results", 1, "geometry", "location", "lng")
)
#> # A tibble: 5 x 4
#> city lat lng json
#> <chr> <dbl> <dbl> <list>
#> 1 Houston 29.8 -95.4 <named list [2]>
#> 2 LA 34.1 -118. <named list [2]>
#> 3 New York 40.7 -74.0 <named list [2]>
#> 4 Chicago 41.9 -87.6 <named list [2]>
#> 5 Springfield 37.2 -93.3 <named list [2]>Zolemba za Sharla Gelfand
Pomaliza, tiwona zojambula zovuta kwambiri - zojambula za Sharla Gelfand. Monga m'zitsanzo zomwe zili pamwambapa, tiyamba ndikusintha mndandandawo kukhala gawo limodzi la data, kenako ndikukulitsa kuti chigawo chilichonse chikhale chosiyana. Ndisinthanso gawoli date_added mumpangidwe woyenerera wa tsiku ndi nthawi mu R.
discs <- tibble(disc = discog) %>%
unnest_wider(disc) %>%
mutate(date_added = as.POSIXct(strptime(date_added, "%Y-%m-%dT%H:%M:%S")))
discs
#> # A tibble: 155 x 5
#> instance_id date_added basic_information id rating
#> <int> <dttm> <list> <int> <int>
#> 1 354823933 2019-02-16 17:48:59 <named list [11]> 7496378 0
#> 2 354092601 2019-02-13 14:13:11 <named list [11]> 4490852 0
#> 3 354091476 2019-02-13 14:07:23 <named list [11]> 9827276 0
#> 4 351244906 2019-02-02 11:39:58 <named list [11]> 9769203 0
#> 5 351244801 2019-02-02 11:39:37 <named list [11]> 7237138 0
#> 6 351052065 2019-02-01 20:40:53 <named list [11]> 13117042 0
#> 7 350315345 2019-01-29 15:48:37 <named list [11]> 7113575 0
#> 8 350315103 2019-01-29 15:47:22 <named list [11]> 10540713 0
#> 9 350314507 2019-01-29 15:44:08 <named list [11]> 11260950 0
#> 10 350314047 2019-01-29 15:41:35 <named list [11]> 11726853 0
#> # … with 145 more rowsPamulingo uwu, talandira zambiri za nthawi yomwe chimbale chilichonse chidawonjezedwa ku discography ya Sharla, koma sitikuwona chilichonse chokhudza ma disc awa. Kuti tiwone izi, tifunika kukulitsa ndime. basic_information:
discs %>% unnest_wider(basic_information)
#> Column name `id` must not be duplicated.
#> Use .name_repair to specify repair.Tsoka ilo, tipeza cholakwika chifukwa mkati mwa mndandanda basic_information pali mzati wa dzina lomwelo basic_informationNgati cholakwa choterocho chikuchitika, kuti mudziwe mwamsanga chifukwa chake, mungagwiritse ntchito names_repair = "unique":
discs %>% unnest_wider(basic_information, names_repair = "unique")
#> New names:
#> * id -> id...6
#> * id -> id...14
#> # A tibble: 155 x 15
#> instance_id date_added labels year artists id...6 thumb title
#> <int> <dttm> <list> <int> <list> <int> <chr> <chr>
#> 1 354823933 2019-02-16 17:48:59 <list… 2015 <list … 7.50e6 http… Demo
#> 2 354092601 2019-02-13 14:13:11 <list… 2013 <list … 4.49e6 http… Obse…
#> 3 354091476 2019-02-13 14:07:23 <list… 2017 <list … 9.83e6 http… I
#> 4 351244906 2019-02-02 11:39:58 <list… 2017 <list … 9.77e6 http… Oído…
#> 5 351244801 2019-02-02 11:39:37 <list… 2015 <list … 7.24e6 http… A Ca…
#> 6 351052065 2019-02-01 20:40:53 <list… 2019 <list … 1.31e7 http… Tash…
#> 7 350315345 2019-01-29 15:48:37 <list… 2014 <list … 7.11e6 http… Demo
#> 8 350315103 2019-01-29 15:47:22 <list… 2015 <list … 1.05e7 http… Let …
#> 9 350314507 2019-01-29 15:44:08 <list… 2017 <list … 1.13e7 "" Sub …
#> 10 350314047 2019-01-29 15:41:35 <list… 2017 <list … 1.17e7 http… Demo
#> # … with 145 more rows, and 7 more variables: formats <list>,
#> # cover_image <chr>, resource_url <chr>, master_id <int>,
#> # master_url <chr>, id...14 <int>, rating <int>Vuto ndiloti basic_information ikubwerezanso ndime ya id yomwe imasungidwanso pamwamba, kuti tingoyichotsa:
discs %>%
select(-id) %>%
unnest_wider(basic_information)
#> # A tibble: 155 x 14
#> instance_id date_added labels year artists id thumb title
#> <int> <dttm> <list> <int> <list> <int> <chr> <chr>
#> 1 354823933 2019-02-16 17:48:59 <list… 2015 <list … 7.50e6 http… Demo
#> 2 354092601 2019-02-13 14:13:11 <list… 2013 <list … 4.49e6 http… Obse…
#> 3 354091476 2019-02-13 14:07:23 <list… 2017 <list … 9.83e6 http… I
#> 4 351244906 2019-02-02 11:39:58 <list… 2017 <list … 9.77e6 http… Oído…
#> 5 351244801 2019-02-02 11:39:37 <list… 2015 <list … 7.24e6 http… A Ca…
#> 6 351052065 2019-02-01 20:40:53 <list… 2019 <list … 1.31e7 http… Tash…
#> 7 350315345 2019-01-29 15:48:37 <list… 2014 <list … 7.11e6 http… Demo
#> 8 350315103 2019-01-29 15:47:22 <list… 2015 <list … 1.05e7 http… Let …
#> 9 350314507 2019-01-29 15:44:08 <list… 2017 <list … 1.13e7 "" Sub …
#> 10 350314047 2019-01-29 15:41:35 <list… 2017 <list … 1.17e7 http… Demo
#> # … with 145 more rows, and 6 more variables: formats <list>,
#> # cover_image <chr>, resource_url <chr>, master_id <int>,
#> # master_url <chr>, rating <int>Kapena, tikhoza kugwiritsa ntchito hoist():
discs %>%
hoist(basic_information,
title = "title",
year = "year",
label = list("labels", 1, "name"),
artist = list("artists", 1, "name")
)
#> # A tibble: 155 x 9
#> instance_id date_added title year label artist
#> <int> <dttm> <chr> <int> <chr> <chr>
#> 1 354823933 2019-02-16 17:48:59 Demo 2015 Tobi… Mollot
#> 2 354092601 2019-02-13 14:13:11 Obse… 2013 La V… Una B…
#> 3 354091476 2019-02-13 14:07:23 I 2017 La V… S.H.I…
#> 4 351244906 2019-02-02 11:39:58 Oído… 2017 La V… Rata …
#> 5 351244801 2019-02-02 11:39:37 A Ca… 2015 Kato… Ivy (…
#> 6 351052065 2019-02-01 20:40:53 Tash… 2019 High… Tashme
#> 7 350315345 2019-01-29 15:48:37 Demo 2014 Mind… Desgr…
#> 8 350315103 2019-01-29 15:47:22 Let … 2015 Not … Phant…
#> 9 350314507 2019-01-29 15:44:08 Sub … 2017 Not … Sub S…
#> 10 350314047 2019-01-29 15:41:35 Demo 2017 Pres… Small…
#> # … with 145 more rows, and 3 more variables: basic_information <list>,
#> # id <int>, rating <int>Apa ndimachotsa mwachangu dzina lachilembo choyamba ndi wojambula ndi index, ndikudumphira pamndandanda wokhala ndi zisa.
Njira yokhazikika ndikupanga matebulo osiyana a ojambula ndi zilembo:
discs %>%
hoist(basic_information, artist = "artists") %>%
select(disc_id = id, artist) %>%
unnest_longer(artist) %>%
unnest_wider(artist)
#> # A tibble: 167 x 8
#> disc_id join name anv tracks role resource_url id
#> <int> <chr> <chr> <chr> <chr> <chr> <chr> <int>
#> 1 7496378 "" Mollot "" "" "" https://api.discog… 4.62e6
#> 2 4490852 "" Una Bèstia… "" "" "" https://api.discog… 3.19e6
#> 3 9827276 "" S.H.I.T. (… "" "" "" https://api.discog… 2.77e6
#> 4 9769203 "" Rata Negra "" "" "" https://api.discog… 4.28e6
#> 5 7237138 "" Ivy (18) "" "" "" https://api.discog… 3.60e6
#> 6 13117042 "" Tashme "" "" "" https://api.discog… 5.21e6
#> 7 7113575 "" Desgraciad… "" "" "" https://api.discog… 4.45e6
#> 8 10540713 "" Phantom He… "" "" "" https://api.discog… 4.27e6
#> 9 11260950 "" Sub Space … "" "" "" https://api.discog… 5.69e6
#> 10 11726853 "" Small Man … "" "" "" https://api.discog… 6.37e6
#> # … with 157 more rows
discs %>%
hoist(basic_information, format = "formats") %>%
select(disc_id = id, format) %>%
unnest_longer(format) %>%
unnest_wider(format) %>%
unnest_longer(descriptions)
#> # A tibble: 280 x 5
#> disc_id descriptions text name qty
#> <int> <chr> <chr> <chr> <chr>
#> 1 7496378 Numbered Black Cassette 1
#> 2 4490852 LP <NA> Vinyl 1
#> 3 9827276 "7"" <NA> Vinyl 1
#> 4 9827276 45 RPM <NA> Vinyl 1
#> 5 9827276 EP <NA> Vinyl 1
#> 6 9769203 LP <NA> Vinyl 1
#> 7 9769203 Album <NA> Vinyl 1
#> 8 7237138 "7"" <NA> Vinyl 1
#> 9 7237138 45 RPM <NA> Vinyl 1
#> 10 13117042 "7"" <NA> Vinyl 1
#> # … with 270 more rowsMutha kuwajowinanso ku dataset yoyambirira ngati pakufunika.
Pomaliza
Pakatikati pa laibulale tidyverse imaphatikizanso maphukusi ambiri othandiza ophatikizidwa ndi filosofi wamba yokonza deta.
M'nkhani ino takambirana za banja la ntchito unnest_*(), zomwe cholinga chake ndi kuchotsa zinthu kuchokera pamindandanda yosungidwa. Phukusili lili ndi ntchito zina zambiri zothandiza zomwe zimathandizira kusintha kwa data molingana ndi lingaliro Dongosolo Ladongosolo.
Source: www.habr.com
