ProHoster > Blog > Tsamaiso > R package tidyr le mesebetsi ea eona e mecha ea pivot_longer le pivot_wider
R package tidyr le mesebetsi ea eona e mecha ea pivot_longer le pivot_wider
Package tidyr e kenyellelitsoe khubung ea e 'ngoe ea lilaebrari tse tsebahalang haholo ka puo ea R - hloekile.
Sepheo se seholo sa sephutheloana ke ho tlisa data ka foromo e nepahetseng.
E se e fumaneha ho Habré phatlalatso e inehetse ho sephutheloana sena, empa e qalile ka 2015. 'Me ke batla ho u bolella ka liphetoho tsa morao-rao, tse phatlalalitsoeng matsatsing a seng makae a fetileng ke mongoli oa eona, Hedley Wickham.
SJK: Na collection() and spread() e tla tlosoa?
Hadley Wickham: Ho isa bohōleng bo itseng. Re ke ke ra hlola re khothaletsa tšebeliso ea mesebetsi ena le ho lokisa liphoso ho tsona, empa li tla tsoelapele ho ba teng ka har'a sephutheloana boemong ba tsona ba hajoale.
Tse ka hare
Haeba u thahasella tlhahlobo ea data, u ka 'na ua thahasella ea ka thelekramo и mang youtube dikanale. Boholo ba litaba bo nehetsoe puong ea R.
Sepheo tidyr — e o thusa ho tlisa data ho seo ho thoeng ke makhethe. Lintlha tse nepahetseng ke data moo:
Mofuta o mong le o mong o ka kholomong.
Pono ka 'ngoe ke khoele.
Boleng bo bong le bo bong ke sele.
Ho bonolo ebile ho bonolo haholoanyane ho sebetsa ka data e hlahisoang ka data e makhethe ha u etsa tlhahlobo.
Mesebetsi ea mantlha e kenyellelitsoeng ka har'a sephutheloana sa tidyr
tidyr e na le sehlopha sa mesebetsi e etselitsoeng ho fetola litafole:
fill() - ho tlatsa litekanyetso tse sieo kholong e nang le boleng bo fetileng;
separate() - o arola tšimo e le 'ngoe hore e be tse ngata ka ho sebelisa searohano;
unite() - e etsa ts'ebetso ea ho kopanya masimo a 'maloa ho a le mong, ketso e fapaneng ea ts'ebetso separate();
pivot_longer() - ts'ebetso e fetolelang data ho tloha sebopehong se pharaletseng ho ea ho sebopeho se selelele;
pivot_wider() - ts'ebetso e fetolelang data ho tloha sebopehong se selelele ho ea ho sebopeho se pharaletseng. Tshebetso e furallang ya e etsoang ke mosebetsi pivot_longer().
gather()e siiloe ke nako - ts'ebetso e fetolelang data ho tloha sebopehong se pharaletseng ho ea ho sebopeho se selelele;
spread()e siiloe ke nako - ts'ebetso e fetolelang data ho tloha sebopehong se selelele ho ea ho sebopeho se pharaletseng. Tshebetso e furallang ya e etsoang ke mosebetsi gather().
Khopolo e ncha ea ho fetolela data ho tloha ho bophara ho ea ho sebopeho se selelele le ka tsela e fapaneng
Pele, mesebetsi e ne e sebelisoa bakeng sa phetoho ea mofuta ona gather() и spread(). Ho theosa le lilemo tsa ho ba teng ha mesebetsi ena, ho ile ha totobala hore ho basebelisi ba bangata, ho kenyelletsa le mongoli oa sephutheloana, mabitso a mesebetsi ena le likhang tsa bona li ne li sa totobala, 'me li bakile mathata ho li fumana le ho utloisisa hore na ke efe ea mesebetsi ena e sokolohang. moralo oa letsatsi ho tloha ho bophara ho ea ho sebopeho se selelele, le ka tsela e fapaneng.
Tabeng ena, ho tidyr Ho kentsoe mesebetsi e 'meli e mecha, ea bohlokoa e etselitsoeng ho fetola liforeimi tsa matsatsi.
Likarolo tse ncha pivot_longer() и pivot_wider() ba ile ba bululeloa ke tse ling tsa likarolo tsa sephutheloana cdata, e entsoeng ke John Mount le Nina Zumel.
Ho kenya mofuta oa hajoale oa tidyr 0.8.3.9000
Ho kenya mofuta o mocha, oa hajoale oa sephutheloana tidyr0.8.3.9000, moo likarolo tse ncha li fumanehang, sebelisa khoutu e latelang.
devtools::install_github("tidyverse/tidyr")
Ka nako ea ho ngola, mesebetsi ena e fumaneha feela ka mofuta oa dev oa sephutheloana ho GitHub.
Phetolelo ho likarolo tse ncha
Ebile, ha ho thata ho fetisetsa mangolo a khale ho sebetsa le mesebetsi e mecha; bakeng sa kutloisiso e betere, ke tla nka mohlala ho tsoa litokomaneng tsa mesebetsi ea khale le ho bonts'a hore na ts'ebetso e ts'oanang e etsoa joang ke sebelisa tse ncha. pivot_*() mesebetsi.
Fetolela sebopeho se pharaletseng ho sebopeho se selelele.
Mohlala oa khoutu ho tsoa litokomaneng tsa ts'ebetso ea ho bokella
# example
library(dplyr)
stocks <- data.frame(
time = as.Date('2009-01-01') + 0:9,
X = rnorm(10, 0, 1),
Y = rnorm(10, 0, 2),
Z = rnorm(10, 0, 4)
)
# old
stocks_gather <- stocks %>% gather(key = stock,
value = price,
-time)
# new
stocks_long <- stocks %>% pivot_longer(cols = -time,
names_to = "stock",
values_to = "price")
Ho fetolela sebopeho se selelele ho sebopeho se pharaletseng.
Mohlala oa khoutu ho tsoa litokomaneng tsa ts'ebetso ea phatlalatso
# old
stocks_spread <- stocks_gather %>% spread(key = stock,
value = price)
# new
stock_wide <- stocks_long %>% pivot_wider(names_from = "stock",
values_from = "price")
Hobane mehlaleng e ka holimo ea ho sebetsa le pivot_longer() и pivot_wider(), tafoleng ea pele metšoasong ha ho litšiea tse thathamisitsoeng ka likhang mabitso_ho и values_to mabitso a bona a be ka matshwao a qotso.
Tafole e tla u thusa ho tseba habonolo mokhoa oa ho fetohela ho sebetsa ka mohopolo o mocha tidyr.
Ela hloko ho tsoa ho mongoli
Litemana tsohle tse ka tlase lia ikamahanya le maemo, nka ba ka re phetolelo ea mahala li-vignettes ho tsoa webosaeteng ea semmuso ea tidyverse.
Mohlala o bonolo oa ho fetola data ho tloha ka bophara ho ea ho sebopeho se selelele
pivot_longer () - e etsa hore data e be telele ka ho fokotsa palo ea likholomo le ho eketsa palo ea mela.
Ho tsamaisa mehlala e hlahisitsoeng sengolong, o hloka ho hokela liphutheloana tse hlokahalang pele:
library(tidyr)
library(dplyr)
library(readr)
A re re re na le tafole e nang le liphetho tsa phuputso eo (har'a lintho tse ling) e ileng ea botsa batho ka bolumeli ba bona le chelete ea selemo:
Lethathamo lena le na le lintlha tsa bolumeli ba ba arabelang ka mela, 'me maemo a chelete a hasana ho pholletsa le mabitso a likholomo. Palo ea ba arabelitsoeng ho tsoa sehlopheng ka seng e bolokiloe ka boleng ba lisele mateanong a bolumeli le boemo ba chelete. Ho tlisa tafole ka mokhoa o makhethe, o nepahetseng, ho lekane ho e sebelisa pivot_longer():
Khang ea pele Likhola, e hlalosa hore na ke litšiea life tse lokelang ho kopanngoa. Tabeng ena, litšiea tsohle ntle le nako.
khang mabitso_ho e fana ka lebitso la phapano e tla bōptjoa ho tsoa mabitsong a litšiea tseo re li kopantseng.
values_to e fana ka lebitso la phapang e tla etsoa ho tsoa ho data e bolokiloeng boleng ba lisele tsa likholomo tse kopaneng.
Lintlha (edita)
Ena ke ts'ebetso e ncha ea sephutheloana tidyr, eo pele e neng e sa fumanehe ha e sebetsa ka mesebetsi ea lefa.
Tlhaloso ke foreimi ea data, mola o mong le o mong o tsamaellanang le kholomo e le 'ngoe ka har'a foreimi e ncha ea letsatsi la tlhahiso, le litšiea tse peli tse khethehileng tse qalang ka:
.name e na le lebitso la pele la kholomo.
.boleng e na le lebitso la kholomo e tla ba le boleng ba lisele.
Likholomo tse setseng tsa litlhaloso li bontša kamoo kholomo e ncha e tla bontša lebitso la litšiea tse hatelitsoeng ho tloha ho .name.
Tlhaloso e hlalosa metadata e bolokiloeng ka lebitso la kholomo, ka mola o le mong bakeng sa kholomo ka 'ngoe le kholumo e le' ngoe bakeng sa phetoho e 'ngoe le e' ngoe, e kopantsoe le lebitso la kholomo, tlhaloso ena e ka 'na ea bonahala e ferekanya hona joale, empa ka mor'a ho sheba mehlala e seng mekae e tla fetoha haholo. hlakileng haholoanyane.
Taba ea tlhaloso ke hore o ka khona ho fumana, ho fetola, le ho hlalosa metadata e ncha bakeng sa dataframe e ntseng e fetoloa.
Ho sebetsa ka litlhaloso ha u fetola tafole ho tloha sebopeho se pharaletseng ho ea ho sebopeho se selelele, sebelisa ts'ebetso pivot_longer_spec().
Tsela eo ts'ebetso ena e sebetsang ka eona ke hore e nka nako efe kapa efe mme e hlahisa metadata ea eona ka mokhoa o hlalositsoeng ka holimo.
E le mohlala, ha re nke hore na dataset e fanoeng le sephutheloana ke mang tidyr. Lethathamo lena la boitsebiso le na le lintlha tse fanoeng ke mokhatlo oa machaba oa bophelo bo botle mabapi le liketsahalo tsa lefuba.
who
#> # A tibble: 7,240 x 60
#> country iso2 iso3 year new_sp_m014 new_sp_m1524 new_sp_m2534
#> <chr> <chr> <chr> <int> <int> <int> <int>
#> 1 Afghan… AF AFG 1980 NA NA NA
#> 2 Afghan… AF AFG 1981 NA NA NA
#> 3 Afghan… AF AFG 1982 NA NA NA
#> 4 Afghan… AF AFG 1983 NA NA NA
#> 5 Afghan… AF AFG 1984 NA NA NA
#> 6 Afghan… AF AFG 1985 NA NA NA
#> 7 Afghan… AF AFG 1986 NA NA NA
#> 8 Afghan… AF AFG 1987 NA NA NA
#> 9 Afghan… AF AFG 1988 NA NA NA
#> 10 Afghan… AF AFG 1989 NA NA NA
#> # … with 7,230 more rows, and 53 more variables
Ha re aheng litlhaloso tsa eona.
spec <- who %>%
pivot_longer_spec(new_sp_m014:newrel_f65, values_to = "count")
masimo naheng, isoxnumx, isoxnumx li se li ntse li fetoha. Mosebetsi oa rona ke ho phetla litšiea ka new_sp_m014 ka newrel_f65.
Mabitso a likholomo tsena a boloka lintlha tse latelang:
Sehlongoapele new_ e bontša hore kholomo e na le lintlha tse mabapi le linyeoe tse ncha tsa lefuba, letsatsi la hona joale le na le tlhahisoleseding feela ka mafu a macha, kahoo sehlomathiso sena sa moelelo oa hona joale ha se na moelelo leha e le ofe.
sp/rel/sp/ep e hlalosa mokhoa oa ho hlahloba lefu.
m/f bong ba mokuli.
014/1524/2535/3544/4554/65 lilemo tsa mokuli.
Re ka arola litšiea tsena ka ho sebelisa mosebetsi extract()ka ho sebedisa polelo e tlwaelehileng.
#> # A tibble: 56 x 5
#> .name .value diagnosis gender age
#> <chr> <chr> <chr> <chr> <chr>
#> 1 new_sp_m014 count sp m 014
#> 2 new_sp_m1524 count sp m 1524
#> 3 new_sp_m2534 count sp m 2534
#> 4 new_sp_m3544 count sp m 3544
#> 5 new_sp_m4554 count sp m 4554
#> 6 new_sp_m5564 count sp m 5564
#> 7 new_sp_m65 count sp m 65
#> 8 new_sp_f014 count sp f 014
#> 9 new_sp_f1524 count sp f 1524
#> 10 new_sp_f2534 count sp f 2534
#> # … with 46 more rows
Ka kopo hlokomela kholomo .name e lokela ho lula e sa fetohe kaha lena ke index ea rona ea mabitso a kholumo ea dataset ea mantlha.
Bong le lilemo (likholomo tekano и dilemo) li na le boleng bo tsitsitseng le bo tsejoang, kahoo ho khothaletsoa ho fetolela likholomo tsena ho lintlha:
Qetellong, molemong oa ho sebelisa litlhaloso tseo re li entseng ho foreimi ea letsatsi la mantlha ea ileng a re hloka ho sebelisa khang mohlomong mosebetsing pivot_longer().
who %>% pivot_longer(spec = spec)
#> # A tibble: 405,440 x 8
#> country iso2 iso3 year diagnosis gender age count
#> <chr> <chr> <chr> <int> <chr> <fct> <ord> <int>
#> 1 Afghanistan AF AFG 1980 sp m 014 NA
#> 2 Afghanistan AF AFG 1980 sp m 1524 NA
#> 3 Afghanistan AF AFG 1980 sp m 2534 NA
#> 4 Afghanistan AF AFG 1980 sp m 3544 NA
#> 5 Afghanistan AF AFG 1980 sp m 4554 NA
#> 6 Afghanistan AF AFG 1980 sp m 5564 NA
#> 7 Afghanistan AF AFG 1980 sp m 65 NA
#> 8 Afghanistan AF AFG 1980 sp f 014 NA
#> 9 Afghanistan AF AFG 1980 sp f 1524 NA
#> 10 Afghanistan AF AFG 1980 sp f 2534 NA
#> # … with 405,430 more rows
Ntho e 'ngoe le e' ngoe eo re sa tsoa e etsa e ka hlalosoa ka mokhoa o latelang:
Tlhaloso e sebelisang boleng bo bongata (.value)
Mohlala o ka holimo, kholomo ea litlhaloso .boleng e na le boleng bo le bong feela, hangata ho joalo.
Empa ka linako tse ling boemo bo ka hlaha ha o hloka ho bokella data ho tsoa likholomong tse nang le mefuta e fapaneng ea data ka boleng. Ho sebelisa mosebetsi oa lefa spread() sena se ka ba thata haholo ho se etsa.
Mohlala o ka tlase o nkiloe ho li-vignettes ho sephutheloana data.tafole.
Foreimi ea letsatsi e entsoeng e na le lintlha tsa bana ba lelapa le le leng moleng o mong le o mong. Malapa a ka ba le ngoana a le mong kapa ba babeli. Bakeng sa ngoana e mong le e mong, lintlha li fanoa ka letsatsi la tsoalo le bong, 'me lintlha tsa ngoana e mong le e mong li ka har'a mela e arohaneng; mosebetsi oa rona ke ho tlisa lintlha tsena ka mokhoa o nepahetseng bakeng sa tlhahlobo.
Ka kopo hlokomela hore re na le liphapang tse peli tse nang le tlhaiso-leseling ka ngoana ka mong: bong ba hae le letsatsi la tsoalo (likholomo tse nang le sehlongwapele. dop li na le letsatsi la tsoalo, likholomo tse nang le sehlongwapele tekano e na le bong ba ngoana). Sephetho se lebelletsoeng ke hore li lokela ho hlaha ka mela e arohaneng. Re ka etsa sena ka ho hlahisa tlhaloso eo ho eona kholomo .value e tla ba le meelelo e 'meli e fapaneng.
spec <- family %>%
pivot_longer_spec(-family) %>%
separate(col = name, into = c(".value", "child"))%>%
mutate(child = parse_number(child))
#> # A tibble: 4 x 3
#> .name .value child
#> <chr> <chr> <dbl>
#> 1 dob_child1 dob 1
#> 2 dob_child2 dob 2
#> 3 gender_child1 gender 1
#> 4 gender_child2 gender 2
Kahoo, a re shebeng mohato ka mohato liketso tse entsoeng ke khoutu e ka holimo.
pivot_longer_spec(-family) - theha tlhaloso e hatellang litšiea tsohle tse teng ntle le kholomo ea lelapa.
separate(col = name, into = c(".value", "child")) - arola kholomo .name, e nang le mabitso a libaka tsa mohloli, ho sebelisa underscore le ho kenya litekanyetso tse hlahisoang likholomong .boleng и Ngoana.
mutate(child = parse_number(child)) - fetola maemo a sebaka Ngoana ho tloha ho mongolo ho ea ho mofuta oa data oa linomoro.
Hona joale re ka sebelisa litlhaloso tse hlahisitsoeng ho dataframe ea pele mme re tlisa tafole ho foromo e lakatsehang.
Re sebelisa khang na.rm = TRUE, hobane mofuta oa hajoale oa data o qobella ho theha mela e meng bakeng sa litebello tse seng teng. Hobane lelapa la 2 le na le ngoana a le mong, na.rm = TRUE e tiisa hore lelapa la 2 le tla ba le mola o le mong sephethong.
E fetolela liforeimi tsa matsatsi ho tloha ho tse telele ho ea ho tse pharaletseng
pivot_wider() - ke phetoho e fapaneng, 'me ka tsela e fapaneng e eketsa palo ea likholomo tsa foreimi ea letsatsi ka ho fokotsa palo ea mela.
Phetoho ea mofuta ona ha e sebelisoe ka seoelo haholo ho tlisa data ka mokhoa o nepahetseng, leha ho le joalo, mokhoa ona o ka ba molemo bakeng sa ho theha litafole tsa pivot tse sebelisoang lipontšong, kapa bakeng sa ho hokahana le lisebelisoa tse ling.
Haele hantle mesebetsi pivot_longer() и pivot_wider() li na le symmetrical, 'me li hlahisa liketso tse fapaneng, ke hore: df %>% pivot_longer(spec = spec) %>% pivot_wider(spec = spec) и df %>% pivot_wider(spec = spec) %>% pivot_longer(spec = spec) e tla khutlisa df ea pele.
Mohlala o bonolo oa ho fetolela tafole ho sebopeho se pharaletseng
Ho bontša hore na mosebetsi o sebetsa joang pivot_wider() re tla sebelisa dataset tlhapi_dikopano, e bolokang tlhahisoleseding mabapi le hore na liteishene tse fapaneng li tlaleha ho tsamaea ha litlhapi haufi le nōka.
#> # A tibble: 114 x 3
#> fish station seen
#> <fct> <fct> <int>
#> 1 4842 Release 1
#> 2 4842 I80_1 1
#> 3 4842 Lisbon 1
#> 4 4842 Rstr 1
#> 5 4842 Base_TD 1
#> 6 4842 BCE 1
#> 7 4842 BCW 1
#> 8 4842 BCE2 1
#> 9 4842 BCW2 1
#> 10 4842 MAE 1
#> # … with 104 more rows
Maemong a mangata, tafole ena e tla ba e rutang haholoanyane, 'me ho be bonolo ho e sebelisa haeba u hlahisa tlhahisoleseling bakeng sa seteishene ka seng kholumong e fapaneng.
fish_encounters %>% pivot_wider(names_from = station, values_from = seen)
#> # A tibble: 19 x 12
#> fish Release I80_1 Lisbon Rstr Base_TD BCE BCW BCE2 BCW2 MAE
#> <fct> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
#> 1 4842 1 1 1 1 1 1 1 1 1 1
#> 2 4843 1 1 1 1 1 1 1 1 1 1
#> 3 4844 1 1 1 1 1 1 1 1 1 1
#> 4 4845 1 1 1 1 1 NA NA NA NA NA
#> 5 4847 1 1 1 NA NA NA NA NA NA NA
#> 6 4848 1 1 1 1 NA NA NA NA NA NA
#> 7 4849 1 1 NA NA NA NA NA NA NA NA
#> 8 4850 1 1 NA 1 1 1 1 NA NA NA
#> 9 4851 1 1 NA NA NA NA NA NA NA NA
#> 10 4854 1 1 NA NA NA NA NA NA NA NA
#> # … with 9 more rows, and 1 more variable: MAW <int>
Lintlha tsena tse behiloeng li tlaleha tlhahisoleseding feela ha litlhapi li fumanoe ke seteisheneng, i.e. haeba tlhapi leha e le efe e sa tlalehoa ke seteishene se seng, joale data ena e ke ke ea e-ba teng tafoleng. Sena se bolela hore tlhahiso e tla tlatsoa ka NA.
Leha ho le joalo, tabeng ena rea tseba hore ho ba sieo ha tlaleho ho bolela hore litlhapi ha lia ka tsa bonoa, kahoo re ka sebelisa khang values_tlatsa mosebetsing pivot_wider() 'me u tlatse litekanyetso tsena tse sieo ka zero:
Ho hlahisa lebitso la kholomo ho tsoa ho mefuta e mengata ea mehloli
Ak'u nahane re na le tafole e nang le motsoako oa lihlahisoa, naha le selemo. Ho etsa moralo oa letsatsi la teko, o ka sebelisa khoutu e latelang:
df <- expand_grid(
product = c("A", "B"),
country = c("AI", "EI"),
year = 2000:2014
) %>%
filter((product == "A" & country == "AI") | product == "B") %>%
mutate(value = rnorm(nrow(.)))
#> # A tibble: 45 x 4
#> product country year value
#> <chr> <chr> <int> <dbl>
#> 1 A AI 2000 -2.05
#> 2 A AI 2001 -0.676
#> 3 A AI 2002 1.60
#> 4 A AI 2003 -0.353
#> 5 A AI 2004 -0.00530
#> 6 A AI 2005 0.442
#> 7 A AI 2006 -0.610
#> 8 A AI 2007 -2.77
#> 9 A AI 2008 0.899
#> 10 A AI 2009 -0.106
#> # … with 35 more rows
Mosebetsi oa rona ke ho holisa moralo oa data e le hore kholomo e le 'ngoe e be le data bakeng sa motsoako o mong le o mong oa sehlahisoa le naha. Ho etsa sena, kenya feela moqoqong mabitso_ho tsoa vector e nang le mabitso a masimo a tla kopanngoa.
U ka boela ua sebelisa litlhaloso ho ts'ebetso pivot_wider(). Empa ha e fetisetsoa ho pivot_wider() tlhaloso e etsa phetoho e fapaneng pivot_longer(): Litšiea tse boletsoeng ho .name, ho sebelisa litekanyetso tse tsoang ho .boleng le litšiea tse ling.
Bakeng sa datha ena, o ka hlahisa litlhaloso tsa tloaelo haeba u batla hore naha e 'ngoe le e 'ngoe e ka khonehang le motsoako oa lihlahisoa li be le kholomo ea eona, eseng feela tse teng ho data:
#> # A tibble: 4 x 4
#> .name product country .value
#> <chr> <chr> <chr> <chr>
#> 1 A_AI A AI value
#> 2 A_EI A EI value
#> 3 B_AI B AI value
#> 4 B_EI B EI value
df %>% pivot_wider(spec = spec) %>% head()
#> # A tibble: 6 x 5
#> year A_AI A_EI B_AI B_EI
#> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 2000 -2.05 NA 0.607 1.20
#> 2 2001 -0.676 NA 1.65 -0.114
#> 3 2002 1.60 NA -0.0245 0.501
#> 4 2003 -0.353 NA 1.30 -0.459
#> 5 2004 -0.00530 NA 0.921 -0.0589
#> 6 2005 0.442 NA -1.55 0.594
Mehlala e mengata e tsoetseng pele ea ho sebetsa le mohopolo o mocha oa tidyr
Ho hloekisa data ka ho sebelisa mohlala oa Lekeno la Census le Rent ea US e le mohlala.
Sete ya data rona_rente_motseno e na le lekeno la mahareng le tlhaiso-leseling ea rente bakeng sa naha e 'ngoe le e 'ngoe ea US bakeng sa 2017 (sete ea data e fumaneha ka har'a sephutheloana tidycensus).
us_rent_income
#> # A tibble: 104 x 5
#> GEOID NAME variable estimate moe
#> <chr> <chr> <chr> <dbl> <dbl>
#> 1 01 Alabama income 24476 136
#> 2 01 Alabama rent 747 3
#> 3 02 Alaska income 32940 508
#> 4 02 Alaska rent 1200 13
#> 5 04 Arizona income 27517 148
#> 6 04 Arizona rent 972 4
#> 7 05 Arkansas income 23789 165
#> 8 05 Arkansas rent 709 5
#> 9 06 California income 29454 109
#> 10 06 California rent 1358 3
#> # … with 94 more rows
Ka mokhoa oo data e bolokiloeng ka eona ho dataset rona_rente_motseno ho sebetsa le bona ha ho bonolo haholo, kahoo re ka rata ho theha sete ea data e nang le likholomo: rente, rent_moe, tla, chelete_moe. Ho na le mekhoa e mengata ea ho theha tlhaloso ena, empa ntlha ea bohlokoa ke hore re hloka ho hlahisa motsoako o mong le o mong oa litekanyetso tse fapaneng estimate/moeebe o hlahisa lebitso la kholomo.
Ka linako tse ling ho tlisa data e behiloeng ka foromo e lakatsehang ho hloka mehato e mengata.
Lethathamo la boitsebiso world_bank_pop e na le lintlha tsa Banka ea Lefatše mabapi le baahi ba naha ka 'ngoe pakeng tsa 2000 le 2018.
Sepheo sa rona ke ho theha data e makhethe e nang le phapang e 'ngoe le e' ngoe ka har'a kholomo ea eona. Ha ho tsejoe hantle hore na ho hlokahala mehato efe, empa re tla qala ka bothata bo hlakileng ka ho fetesisa: selemo se phatlalalitsoe likholomong tse ngata.
Ho lokisa sena, o hloka ho sebelisa sesebelisoa pivot_longer().
Mohato o latelang ke ho sheba phapang ea indicator. pop2 %>% count(indicator)
#> # A tibble: 4 x 2
#> indicator n
#> <chr> <int>
#> 1 SP.POP.GROW 4752
#> 2 SP.POP.TOTL 4752
#> 3 SP.URB.GROW 4752
#> 4 SP.URB.TOTL 4752
Moo SP.POP.GROW e leng kholo ea baahi, SP.POP.TOTL ke kakaretso ea baahi, le SP.URB. * ntho e tšoanang, empa bakeng sa libaka tsa litoropo feela. Ha re arole boleng bona ka mefuta e 'meli: sebaka - sebaka (kakaretso kapa toropo) le phapano e nang le data ea nnete (baahi kapa kholo):
Ho beha lenane lena lethathamong ho thata haholo hobane ha ho na phetoho e supang hore na data ke ea mang. Re ka lokisa sena ka ho hlokomela hore lintlha tsa lebitso le leng le le leng le lecha li qala ka "lebitso", kahoo re ka theha sekhetho se ikhethileng 'me ra se eketsa ka se le seng nako le nako ha kholomo ea tšimo e na le "lebitso" la bohlokoa:
#> # A tibble: 6 x 3
#> field value person_id
#> <chr> <chr> <int>
#> 1 name Jiena McLellan 1
#> 2 company Toyota 1
#> 3 name John Smith 2
#> 4 company google 2
#> 5 email [email protected] 2
#> 6 name Huxley Ratcliffe 3
Kaha joale re na le ID e ikhethang bakeng sa lebitso le leng le le leng, re ka fetola tšimo le boleng hore e be likholomo:
#> # A tibble: 3 x 4
#> person_id name company email
#> <int> <chr> <chr> <chr>
#> 1 1 Jiena McLellan Toyota <NA>
#> 2 2 John Smith google [email protected]
#> 3 3 Huxley Ratcliffe <NA> <NA>
fihlela qeto e
Maikutlo a ka ke hore mohopolo o mocha tidyr e hlakileng haholoanyane, 'me e phahame haholo ts'ebetsong ho feta mesebetsi e fetileng spread() и gather(). Ke tšepa hore sehlooho sena se u thusitse ho sebetsana le pivot_longer() и pivot_wider().