R์„ ์‚ฌ์šฉํ•˜์—ฌ ์• ๋‹ˆ๋ฉ”์ด์…˜ ํžˆ์Šคํ† ๊ทธ๋žจ ๋งŒ๋“ค๊ธฐ

R์„ ์‚ฌ์šฉํ•˜์—ฌ ์• ๋‹ˆ๋ฉ”์ด์…˜ ํžˆ์Šคํ† ๊ทธ๋žจ ๋งŒ๋“ค๊ธฐ

๋ชจ๋“  ์›น์‚ฌ์ดํŠธ์˜ ๊ฒŒ์‹œ๋ฌผ์— ์ง์ ‘ ์‚ฝ์ž…ํ•  ์ˆ˜ ์žˆ๋Š” ์• ๋‹ˆ๋ฉ”์ด์…˜ ๋ง‰๋Œ€ ์ฐจํŠธ๊ฐ€ ์ ์  ์ธ๊ธฐ๋ฅผ ์–ป๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” ํŠน์ • ์‹œ๊ฐ„ ๋™์•ˆ ๋ชจ๋“  ํŠน์„ฑ์˜ ๋ณ€ํ™” ์—ญํ•™์„ ํ‘œ์‹œํ•˜๊ณ  ์ด๋ฅผ ๋ช…ํ™•ํ•˜๊ฒŒ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. R ๋ฐ ์ผ๋ฐ˜ ํŒจํ‚ค์ง€๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ด๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

Skillbox๋Š” ๋‹ค์Œ์„ ๊ถŒ์žฅํ•ฉ๋‹ˆ๋‹ค. ์‹ค๊ธฐ ์ฝ”์Šค "์ฒ˜์Œ๋ถ€ํ„ฐ Python ๊ฐœ๋ฐœ์ž".

์•Œ๋ฆผ: "Habr"์˜ ๋ชจ๋“  ๋…์ž๋ฅผ ์œ„ํ•œ - "Habr" ํ”„๋กœ๋ชจ์…˜ ์ฝ”๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Skillbox ๊ณผ์ •์— ๋“ฑ๋กํ•  ๋•Œ 10 ๋ฃจ๋ธ” ํ• ์ธ.

ํŒจํ‚ค์ง€

R์—๋Š” ํŒจํ‚ค์ง€๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

์ด ๋‘ ๊ฐ€์ง€๋Š” ๋งค์šฐ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ๊ด€๋ฆฌํ•˜๊ณ  ๊ทธ์— ๋”ฐ๋ผ ๋ฐฐ์—ด์„ ์ •๋ฆฌํ•˜๊ณ  ํฌ๋งทํ•˜๋ ค๋ฉด tidyverse, ๊ด€๋ฆฌ์ธ ๋ฐ ์ €์šธ์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

๋ฐ์ดํ„ฐ

์ด ํ”„๋กœ์ ํŠธ์—์„œ ์‚ฌ์šฉํ•  ์›๋ณธ ๋ฐ์ดํ„ฐ์„ธํŠธ๋Š” World Bank ์›น์‚ฌ์ดํŠธ์—์„œ ๋‹ค์šด๋กœ๋“œ๋ฉ๋‹ˆ๋‹ค. ์—ฌ๊ธฐ ์žˆ์Šต๋‹ˆ๋‹ค - ์„ธ๊ณ„์€ํ–‰ ๋ฐ์ดํ„ฐ. ๋™์ผํ•œ ๋ฐ์ดํ„ฐ๊ฐ€ ๋ฏธ๋ฆฌ ๋งŒ๋“ค์–ด์ ธ ์žˆ์–ด์•ผ ํ•˜๋Š” ๊ฒฝ์šฐ ๋‹ค์Œ์—์„œ ๋‹ค์šด๋กœ๋“œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ”„๋กœ์ ํŠธ ํด๋”.

์ด๊ฒƒ์€ ์–ด๋–ค ์ข…๋ฅ˜์˜ ์ •๋ณด์ž…๋‹ˆ๊นŒ? ์ƒ˜ํ”Œ์—๋Š” ์ˆ˜๋…„๊ฐ„(2000๋…„๋ถ€ํ„ฐ 2017๋…„๊นŒ์ง€) ๋Œ€๋ถ€๋ถ„์˜ ๊ตญ๊ฐ€์˜ GDP ๊ฐ€์น˜๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.

๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ

์•„๋ž˜์— ๊ฒŒ์‹œ๋œ ์ฝ”๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ•„์š”ํ•œ ๋ฐ์ดํ„ฐ ํ˜•์‹์„ ์ค€๋น„ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. ์—ด ์ด๋ฆ„์„ ์ง€์šฐ๊ณ , ์ˆซ์ž๋ฅผ ์ˆซ์ž ํ˜•์‹์œผ๋กœ ๋ณ€ํ™˜ํ•˜๊ณ , ์ˆ˜์ง‘() ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐ์ดํ„ฐ๋ฅผ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค. ๋‚˜์ค‘์— ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก gdp_tidy.csv์— ์ˆ˜์‹ ๋œ ๋ชจ๋“  ๋‚ด์šฉ์„ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.

library(tidyverse)
library(janitor)

gdp <- read_csv("./data/GDP_Data.csv")

#select required columns

gdp <- gdp %>% select(3:15)

#filter only country rows

gdp <- gdp[1:217,]

gdp_tidy <- gdp %>%
mutate_at(vars(contains("YR")),as.numeric) %>%
gather(year,value,3:13) %>%
janitor::clean_names() %>%
mutate(year = as.numeric(stringr::str_sub(year,1,4)))

write_csv(gdp_tidy,"./data/gdp_tidy.csv")

์• ๋‹ˆ๋ฉ”์ด์…˜ ํžˆ์Šคํ† ๊ทธ๋žจ

์ƒ์„ฑ์—๋Š” ๋‘ ๋‹จ๊ณ„๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

  • ggplot2๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‹ค์ œ ํžˆ์Šคํ† ๊ทธ๋žจ์˜ ์ „์ฒด ์„ธํŠธ๋ฅผ ํ”Œ๋กฏํ•ฉ๋‹ˆ๋‹ค.
  • gganimate๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์›ํ•˜๋Š” ๋งค๊ฐœ๋ณ€์ˆ˜๋กœ ์ •์  ํžˆ์Šคํ† ๊ทธ๋žจ์— ์• ๋‹ˆ๋ฉ”์ด์…˜์„ ์ ์šฉํ•ฉ๋‹ˆ๋‹ค.

๋งˆ์ง€๋ง‰ ๋‹จ๊ณ„๋Š” GIF ๋˜๋Š” MP4๋ฅผ ํฌํ•จํ•˜์—ฌ ์›ํ•˜๋Š” ํ˜•์‹์œผ๋กœ ์• ๋‹ˆ๋ฉ”์ด์…˜์„ ๋ Œ๋”๋งํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๋กœ๋“œ ์ค‘

  • ๋„์„œ๊ด€(ํƒ€์ด๋””๋ฒ„์Šค)
  • ๋„์„œ๊ด€(gganimate)

๋ฐ์ดํ„ฐ ๊ด€๋ฆฌ

์ด ๋‹จ๊ณ„์—์„œ๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ํ•„ํ„ฐ๋งํ•˜์—ฌ ๋งค๋…„ ์ƒ์œ„ 10๊ฐœ ๊ตญ๊ฐ€๋ฅผ ๊ฐ€์ ธ์™€์•ผ ํ•ฉ๋‹ˆ๋‹ค. ํžˆ์Šคํ† ๊ทธ๋žจ์˜ ๋ฒ”๋ก€๋ฅผ ํ‘œ์‹œํ•  ์ˆ˜ ์žˆ๋Š” ์—ฌ๋Ÿฌ ์—ด์„ ์ถ”๊ฐ€ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

gdp_tidy <- read_csv("./data/gdp_tidy.csv")

gdp_formatted <- gdp_tidy %>%
group_by(year) %>%
# The * 1 makes it possible to have non-integer ranks while sliding
mutate(rank = rank(-value),
Value_rel = value/value[rank==1],
Value_lbl = paste0(" ",round(value/1e9))) %>%
group_by(country_name) %>%
filter(rank <=10) %>%
ungroup()

์ •์  ํžˆ์Šคํ† ๊ทธ๋žจ ์ž‘์„ฑ

์ด์ œ ํ•„์š”ํ•œ ํ˜•์‹์˜ ๋ฐ์ดํ„ฐ ํŒจํ‚ค์ง€๊ฐ€ ์žˆ์œผ๋ฏ€๋กœ ์ •์  ํžˆ์Šคํ† ๊ทธ๋žจ ๊ทธ๋ฆฌ๊ธฐ๋ฅผ ์‹œ์ž‘ํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ ์ •๋ณด - ์„ ํƒํ•œ ๊ธฐ๊ฐ„ ๋™์•ˆ ์ตœ๋Œ€ GDP๋ฅผ ๊ธฐ๋กํ•œ ์ƒ์œ„ 10๊ฐœ ๊ตญ๊ฐ€์ž…๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ๋งค๋…„ ๊ทธ๋ž˜ํ”„๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค.

staticplot = ggplot(gdp_formatted, aes(rank, group = country_name,
fill = as.factor(country_name), color = as.factor(country_name))) +
geom_tile(aes(y = value/2,
height = value,
width = 0.9), alpha = 0.8, color = NA) +
geom_text(aes(y = 0, label = paste(country_name, " ")), vjust = 0.2, hjust = 1) +
geom_text(aes(y=value,label = Value_lbl, hjust=0)) +
coord_flip(clip = "off", expand = FALSE) +
scale_y_continuous(labels = scales::comma) +
scale_x_reverse() +
guides(color = FALSE, fill = FALSE) +
theme(axis.line=element_blank(),
axis.text.x=element_blank(),
axis.text.y=element_blank(),
axis.ticks=element_blank(),
axis.title.x=element_blank(),
axis.title.y=element_blank(),
legend.position="none",
panel.background=element_blank(),
panel.border=element_blank(),
panel.grid.major=element_blank(),
panel.grid.minor=element_blank(),
panel.grid.major.x = element_line( size=.1, color="grey" ),
panel.grid.minor.x = element_line( size=.1, color="grey" ),
plot.title=element_text(size=25, hjust=0.5, face="bold", colour="grey", vjust=-1),
plot.subtitle=element_text(size=18, hjust=0.5, face="italic", color="grey"),
plot.caption =element_text(size=8, hjust=0.5, face="italic", color="grey"),
plot.background=element_blank(),
plot.margin = margin(2,2, 2, 4, "cm"))

ggplot2๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ”Œ๋กฏ์„ ๋งŒ๋“œ๋Š” ๊ฒƒ์€ ๋งค์šฐ ๊ฐ„๋‹จํ•ฉ๋‹ˆ๋‹ค. ์œ„์˜ ์ฝ”๋“œ ์„น์…˜์—์„œ ๋ณผ ์ˆ˜ ์žˆ๋“ฏ์ด theme() ํ•จ์ˆ˜์—๋Š” ๋ช‡ ๊ฐ€์ง€ ํ•ต์‹ฌ ์‚ฌํ•ญ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ๋ชจ๋“  ์š”์†Œ๊ฐ€ ๋ฌธ์ œ ์—†์ด ์• ๋‹ˆ๋ฉ”์ด์…˜๋˜๋„๋ก ํ•˜๋ ค๋ฉด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ํ•„์š”ํ•œ ๊ฒฝ์šฐ ์ผ๋ถ€๋Š” ํ‘œ์‹œ๋˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ: ์ˆ˜์ง ๊ทธ๋ฆฌ๋“œ ์„ ๊ณผ ๋ฒ”๋ก€๋งŒ ๊ทธ๋ ค์ง€์ง€๋งŒ ์ถ• ์ œ๋ชฉ๊ณผ ๊ธฐํƒ€ ์—ฌ๋Ÿฌ ๊ตฌ์„ฑ ์š”์†Œ๋Š” ์˜์—ญ์—์„œ ์ œ๊ฑฐ๋ฉ๋‹ˆ๋‹ค.

์ƒ๊ธฐ

์—ฌ๊ธฐ์„œ ํ•ต์‹ฌ ๊ธฐ๋Šฅ์€ ์ „ํ™˜_์ƒํƒœ()์ด๋ฉฐ, ์ด๋Š” ๋ณ„๋„์˜ ์ •์  ๊ทธ๋ž˜ํ”„๋ฅผ ํ•จ๊ป˜ ์—ฐ๊ฒฐํ•ฉ๋‹ˆ๋‹ค. view_follow()๋Š” ๊ทธ๋ฆฌ๋“œ ์„ ์„ ๊ทธ๋ฆฌ๋Š” ๋ฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.

anim = staticplot + transition_states(year, transition_length = 4, state_length = 1) +
view_follow(fixed_x = TRUE) +
labs(title = 'GDP per Year : {closest_state}',
subtitle = "Top 10 Countries",
caption = "GDP in Billions USD | Data Source: World Bank Data")

ํ‘œํ˜„

์• ๋‹ˆ๋ฉ”์ด์…˜์ด ์ƒ์„ฑ๋˜์–ด anim ๊ฐ์ฒด์— ์ €์žฅ๋˜๋ฉด animate() ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ Œ๋”๋งํ•  ์ฐจ๋ก€์ž…๋‹ˆ๋‹ค. animate()์— ์‚ฌ์šฉ๋˜๋Š” ๋ Œ๋”๋Ÿฌ๋Š” ํ•„์š”ํ•œ ์ถœ๋ ฅ ํŒŒ์ผ ์œ ํ˜•์— ๋”ฐ๋ผ ๋‹ค๋ฅผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

GIF

# For GIF

animate(anim, 200, fps = 20, width = 1200, height = 1000,
renderer = gifski_renderer("gganim.gif"))

MP4

# For MP4

animate(anim, 200, fps = 20, width = 1200, height = 1000,
renderer = ffmpeg_renderer()) -> for_mp4

anim_save("animation.mp4", animation = for_mp4 )

๊ฒฐ๊ณผ

R์„ ์‚ฌ์šฉํ•˜์—ฌ ์• ๋‹ˆ๋ฉ”์ด์…˜ ํžˆ์Šคํ† ๊ทธ๋žจ ๋งŒ๋“ค๊ธฐ

๋ณด์‹œ๋‹ค์‹œํ”ผ ๋ณต์žกํ•œ ๊ฒƒ์€ ์—†์Šต๋‹ˆ๋‹ค. ์ „์ฒด ํ”„๋กœ์ ํŠธ๋Š” ๋‹ค์Œ์—์„œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‚ด GitHub, ์ ํ•ฉํ•˜๋‹ค๊ณ  ํŒ๋‹จ๋˜๋Š” ๋Œ€๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Skillbox๋Š” ๋‹ค์Œ์„ ๊ถŒ์žฅํ•ฉ๋‹ˆ๋‹ค.

์ถœ์ฒ˜ : habr.com

์ฝ”๋ฉ˜ํŠธ๋ฅผ ์ถ”๊ฐ€