In connection with quarantine, many now spend the lion's share of their time at home, and this time can, and even needs to be spent with benefit.
At the beginning of the quarantine, I decided to bring to mind some projects started a few months ago. One such project was the video course "R Language for Excel Users". With this course, I wanted to lower the threshold for entering R, and to slightly compensate for the existing lack of training materials on this topic in Russian.
If all work with data in the company you work for is still done in Excel, then I suggest you get acquainted with a more modern, and at the same time completely free data analysis tool.
Content
If you are interested in data analysis, you might be interested in my telegram ΠΈ youtube channels. Most of the content of which is devoted to the R language.
Course built around architecture tidyverse, and the packages included in it: readr, vroom, dplyr, tidyr, ggplot2. Of course, there are other good packages in R that perform similar operations, for example data.table, but the syntax tidyverse intuitive, easy to read even for an inexperienced user, so I think that it is better to start learning the R language with tidyverse.
The course will guide you through all data analysis operations, from loading to visualizing the finished result.
Why R and not Python? Because R is a functional language, it's easier for Excel users to switch to it. no need to delve into traditional object-oriented programming.
At the moment, 12 video lessons are planned, lasting from 5 to 20 minutes each.
Lessons will open gradually. Every Monday I will open access to a new lesson on my website. YouTube channel in a separate playlist.
Who is this course for
I think this is clear from the title, however, I will describe in more detail.
The course is aimed at those who actively use Microsoft Excel in their work and implement all work with data there. In general, if you open the Microsoft Excel application at least once a week, then the course will suit you.
Programming skills are not required from you to complete the course. The course is aimed at beginners.
But, perhaps, starting from lesson 4, there will be material interesting for active users of R, as well. the main functionality of such packages as dplyr ΠΈ tidyr will be considered in sufficient detail.
Course program
Lesson 1: Installing the R Language and RStudio Development Environment
Description:
An introductory lesson during which we will download and install the necessary software, and a little bit about the features and interface of the RStudio development environment.
Description:
This lesson will help you understand what data structures are in the R language. We will analyze vectors, data frames and lists in detail. Let's learn how to create them and access their individual elements.
Lesson 3: Reading data from TSV, CSV, Excel files and Google Sheets
Description:
Working with data, regardless of the tool, begins with their extraction. Packages are used during the lesson vroom, readxl, googlesheets4 to load data into the R environment from csv, tsv, Excel files and Google Sheets.
Lesson 4: Filtering Rows, Selecting and Renaming Columns, Pipelines in R
Description:
In this video we continue our acquaintance with the library tidyverse and package dplyr.
Let's analyze the family of functions mutate(), and learn how to use them to add new calculated columns to the table.
Description:
This lesson is devoted to one of the main operations of data analysis, grouping and aggregation. In the course of the lesson, we will use the package dplyr and features group_by() ΠΈ summarise().
We will consider the entire family of functions summarise()Ie summarise(), summarise_if() ΠΈ summarise_at().
Lesson 7: Joining Tables Vertically and Horizontally in R
Description:
Window functions are similar in meaning to aggregating ones, they also take an array of values ββas input and perform arithmetic operations on them, but do not change the number of rows in the output result.
In this tutorial, we continue to explore the package dplyr, and functions group_by(), mutate(), as well as new cumsum(), lag(), lead() ΠΈ arrange().
Description:
Most Excel users use PivotTables, this is a handy tool with which you can turn an array of raw data into readable reports in a matter of seconds.
In this tutorial, we'll learn how to rotate tables in R, and convert them from wide to long and vice versa.
Most of the lesson is devoted to the package tidyr and functions pivot_longer() ΠΈ pivot_wider().
Lesson 10: Loading JSON Files into R and Converting Lists to Tables
Description:
JSON and XML are extremely popular formats for storing and exchanging information, usually due to their compactness.
But it is difficult to analyze data presented in such formats, therefore, before analysis, they must be brought to a tabular form, which is exactly what we will learn in this video.
The lesson is about the package tidyrincluded in the core of the library tidyverse, and functions unnest_longer(), unnest_wider() ΠΈ hoist().
Lesson 11: Plotting Quickly with the qplot() Function
Description:
The lesson demonstrates the full power of the package ggplot2 and the grammar laid down in it for plotting graphs in layers.
We will analyze the main geometries that are present in the package and learn how to overlay layers to build a graph.
Conclusion
I tried to approach the formation of the course program as concisely as possible, to highlight only the most necessary information that you need in order to take the first steps in learning such a powerful data analysis tool as the R language.
This course is not meant to be a complete guide to data analysis with R, but it will help you understand all the techniques you need to do so.
While the course program is designed for 12 weeks, every week, on Mondays, I will open access to new lessons, so I recommend Subscribe on YouTube channel, so as not to miss the publication of a new lesson.