Leveling plan for obtaining the profession Data engineer

For the last eight years I have been working as a project manager (I do not write code at work), which naturally negatively affects my technological backend. I decided to close my technological gap and get the profession of Data engineer. The core skill of a Data Engineer is the ability to design, build, and maintain data warehouses.

I made a training plan, I think it will be useful not only for me. The plan is focused on self-study courses. Priority is given to free courses in Russian.

Sections:

  • Algorithms and data structures. Key section. Learn it and everything else will work too. It is important to get your hands on the code and use the basic structures and algorithms.
  • Databases and data warehouses, Business Intelligence. We are moving from algorithms to data storage and processing.
  • Hadoop and Big Data. When the database is not included on the hard drive, or when the data needs to be analyzed, but Excel can no longer load them, large data begins. In my opinion, it is necessary to proceed to this section only after a deep study of the two previous ones.

Algorithms and data structures

In my plan, I included learning Python, repeating the basics of mathematics and algorithmization.

Databases and data warehouses, Business Intelligence

Topics related to building data warehouses, ETL, OLAP cubes are highly dependent on tools, so I do not give links to courses in this document. It is advisable to study such systems when working on a specific project in a specific company. For acquaintance with ETL, you can try talent or Airflow.

In my opinion, it is important to study the modern Data Vault design methodology 1 link, 2 link. And the best way to learn it is to take it and implement it with a simple example. There are several Data Vault implementation examples on GitHub link. The Modern Data Warehouse Book: Modeling the Agile Data Warehouse with Data Vault by Hans Hultgren.

To get acquainted with the Business Intelligence tools for end users, you can use the free designer of reports, dashboards, mini data warehouses Power BI Desktop. Educational materials: 1 link, 2 link.

Hadoop and Big Data

Conclusion

Not everything you learn can be applied at work. Therefore, you need a graduation project in which you will try to apply new knowledge.

There are no topics related to data analysis and Machine Learning in the plan. this applies more to the Data Scientist profession. There are also no topics related to AWS clouds, Azure. these themes are highly dependent on the choice of platform.

Questions to the community:
How adequate is my leveling plan? What to remove or add?
What project would you recommend as a thesis?

Source: habr.com

Add a comment