Painting in the style of Frida Kahlo of two women sitting on bench holding notebooks

Data Engineer vs Data Scientist (Responsibilities, Salary & Career Outlook 2024)

11.9.2024data science

TLDR

Data engineer vs data scientist–key differences:

  • Responsibilities & toolkit:
    • data engineers facilitate data analysis by setting up data infrastructure and by preprocessing the data (toolkit: SQL, Python, databanks, Cloud (AWS/Azure/GCP).
    • data scientists conduct data analysis using modern statistical methods (toolkit: Python (Pandas, Jupiter Notebook), Excel, SQL, Apache Superset/Tableau/Power BI, ML).
  • Salary: $ data scientist > $ data engineer
  • Career paths:
    • junior --> senior --> staff
    • in big tech: choose between IC or management track

What does a data engineer do?

Data engineers are responsible for a company's data infrastructure, i.e. for the systems that pull, store and organize data. They build data pipelines and manage databases, often in the cloud. They also prepare data so that other data experts or reporting tools can analyze it (aka. data transformation). SQL is the data engineer's "love language".

What does a data scientist do?

Data scientists develop data-based solutions for complex problems (e.g. demand forecasts or personalized recommendations). They employ modern statistical methods, programmed in Python. In order to share their results, data scientists compile reports or build interactive dashboards.

Skill set: data scientist vs data engineer

Radar chart of the data engineer and data scientist's skillset (6 dimensions: statistics, software engineering, data transformation, data modeling, data visualization, domain knowledge). Data scientists are focused on analyzing data and searching for suitable methods and models. Thus, knowledge of statistical theory, but also of programming languages, is essential for them . Data engineers generally spend even more time coding than data scientists. Conversely, statistical methodology takes a backseat for data engineers (at least when compared to data scientists).

Data engineer vs data scientist: responsabilities & toolkit

Data engineers write Python scripts to automatically load data into a database. In the database, they transform and aggregate data using SQL to organize it in such a way that all data workers can access the data in a flexible and efficient manner.

What's new? Ten years ago, data engineers were still working on the company's own servers. Today it's common to use cloud services. Accordingly, experience with cloud environments makes a great addition to the CV of any data engineer.

Data scientists import Excel files or entire databases with SQL and carry out complex statistical analyses (from linear regression to KNN classification) using Pandas and Jupiter Notebook (i.e. Python packages and environments).

What's new? Similarly to data engineers, more skills are required of data scientists today than in the past. Since the AI boom of 2023, even more companies expect data scientists to not only master classic data science methods, but to also be familiar with machine learning, and, preferably, deep learning as well.

Who makes more? Salaries of data scientists and data engineers?

A data scientist in Germany makes around 78.000 € (gross median salary), a data engineer makes around 71.000 €. Junior data scientists and junior data engineers in Germany both make around 61.000 € (gross median salary).

Best majors for data engineers and data scientists

All study programs that cover statistic and/or teach programming skills are good choices for data scientists. The following majors would be a good fit: data science, statistics or computer science, but also other STEM subjects, sociology, psychology, economics or business studies.

These majors would all be valid options for data engineers as well. However, as data engineering involves a lot of coding (and not so much complex statistic), computer science programs are the most obvious choice.

Where to work as a data engineer? Where to work as a data scientist?

Data scientists tend to be hired by companies with a higher level of digitalization. In Germany, these are mainly large companies or start-ups (medium-sized companies are slowly catching up).

Career outlook 2024: data scientist vs data engineer

German companies want to (and have to) become data-driven. They won't achieve this without data scientists and data engineers. As of 2024, data scientists and data engineers therefore have the best chances of being hired or headhunted, and demand won't die down anytime soon. If anything, demand for data engineers will increase even further, as more and more companies start using the cloud.

Career paths of data engineers and data scientists

Most companies have senior and junior positions for data engineers and data scientists. Your career opportunities beyond the senior level depend on company-specificities: Does the company have its own data department or are you integrated into another department, say marketing? How big is the company? How serious is management about data analytics?

In large tech companies, both data engineers and data scientists can progress on their career path, either on the management track or on the individual contributor track (less personnel responsibility, more technical responsibility).

Can I switch between data engineer and data science jobs?

Data scientists sometimes slip into a data engineer role. When companies hire data scientists but no data engineers, data scientists often take on tasks such as building and managing databases. They practically train on the job to become data engineers.

Switching in the other direction is also possible, with data engineers taking on more data analysis tasks. BTW, for those with the combined skill set of data scientist and data engineer, a new job title has emerged: they are called analytics engineer.

P.S. The lines betwee data engineer and data scientist are blurred

What exactly you'll work on as data scientist or data engineer will vary from company to company. The broad distinctions we have drawn in this post apply to most data engineers and data scientists. But not every company in Germany uses the job titles unambiguously. If there are no data engineers in the company, then the data scientists might have to build the pipelines themselves.