If you Google “data engineering” and “boring” together, you’ll see tons of content of people complaining about how repetitive or stale this job is.
Is it really just an endless cycle of building data pipelines? Will you spend your entire career working only with SQL and Python?
There was a phase earlier in my career when I felt this really hard. “Did I choose the wrong area?” I would always ask myself.
However, the more projects I got involved in, the more I saw that data engineering is actually a craft.
Also, I've become convinced that although my job as a Data Engineer will always be “pipeline-centered”—as in, “my goal will always be processing tons of data,” there's a hidden coefficient that will make a project cool or not: its platform.
Dante's Inferno of Data Engineering
Like Allighieri, let's start our journey in the Hades of our job. Let's paint the general picture of a boredom-prone role.
The firefighter Data Engineer is either stressed or bored, period.
His only role is to do the bidding of the people who analyze the data. Bonus stress points if they are meticulous about silly details and are not technical enough to transform the data into the format they want.
Bonus boredom points if this Data Engineer is trapped in a low-code form-filling system like Talend or even Airbyte to some extent. “YAML Data Engineering” can be a stale position, as well.
This Data Engineer's job description is to, upon request, create spreadsheets, views, aggregations, or tables for business stakeholders weekly. If they're lucky, a new system will be implemented at their company, and they'll get a say on things like architecture, schema, or data quality. But that's it.
This is definitely a negative coefficient, one that will lead (a) to burnout or (b) to permanent Data Engineering attrition.
If your role is not proactive, but reactive to others’ needs, you'll lose your sense of purpose and quickly become demotivated to work. Engineering should mean building, period.
The Paraíso of Data Engineering
My favorite projects are the ones oriented by the Platform patterns.
I love a “coding-first” architecture where the Engineer has to code things like: infrastructure, CI/CD, test suite, data quality checks, DDL migration, common data processing functions, DAGs, monitoring, and everything in between.
I just love this sense of bringing Software Engineering best practices to Data Engineering - and we can discuss later whether Data Engineering is Software Engineering.
This is the type of project where I truly feel like a builder. It's in this context where I feel the purpose of building beautiful, scalable software. In a sense, it's a bit like what I feel when I'm drawing or painting something.
Lest anyone say I ignored it, the data must be interesting, too. There are some domains that simply attract me. For instance, when I'm extracting interesting data from an API I feel just like a miner finding a veil of ore.
I truly believe that Data Engineering shines when it's developed in the company of quality software. It's great for the Engineer to have a broader toolset, especially when you mix in DevOps and Cloud Engineering practices.
The keywords that automatically get me excited with a project: (1) GitHub Actions or GitLab Pipelines; (2) Terraform or Pulumi; (3) Pytest; (4) Event-driven architecture; (5) MLOps pipeline, (6) Great Expectations.
It's not always that I get to use these cool software products AND still build high-quality datasets.
Does that reinforce the point of Data Engineering being boring?
After all, if you needs things added, then the core must suck.
Well, here in Brazil there's a native berry called açaí. We process its pulp by freezing, pasteurizing and blending it. It then becomes ice-cream like. But the catch is: it's not exactly sweet.
We normally eat açaí with some toppings like actual ice-cream, sweetened condensed milk, bananas, granola, etc. You’ll rarely see someone eating a bowl of pure açaí.
But does that mean that açaí tastes bad?
Of course not. It’s just better with the right add-ons.
I feel Data Engineering is the same. Alone, it's not bad, but I can become bitter very quickly. If you sprinkle a little bit of DevOps, Cloud Engineering, MLOps and Back-End, you'll hit the spot!