As a data engineer, you will support the efficient processing of an increasing volume of operational data – both as part of operational needs and product improvements. You will work closely with the backend and product teams to translate key product-requirements into scalable data processing pipelines.
Your contributions will be essential to further strengthen our value proposition. As part of an international expert team you will extend our solutions with additional data-centric features. In addition, you will also contribute to the research and development of operational improvements by defining new data services to optimize installation steps and post-installation monitoring.
Key Qualifications:
3 - 5 years proven experience and present expertise in Production-grade solutions, not only prototypes.
ETL-pipeline design, implementation and health-monitoring.
Reporting, using Metabase and Grafana.
Scientific programming in Python using Pandas, Numpy, Scipy.
Database management and event engineering using SQL, MongoDB, Redis, Kafka, InfluxDB, PySpark and Liquibase.
Databricks (CLI) and Azure (CLI).
Software engineering (python and nodejs) + testing (pytest) + dependency & package management (Poetry/Conda).
Knowledge of database infrastructure patterns for achieving high performance and ability to scale (replication, caching, sharding, …)
GIT or other version control solutions.
You get extra points if you master:
ML Ops (MLflow, Tensorboard, …)
Statistics and ML, including (but not limited to) digital signal processing, scientific computing and feature engineering.
Advanced statistical modeling / machine learning engineering (Scikit-learn, Statsmodels, XGBoost, Tensorflow, Pytorch).
Implementation and monitoring of deployment pipelines (Dockerized development, K8S, Azure DevOPS, ...).
Knowledge of different PostgreSQL index types and the ability to assess their performance by analyzing query plans.
AWS (CLI).
Leveraging AWS RDS performance insights to identify potential issues or areas of improvement.
What we offer: