SRE - DataPlatform

Juniors accepted
Permanent contract
Site Reliability Engineer (SRE)
Kubernetes
Grafana
Kafka

Here's the job offer description formatted in Markdown, following your specifications:

🌟 About the Role 🌟

Being an SRE at VeepeeTech means being part of a transversal SRE community while integrating a product-oriented Data Platform team.

You will contribute to the reliability, scalability, and operability of critical data services by applying SRE and DevOps practices, while sharing knowledge across teams.

The Data Platform is currently evolving toward a modern lakehouse architecture deployed on VeepeeCloud (our on-prem platform), based on technologies such as Trino, Iceberg, and object storage, with strong ambitions around performance, cost efficiency, and platform ownership.

You will work in a distributed environment (France & Spain), within a team of 40–50 data professionals across engineering, analytics, data science, and governance.

You will play a key role in ensuring the reliability and scalability of this next-generation data platform, while supporting the transition from public cloud to hybrid/on-prem architectures.


🎯 TASKS 🎯

☁️ Platform Reliability & Operations

  • Ensure reliability and performance of our data platform services (Trino, Iceberg, S3, Kafka, Flink)
  • Define and implement SRE best practices: SLIs/SLOs, error budgets, observability
  • Build and maintain monitoring, alerting, and incident response frameworks (Prometheus, Grafana, etc.)

🚀 Cloud Migration & Architecture

  • Contribute to the migration from public datawarehouse cloud to VeepeeCloud lakehouse stack
  • Support coexistence between cloud and on-prem systems and ensure consistency and reliability
  • Help design resilient architectures for ingestion, transformation, and serving layers

☸️ Kubernetes & Infrastructure

  • Operate and improve services running on Kubernetes (GKE/EKS & on-prem clusters)
  • Automate infrastructure provisioning using Terraform, Atlantis, and/or Crossplane
  • Improve GitOps workflows for platform deployment and configuration

💰 FinOps & Performance Optimization

  • Collaborate with teams to optimize compute/storage usage (Trino queries, BigQuery slots, etc.)
  • Build tools and dashboards to track cost, usage, and efficiency
  • Support the transition toward cost-efficient on-prem workloads

🤝 Developer Enablement

  • Improve self-service capabilities for data teams (e.g., provisioning Trino/Iceberg resources)
  • Help teams adopt best practices in reliability, observability, and deployment
  • Write clear technical documentation and runbooks

🛡️ Resilience & DRP

  • Contribute to Disaster Recovery Plan (DRP) definition and implementation
  • Ensure multi-DC resilience (FR1 / NL1) and data replication strategies
  • Participate in incident management and postmortems

👉 MUST HAVE skills 👈

  • Strong experience with Kubernetes in production environments
  • Experience with distributed data systems (or strong willingness to learn)
  • Solid understanding of SRE principles (monitoring, alerting, SLAs/SLOs)
  • Experience with Infrastructure as Code (Terraform or similar)
  • Familiarity with GitOps workflows
  • Experience with observability tools (Prometheus, Grafana, logging systems)
  • Comfortable working in cloud environments
  • Strong collaboration mindset and ability to work across teams
  • Fluent in English

👉 NICE TO HAVE skills 👈

  • Experience with Trino, Iceberg, or data lakehouse architectures
  • Experience with Ceph S3 or object storage systems
  • Knowledge of Kafka / Flink / Airflow
  • Experience with FinOps practices and cost optimization
  • Experience with Crossplane or platform self-service models
  • Programming skills (Python, Java, or Go)
  • Experience with multi-region / multi-DC architectures

BENEFITS

  • Variable bonus;
  • The dynamic and creative environment within international teams;
  • The variety of self-education courses on our e-learning platform;
  • Participation in meetups and conferences locally and internationally;
  • Flexible Office with up to 2 days at home

⚙️ RECRUITMENT PROCESS ⚙️

  1. 30-minute HR Screen with a Veepeeᵀᵉᶜʰ Recruiter
  2. General Technical exchange
  3. Technical exchange with the manager
  4. Team Interview

We are convinced that it is up to you to define the way you work, to develop yourself and to progress.
At Veepee we guarantee that you can just be yourself!
For the service of diversity and inclusion, Veepee is committed to reviewing all applications received on an equal basis.

🔗 COMPANY
For more information about our ecosystem: https://careers.veepee.com/en/home-page-en/

Reference :veepeetech+Veepee-SRE-DataPlatform

Skills

Data
Grafana
Kafka
Progress
Backend
Go
Java
Python
Ops
Kubernetes
Terraform
Cloud
Prometheus
Project Management
Management

Similar Jobs

brand cover
tech lead - full stack web (cybersécurité)
SeedfencePermanent contract
SeedfencePermanent contract
Rue de Paris, France
& Remote
Hybrid remote
≥ 3 years experience
45k€ ➞ 85k€/year
Python
Flask
React
12 days ago
brand cover
cloud ops
NeomiPermanent contract
NeomiPermanent contract
Paris, FR
& Remote
Hybrid remote
≥ 1 year experience
40k€ ➞ 50k€/year
Kubernetes
Docker
Terraform
20 days ago
brand cover
software engineer in tests
InceptoPermanent contract
InceptoPermanent contract
Paris, FR
& Remote
Hybrid remote
≥ 1 year experience
Java
Python
Selenium
20 days ago
brand cover
devops h/f
STIMEPermanent contract
STIMEPermanent contract
Châtillon, FR
& Remote
Hybrid remote
≥ 5 years experience
50k€ ➞ 60k€/year
Google Cloud
Kubernetes
Microsoft SQL Server
20 days ago
brand cover
ingénieure / ingénieur devops - paris
CapgeminiPermanent contract
CapgeminiPermanent contract
Issy-les-Moulineaux, FR
No remote work
≥ 5 years experience
D3
Active Directory
Ansible
14 days ago
brand cover
dataops & devops engineer - confirmé·e
JAKALAPermanent contract
JAKALAPermanent contract
Paris, FR& 1 other
No remote work
Juniors accepted
Azure
Databricks
Ansible
2 hours ago
brand cover
fullstack developer (f# / c#) - brand payment
VeepeePermanent contract
VeepeePermanent contract
Paris, FR
& Remote
Hybrid remote
≥ 5 years experience
F#
C#
Angular
5 hours ago
brand cover
Spontaneous applications
IbouSpontaneous applications
IbouSpontaneous applications
Paris, FR
& Remote
Hybrid remote
Juniors accepted
Rust
PHP
Laravel
16 days ago
brand cover
Spontaneous applications
Eleven LabsSpontaneous applications
Eleven LabsSpontaneous applications
Paris, FR& 2 others
& Remote
Hybrid remote
Juniors accepted
React
vueJS
NodeJS
9 days ago