Smile news

Kestra: The modern data workflow orchestrator

Date de l’événement Mar. 03 2025
Temps de lecture min.

Discover Kestra, the new open-source cloud-native orchestrator that simplifies and automates your data workflows with flexibility, scalability, and observability.

In a world where data flow management has become critical, workflow orchestration is a key element in ensuring the performance, reliability, and scalability of data pipelines. Among emerging solutions, Kestra stands out as a modern open-source orchestrator, designed for cloud-native environments and distributed architectures.

Kestra allows workflows to be created either by coding them directly or by using an intuitive visual interface, thus providing greater accessibility to both technical and business teams.

What is Kestra ?

Kestra is a workflow orchestration platform designed to simplify the management of complex tasks by enabling declarative and scalable modeling of data pipelines. It supports nearly 200 native connectors, covering databases (PostgreSQL, MySQL, MongoDB), cloud services (AWS, GCP, Azure), messaging systems (Kafka, RabbitMQ), and much more.

Kestra architecture

Kestra is built on a distributed and cloud-native architecture that enables efficient orchestration of large-scale workflows. Here are the key components:

Multi-Tenant : Kestra is designed to manage multiple independent workspaces (namespaces), allowing different teams or departments to run their own workflows without interference. Each namespace can be configured with its own resources and permissions.
Workers :They are responsible for executing workflow tasks. They are distributed and scalable, meaning Kestra can handle a large number of tasks in parallel. Each worker can be deployed on Kubernetes or dedicated machines, ensuring flexibility and automatic scaling.
Executors : They manage the launching and scheduling of workflows. They oversee the state of running workflows, restart failed tasks, and ensure execution in the intended order. Kestra uses an event-driven architecture that allows executors to react immediately to task state changes.
Storage and persistence : Kestra uses a centralized database (often PostgreSQL) to store workflow definitions, execution logs, and metadata. This centralization ensures high availability and quick recovery after an incident.
API et interface utilisateur : Kestra provides a REST API that allows interaction with the orchestration engine. A comprehensive graphical user interface enables users to design, monitor, and troubleshoot workflows without writing a single line of code.

Example: workflow implementation

Let's take the example of a mid-sized company looking to orchestrate a sales data processing pipeline. The Kestra workflow can be defined in YAML and includes the following steps:

Data extraction: A task connects to a PostgreSQL database and extracts the day's sales.
Transformation: A Python task cleans and enriches the data using Pandas.
Loading: The data is sent to a Data Warehouse like BigQuery or Snowflake.
Notification: A message is sent via Slack or Microsoft Teams to alert analysts.
Monitoring: Executions are tracked using Prometheus and Grafana to detect errors and monitor performance.

Example of a YAML workflow definition:

id: workflow-ventes
namespace: entreprise
inputs:
  - type: string
    name: date

tasks:
  - id: extraction
    type: io.kestra.plugin.jdbc.postgresql.Query
    sql: "SELECT * FROM ventes WHERE date = :date"

  - id: transformation
    type: io.kestra.plugin.scripts.python.Script
    script: |
      import pandas as pd
      df = pd.read_csv('data.csv')
      df = df.dropna()
      df.to_csv('clean_data.csv', index=False)

  - id: chargement
    type: io.kestra.plugin.gcp.bigquery.Load
    dataset: ventes
    table: ventes_journalieres
    source: clean_data.csv

  - id: notification
    type: io.kestra.plugin.slack.SendMessage
    channel: '#data-alerts'
    text: 'Pipeline des ventes exécuté avec succès !'

Why Kestra ?

1. Simplicity

Kestra allows workflows to be defined as YAML files, making integration into DevOps and DataOps processes easier. Unlike orchestrators such as Apache Airflow, Kestra focuses on a more intuitive and readable approach, reducing the complexity of developing and maintaining pipelines.

Kestra is language-agnostic, enabling developers to write tasks in various languages such as Go, Python, R, Java, Node.js, and many others, providing great flexibility for technical teams.

2. Cloud-native scalability

Designed to run on Kubernetes, Kestra enables smooth and automatic scaling to meet growing data processing needs. It offers fine-grained resource management with distributed executions, ensuring high availability and optimized workload management.

3. Flexibility and versatility

Kestra supports various use cases:

ETL & ELT : Data integration and transformation in batch or real-time mode.
Data engineering : Automation of complex workflows for data management.
Machine learning & AI : Complete orchestration of machine learning pipelines, including data preparation, model training, validation, deployment to production, and continuous monitoring.

4. Observability and seccurity

Native integration with Prometheus, Grafana, and Elasticsearch enables precise execution tracking and detailed performance analysis. Kestra also provides error management mechanisms and automatic recovery in case of failure, ensuring increased reliability.

Kestra vs Airflow: a new standard?

While Apache Airflow is a leading workflow orchestration solution, Kestra positions itself as a modern alternative tailored for cloud-native architectures. Its simple configuration and ability to handle massively parallel workflows make it a relevant choice for organizations looking to modernize their data stack.

Pourquoi Kestra ?

1. Simplicité

Kestra permet de définir des workflows sous forme de fichiers YAML, facilitant ainsi l'intégration dans les processus DevOps et DataOps. Contrairement aux orchestrateurs comme Apache Airflow, Kestra mise sur une approche plus intuitive et lisible, réduisant ainsi la complexité du développement et de la maintenance des pipelines.

Kestra est language-agnostique, permettant aux développeurs d'écrire leurs tâches en divers langages tels que Go, Python, R, Java, Node.JS et bien d'autres, offrant ainsi une grande flexibilité aux équipes techniques.

2. Scalabilité native Cloud

Conçu pour fonctionner sur Kubernetes, Kestra permet un passage à l’échelle fluide et automatique, répondant aux besoins croissants en matière de traitement des données. Il offre une gestion fine des ressources avec des exécutions distribuées, assurant ainsi une haute disponibilité et une gestion optimisée des charges de travail.

3. Flexibilité et Polyvalence

Kestra prend en charge divers cas d’usage :

ETL & ELT : Intégration et transformation de données en mode batch ou temps réel.
Data engineering : Automatisation des workflows complexes pour la gestion des données.
Machine learning & AI : Orchestration complète des pipelines de machine learning, incluant la préparation des données, l'entraînement des modèles, leur validation, leur déploiement en production et leur supervision continue.

4. Observabilité et Sécurité

L’intégration native avec Prometheus, Grafana et Elasticsearch permet un suivi précis des exécutions et une analyse fine des performances. Kestra offre également des mécanismes de gestion des erreurs et de reprise automatique en cas d’échec, garantissant une fiabilité accrue.

Kestra vs Airflow : Un Nouveau Standard ?

Si Apache Airflow est aujourd’hui une référence dans l’orchestration des workflows, Kestra se positionne comme une alternative moderne et adaptée aux architectures cloud-native. Sa simplicité de configuration et sa capacité à gérer des workflows massivement parallélisés en font un choix pertinent pour les organisations cherchant à moderniser leur stack data.

Kestra is an innovative and robust solution that enhances data workflow management while reducing technical debt. With its declarative approach, seamless integration with cloud tools, and scalable architecture, Kestra is a strategic choice for any company looking to optimize its data processes.

Jamel Ben Amar

CTO, Smile

For more information about this solution, feel free to contact us.

Kestra: The modern data workflow orchestrator

What is Kestra ?

Kestra architecture