"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

April 17, 2023

Sample python code vs Kubeflow pipeline.

 

When a pipeline is submitted, Kubeflow creates a Kubernetes pod for each step in the pipeline. The pod is responsible for running the code associated with the step. The code is typically packaged as a Docker container, which is then deployed to the Kubernetes cluster. The pod is responsible for downloading the container, running it, and then reporting the results back to the Kubeflow Pipelines platform.

The Kubeflow Pipelines platform is responsible for managing the execution of the pipeline. It will monitor the status of each step and ensure that the steps are executed in the correct order. It will also handle retries and rollbacks in the event of a failure.

DSL.Pipeline is a domain-specific language (DSL) for creating and managing Kubeflow Pipelines. It provides a way to define a pipeline as a set of steps, each of which is a container image. It also provides a way to define parameters and artifacts that are shared across steps. DSL.Pipeline is designed to be easy to use and to provide a consistent way to define and manage pipelines.

DSL.ContainerOp is a Kubeflow Pipeline component that allows users to run a containerized workload. It is a wrapper around the Kubernetes Pod API and provides an easy way to define and execute containerized workloads. It allows users to specify the container image, environment variables, command line arguments, and other parameters that are needed to run a containerized workload. It also provides an easy way to define and execute containerized workflows.

The above pipeline code creates a Docker image from the Python script. The pipeline code defines a function called add_two_numbers_pipeline, which takes two parameters a and b. It then creates a ContainerOp, which is a type of operation in Kubeflow Pipelines. This ContainerOp defines the Docker image to be used, which in this case is python:3.7. It also defines the command to be run, which is a Python script that calls the add_two_numbers function with the two parameters a and b. Finally, it compiles the pipeline into a YAML file.

The pipeline code does not actually generate a Docker image, but instead creates a definition of the Docker image that can be used to generate the actual Docker image. Kubeflow Pipelines does not create Docker images in runtime. It uses existing Docker images to run the pipeline steps.

Keep Exploring!!!

No comments: