A very good read - link
Copying a few lines/summary from it from the perspective that echo's my views
Perception - I believed that Kubernetes was essential to the DS/ML workflow.
Experience - However, as I learned more about low-level infrastructure, I realized how unreasonable it is to expect data scientists to know about it
Fact / Reality - In theory, you can learn both sets of skills. In practice, the more time you spend on one means the less time you spend on another.
My perspective - We can know few things in-depth and need to master them with multiple experiments. You can master few areas and have a broad understanding of the rest of them. Compile knowledge vs Customize knowledge vs Solve in your own way is different.
Interesting Analogy - I became a data scientist because I wanted to spend more time with data, not with spinning up AWS instances, writing Dockerfiles, scheduling/scaling clusters, or debugging YAML configuration files.
Recommendations
- Have a separate team to manage production
- Infrastructure abstraction kubeflow, metaflow, google vertex is useful for non-trivial workflows, and multiple models in production.
No comments:
Post a Comment