Session: Simplifying Production Machine Learning with Delta Lake and MLFlow
Building and deploying machine learning (ML) models is complex. The concept of MLOps (Machine Learning Operations) has evolved rapidly in the last few years and now encompasses DataOps, ModelOps and DevOps. Various aspects of the data such as type (structured vs unstructured), file format, consistency, quality, optimizations, schema updates and regulatory requirements strongly influence both the performance and governance of the model. Data Engineers, Data Scientists and MLOps engineers spend significant time to deploy models to production and realize business value. Additionally, tracking and monitoring the life cycle of the model and its artifacts end-to-end can be challenging due to the variety and complexity of the process.
Adopting Delta Lake can alleviate many data problems and provides performance and consistency to ML models. Along with MLFlow, the popular open source ModelOps package, model prototyping to deployment , performance and drift can be managed with ease. In this talk we discuss and demo Delta Lake and MLFlow capabilities to supercharge development of robust production grade models.