Room: Room 228
April 5
11:00–11:25
Machine learning (ML) model serialization helps to optimize inference latency, memory, and disk space requirements and provides more options for model deployment. We will explore the use cases that benefit the most from this technique and some drawbacks.
Basic understanding of MLops and ML model types.
This presentation discusses ML model serialization, showcasing its role in enhancing the efficiency and flexibility of machine learning systems. We will explore the most common model serialization formats for neural networks and boosted tree models (TorchScript, ONNX, Treelite). We will see how model serialization can help in restricted environments like AWS Lambda and check some drawbacks of the approach. Case studies from real-life examples will be presented to ground these concepts in practical reality.