Model Pipeline - Model optimization (quantization, pruning)
This pipeline shows how a neural network model is made smaller and faster using two techniques: quantization and pruning. Quantization reduces the number size used in the model, and pruning removes unnecessary parts of the model. This helps the model run efficiently on devices with less power.