Article
tensorflow-litemodel-conversionon-device-mledge-aipythoninferencemobile-mlkeras
Deploy Your First Model with TensorFlow Lite
Convert a standard TensorFlow model to the lightweight .tflite format for on-device deployment. Learn to use the TFLiteConverter and run inference with the optimized model, preparing it for mobile or embedded systems.
beginner15 min4 steps
The play
- Load a Pre-Trained Keras ModelStart with a standard `tf.keras` model. We'll use MobileNetV2, a common choice for mobile vision applications, as our base model to convert. This avoids needing to train a model from scratch.
- Convert the Model to TensorFlow Lite FormatUse the `tf.lite.TFLiteConverter` to transform the Keras model into the optimized `.tflite` format. This is the core step for preparing a model for on-device inference with TensorFlow Lite.
- Save the .tflite Model FileSave the converted model to a binary file. This `.tflite` file is the final, portable artifact you'll deploy to your edge device, such as a mobile app or a Raspberry Pi.
- Run Inference with the TFLite InterpreterVerify the conversion by loading the `.tflite` file and running inference. This simulates how an application on a device would use the model. We'll use the Python interpreter for a quick check.
Starter code
import tensorflow as tf
import numpy as np
# 1. Load a pre-trained Keras model
print("Loading Keras MobileNetV2 model...")
keras_model = tf.keras.applications.MobileNetV2(input_shape=(224, 224, 3), include_top=True)
# 2. Convert the model using TensorFlow Lite Converter
print("Converting model to TensorFlow Lite format...")
converter = tf.lite.TFLiteConverter.from_keras_model(keras_model)
tflite_model = converter.convert()
# 3. Save the converted model to a .tflite file
model_path = 'mobilenet_v2.tflite'
with open(model_path, 'wb') as f:
f.write(tflite_model)
print(f"Model saved to {model_path}")
# 4. Verify the model by running inference
print("\nVerifying the TFLite model...")
# Load the TFLite model
interpreter = tf.lite.Interpreter(model_path=model_path)
interpreter.allocate_tensors()
# Get input and output tensor details
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
print("Input details:", input_details[0]['shape'], input_details[0]['dtype'])
print("Output details:", output_details[0]['shape'], output_details[0]['dtype'])
# Create a random dummy input that matches the model's expected input shape
input_shape = input_details[0]['shape']
input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)
# Set the input tensor and run inference
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
# Retrieve the output tensor
output_data = interpreter.get_tensor(output_details[0]['index'])
print("\nInference executed successfully!")
print("Output shape:", output_data.shape)
print("Sample output (first 5 values):", output_data[0][:5])