Deploy Your First Model with TensorFlow Lite

Convert a standard TensorFlow model to the lightweight .tflite format for on-device deployment. Learn to use the TFLiteConverter and run inference with the optimized model, preparing it for mobile or embedded systems.

beginner15 min4 steps

The play

Load a Pre-Trained Keras Model
Start with a standard `tf.keras` model. We'll use MobileNetV2, a common choice for mobile vision applications, as our base model to convert. This avoids needing to train a model from scratch.
Convert the Model to TensorFlow Lite Format
Use the `tf.lite.TFLiteConverter` to transform the Keras model into the optimized `.tflite` format. This is the core step for preparing a model for on-device inference with TensorFlow Lite.
Save the .tflite Model File
Save the converted model to a binary file. This `.tflite` file is the final, portable artifact you'll deploy to your edge device, such as a mobile app or a Raspberry Pi.
Run Inference with the TFLite Interpreter
Verify the conversion by loading the `.tflite` file and running inference. This simulates how an application on a device would use the model. We'll use the Python interpreter for a quick check.

Starter code

import tensorflow as tf
import numpy as np

# 1. Load a pre-trained Keras model
print("Loading Keras MobileNetV2 model...")
keras_model = tf.keras.applications.MobileNetV2(input_shape=(224, 224, 3), include_top=True)

# 2. Convert the model using TensorFlow Lite Converter
print("Converting model to TensorFlow Lite format...")
converter = tf.lite.TFLiteConverter.from_keras_model(keras_model)
tflite_model = converter.convert()

# 3. Save the converted model to a .tflite file
model_path = 'mobilenet_v2.tflite'
with open(model_path, 'wb') as f:
    f.write(tflite_model)
print(f"Model saved to {model_path}")

# 4. Verify the model by running inference
print("\nVerifying the TFLite model...")
# Load the TFLite model
interpreter = tf.lite.Interpreter(model_path=model_path)
interpreter.allocate_tensors()

# Get input and output tensor details
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

print("Input details:", input_details[0]['shape'], input_details[0]['dtype'])
print("Output details:", output_details[0]['shape'], output_details[0]['dtype'])

# Create a random dummy input that matches the model's expected input shape
input_shape = input_details[0]['shape']
input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)

# Set the input tensor and run inference
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()

# Retrieve the output tensor
output_data = interpreter.get_tensor(output_details[0]['index'])

print("\nInference executed successfully!")
print("Output shape:", output_data.shape)
print("Sample output (first 5 values):", output_data[0][:5])