Article
object-detectionyoloyolov8computer-visionpythonopencvreal-timevideo-processing
Run Real-Time Object Detection with YOLOv8 and Python
Bootstrap a real-time object detection script using your webcam. This guide uses the Ultralytics YOLOv8 model to identify objects in video frames, draw bounding boxes, and print structured detection data to the console.
beginner15 min5 steps
The play
- Install DependenciesSet up your Python environment by installing the core `ultralytics` library, which includes YOLOv8, and `opencv-python` for handling video streams from your webcam or video files.
- Load a Pre-trained ModelInstantiate a YOLO model. The `ultralytics` library automatically downloads the specified pre-trained model weights the first time you use them. We'll use 'yolov8n.pt', a small and fast model ideal for real-time applications.
- Run Detection on a Single ImagePerform a test inference on a single image to verify your setup. The `predict` method runs the detection and returns a list of results. We'll save the annotated image to a file to see the output.
- Process Detection ResultsThe power of the Object Detection Setup is accessing the structured data. Iterate through the `boxes` attribute of a result to get coordinates (xyxy), confidence scores, and class IDs for each detected object.
- Launch Real-Time Webcam FeedUse OpenCV to capture frames from your webcam, pass each frame to the YOLO model for inference, and display the annotated video stream in real-time. The starter code below provides a complete implementation of this.
Starter code
import cv2
from ultralytics import YOLO
import json
# Load the YOLOv8 model
model = YOLO('yolov8n.pt')
# Open the webcam
cap = cv2.VideoCapture(0)
# Check if the webcam is opened correctly
if not cap.isOpened():
raise IOError("Cannot open webcam")
print("Webcam opened. Press 'q' to quit.")
while True:
# Read a frame from the webcam
ret, frame = cap.read()
if not ret:
break
# Run YOLOv8 inference on the frame
results = model(frame)
# --- Process Detections for JSON Output ---
detections = []
for r in results:
for box in r.boxes:
detection = {
'class_id': int(box.cls.item()),
'class_name': model.names[int(box.cls.item())],
'confidence': float(box.conf.item()),
'box': [float(coord) for coord in box.xyxy[0].tolist()]
}
detections.append(detection)
# Print detections as a JSON string for the current frame
if detections:
print(json.dumps(detections, indent=2))
# Visualize the results on the frame
annotated_frame = results[0].plot()
# Display the annotated frame
cv2.imshow('YOLOv8 Real-Time Detection', annotated_frame)
# Break the loop if 'q' is pressed
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# Release the webcam and close the display window
cap.release()
cv2.destroyAllWindows()
print("Webcam feed closed.")