Exploring Ultralytics YOLO Models

Introduction

The evolution of computer vision has been marked by continuous advancements, and one of the latest breakthroughs is the Ultralytics YOLO models. It has proven to be a powerful tool for real-time object detection, classification, and segmentation. In this blog, we'll walk you through how to build a real-time object-tracking application using Streamlit, leveraging the power of YOLO, SAM and other models from Ultralytics.

Overview

The goal of this project is to create a user-friendly interface where users can choose between various tasks such as object detection, segmentation, classification, and more. The app is capable of processing input from a webcam, image, or video file, and utilizes state-of-the-art models from Ultralytics.

Key Aspects

Model Selection Based on Task - Choose models dynamically based on the task (detection, segmentation, etc.).
Streamlit Integration - Use Streamlit to build an interactive user interface, enabling easy uploading of images/videos, configuring models, and displaying results.
Real-Time Video/Image Processing - Process video and images in real-time, applying the chosen model for the selected task.
Efficient Resource Management - Manage GPU resources and video capture devices effectively to ensure smooth performance.

Code Breakdown

Model Selection with YOLO

The application provides several tasks like object detection, segmentation, classification, pose detection, and oriented bounding boxes (OBB). Based on the selected task, the app retrieves the appropriate model from the list of available Ultralytics models.

def GetModels(task):
   if task == "Detect":
       available_models = [
           x.replace("yolo", "YOLO") for x in GITHUB_ASSETS_STEMS 
           if not (x.endswith("-seg") or x.endswith("-cls") or x.endswith("-pose") or x.endswith("-obb"))
       ]
   elif task == "Segment":
       available_models = [
           x.replace("yolo", "YOLO") for x in GITHUB_ASSETS_STEMS 
           if x.startswith("sam_") or x.endswith("-seg.pt")
       ]
   ...

In this section of the code, the GetModels( ) function dynamically selects models based on the task. If the user selects "Detect," the function filters out models related to segmentation or classification and presents only detection models. Similarly, other task options like segmentation or classification will filter the appropriate models.

Interactive Streamlit Interface

Streamlit offers an easy way to build w eb apps in Python, and it's a perfect choice for creating an interface for our object detection tool.

def Initialize(model=None):
   check_requirements("streamlit>=1.29.0")
   import streamlit as st
   from ultralytics import YOLO, SAM, FastSAM
   # Configure the Streamlit app
   st.set_page_config(page_title="Object Tracking", layout="wide")
   st.markdown("""<style>MainMenu {visibility: hidden;}</style>""", unsafe_allow_html=True)
   st.markdown("<h1 style='text-align:center;'>Object Tracking Application</h1>", unsafe_allow_html=True)

This function is the core of the application and starts by checking the necessary requirements. It uses Streamlit to create a minimalistic yet functional interface. The main title and layout are configured, and the sidebar allows users to select the input source (image, video, or webcam) and the task they want to perform (detect, segment, classify, etc.).

Real-Time Processing

Once the user selects a task and model, the application processes the input (either an image or a video). This section of the code handles video capture and runs the selected model in real-time, applying the chosen task.

if model and st.sidebar.button("Start"):
   if source == "webcam" or source == "video":
       videocapture = cv2.VideoCapture(file_name)
       while videocapture.isOpened():
           success, frame = videocapture.read()
           if not success:
               break
           results = model(frame, conf=conf, iou=iou, classes=selected_ind)
           annotated_frame = results[0].plot()
           # Display original and annotated frames
           org_frame.image(frame, channels="BGR")
           ann_frame.image(annotated_frame, channels="BGR")

Here, we capture video frames using OpenCV (cv2.VideoCapture), process them with the selected YOLO model, and display the original and annotated frames side-by-side in real-time.

Efficient Resource Management

Since we're working with potentially large models and GPU resources, it's essential to release resources after each operation.

def ClearAllResources(videocapture, st):
   if videocapture:
       videocapture.release()
   torch.cuda.empty_cache()  # Clear GPU memory
   cv2.destroyAllWindows()   # Close any open OpenCV windows

The ClearAllResources() function ensures that video capture devices are released, and GPU memory is cleared when they are no longer needed. This keeps the app from crashing due to excessive memory usage, especially when handling high-definition videos.

Key Features

Model Flexibility - The app supports multiple tasks (detection, segmentation, etc.) and automatically loads the appropriate models based on user selection.
Real-Time Performance - With GPU acceleration and efficient model inference, the app can handle real-time video input without lag.
User-Friendly Interface - The intuitive Streamlit interface allows users to quickly select their input source, task, and model, making it accessible even for non-experts.
Customizable Thresholds - The app provides controls for confidence and IoU thresholds, allowing users to tweak detection sensitivity to fit their needs.

Conclusion

This blog provides a detailed look into how to build a Streamlit application for object detection, classification, and segmentation using Ultralytics YOLO models. The combination of Ultralytics' advanced models and Streamlit's easy-to-use interface makes this a powerful tool for anyone interested in computer vision. The code can be further expanded by integrating additional Ultralytics models or implementing advanced features like multi-class tracking and interactive feedback for model training.

Exploring Ultralytics YOLO Models

Arun Gopalakrishnan

Introduction

Overview

Key Aspects

Code Breakdown

Model Selection with YOLO

Interactive Streamlit Interface

Real-Time Processing

Efficient Resource Management

Key Features

Conclusion

Arun Gopalakrishnan

Leave a comment

Search

Popular on eclabs.ai

Meet n8n: Automating Workflows...

The Mindset of Business Analys...

Proposition Based Chunking

Stay In Touch

Recent Articles

The AI Ecosystem

Exploring Vapi: Why Voice AI Could Transform Your Business

Meet n8n: Automating Workflows Without Code

More Stories

Meet n8n: Automating Workflows Without Code

Transforming Human-Language Interaction

Follow Us

Exploring Ultralytics YOLO Models

Introduction

Overview

Key Aspects

Code Breakdown

Model Selection with YOLO

Interactive Streamlit Interface

Real-Time Processing

Efficient Resource Management

Key Features

Conclusion

Leave a comment

Search

Popular on eclabs.ai

Never Miss A Post!

Stay In Touch

Recent Articles

More Stories

Follow Us