Skip to main content

Object Detection

/img/content-concepts-raw-computer-vision-object-detection-slide29.png

Introduction​

  • Definition: Object detection is a computer vision technique that allows us to identify and locate objects in an image or video.
  • Applications: Crowd counting, Self-driving cars, Video surveillance, Face detection, Anomaly detection
  • Scope: Detect objects in images and videos, 2-dimensional bounding boxes, Real-time
  • Tools: Detectron2, TF Object Detection API, OpenCV, TFHub, TorchVision

Models​

Faster R-CNN​

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. arXiv, 2016.

SSD (Single Shot Detector)​

SSD: Single Shot MultiBox Detector. CVPR, 2016.

YOLO (You Only Look Once)​

YOLOv3: An Incremental Improvement. arXiv, 2018.

EfficientDet​

EfficientDet: Scalable and Efficient Object Detection. CVPR, 2020.

It achieved 55.1 AP on COCO test-dev with 77M parameters.

Process flow​

Step 1: Collect Images

Capture via camera, scrap from the internet or use public datasets

Step 2: Create Labels

This step is required only if the object category is not available in any pre-trained model or labels are not freely available on the web. To create the labels (bounding boxes) using either open-source tools like Labelme or any other professional tool.

Step 3: Data Acquisition

Setup the database connection and fetch the data into python environment

Step 4: Data Exploration

Explore the data, validate it and create preprocessing strategy

Step 5: Data Preparation

Clean the data and make it ready for modeling

Step 6: Model Building

Create the model architecture in python and perform a sanity check

Step 7: Model Training

Start the training process and track the progress and experiments

Step 8: Model Validation

Validate the final set of models and select/assemble the final model

Step 9: UAT Testing

Wrap the model inference engine in API for client testing

Step 10: Deployment

Deploy the model on cloud or edge as per the requirement

Step 11: Documentation

Prepare the documentation and transfer all assets to the client

Use Cases​

Automatic License Plate Recognition​

Recognition of vehicle license plate number using various methods including YOLO4 object detector and Tesseract OCR. Checkout the notion here.

Object Detection App​

This is available as a streamlit app. It detects common objects. 3 models are available for this task - Caffe MobileNet-SSD, Darknet YOLO3-tiny, and Darknet YOLO3. Along with common objects, this app also detects human faces and fire. Checkout the notion here.

Logo Detector​

Build a REST API to detect logos in images. API will receive 2 zip files - 1) a set of images in which we have to find the logo and 2) an image of the logo. Deployed the model in AWS Elastic Beanstalk. Checkout the notion here.

TF Object Detection API Experiments​

The TensorFlow Object Detection API is an open-source framework built on top of TensorFlow that makes it easy to construct, train, and deploy object detection models. We did inference on pre-trained models, few-shot training on single class, few-shot training on multiple classes and conversion to TFLite model. Checkout the notion here.

Pre-trained Inference Experiments​

Inference on 6 pre-trained models - Inception-ResNet (TFHub), SSD-MobileNet (TFHub), PyTorch YOLO3, PyTorch SSD, PyTorch Mask R-CNN, and EfficientDet. Checkout the notion here and here.

Object Detection App​

TorchVision Mask R-CNN model Gradio App. Checkout the notion here.

Real-time Object Detector in OpenCV​

Build a model to detect common objects like scissors, cups, bottles, etc. using the MobileNet SSD model in the OpenCV toolkit. It will task input from the camera and detect objects in real-time. Checkout the notion here. Available as a Streamlit app also (this app is not real-time).

EfficientDet Fine-tuning​

Fine-tune YOLO4 model on new classes. Checkout the notion here.

YOLO4 Fine-tuning​

Fine-tune YOLO4 model on new classes. Checkout the notion here.

Detectron2 Fine-tuning​

Fine-tune Detectron2 Mask R-CNN (with PointRend) model on new classes. Checkout the notion here.