The You Only Look Once (YOLO) architecture was developed to create a one step process for detection and classification. The image is divided into a fixed grid of uniform cells and bounding boxes are predicted and classified within each cell. This architecture enables faster object detection and has been applied to streaming video.
The organization geography is displayed beneath. The pink hued layers have been quantized with 1 digit for loads and 3 cycle for enactments, and will be executed in the HW gas pedal, while different layers are executed in python.
The image processing is performed within Darknet by using python bindings.
The neural network has been trained on the PASCAL VOC (Visual Object Classes) and is able to identify 20 classes of objects in 4 categories
- Person: person
- Animal: bird, cat, cow, dog, horse, sheep
- Vehicle: airplane, bicycle, boat, bus, car, motorbike, train
- Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor
The means for discovery and grouping are like the past organization as this organization additionally utilizes the Multi-layer offload design.
Initialize the network
- Import libraries
- Instantiate classifier
- Perform other initializations in the Darknet framework
- Open image to be classified
- Execute the first convolutional layer in Python
- Compute HW Offload of the quantized layers
- Normalize using fully connected layers in python
Code for classification:
Draw detection boxes using Darknet
The image postprocessing (drawing the bounding boxes) is performed in darknet using python bindings
Code for image postprocessing:
Sample image (horses)
The first image that I going to use is a provided sample image of horses (773 x 512 pixels)
class: cow probability: 84%
class: horse probability: 74%
class: horse probability: 68%
Object detection bounding boxes:
The example shows the issues that occur with multiple overlapping objects.
Street camera images
The application that I would like to use neural networks for is object identification in video streams from surveillance cameras.
class: car probability: 30%
The car to the right is not detected.
class: car probability: 86%
class: car probability: 34%
Multiple bounding boxes for the same image
class: car probability: 79% — no separate class for truck
class: car probability: 96% — improved classification with larger image size (better resolution?)
class: car probability: 79%
class: car probability: 75%
class: person probability: 51%