Simple Motion Detection with Python and OpenCV — For Beginners

sokacoding
5 min readFeb 28, 2021

--

Hey folks! You got here because of the great title and your interest in a simple motion detector. If you want to follow along you’ll need Python 3 and OpenCV (and numpy of course… you always need numpy). After buying a webcam I thought I’d like to try to create a motion detector. My goal was to do so without googling anything on that matter. I will now share my minimalistic program that can be used for further expansion with more functionality.

On pexels.com by Alex Knight

The Base Code

At first we want to start off with a standard videofeed capture with opencv which looks like this. I am using a USB-Webcam:

import cv2

cap = cv2.VideoCapture(0)

while(True):
ret, frame = cap.read()
cv2.imshow('frame',frame)

if (cv2.waitKey(1) & 0xFF == ord('q')):
break

cap.release()
cv2.destroyAllWindows()

We can now see the videofeed from our webcam. The current frame is saved in the variable frame. As i wanted to make it as simple as possible my next thought was to use the grayscale information only. So we need to convert from RGB to grayscale and make the following changes to our code. (Annotation: OpenCV uses BGR-Format. I am saying RGB because the expression is used more widely):

gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
cv2.imshow('frame',gray)

The grayscale image is ready for our clever computations saved in the variable called gray as a numpy-array after using the cvtColor-function (convert color). We can print out the shapes of our variables frame and gray:

print(frame.shape)
print(gray.shape)
--
Output:
(480, 640, 3)
(480, 640)

An expression worth knowing is HWC-format which stands for height, width and channels. The frame variable has such format; height = 480, width = 640, channels = 3 (RGB). Other libraries that can read images might have other formats. The grayscale image has only one channel so the 3rd shape dimension is not needed as you can see in the output.

The Algorithm

A video consists of frames glued together. The videofeed of my USB-webcam can record 30 frames per second (fps). We want to detect motion. If there is motion of any kind taking place in a video there must be some kind of change in between two consecutive frames. And so i came up with the idea of comparing the mean values of all pixels of two consecutive frames. I do so by subtracting the two mean values. If you have a static video the result of this subtraction is zero. In reality the result will be alternating around zero because we always have to deal with some noise. Assuming that there is some kind of motion the value will rise (or fall). We will get a negative or positive value further away from zero. Supposing that you now want to compare that value to some kind of threshold it is better to use the absolute value of the result because it can be a high negative or positive value.

Let’s get to the coding. Before we enter the while-loop we initialize a variable called last_mean. Then after we grab our frame and convert it to grayscale we can do the math we thought of using numpy.

  • We take the mean value of the current frame and subtract the value of the previous frame
  • When the program starts we don’t have a “previous” frame, that is why we need the initialized variable last_mean set to 0
  • As we will be using the result of the subtraction with a threshold it is beneficial to use the absolute value of the result
import cv2
import numpy as np
cap = cv2.VideoCapture(0)
last_mean = 0
while(True):
ret, frame = cap.read()
cv2.imshow('frame',frame)
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
result = np.abs(np.mean(gray) - last_mean)
print(result)
last_mean= np.mean(gray)
if (cv2.waitKey(1) & 0xFF == ord('q')):
break

cap.release()
cv2.destroyAllWindows()

Next you should print out the result variable. Observe the value while not moving, while raising your arm slowly and while dancing. This should give you a good feel of how to set the threshold that fits your needs and your camera. In my case I chose a threshold of 0.3.

print(result)
if result > 0.3:
print("Motion detected!")
print("Started recording.")

In the following GIF you can see our code in action. For a better demonstration I am printing out the absolute result values every 30 frames only. As my webcam records with 30 fps this means every second.

Great, we can detect motion! Let us take it a step further an record the video for a specific amount of time when we do detect motion. If we want to specify the amount of recording time it is helpful to keep track of the number of frames. We will use a variable called frame_rec_count that we iterate after each frame. It is then possible to modify the break condition by adding (in pseudocode) “…stop if we reach 240 frames” which corresponds to 8 seconds of recording when operating with 30 fps. Recording frames and writing them to a videofile is fairly easy with opencv. You can look it up in the documentation of opencv. We introduce a bool type variable called detected_motion and initialize it as False at the beginning of our script right after the initialization of the last_mean variable. When we meet our thresholding if-condition we set detected_motion to True and set up another if-condition where we start recording only if we detected motion. Our code looks like this now:

import cv2
import numpy as np

cap = cv2.VideoCapture(0)
last_mean = 0
detected_motion = False
frame_rec_count = 0
fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter('output.avi',fourcc, 20.0, (640, 480))

while(True):
ret, frame = cap.read()
cv2.imshow('frame',frame)
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
result = np.abs(np.mean(gray) - last_mean)
print(result)
last_mean= np.mean(gray)
print(result)
if result > 0.3:
print("Motion detected!")
print("Started recording.")
detected_motion = True
if detected_motion:
out.write(frame)
frame_rec_count = frame_rec_count + 1
if (cv2.waitKey(1) & 0xFF == ord('q')) or frame_rec_count == 240:
break

cap.release()
cv2.destroyAllWindows()

This is our final code with which we can detect motion and automatically record a video file for a certain amount of time. I ran into the problem that the first webcam videofeed frame we read always leads to a motion detection. If you run into the same problem you can initialize another variable called first_val and set it to True. Create another if-condition in the while-loop and check that variable. Inside the if-condition just set the variable to False and use the pass-command after that:

if first_val:
first_val = False
pass
else:
# do the other stuff we discussed

This is it! Feel free to play around with the code and add new functionalities.

--

--

sokacoding

M.Sc. Media Informatics. Scientific Associate at Hochschule Düsseldorf - University of Applied Sciences.