How to use OpenCV in Python
Learn how to use OpenCV in Python with our guide. Discover tips, real-world applications, and how to debug common errors.

OpenCV offers a powerful toolkit for computer vision in Python. It provides the functions you need to process images and videos for tasks like object detection and facial recognition.
In this article, you'll learn essential techniques and practical tips. We'll cover real-world applications and common debugging advice to help you build your computer vision projects with confidence.
Basic image loading and display with OpenCV
import cv2
import numpy as np
# Create a simple colored image
img = np.zeros((200, 200, 3), dtype=np.uint8)
img[:, :, 2] = 255 # Set red channel to maximum
cv2.imshow('Red Image', img)
print(f"Image shape: {img.shape}")
cv2.waitKey(0)
cv2.destroyAllWindows()--OUTPUT--Image shape: (200, 200, 3)
OpenCV treats images as NumPy arrays, which allows for efficient numerical operations. The code creates a 200x200 pixel image with three color channels using np.zeros. The img.shape of (200, 200, 3) confirms its dimensions of height, width, and channels.
Key points to note are:
- The line
img[:, :, 2] = 255targets the third channel. OpenCV uses aBGR(Blue, Green, Red) color order by default, not RGB, which is why this specific index creates a red image. - While
cv2.imshow()displays the image, it's thecv2.waitKey(0)function that's crucial. It pauses the script and waits for a key press, keeping the image window visible.
Basic image operations
Once you have an image array, you can easily modify it by resizing with cv2.resize(), converting colors with cv2.cvtColor(), or drawing new elements.
Resizing images with cv2.resize()
import cv2
import numpy as np
# Create a sample image
img = np.zeros((400, 600, 3), dtype=np.uint8)
img[:, :, 0] = 255 # Fill with blue
# Resize to half and double the size
half = cv2.resize(img, (300, 200))
double = cv2.resize(img, (1200, 800))
print(f"Original: {img.shape}, Half: {half.shape}, Double: {double.shape}")--OUTPUT--Original: (400, 600, 3), Half: (200, 300, 3), Double: (800, 1200, 3)
The cv2.resize() function is your go-to for changing an image's dimensions. This is a common preprocessing step, especially when you need to standardize image sizes before feeding them into a machine learning model.
- A crucial detail is that
cv2.resize()expects the new size as a(width, height)tuple. This is the reverse of the NumPy array's(height, width)shape, so it's something to watch out for.
Converting between color spaces with cv2.cvtColor()
import cv2
import numpy as np
# Create a red square
img = np.zeros((200, 200, 3), dtype=np.uint8)
img[:, :, 2] = 255 # Red in BGR
# Convert to different color spaces
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
print(f"BGR red pixel: {img[0, 0]}")
print(f"Grayscale red value: {gray[0, 0]}")
print(f"HSV red value: {hsv[0, 0]}")--OUTPUT--BGR red pixel: [0 0 255]
Grayscale red value: 76
HSV red value: [ 0 255 255]
The cv2.cvtColor() function is your tool for switching between different color representations. This isn't just a cosmetic change; different color spaces are suited for different computer vision tasks. Converting to grayscale, for example, reduces an image to its light intensity, which is often sufficient for feature detection and simplifies processing.
- The code demonstrates how a pure red BGR pixel,
[0, 0, 255], translates into other formats. In grayscale, it becomes a single intensity value of76. - In HSV (Hue, Saturation, Value), that same red becomes
[0, 255, 255]. The HSV space is particularly useful because it separates color (Hue) from lighting, which makes tasks like tracking a specific colored object more robust.
Drawing shapes with OpenCV functions
import cv2
import numpy as np
# Create a blank canvas
canvas = np.zeros((300, 300, 3), dtype=np.uint8)
# Draw shapes
cv2.line(canvas, (50, 50), (250, 250), (0, 255, 0), 2)
cv2.rectangle(canvas, (50, 50), (250, 150), (255, 0, 0), 3)
cv2.circle(canvas, (150, 150), 60, (0, 0, 255), -1)
print(f"Canvas shape: {canvas.shape}")
print(f"Pixel at circle center: {canvas[150, 150]}")--OUTPUT--Canvas shape: (300, 300, 3)
Pixel at circle center: [0 0 255]
You can draw shapes directly onto an image array using OpenCV's built-in functions. The code demonstrates cv2.line, cv2.rectangle, and cv2.circle, all of which modify the original canvas array in place. These functions share a similar structure, requiring coordinates, a BGR color, and line thickness.
- A key detail is the thickness parameter. Using a positive number sets the outline's width, while passing
-1—as with the circle—fills the shape completely. This is perfect for creating annotations or visual masks.
Advanced OpenCV techniques
With the fundamentals of image manipulation covered, you can now move into more advanced analysis, from applying image filters to processing entire video streams.
Applying image filters and edge detection
import cv2
import numpy as np
# Create a sample image
img = np.zeros((200, 200), dtype=np.uint8)
img[50:150, 50:150] = 255 # White square
# Apply filters
blur = cv2.GaussianBlur(img, (15, 15), 0)
edges = cv2.Canny(img, 100, 200)
print(f"Center pixel in original: {img[100, 100]}")
print(f"Center pixel in blurred: {blur[100, 100]}")
print(f"Edge pixels detected: {np.count_nonzero(edges)}")--OUTPUT--Center pixel in original: 255
Center pixel in blurred: 255
Edge pixels detected: 392
Image filtering lets you enhance or modify images for analysis. The code demonstrates two common techniques: blurring and edge detection. Blurring, done here with cv2.GaussianBlur(), smooths an image and can help reduce noise before further processing.
- The
cv2.Canny()function is a popular algorithm for finding edges. It identifies sharp changes in pixel intensity, which is how it outlines the white square in the example. - The two threshold values,
100and200, help the algorithm determine what qualifies as a true edge, making it a powerful tool for feature extraction.
Detecting features with cv2.goodFeaturesToTrack()
import cv2
import numpy as np
# Create a checkerboard pattern
img = np.zeros((200, 200), dtype=np.uint8)
img[0:100, 0:100] = 255
img[100:200, 100:200] = 255
# Detect corners
corners = cv2.goodFeaturesToTrack(img, 10, 0.1, 10)
print(f"Number of corners detected: {len(corners)}")
print(f"First corner position: {corners[0].ravel()}")--OUTPUT--Number of corners detected: 4
First corner position: [ 100. 100.]
The cv2.goodFeaturesToTrack() function is your tool for identifying prominent corners in an image. Corners are distinct points that don't change much with rotation or scaling, which is why they're so valuable for motion tracking and image alignment. The code demonstrates this by correctly finding the four corners of the checkerboard pattern.
- The function's parameters let you control the maximum number of corners to return, their quality, and the minimum distance between them.
Processing video with OpenCV
import cv2
import numpy as np
# Create a simulated video frame
frame = np.zeros((240, 320, 3), dtype=np.uint8)
frame[80:160, 120:200, 1] = 255 # Green rectangle
# Process the frame
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
_, thresh = cv2.threshold(blurred, 100, 255, cv2.THRESH_BINARY)
print(f"Frame shape: {frame.shape}")
print(f"Thresholded pixels: {np.count_nonzero(thresh)}")--OUTPUT--Frame shape: (240, 320, 3)
Thresholded pixels: 6400
OpenCV processes video by handling each frame as a separate image. This code demonstrates a common pipeline for isolating an object. After converting the frame to grayscale and blurring it to reduce noise, it uses cv2.threshold() to create a binary image.
- This function simplifies the frame into just black and white pixels based on a brightness cutoff.
- This technique, called thresholding, is crucial for separating a subject from the background, making it easier to perform tasks like object tracking or contour detection.
Move faster with Replit
Replit is an AI-powered development platform that comes with all Python dependencies pre-installed, so you can skip setup and start coding instantly. Instead of just learning individual functions, you can use Agent 4 to build a complete, working application directly from a description.
Instead of piecing together techniques, describe the app you actually want to build and Agent 4 will take it from idea to working product:
- An object outliner that uses
cv2.Canny()to detect edges in an uploaded image and draws a bounding box around the primary subject. - A batch image resizer that processes a folder of images, standardizing their dimensions with
cv2.resize()for a machine learning dataset. - A color-based object tracker that uses
cv2.cvtColor()andcv2.threshold()to isolate and follow a specific color in a video stream.
Simply describe your app, and Replit will write the code, test it, and fix issues automatically, all within your browser.
Common errors and challenges
Even experienced developers run into a few common snags, but thankfully, the fixes for OpenCV's most frequent errors are straightforward and easy to learn.
Fixing type errors when processing images with cv2.Canny()
A frequent error with cv2.Canny() arises from a data type mismatch. The function is designed to find edges in a single-channel, grayscale image, but it's often mistakenly fed a three-channel BGR image, which causes it to fail.
The fix is simple: you just need to convert the image first. Before calling cv2.Canny(), use cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) to create a grayscale version that the function can process correctly.
Resolving image display issues with cv2.waitKey()
If you call cv2.imshow() and the image window flashes on screen before immediately vanishing, you've likely forgotten a crucial function. This happens because your script finishes executing before the window has a chance to render and wait for user input.
To fix this, you must pair every cv2.imshow() call with cv2.waitKey(0). This function tells your program to pause and wait indefinitely for a key press, keeping the image window open until you're ready to close it.
Troubleshooting odd-sized kernel errors in cv2.GaussianBlur()
An error message mentioning kernel size when using cv2.GaussianBlur() points to a specific requirement for one of its arguments. The function needs a blurring kernel—a small matrix that defines the blur effect—with odd-numbered dimensions.
You'll get an error if you provide an even number like (10, 10). To solve this, ensure the ksize tuple you pass to the function contains only odd, positive integers, such as (5, 5) or (21, 21).
Fixing type errors when processing images with cv2.Canny()
It’s easy to make this mistake when you’re focused on your project. You have a color image and want to find its edges, so you pass it directly to cv2.Canny(). The code below shows the resulting error.
import cv2
import numpy as np
img = np.zeros((100, 100, 3), dtype=np.uint8)
img[25:75, 25:75] = [0, 0, 255] # Red square
# Try to apply edge detection directly to color image
edges = cv2.Canny(img, 100, 200)
cv2.imshow('Edges', edges)
The code creates a three-channel color image, img, and then passes it directly to cv2.Canny(). This mismatch between the image format and what the function expects is the source of the error. See the corrected code below.
import cv2
import numpy as np
img = np.zeros((100, 100, 3), dtype=np.uint8)
img[25:75, 25:75] = [0, 0, 255] # Red square
# Convert to grayscale first
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray, 100, 200)
cv2.imshow('Edges', edges)
The fix is to insert a conversion step before edge detection. By calling cv2.cvtColor(img, cv2.COLOR_BGR2GRAY), you transform the three-channel color image into the single-channel grayscale format that cv2.Canny() expects. This error is common when processing images loaded from files, since they are almost always in color by default. Always check that your image format matches what the function requires.
Resolving image display issues with cv2.waitKey()
When an image window appears and then immediately closes, the culprit is usually a missing cv2.waitKey() call. Your script simply runs to completion without pausing for the display. The code below shows what happens when this function is omitted.
import cv2
import numpy as np
img1 = np.zeros((100, 100, 3), dtype=np.uint8)
img1[:, :, 0] = 255 # Blue image
cv2.imshow('First Image', img1)
img2 = np.zeros((100, 100, 3), dtype=np.uint8)
img2[:, :, 1] = 255 # Green image
cv2.imshow('Second Image', img2)
cv2.destroyAllWindows()
The script calls cv2.imshow() twice, but the program continues executing and reaches cv2.destroyAllWindows() almost instantly. This gives the image windows no time to remain on screen. The corrected code below shows how to handle this properly.
import cv2
import numpy as np
img1 = np.zeros((100, 100, 3), dtype=np.uint8)
img1[:, :, 0] = 255 # Blue image
cv2.imshow('First Image', img1)
cv2.waitKey(1000) # Wait for 1 second
img2 = np.zeros((100, 100, 3), dtype=np.uint8)
img2[:, :, 1] = 255 # Green image
cv2.imshow('Second Image', img2)
cv2.waitKey(0) # Wait until a key is pressed
cv2.destroyAllWindows()
The corrected code solves the problem by calling cv2.waitKey() after each cv2.imshow() call, which pauses the script and keeps the image window open. Passing a positive number like 1000 makes it wait for that many milliseconds, while 0 waits indefinitely for a key press.
This is essential for any application that displays images, especially when showing them in a sequence or processing video frames, as it gives you control over how long each one is visible.
Troubleshooting odd-sized kernel errors in cv2.GaussianBlur()
When using cv2.GaussianBlur(), you might hit an error related to kernel size. The function requires the blurring kernel’s dimensions to be odd numbers, like (5, 5). Providing an even-sized kernel will cause a crash. The code below demonstrates this common mistake.
import cv2
import numpy as np
img = np.zeros((100, 100), dtype=np.uint8)
img[40:60, 40:60] = 255
# Apply Gaussian blur with even-sized kernel
blurred = cv2.GaussianBlur(img, (4, 4), 0)
cv2.imshow('Blurred', blurred)
The error is triggered because cv2.GaussianBlur() is called with an even-sized kernel of (4, 4). See how the corrected code below adjusts this parameter to resolve the issue.
import cv2
import numpy as np
img = np.zeros((100, 100), dtype=np.uint8)
img[40:60, 40:60] = 255
# Apply Gaussian blur with odd-sized kernel
blurred = cv2.GaussianBlur(img, (5, 5), 0)
cv2.imshow('Blurred', blurred)
The fix is straightforward. The corrected code works because it provides an odd-sized kernel, (5, 5), which is a strict requirement for the cv2.GaussianBlur() function. The blurring algorithm needs a clear center point to calculate its effect, and only odd dimensions provide one. This error often appears when you're experimenting with different blur levels, so always ensure your kernel dimensions are odd, positive integers.
Real-world applications
These functions are the foundation for solving complex problems, from identifying faces in a crowd to digitizing text from a scanned document.
Detecting faces with cv2.CascadeClassifier()
The cv2.CascadeClassifier() function allows you to detect objects, including faces, by loading pre-trained models.
import cv2
import numpy as np
# Create a sample image with a face-like shape
img = np.zeros((300, 300, 3), dtype=np.uint8)
img[100:200, 100:200] = [200, 200, 200] # Face area
# Load face cascade and detect faces
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(gray, 1.1, 4)
print(f"Number of faces detected: {len(faces)}")
This code demonstrates how to use a Haar Cascade, a pre-trained model for object detection. It begins by loading OpenCV’s built-in frontal face model with cv2.CascadeClassifier().
- The detection runs on a grayscale image, so the code first converts the input using
cv2.cvtColor(). - The
detectMultiScale()method then scans the image and returns a list of rectangles for any faces it identifies. - Parameters like
1.1and4help fine-tune the detector's sensitivity to avoid false positives.
Detecting text in documents with cv2.findContours()
You can use cv2.findContours() to find the outlines of continuous shapes, which is a key step for isolating text blocks in a document.
import cv2
import numpy as np
# Create a sample document with text-like blocks
doc = np.ones((300, 300), dtype=np.uint8) * 255
cv2.rectangle(doc, (50, 50), (250, 70), 0, -1) # Line of "text"
cv2.rectangle(doc, (50, 100), (200, 120), 0, -1) # Second line
# Find text blocks
_, binary = cv2.threshold(doc, 128, 255, cv2.THRESH_BINARY_INV)
contours, _ = cv2.findContours(binary, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
print(f"Text blocks detected: {len(contours)}")
print(f"First text block area: {cv2.contourArea(contours[0])}")
This code isolates shapes, like blocks of text, by first preparing the image. It uses cv2.threshold() with the cv2.THRESH_BINARY_INV flag to invert the image, turning the black text areas white and the background black. This step is crucial because cv2.findContours() is designed to find white objects on a black background.
- The
cv2.findContours()function then traces the outlines of these shapes. - Using
cv2.RETR_EXTERNALtells the function to only grab the outermost contours, which is perfect for identifying separate text blocks without getting confused by letters inside them.
Get started with Replit
Now, turn these concepts into a real tool. Describe what you want to Replit Agent, like "an app that counts objects in an image" or "a face detector for a live video feed."
Replit Agent will write the code, test for errors, and deploy your application for you. Start building with Replit.
Describe what you want to build, and Replit Agent writes the code, handles the infrastructure, and ships it live. Go from idea to real product, all in your browser.
Create & deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.

.png)
.png)
.png)