How to use OpenCV in Python

Learn how to use OpenCV in Python with our guide. Discover tips, real-world applications, and how to debug common errors.

Published on:

Tue

Mar 10, 2026

Updated on:

Wed

Apr 1, 2026

The Replit Team

ON THIS PAGE

Example H2

OpenCV offers a powerful toolkit for computer vision in Python. It provides the functions you need to process images and videos for tasks like object detection and facial recognition.

In this article, you'll learn essential techniques and practical tips. We'll cover real-world applications and common debugging advice to help you build your computer vision projects with confidence.

Basic image loading and display with OpenCV

import cv2 import numpy as np # Create a simple colored image img = np.zeros((200, 200, 3), dtype=np.uint8) img[:, :, 2] = 255 # Set red channel to maximum cv2.imshow('Red Image', img) print(f"Image shape: {img.shape}") cv2.waitKey(0) cv2.destroyAllWindows()--OUTPUT--Image shape: (200, 200, 3)

OpenCV treats images as NumPy arrays, which allows for efficient numerical operations. The code creates a 200x200 pixel image with three color channels using np.zeros. The img.shape of (200, 200, 3) confirms its dimensions of height, width, and channels.

Key points to note are:

The line img[:, :, 2] = 255 targets the third channel. OpenCV uses a BGR (Blue, Green, Red) color order by default, not RGB, which is why this specific index creates a red image.
While cv2.imshow() displays the image, it's the cv2.waitKey(0) function that's crucial. It pauses the script and waits for a key press, keeping the image window visible.

Basic image operations

Once you have an image array, you can easily modify it by resizing with cv2.resize(), converting colors with cv2.cvtColor(), or drawing new elements.

Resizing images with `cv2.resize()`

import cv2 import numpy as np # Create a sample image img = np.zeros((400, 600, 3), dtype=np.uint8) img[:, :, 0] = 255 # Fill with blue # Resize to half and double the size half = cv2.resize(img, (300, 200)) double = cv2.resize(img, (1200, 800)) print(f"Original: {img.shape}, Half: {half.shape}, Double: {double.shape}")--OUTPUT--Original: (400, 600, 3), Half: (200, 300, 3), Double: (800, 1200, 3)

The cv2.resize() function is your go-to for changing an image's dimensions. This is a common preprocessing step, especially when you need to standardize image sizes before feeding them into a machine learning model.

A crucial detail is that cv2.resize() expects the new size as a (width, height) tuple. This is the reverse of the NumPy array's (height, width) shape, so it's something to watch out for.

Converting between color spaces with `cv2.cvtColor()`

import cv2 import numpy as np # Create a red square img = np.zeros((200, 200, 3), dtype=np.uint8) img[:, :, 2] = 255 # Red in BGR # Convert to different color spaces gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV) print(f"BGR red pixel: {img[0, 0]}") print(f"Grayscale red value: {gray[0, 0]}") print(f"HSV red value: {hsv[0, 0]}")--OUTPUT--BGR red pixel: [0 0 255] Grayscale red value: 76 HSV red value: [ 0 255 255]

The cv2.cvtColor() function is your tool for switching between different color representations. This isn't just a cosmetic change; different color spaces are suited for different computer vision tasks. Converting to grayscale, for example, reduces an image to its light intensity, which is often sufficient for feature detection and simplifies processing.

The code demonstrates how a pure red BGR pixel, [0, 0, 255], translates into other formats. In grayscale, it becomes a single intensity value of 76.
In HSV (Hue, Saturation, Value), that same red becomes [0, 255, 255]. The HSV space is particularly useful because it separates color (Hue) from lighting, which makes tasks like tracking a specific colored object more robust.

Drawing shapes with OpenCV functions

import cv2 import numpy as np # Create a blank canvas canvas = np.zeros((300, 300, 3), dtype=np.uint8) # Draw shapes cv2.line(canvas, (50, 50), (250, 250), (0, 255, 0), 2) cv2.rectangle(canvas, (50, 50), (250, 150), (255, 0, 0), 3) cv2.circle(canvas, (150, 150), 60, (0, 0, 255), -1) print(f"Canvas shape: {canvas.shape}") print(f"Pixel at circle center: {canvas[150, 150]}")--OUTPUT--Canvas shape: (300, 300, 3) Pixel at circle center: [0 0 255]

You can draw shapes directly onto an image array using OpenCV's built-in functions. The code demonstrates cv2.line, cv2.rectangle, and cv2.circle, all of which modify the original canvas array in place. These functions share a similar structure, requiring coordinates, a BGR color, and line thickness.

A key detail is the thickness parameter. Using a positive number sets the outline's width, while passing -1—as with the circle—fills the shape completely. This is perfect for creating annotations or visual masks.

Advanced OpenCV techniques

With the fundamentals of image manipulation covered, you can now move into more advanced analysis, from applying image filters to processing entire video streams.

Applying image filters and edge detection

import cv2 import numpy as np # Create a sample image img = np.zeros((200, 200), dtype=np.uint8) img[50:150, 50:150] = 255 # White square # Apply filters blur = cv2.GaussianBlur(img, (15, 15), 0) edges = cv2.Canny(img, 100, 200) print(f"Center pixel in original: {img[100, 100]}") print(f"Center pixel in blurred: {blur[100, 100]}") print(f"Edge pixels detected: {np.count_nonzero(edges)}")--OUTPUT--Center pixel in original: 255 Center pixel in blurred: 255 Edge pixels detected: 392

Image filtering lets you enhance or modify images for analysis. The code demonstrates two common techniques: blurring and edge detection. Blurring, done here with cv2.GaussianBlur(), smooths an image and can help reduce noise before further processing.

The cv2.Canny() function is a popular algorithm for finding edges. It identifies sharp changes in pixel intensity, which is how it outlines the white square in the example.
The two threshold values, 100 and 200, help the algorithm determine what qualifies as a true edge, making it a powerful tool for feature extraction.

Detecting features with `cv2.goodFeaturesToTrack()`

import cv2 import numpy as np # Create a checkerboard pattern img = np.zeros((200, 200), dtype=np.uint8) img[0:100, 0:100] = 255 img[100:200, 100:200] = 255 # Detect corners corners = cv2.goodFeaturesToTrack(img, 10, 0.1, 10) print(f"Number of corners detected: {len(corners)}") print(f"First corner position: {corners[0].ravel()}")--OUTPUT--Number of corners detected: 4 First corner position: [ 100. 100.]

The cv2.goodFeaturesToTrack() function is your tool for identifying prominent corners in an image. Corners are distinct points that don't change much with rotation or scaling, which is why they're so valuable for motion tracking and image alignment. The code demonstrates this by correctly finding the four corners of the checkerboard pattern.

The function's parameters let you control the maximum number of corners to return, their quality, and the minimum distance between them.

Processing video with OpenCV

import cv2 import numpy as np # Create a simulated video frame frame = np.zeros((240, 320, 3), dtype=np.uint8) frame[80:160, 120:200, 1] = 255 # Green rectangle # Process the frame gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) blurred = cv2.GaussianBlur(gray, (5, 5), 0) _, thresh = cv2.threshold(blurred, 100, 255, cv2.THRESH_BINARY) print(f"Frame shape: {frame.shape}") print(f"Thresholded pixels: {np.count_nonzero(thresh)}")--OUTPUT--Frame shape: (240, 320, 3) Thresholded pixels: 6400

OpenCV processes video by handling each frame as a separate image. This code demonstrates a common pipeline for isolating an object. After converting the frame to grayscale and blurring it to reduce noise, it uses cv2.threshold() to create a binary image.

This function simplifies the frame into just black and white pixels based on a brightness cutoff.
This technique, called thresholding, is crucial for separating a subject from the background, making it easier to perform tasks like object tracking or contour detection.

Move faster with Replit

Replit is an AI-powered development platform that comes with all Python dependencies pre-installed, so you can skip setup and start coding instantly. Instead of just learning individual functions, you can use Agent 4 to build a complete, working application directly from a description.

Instead of piecing together techniques, describe the app you actually want to build and Agent 4 will take it from idea to working product:

An object outliner that uses cv2.Canny() to detect edges in an uploaded image and draws a bounding box around the primary subject.
A batch image resizer that processes a folder of images, standardizing their dimensions with cv2.resize() for a machine learning dataset.
A color-based object tracker that uses cv2.cvtColor() and cv2.threshold() to isolate and follow a specific color in a video stream.

Simply describe your app, and Replit will write the code, test it, and fix issues automatically, all within your browser.

Common errors and challenges

Even experienced developers run into a few common snags, but thankfully, the fixes for OpenCV's most frequent errors are straightforward and easy to learn.

Fixing type errors when processing images with `cv2.Canny()`

A frequent error with cv2.Canny() arises from a data type mismatch. The function is designed to find edges in a single-channel, grayscale image, but it's often mistakenly fed a three-channel BGR image, which causes it to fail.

The fix is simple: you just need to convert the image first. Before calling cv2.Canny(), use cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) to create a grayscale version that the function can process correctly.

Resolving image display issues with `cv2.waitKey()`

If you call cv2.imshow() and the image window flashes on screen before immediately vanishing, you've likely forgotten a crucial function. This happens because your script finishes executing before the window has a chance to render and wait for user input.

To fix this, you must pair every cv2.imshow() call with cv2.waitKey(0). This function tells your program to pause and wait indefinitely for a key press, keeping the image window open until you're ready to close it.

Troubleshooting odd-sized kernel errors in `cv2.GaussianBlur()`

An error message mentioning kernel size when using cv2.GaussianBlur() points to a specific requirement for one of its arguments. The function needs a blurring kernel—a small matrix that defines the blur effect—with odd-numbered dimensions.

You'll get an error if you provide an even number like (10, 10). To solve this, ensure the ksize tuple you pass to the function contains only odd, positive integers, such as (5, 5) or (21, 21).

Fixing type errors when processing images with `cv2.Canny()`

It’s easy to make this mistake when you’re focused on your project. You have a color image and want to find its edges, so you pass it directly to cv2.Canny(). The code below shows the resulting error.

import cv2 import numpy as np img = np.zeros((100, 100, 3), dtype=np.uint8) img[25:75, 25:75] = [0, 0, 255] # Red square # Try to apply edge detection directly to color image edges = cv2.Canny(img, 100, 200) cv2.imshow('Edges', edges)

The code creates a three-channel color image, img, and then passes it directly to cv2.Canny(). This mismatch between the image format and what the function expects is the source of the error. See the corrected code below.

import cv2 import numpy as np img = np.zeros((100, 100, 3), dtype=np.uint8) img[25:75, 25:75] = [0, 0, 255] # Red square # Convert to grayscale first gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) edges = cv2.Canny(gray, 100, 200) cv2.imshow('Edges', edges)

The fix is to insert a conversion step before edge detection. By calling cv2.cvtColor(img, cv2.COLOR_BGR2GRAY), you transform the three-channel color image into the single-channel grayscale format that cv2.Canny() expects. This error is common when processing images loaded from files, since they are almost always in color by default. Always check that your image format matches what the function requires.

Resolving image display issues with `cv2.waitKey()`

When an image window appears and then immediately closes, the culprit is usually a missing cv2.waitKey() call. Your script simply runs to completion without pausing for the display. The code below shows what happens when this function is omitted.

import cv2 import numpy as np img1 = np.zeros((100, 100, 3), dtype=np.uint8) img1[:, :, 0] = 255 # Blue image cv2.imshow('First Image', img1) img2 = np.zeros((100, 100, 3), dtype=np.uint8) img2[:, :, 1] = 255 # Green image cv2.imshow('Second Image', img2) cv2.destroyAllWindows()

The script calls cv2.imshow() twice, but the program continues executing and reaches cv2.destroyAllWindows() almost instantly. This gives the image windows no time to remain on screen. The corrected code below shows how to handle this properly.

import cv2 import numpy as np img1 = np.zeros((100, 100, 3), dtype=np.uint8) img1[:, :, 0] = 255 # Blue image cv2.imshow('First Image', img1) cv2.waitKey(1000) # Wait for 1 second img2 = np.zeros((100, 100, 3), dtype=np.uint8) img2[:, :, 1] = 255 # Green image cv2.imshow('Second Image', img2) cv2.waitKey(0) # Wait until a key is pressed cv2.destroyAllWindows()

The corrected code solves the problem by calling cv2.waitKey() after each cv2.imshow() call, which pauses the script and keeps the image window open. Passing a positive number like 1000 makes it wait for that many milliseconds, while 0 waits indefinitely for a key press.

This is essential for any application that displays images, especially when showing them in a sequence or processing video frames, as it gives you control over how long each one is visible.

Troubleshooting odd-sized kernel errors in `cv2.GaussianBlur()`

When using cv2.GaussianBlur(), you might hit an error related to kernel size. The function requires the blurring kernel’s dimensions to be odd numbers, like (5, 5). Providing an even-sized kernel will cause a crash. The code below demonstrates this common mistake.

import cv2 import numpy as np img = np.zeros((100, 100), dtype=np.uint8) img[40:60, 40:60] = 255 # Apply Gaussian blur with even-sized kernel blurred = cv2.GaussianBlur(img, (4, 4), 0) cv2.imshow('Blurred', blurred)

The error is triggered because cv2.GaussianBlur() is called with an even-sized kernel of (4, 4). See how the corrected code below adjusts this parameter to resolve the issue.

import cv2 import numpy as np img = np.zeros((100, 100), dtype=np.uint8) img[40:60, 40:60] = 255 # Apply Gaussian blur with odd-sized kernel blurred = cv2.GaussianBlur(img, (5, 5), 0) cv2.imshow('Blurred', blurred)

The fix is straightforward. The corrected code works because it provides an odd-sized kernel, (5, 5), which is a strict requirement for the cv2.GaussianBlur() function. The blurring algorithm needs a clear center point to calculate its effect, and only odd dimensions provide one. This error often appears when you're experimenting with different blur levels, so always ensure your kernel dimensions are odd, positive integers.

Real-world applications

These functions are the foundation for solving complex problems, from identifying faces in a crowd to digitizing text from a scanned document.

Detecting faces with `cv2.CascadeClassifier()`

The cv2.CascadeClassifier() function allows you to detect objects, including faces, by loading pre-trained models.

import cv2 import numpy as np # Create a sample image with a face-like shape img = np.zeros((300, 300, 3), dtype=np.uint8) img[100:200, 100:200] = [200, 200, 200] # Face area # Load face cascade and detect faces face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml') gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) faces = face_cascade.detectMultiScale(gray, 1.1, 4) print(f"Number of faces detected: {len(faces)}")

This code demonstrates how to use a Haar Cascade, a pre-trained model for object detection. It begins by loading OpenCV’s built-in frontal face model with cv2.CascadeClassifier().

The detection runs on a grayscale image, so the code first converts the input using cv2.cvtColor().
The detectMultiScale() method then scans the image and returns a list of rectangles for any faces it identifies.
Parameters like 1.1 and 4 help fine-tune the detector's sensitivity to avoid false positives.

Detecting text in documents with `cv2.findContours()`

You can use cv2.findContours() to find the outlines of continuous shapes, which is a key step for isolating text blocks in a document.

import cv2 import numpy as np # Create a sample document with text-like blocks doc = np.ones((300, 300), dtype=np.uint8) * 255 cv2.rectangle(doc, (50, 50), (250, 70), 0, -1) # Line of "text" cv2.rectangle(doc, (50, 100), (200, 120), 0, -1) # Second line # Find text blocks _, binary = cv2.threshold(doc, 128, 255, cv2.THRESH_BINARY_INV) contours, _ = cv2.findContours(binary, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) print(f"Text blocks detected: {len(contours)}") print(f"First text block area: {cv2.contourArea(contours[0])}")

This code isolates shapes, like blocks of text, by first preparing the image. It uses cv2.threshold() with the cv2.THRESH_BINARY_INV flag to invert the image, turning the black text areas white and the background black. This step is crucial because cv2.findContours() is designed to find white objects on a black background.

The cv2.findContours() function then traces the outlines of these shapes.
Using cv2.RETR_EXTERNAL tells the function to only grab the outermost contours, which is perfect for identifying separate text blocks without getting confused by letters inside them.

Get started with Replit

Now, turn these concepts into a real tool. Describe what you want to Replit Agent, like "an app that counts objects in an image" or "a face detector for a live video feed."

Replit Agent will write the code, test for errors, and deploy your application for you. Start building with Replit.

Build your first app today

Describe what you want to build, and Replit Agent writes the code, handles the infrastructure, and ships it live. Go from idea to real product, all in your browser.

Get started free

Get started for free

Create & deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.

Get started for free