How to use OpenCV in Python

Unlock computer vision with Python. This guide shows you how to use OpenCV, covering tips, real-world applications, and debugging.

How to use OpenCV in Python
Published on: 
Tue
Mar 10, 2026
Updated on: 
Fri
Mar 13, 2026
The Replit Team

OpenCV is a powerful library for computer vision in Python. It lets you process images and videos, which opens up a world of creative and practical software applications.

In this article, we'll cover key techniques, real-world applications, and practical tips. You'll also find debugging advice to help you master computer vision with Python and OpenCV.

Basic image loading and display with OpenCV

import cv2
import numpy as np

# Create a simple colored image
img = np.zeros((200, 200, 3), dtype=np.uint8)
img[:, :, 2] = 255 # Set red channel to maximum
cv2.imshow('Red Image', img)
print(f"Image shape: {img.shape}")
cv2.waitKey(0)
cv2.destroyAllWindows()--OUTPUT--Image shape: (200, 200, 3)

The code demonstrates how OpenCV treats images as NumPy arrays. The shape (200, 200, 3) defines the image's height, width, and color channels, which lets you manipulate pixels directly.

  • Color Order: A crucial detail is OpenCV's BGR (Blue, Green, Red) color order, not the more common RGB. That's why setting the channel at index 2 makes the image red.
  • Display Logic: The cv2.imshow() function displays the image, but you need cv2.waitKey(0) to pause the script and keep the window visible until you press a key.

Basic image operations

Once you have an image loaded as an array, you can easily resize it with cv2.resize(), convert its color space with cv2.cvtColor(), and draw shapes.

Resizing images with cv2.resize()

import cv2
import numpy as np

# Create a sample image
img = np.zeros((400, 600, 3), dtype=np.uint8)
img[:, :, 0] = 255 # Fill with blue

# Resize to half and double the size
half = cv2.resize(img, (300, 200))
double = cv2.resize(img, (1200, 800))
print(f"Original: {img.shape}, Half: {half.shape}, Double: {double.shape}")--OUTPUT--Original: (400, 600, 3), Half: (200, 300, 3), Double: (800, 1200, 3)

The cv2.resize() function is your go-to for changing an image's dimensions. It's essential for tasks like preparing images for a machine learning model that expects a fixed input size.

  • Pay close attention to the target size, which you provide as a tuple in (width, height) format. This is the reverse of how NumPy arrays store dimensions (height, width).
  • The function returns a completely new image with the updated dimensions, leaving your original image untouched.

Converting between color spaces with cv2.cvtColor()

import cv2
import numpy as np

# Create a red square
img = np.zeros((200, 200, 3), dtype=np.uint8)
img[:, :, 2] = 255 # Red in BGR

# Convert to different color spaces
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
print(f"BGR red pixel: {img[0, 0]}")
print(f"Grayscale red value: {gray[0, 0]}")
print(f"HSV red value: {hsv[0, 0]}")--OUTPUT--BGR red pixel: [0 0 255]
Grayscale red value: 76
HSV red value: [ 0 255 255]

The cv2.cvtColor() function is your tool for changing an image's color representation. While BGR is standard for display, other formats are often better suited for specific computer vision tasks. The code demonstrates converting a BGR image to both grayscale and HSV, two of the most common and useful transformations.

  • Grayscale (cv2.COLOR_BGR2GRAY): This conversion simplifies an image by removing color information, leaving only intensity values. It’s perfect for tasks like edge detection where color is just noise.
  • HSV (cv2.COLOR_BGR2HSV): This space separates color (Hue) from brightness (Value), which is incredibly useful for tracking objects based on their color, even in changing light conditions.

Drawing shapes with OpenCV functions

import cv2
import numpy as np

# Create a blank canvas
canvas = np.zeros((300, 300, 3), dtype=np.uint8)
# Draw shapes
cv2.line(canvas, (50, 50), (250, 250), (0, 255, 0), 2)
cv2.rectangle(canvas, (50, 50), (250, 150), (255, 0, 0), 3)
cv2.circle(canvas, (150, 150), 60, (0, 0, 255), -1)
print(f"Canvas shape: {canvas.shape}")
print(f"Pixel at circle center: {canvas[150, 150]}")--OUTPUT--Canvas shape: (300, 300, 3)
Pixel at circle center: [0 0 255]

OpenCV lets you draw shapes directly onto an image, which is just a NumPy array. These functions modify the array in place, making them ideal for tasks like annotating images with bounding boxes or other visual markers.

  • Functions like cv2.line(), cv2.rectangle(), and cv2.circle() require coordinates, a BGR color tuple, and a thickness value.
  • The thickness parameter is especially useful. A positive integer sets the line width, while a value of -1 instructs the function to draw a filled shape.

Advanced OpenCV techniques

Building on those basic operations, you can now use more sophisticated methods to analyze an image's content, from applying filters to processing entire video streams.

Applying image filters and edge detection

import cv2
import numpy as np

# Create a sample image
img = np.zeros((200, 200), dtype=np.uint8)
img[50:150, 50:150] = 255 # White square

# Apply filters
blur = cv2.GaussianBlur(img, (15, 15), 0)
edges = cv2.Canny(img, 100, 200)
print(f"Center pixel in original: {img[100, 100]}")
print(f"Center pixel in blurred: {blur[100, 100]}")
print(f"Edge pixels detected: {np.count_nonzero(edges)}")--OUTPUT--Center pixel in original: 255
Center pixel in blurred: 255
Edge pixels detected: 392

Image filters are essential for preprocessing images for analysis. The code demonstrates two key functions: cv2.GaussianBlur() for smoothing and cv2.Canny() for finding edges. These operations help you reduce noise and isolate important features within an image.

  • The cv2.GaussianBlur() function softens an image. It's a common first step to reduce unwanted noise before more complex analysis.
  • The cv2.Canny() algorithm is a powerful tool for identifying object boundaries by detecting sharp changes in pixel intensity, like the edges of the white square in the example.

Detecting features with cv2.goodFeaturesToTrack()

import cv2
import numpy as np

# Create a checkerboard pattern
img = np.zeros((200, 200), dtype=np.uint8)
img[0:100, 0:100] = 255
img[100:200, 100:200] = 255

# Detect corners
corners = cv2.goodFeaturesToTrack(img, 10, 0.1, 10)
print(f"Number of corners detected: {len(corners)}")
print(f"First corner position: {corners[0].ravel()}")--OUTPUT--Number of corners detected: 4
First corner position: [ 100. 100.]

The cv2.goodFeaturesToTrack() function is your tool for finding the most prominent corners in an image. These corners are stable features, which makes them useful for tasks like tracking objects across video frames or aligning images.

  • You tell the function the maximum number of corners to find, a quality threshold to filter out weak ones, and the minimum distance between them.
  • In the example, it correctly identifies the four corners of the checkerboard pattern, returning an array of their coordinates.

Processing video with OpenCV

import cv2
import numpy as np

# Create a simulated video frame
frame = np.zeros((240, 320, 3), dtype=np.uint8)
frame[80:160, 120:200, 1] = 255 # Green rectangle

# Process the frame
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
_, thresh = cv2.threshold(blurred, 100, 255, cv2.THRESH_BINARY)
print(f"Frame shape: {frame.shape}")
print(f"Thresholded pixels: {np.count_nonzero(thresh)}")--OUTPUT--Frame shape: (240, 320, 3)
Thresholded pixels: 6400

Processing video in OpenCV is the same as processing images, just applied sequentially to each frame. This example shows a common pipeline for preparing a single frame for analysis, which is a foundational skill for tasks like object tracking.

  • The process starts by converting the frame to grayscale and blurring it to reduce image noise.
  • Next, cv2.threshold() creates a binary image. Any pixel above a certain brightness becomes white, and all others become black. This is a simple yet powerful way to isolate objects of interest, like the rectangle in the frame.

Move faster with Replit

Replit is an AI-powered development platform that transforms natural language into working applications. Describe what you want to build, and Replit Agent creates it—complete with databases, APIs, and deployment.

For the computer vision techniques we've explored, Replit Agent can turn them into production tools:

  • Build a color isolation tool that lets users upload an image and extract all objects of a specific hue, perfect for design or analysis.
  • Create a simple image annotation utility that allows users to draw and save bounding boxes on images for creating custom datasets.
  • Deploy a real-time motion detection app that processes a video feed to highlight and log movement using frame differencing and thresholding.

Describe your app idea, and Replit Agent writes the code, tests it, and fixes issues automatically, all in your browser. Try Replit Agent to bring your computer vision projects to life.

Common errors and challenges

You'll inevitably encounter a few common errors when working with OpenCV, but most have straightforward solutions.

  • Fixing type errors when processing images with cv2.Canny()
  • Resolving image display issues with cv2.waitKey()
  • Troubleshooting odd-sized kernel errors in cv2.GaussianBlur()

Fixing type errors when processing images with cv2.Canny()

The cv2.Canny() function is designed for single-channel, 8-bit images, typically grayscale. You'll trigger a cryptic error if you pass a three-channel color image directly, as the algorithm requires intensity values, not color. The code below shows this common mistake.

import cv2
import numpy as np

img = np.zeros((100, 100, 3), dtype=np.uint8)
img[25:75, 25:75] = [0, 0, 255] # Red square

# Try to apply edge detection directly to color image
edges = cv2.Canny(img, 100, 200)
cv2.imshow('Edges', edges)

This code fails because cv2.Canny() expects a single-channel image, but it received a three-channel color image. The function can't interpret the color data. The corrected code below shows how to properly prepare the image before edge detection.

import cv2
import numpy as np

img = np.zeros((100, 100, 3), dtype=np.uint8)
img[25:75, 25:75] = [0, 0, 255] # Red square

# Convert to grayscale first
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray, 100, 200)
cv2.imshow('Edges', edges)

The fix is to convert the image to grayscale before passing it to cv2.Canny(). This function, like many others in OpenCV, requires a single-channel image because it operates on pixel intensity, not color information.

  • Always preprocess your image with cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) before attempting edge or feature detection to avoid this common error.

Resolving image display issues with cv2.waitKey()

A common frustration is having image windows appear and immediately vanish. This happens when you call cv2.imshow() without a follow-up cv2.waitKey(). The script doesn't pause to display the image, so it closes instantly. The code below shows this common mistake.

import cv2
import numpy as np

img1 = np.zeros((100, 100, 3), dtype=np.uint8)
img1[:, :, 0] = 255 # Blue image
cv2.imshow('First Image', img1)

img2 = np.zeros((100, 100, 3), dtype=np.uint8)
img2[:, :, 1] = 255 # Green image
cv2.imshow('Second Image', img2)
cv2.destroyAllWindows()

Because there’s no pause after the cv2.imshow() calls, the script races to cv2.destroyAllWindows() and closes everything instantly. You never get a chance to see the images. The corrected code below shows how to fix this.

import cv2
import numpy as np

img1 = np.zeros((100, 100, 3), dtype=np.uint8)
img1[:, :, 0] = 255 # Blue image
cv2.imshow('First Image', img1)
cv2.waitKey(1000) # Wait for 1 second

img2 = np.zeros((100, 100, 3), dtype=np.uint8)
img2[:, :, 1] = 255 # Green image
cv2.imshow('Second Image', img2)
cv2.waitKey(0) # Wait until a key is pressed
cv2.destroyAllWindows()

The fix is to call cv2.waitKey() after cv2.imshow(). This function pauses your script, giving the image window time to appear and stay open. Without it, the program finishes instantly, and you see nothing. It’s essential whenever you’re displaying images or video frames.

  • A positive argument like cv2.waitKey(1000) sets a delay in milliseconds.
  • An argument of 0 waits indefinitely for any key press.

Troubleshooting odd-sized kernel errors in cv2.GaussianBlur()

The cv2.GaussianBlur() function requires its kernel size—the area it analyzes—to be a tuple of odd numbers, like (5, 5). This ensures the filter has a clear center pixel. Using an even-sized kernel breaks this rule and triggers an error. The code below demonstrates this common mistake.

import cv2
import numpy as np

img = np.zeros((100, 100), dtype=np.uint8)
img[40:60, 40:60] = 255

# Apply Gaussian blur with even-sized kernel
blurred = cv2.GaussianBlur(img, (4, 4), 0)
cv2.imshow('Blurred', blurred)

The code fails because the kernel size (4, 4) is even, but cv2.GaussianBlur() requires odd dimensions to work correctly. The corrected code below shows the simple adjustment needed to fix this.

import cv2
import numpy as np

img = np.zeros((100, 100), dtype=np.uint8)
img[40:60, 40:60] = 255

# Apply Gaussian blur with odd-sized kernel
blurred = cv2.GaussianBlur(img, (5, 5), 0)
cv2.imshow('Blurred', blurred)

The fix is to provide an odd-numbered kernel size, like (5, 5), to cv2.GaussianBlur(). The function’s algorithm needs a clear center pixel to anchor the blurring calculation, and an even-sized kernel doesn't have one.

  • Keep this in mind for other convolution-based functions in OpenCV, as many share this requirement. It's a common source of errors when applying filters.

Real-world applications

With the fundamentals and common errors covered, you can now apply these techniques to practical tasks like face and text detection.

Detecting faces with cv2.CascadeClassifier()

The cv2.CascadeClassifier() function provides a straightforward way to find objects in an image, such as faces, by using pre-trained models called Haar cascades.

import cv2
import numpy as np

# Create a sample image with a face-like shape
img = np.zeros((300, 300, 3), dtype=np.uint8)
img[100:200, 100:200] = [200, 200, 200] # Face area

# Load face cascade and detect faces
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(gray, 1.1, 4)
print(f"Number of faces detected: {len(faces)}")

The code puts a pre-trained model to work for face detection. It begins by loading the haarcascade_frontalface_default.xml file, which contains the data needed to recognize faces.

  • The classifier requires a single-channel image, so you first convert the input to grayscale using cv2.cvtColor().
  • The core of the operation is detectMultiScale(), which scans the image and returns the coordinates of any detected faces.
  • Parameters like 1.1 (scale factor) and 4 (min neighbors) help you adjust the detection's sensitivity to avoid false positives.

Detecting text in documents with cv2.findContours()

The cv2.findContours() function is an effective way to isolate text blocks in a document by identifying the outlines of continuous shapes.

import cv2
import numpy as np

# Create a sample document with text-like blocks
doc = np.ones((300, 300), dtype=np.uint8) * 255
cv2.rectangle(doc, (50, 50), (250, 70), 0, -1) # Line of "text"
cv2.rectangle(doc, (50, 100), (200, 120), 0, -1) # Second line

# Find text blocks
_, binary = cv2.threshold(doc, 128, 255, cv2.THRESH_BINARY_INV)
contours, _ = cv2.findContours(binary, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
print(f"Text blocks detected: {len(contours)}")
print(f"First text block area: {cv2.contourArea(contours[0])}")

This code locates distinct regions in an image, like blocks of text. It first inverts the image using cv2.threshold, which turns the dark text areas into white shapes on a black background. This preparation is essential for the next step.

  • Next, cv2.findContours scans the image to trace the outlines of these white shapes.
  • Using cv2.RETR_EXTERNAL ensures it only finds the outermost boundaries, which cleanly separates each text block.
  • The function returns a list of contours, effectively isolating each region for further analysis.

Get started with Replit

Turn your knowledge into a real tool. Describe what you want to build, like “a web app that counts faces in an uploaded image” or “a utility that draws boxes around text in a document.”

Replit Agent writes the code, tests for errors, and deploys your app for you. Start building with Replit and bring your computer vision ideas to life.

Get started free

Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.

Get started for free

Create & deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.