How to count unique values in Python

Learn how to count unique values in Python. Explore different methods, tips, real-world applications, and common error debugging.

How to count unique values in Python
Published on: 
Tue
Mar 3, 2026
Updated on: 
Wed
Apr 1, 2026
The Replit Team

A count of unique values in Python is a frequent need for data analysis and validation. Python's built-in structures offer simple and powerful ways to accomplish this task efficiently.

In this article, we explore several techniques to get unique counts. We also cover practical tips, review real-world applications, and provide essential advice to debug and ensure your code produces accurate results.

Using set() to count unique values

numbers = [1, 2, 3, 1, 2, 4, 5, 4, 3, 2]
unique_count = len(set(numbers))
print(f"Number of unique values: {unique_count}")--OUTPUT--Number of unique values: 5

The set data structure is your most direct tool for this task because sets, by definition, only store unique items. When you convert a list to a set, Python automatically and efficiently filters out any repeated values, making it a highly readable solution.

The logic in len(set(numbers)) is a concise operation that unfolds in two simple steps:

  • First, set(numbers) creates a set from the list, which inherently discards all duplicate entries.
  • Then, len() is called on this new set to count the number of items it contains, giving you the final unique count.

Core techniques for counting unique values

While set() offers a quick count, other methods provide more detailed frequency information, including collections.Counter, dictionary comprehensions, and the pandas value_counts() function.

Using the collections.Counter class

from collections import Counter
items = ['apple', 'banana', 'apple', 'orange', 'banana', 'apple']
counter = Counter(items)
print(f"Unique items count: {len(counter)}")
print(f"Item frequencies: {dict(counter)}")--OUTPUT--Unique items count: 3
Item frequencies: {'apple': 3, 'banana': 2, 'orange': 1}

For more detailed insights, the collections.Counter class is your go-to tool. It’s a dictionary subclass designed for counting hashable objects, returning an object where items are keys and their frequencies are values. This gives you more than just a simple count.

  • The total number of unique items is found by calling len() on the Counter object.
  • You can access the full frequency map by converting the counter to a dictionary with dict(counter).

Using a dictionary comprehension

data = ['dog', 'cat', 'bird', 'dog', 'fish', 'bird', 'dog']
freq_dict = {item: data.count(item) for item in set(data)}
print(f"Number of unique animals: {len(freq_dict)}")
print(f"Frequencies: {freq_dict}")--OUTPUT--Number of unique animals: 4
Frequencies: {'dog': 3, 'cat': 1, 'bird': 2, 'fish': 1}

A dictionary comprehension offers a concise, one-line approach to building a frequency map. This method combines iteration and dictionary creation into a single, readable expression that's very Pythonic.

  • First, set(data) extracts the unique items from your list, ensuring each item is processed only once.
  • The comprehension then iterates over these unique items, using data.count(item) to tally the occurrences in the original list and create a key-value pair.

While this is a very clear way to write the logic, it can be less efficient than Counter on large datasets because count() must scan the list repeatedly.

Using pandas value_counts()

import pandas as pd
grades = ['A', 'B', 'A', 'C', 'B', 'A', 'D', 'F', 'B']
series = pd.Series(grades)
value_counts = series.value_counts()
print(f"Unique grades count: {len(value_counts)}")
print(value_counts)--OUTPUT--Unique grades count: 5
A 3
B 3
C 1
D 1
F 1
dtype: int64

For data analysis tasks, the pandas library is a powerhouse. The value_counts() function is specifically designed for frequency counting and is highly optimized for performance, especially on large datasets.

  • First, you convert your list into a pandas Series, which is a one-dimensional labeled array.
  • Calling value_counts() on the Series returns a new Series containing unique values as its index and their frequencies as its values.
  • The result is conveniently sorted, making it easy to see the most common items. The total unique count is just the length of this new Series.

Advanced techniques for counting unique elements

For situations demanding higher performance or a more functional approach, Python provides several powerful, specialized alternatives to the core methods.

Using numpy.unique with return_counts

import numpy as np
measurements = np.array([1.2, 2.3, 1.2, 4.5, 2.3, 6.7, 1.2])
unique_values, counts = np.unique(measurements, return_counts=True)
print(f"Unique values: {unique_values}")
print(f"Counts: {counts}")--OUTPUT--Unique values: [1.2 2.3 4.5 6.7]
Counts: [3 2 1 1]

For numerical data, NumPy offers a highly optimized solution. The numpy.unique function is designed for speed, especially with large arrays. It's a powerful tool when performance is critical.

  • Setting the return_counts parameter to True instructs the function to tally the occurrences of each unique value.
  • The function returns two separate NumPy arrays—one containing the sorted unique values and another holding their corresponding frequencies. This parallel structure makes it easy to map counts back to their values.

Using sets with generator expressions

nested_data = [[1, 2], [3, 4], [1, 2], [5, 6], [3, 4], [7, 8]]
unique_tuples = len(set(tuple(item) for item in nested_data))
flattened_unique = len(set(num for sublist in nested_data for num in sublist))
print(f"Unique nested lists: {unique_tuples}")
print(f"Unique elements across all lists: {flattened_unique}")--OUTPUT--Unique nested lists: 4
Unique elements across all lists: 8

Generator expressions offer a memory-efficient way to handle complex data structures like nested lists. They create items one by one, feeding them directly into the set() without building an intermediate list in memory. This approach is particularly useful for large datasets where performance matters.

  • To count unique sublists, each list is converted into a tuple using tuple(item). Sets require hashable items—tuples are, but lists are not.
  • To count unique elements across all sublists, a nested generator expression like (num for sublist in nested_data for num in sublist) flattens the data, processing each number individually.

Using functional programming with reduce

from functools import reduce
data_stream = [10, 20, 30, 10, 40, 20, 50, 30, 60]
unique_count = len(reduce(lambda seen, x: seen | {x}, data_stream, set()))
print(f"Number of unique values using functional approach: {unique_count}")--OUTPUT--Number of unique values using functional approach: 6

The reduce function from the functools module offers a functional programming approach to this problem. It works by applying a function cumulatively to each item in a sequence, effectively "reducing" it to a single final value.

  • The process starts with an empty set() as an initial accumulator.
  • For each item in the list, the lambda function performs a set union using the | operator. This adds the item to the accumulator set while naturally discarding duplicates.
  • The final result is a set containing all unique items, and len() gives you the total count.

Move faster with Replit

Replit is an AI-powered development platform where all Python dependencies come pre-installed, so you can skip setup and start coding instantly. This lets you move from learning individual functions like set() and collections.Counter to building complete applications faster. With Agent 4, you can describe what you want to build, and it will handle the code, databases, APIs, and deployment.

Instead of piecing techniques together, describe the app you want to build and Agent 4 will take it from idea to working product:

  • An inventory tracker that counts unique items from a sales log to monitor stock levels.
  • A data validation tool that scans a CSV file for duplicate customer IDs and reports the unique count.
  • A simple analytics dashboard that visualizes the most frequent user actions from a raw data stream.

Simply describe your app, and Replit will write the code, test it, and fix issues automatically, all within your browser.

Common errors and challenges

Even with straightforward methods, you can encounter tricky errors that require careful debugging and a solid understanding of Python's data structures.

Fixing TypeError when using set() with unhashable types

One of the most common issues is the TypeError: unhashable type. This error pops up when you try to add mutable (changeable) objects, like lists, into a set.

  • Sets require items to be "hashable," meaning they have a fixed value that never changes. Since lists can be modified, they can't be placed in a set.
  • To fix this, you can convert your mutable items into an immutable equivalent before adding them. For example, you can turn a list into a tuple, which is hashable and can be stored in a set.

Avoiding RuntimeError when modifying a set during iteration

You might also see a RuntimeError if you try to modify a set while iterating over it. Python prevents you from changing the size of a collection as you loop through it, as this can lead to unpredictable behavior.

  • This often happens if you try to remove items from a set inside a for loop that's iterating over that same set.
  • The best practice is to iterate over a copy of the set or create a new set to store the items you want to keep or discard, rather than modifying the original one in place.

Handling duplicate keys when using dictionaries for counting

When building a frequency map with a standard dictionary, you won't get an error for duplicate keys, but you can easily introduce logic flaws. A dictionary can only have one unique key, so if you're not careful, you might overwrite counts or perform redundant calculations.

  • For instance, manually looping through a list and repeatedly calling .count() for each item is inefficient and error-prone.
  • This is precisely why tools like collections.Counter or building a dictionary from a set are recommended. They are designed to handle the logic of counting unique items correctly and efficiently from the start.

Fixing TypeError when using set() with unhashable types

Fixing TypeError when using set() with unhashable types

A TypeError is a common roadblock when counting unique items. It happens because set() requires its elements to be "hashable," meaning their value can't change. Mutable objects like a list don't meet this requirement, causing an error.

When you try to add a list of lists directly into a set, Python raises this error. See what happens in the following code.

data = [[1, 2], [3, 4], [1, 2], [5, 6]]
unique_lists = set(data)
print(f"Number of unique lists: {len(unique_lists)}")

The set() function fails because it receives a list containing other lists. Since those inner lists are mutable, Python raises a TypeError. The fix is to convert them into a hashable format, as shown in the corrected example.

data = [[1, 2], [3, 4], [1, 2], [5, 6]]
unique_lists = set(tuple(item) for item in data)
print(f"Number of unique lists: {len(unique_lists)}")

The solution works because it converts each inner list into a tuple. Since tuples are immutable, they’re hashable and can be stored in a set(). This is done efficiently with a generator expression, (tuple(item) for item in data), which avoids creating an extra list in memory.

Keep an eye out for this TypeError whenever you work with nested data structures, especially when trying to count unique lists or dictionaries that contain mutable values.

Avoiding RuntimeError when modifying a set during iteration

Avoiding RuntimeError when modifying a set during iteration

Python throws a RuntimeError if you modify a set while iterating over it. This is a safeguard, as changing a collection's size mid-loop can cause the iterator to skip elements and produce unreliable results. The code below shows this in action.

numbers = {1, 2, 3, 4, 5}
for num in numbers:
if num % 2 == 0:
numbers.remove(num)
print(numbers)

The loop iterates over the numbers set while the if statement attempts to modify it with numbers.remove(num). This conflict causes the error. The corrected code below shows how to handle this safely.

numbers = {1, 2, 3, 4, 5}
numbers_to_remove = {num for num in numbers if num % 2 == 0}
result = numbers - numbers_to_remove
print(result)

The solution works by creating a separate set, numbers_to_remove, to hold the items slated for deletion. This avoids modifying the original numbers set while you're looping through it. It's done safely with a set comprehension.

Finally, the set difference operator (-) subtracts the unwanted elements, producing a new, clean result set. This is the standard and safest way to filter a set based on its own elements.

Handling duplicate keys when using dictionaries for counting

While a dictionary-based approach seems intuitive, it can be inefficient. A common mistake is to repeatedly call .count() inside a loop, forcing Python to scan the entire list for every single item. The code below demonstrates this redundant process.

data = ["apple", "banana", "apple", "orange"]
counts = {}
for item in data:
counts[item] = data.count(item)
print(counts)

This loop is inefficient because it calls .count() for every item, even duplicates, forcing redundant work. The code below shows a more optimized approach that avoids this issue.

data = ["apple", "banana", "apple", "orange"]
counts = {}
for item in data:
if item in counts:
counts[item] += 1
else:
counts[item] = 1
print(counts)

The corrected code is far more efficient because it only passes through the list once. It checks if an item already exists as a key in the counts dictionary. If it does, the code simply increments the count using += 1. If not, it adds the new item and sets its count to 1. This single-pass method avoids the performance hit of repeatedly calling .count(), a common inefficiency when manually building frequency maps from large datasets.

Real-world applications

With a firm grasp on these methods and their potential pitfalls, you can apply unique value counting to solve practical data challenges.

Extracting unique email domains from customer data

Imagine you have a large list of customer emails and want to analyze which email providers are most common. By iterating through the list, splitting each email at the @ symbol, and collecting the domains, you can create a new list of just the domain names.

Applying set() to this list of domains instantly filters it down to the unique providers. You can then use len() to get a precise count, giving you valuable data for market analysis or for spotting unusual or misspelled domains in your database.

Finding common elements between multiple datasets using set operations

Set operations are perfect for comparing lists. For example, if you have two lists of user IDs—one from users who opened a promotional email and another from users who visited your pricing page—you can find the overlap to identify highly engaged leads.

By converting both lists into sets, you can use the intersection operator (&) or the intersection() method to create a new set containing only the user IDs present in both. The length of this final set tells you exactly how many users performed both actions, offering a clear metric for campaign effectiveness.

Extracting unique email domains from customer data

You can efficiently analyze email domains by using a list comprehension to split() each email and then using set() to collect the unique results.

emails = ["[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]"]
domains = [email.split('@')[1] for email in emails]
unique_domains = set(domains)
print(f"Unique domains: {unique_domains}")
print(f"Number of unique domains: {len(unique_domains)}")

This code snippet efficiently isolates and counts unique email domains from a list. It’s a two-step process that combines data transformation with Python’s built-in data structures.

  • First, a list comprehension builds a new list called domains. It iterates through each email, using split('@')[1] to extract only the text that comes after the @ symbol.
  • Next, set(domains) converts that list of domains into a set. Because sets can only contain unique items, this conversion automatically removes all duplicates, giving you a clean collection of unique domains.

Finding common elements between multiple datasets using set operations

You can use set operations to quickly compare multiple lists and find common elements, such as customers who appear across several datasets, with simple operators like intersection (&) and union (|).

customer_set_a = {101, 102, 103, 104, 105}
customer_set_b = {103, 104, 105, 106, 107}
customer_set_c = {105, 106, 107, 108, 109}

customers_in_all_sets = customer_set_a & customer_set_b & customer_set_c
customers_in_any_set = customer_set_a | customer_set_b | customer_set_c
customers_in_exactly_two_sets = ((customer_set_a & customer_set_b) |
(customer_set_b & customer_set_c) |
(customer_set_a & customer_set_c)) - customers_in_all_sets

print(f"Customers in all three sets: {customers_in_all_sets}")
print(f"Unique customers across all sets: {len(customers_in_any_set)}")
print(f"Customers in exactly two sets: {customers_in_exactly_two_sets}")

This code uses Python's sets to perform powerful data segmentation without loops. It starts with three distinct groups of customer IDs.

  • The first operation, customer_set_a & customer_set_b & customer_set_c, isolates the single customer ID that appears in every group.
  • The second operation calculates the total count of distinct customers by merging all three sets.
  • Finally, it identifies customers appearing in exactly two sets by first finding all IDs shared between any two groups and then removing the ID that was common to all three.

Get started with Replit

Turn your knowledge into a real tool. Tell Replit Agent to "build a dashboard that counts unique visitor IPs from a log file" or "create a script that finds duplicate entries in a spreadsheet."

Replit Agent writes the necessary code, tests for errors, and handles deployment. Start building with Replit.

Build your first app today

Describe what you want to build, and Replit Agent writes the code, handles the infrastructure, and ships it live. Go from idea to real product, all in your browser.

Get started for free

Create & deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.