How to count unique values in Python
Learn how to count unique values in Python. Discover different methods, tips, real-world applications, and how to debug common errors.

To find the number of unique values in Python is a frequent data analysis task. This process helps you understand data distribution and identify distinct elements.
In this article, you'll explore several techniques to get this count, from simple set() conversions to more advanced library functions. You'll also find practical tips, real-world applications, and advice to debug common errors.
Using set() to count unique values
numbers = [1, 2, 3, 1, 2, 4, 5, 4, 3, 2]
unique_count = len(set(numbers))
print(f"Number of unique values: {unique_count}")--OUTPUT--Number of unique values: 5
The magic here lies in the set() data structure. By definition, a set can only contain unique elements. When you pass the numbers list to set(), Python automatically discards any duplicates, creating a new collection with just the distinct values: 1, 2, 3, 4, and 5.
After the duplicates are gone, the len() function simply counts the number of items remaining in the set. This two-step process—converting to a set and then getting its length—is a highly efficient and Pythonic way to find the count of unique values in any iterable.
Core techniques for counting unique values
While set() is effective for a simple count, Python also provides more powerful tools like collections.Counter, dictionary comprehensions, and pandas' value_counts() for deeper insights.
Using the collections.Counter class
from collections import Counter
items = ['apple', 'banana', 'apple', 'orange', 'banana', 'apple']
counter = Counter(items)
print(f"Unique items count: {len(counter)}")
print(f"Item frequencies: {dict(counter)}")--OUTPUT--Unique items count: 3
Item frequencies: {'apple': 3, 'banana': 2, 'orange': 1}
The collections.Counter class is a specialized dictionary that’s perfect for tallying hashable objects. It doesn’t just count unique items; it also tracks how many times each one appears. When you pass a list to Counter, it returns a dictionary-like object where keys are the unique elements and values are their frequencies.
- You can find the number of unique items by calling
len()on theCounterobject. - To view the frequency of each item, simply convert the
Counterto a dictionary withdict().
Using a dictionary comprehension
data = ['dog', 'cat', 'bird', 'dog', 'fish', 'bird', 'dog']
freq_dict = {item: data.count(item) for item in set(data)}
print(f"Number of unique animals: {len(freq_dict)}")
print(f"Frequencies: {freq_dict}")--OUTPUT--Number of unique animals: 4
Frequencies: {'dog': 3, 'cat': 1, 'bird': 2, 'fish': 1}
A dictionary comprehension is a concise, one-line approach to building a frequency map. The expression first gets the unique elements by converting the list to a set with set(data). It then iterates over each unique item.
- For every
item, it counts its appearances in the original list usingdata.count(item).
This process creates a dictionary where keys are the unique items and values are their counts. The total number of unique items is simply the length of this new dictionary, found with len().
Using pandas value_counts()
import pandas as pd
grades = ['A', 'B', 'A', 'C', 'B', 'A', 'D', 'F', 'B']
series = pd.Series(grades)
value_counts = series.value_counts()
print(f"Unique grades count: {len(value_counts)}")
print(value_counts)--OUTPUT--Unique grades count: 5
A 3
B 3
C 1
D 1
F 1
dtype: int64
For serious data work, pandas is a powerhouse, and its value_counts() method is a prime example of its efficiency. The method is designed to work on a pandas Series—a one-dimensional array that you can create from a standard Python list.
- The
value_counts()function automatically tallies the occurrences of each unique item. - It returns a new
Serieswhere the unique items are the index and their counts are the values, conveniently sorted in descending order.
To get the total number of unique items, you just find the length of the resulting Series with len().
Advanced techniques for counting unique elements
Building on the core methods, you can also tap into more specialized techniques for better performance, memory management, or functional programming patterns.
Using numpy.unique with return_counts
import numpy as np
measurements = np.array([1.2, 2.3, 1.2, 4.5, 2.3, 6.7, 1.2])
unique_values, counts = np.unique(measurements, return_counts=True)
print(f"Unique values: {unique_values}")
print(f"Counts: {counts}")--OUTPUT--Unique values: [1.2 2.3 4.5 6.7]
Counts: [3 2 1 1]
The NumPy library is a go-to for numerical operations, and its numpy.unique function is highly optimized for its arrays. It's a powerful tool when you need both the unique elements and their frequencies in one go. This approach is particularly efficient for large numerical datasets.
- By setting the
return_countsparameter toTrue, the function returns two separate arrays. - The first array holds the sorted unique values, while the second contains the corresponding count for each value.
Using sets with generator expressions
nested_data = [[1, 2], [3, 4], [1, 2], [5, 6], [3, 4], [7, 8]]
unique_tuples = len(set(tuple(item) for item in nested_data))
flattened_unique = len(set(num for sublist in nested_data for num in sublist))
print(f"Unique nested lists: {unique_tuples}")
print(f"Unique elements across all lists: {flattened_unique}")--OUTPUT--Unique nested lists: 4
Unique elements across all lists: 8
Generator expressions offer a memory-efficient way to handle data, as they produce items one at a time. This is perfect for working with sets, which can't contain mutable objects like lists. You must first convert the lists to an immutable type.
- To count unique nested lists, the generator
(tuple(item) for item in nested_data)converts each list into a tuple. This allowsset()to correctly identify and count the unique pairs. - To count unique numbers across all lists, a nested generator flattens the data structure. The expression
(num for sublist in nested_data for num in sublist)iterates through each number, lettingset()find the distinct values.
Using functional programming with reduce
from functools import reduce
data_stream = [10, 20, 30, 10, 40, 20, 50, 30, 60]
unique_count = len(reduce(lambda seen, x: seen | {x}, data_stream, set()))
print(f"Number of unique values using functional approach: {unique_count}")--OUTPUT--Number of unique values using functional approach: 6
The reduce function offers a concise, functional way to process a sequence. It cumulatively applies a function to the items, boiling them down to a single result.
- The process starts with an empty
set()as an accumulator. - For each number in the list, a
lambdafunction performs a set union using the|operator, adding the number to the accumulator. - Because sets inherently discard duplicates, the accumulator grows to contain only unique values.
Finally, len() counts the items in the resulting set to give you the total.
Move faster with Replit
Replit is an AI-powered development platform that transforms natural language into working applications. You can describe what you want to build, and Replit Agent creates it—complete with databases, APIs, and deployment.
For the unique value counting techniques we've explored, Replit Agent can turn them into production-ready tools.
- Build a real-time dashboard that tracks unique website visitors or product interactions using methods like
pandas.value_counts(). - Create a data validation utility that identifies and reports the number of unique entries in a dataset, leveraging
collections.Counterto also show frequencies. - Deploy a log analyzer that processes streaming data to count unique error codes or user IDs, using memory-efficient techniques.
Describe your app idea, and Replit Agent writes the code, tests it, and deploys it automatically. Try Replit Agent and turn your concept into a working application.
Common errors and challenges
Even with powerful tools, you might run into a few common roadblocks when counting unique values in your data.
Fixing TypeError when using set() with unhashable types
You'll hit a TypeError: unhashable type if you try to add a mutable object, like a list or another dictionary, to a set. Sets require their elements to be "hashable," meaning they must be immutable and have a constant value over their lifetime. Since lists can be changed, Python can't create a reliable hash for them.
The fix is to convert mutable objects into an immutable equivalent before adding them to the set. For instance, you can turn a list into a tuple. Because tuples cannot be changed after creation, they are hashable and work perfectly with set().
Avoiding RuntimeError when modifying a set during iteration
A RuntimeError: Set changed size during iteration occurs if you try to add or remove elements from a set while looping over it. Python raises this error to prevent unpredictable behavior, as changing the collection's size can disrupt the loop's internal state.
To safely modify a set, you should iterate over a copy of it. You can create a shallow copy by using my_set.copy(). This allows you to loop through the original, unchanged version while safely making modifications to the actual set.
Handling duplicate keys when using dictionaries for counting
When you manually build a frequency dictionary, it's easy to accidentally overwrite your counts. If you simply assign a value like counts[key] = 1 inside a loop, you'll reset the count for that key every time it appears instead of incrementing it.
To count correctly, you must first check if a key already exists before updating its value. A more Pythonic way is to use the get() method with a default value, like counts[key] = counts.get(key, 0) + 1. This one-liner fetches the current count or starts from zero if the key is new, preventing data loss.
Fixing TypeError when using set() with unhashable types
This error commonly surfaces when you're working with nested data. Sets demand immutable elements, so they reject mutable objects like lists. Attempting to add a list of lists directly to a set() will trigger this specific TypeError. The code below demonstrates this.
data = [[1, 2], [3, 4], [1, 2], [5, 6]]
unique_lists = set(data)
print(f"Number of unique lists: {len(unique_lists)}")
The set() constructor iterates through data, but each element is a list. Because lists are mutable and therefore unhashable, Python raises a TypeError. The following example demonstrates the correct approach to solve this.
data = [[1, 2], [3, 4], [1, 2], [5, 6]]
unique_lists = set(tuple(item) for item in data)
print(f"Number of unique lists: {len(unique_lists)}")
The solution is to convert each nested list into a tuple. A generator expression, (tuple(item) for item in data), iterates through your data and transforms each list into an immutable tuple on the fly.
- Since tuples are hashable,
set()can now process them without error.
This lets you correctly identify and count only the unique items. It's a common pattern when working with lists of lists or other complex data structures.
Avoiding RuntimeError when modifying a set during iteration
Modifying a set while you're looping over it is a recipe for a RuntimeError. Python stops the operation because changing the set's size during iteration can lead to skipped elements or other strange bugs. The code below shows exactly how this happens.
numbers = {1, 2, 3, 4, 5}
for num in numbers:
if num % 2 == 0:
numbers.remove(num)
print(numbers)
The for loop iterates directly over the numbers set. When the if condition finds an even number, numbers.remove(num) tries to alter the set while it's still being looped over, causing the error. See the correct implementation below.
numbers = {1, 2, 3, 4, 5}
numbers_to_remove = {num for num in numbers if num % 2 == 0}
result = numbers - numbers_to_remove
print(result)
The safe approach is to separate identification from removal. This avoids modifying the set you're iterating over, which prevents the error.
- First, a set comprehension builds a new set,
numbers_to_remove, that holds only the items you want to delete. - Then, you use the set difference operator (
-) to subtract this new set from the original, producing the finalresult.
This pattern is crucial whenever you need to filter a collection based on its own elements.
Handling duplicate keys when using dictionaries for counting
A common pitfall when you're creating a frequency map is using an inefficient method that repeatedly recalculates counts. While the final numbers might look right, the approach is slow and redundant, especially with large datasets. The following code demonstrates this exact problem.
data = ["apple", "banana", "apple", "orange"]
counts = {}
for item in data:
counts[item] = data.count(item)
print(counts)
The issue is that data.count(item) runs for every element, including duplicates. For "apple," it scans the list twice. This redundant work adds up quickly on large datasets. The following code demonstrates a more optimized approach.
data = ["apple", "banana", "apple", "orange"]
counts = {}
for item in data:
if item in counts:
counts[item] += 1
else:
counts[item] = 1
print(counts)
This efficient method builds the frequency map in one go. As you loop through the data, you check if an item is already a key in your counts dictionary.
- If the key exists, you simply increment its value using
counts[item] += 1. - If it's a new item, you initialize it with a count of one.
This single-pass technique is much faster for large datasets because it avoids repeatedly scanning the entire list for each item.
Real-world applications
These counting methods are fundamental for solving practical data problems, from cleaning customer lists to finding shared insights across datasets.
Extracting unique email domains from customer data
Analyzing a customer list to see which email providers are most common is a straightforward task where you split each email to extract its domain and then use a set to find the unique entries.
emails = ["[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]"]
domains = [email.split('@')[1] for email in emails]
unique_domains = set(domains)
print(f"Unique domains: {unique_domains}")
print(f"Number of unique domains: {len(unique_domains)}")
This code efficiently isolates unique email domains from a list. It starts with a list comprehension that processes each email, using the split('@') method to grab the domain name after the "@" symbol. This gives you a new list containing just the domains, including duplicates. From there, the process is simple:
- You pass the new list of domains to
set(), which automatically filters out any repeated entries. - Finally, you call
len()on the set to count the remaining unique items.
Finding common elements between multiple datasets using set operations
When comparing multiple datasets, set operations like intersection (&) and union (|) provide a fast and readable way to find shared or unique elements.
customer_set_a = {101, 102, 103, 104, 105}
customer_set_b = {103, 104, 105, 106, 107}
customer_set_c = {105, 106, 107, 108, 109}
customers_in_all_sets = customer_set_a & customer_set_b & customer_set_c
customers_in_any_set = customer_set_a | customer_set_b | customer_set_c
customers_in_exactly_two_sets = ((customer_set_a & customer_set_b) |
(customer_set_b & customer_set_c) |
(customer_set_a & customer_set_c)) - customers_in_all_sets
print(f"Customers in all three sets: {customers_in_all_sets}")
print(f"Unique customers across all sets: {len(customers_in_any_set)}")
print(f"Customers in exactly two sets: {customers_in_exactly_two_sets}")
This code leverages set logic to segment customer data across three groups. It efficiently calculates complex relationships without manual loops.
- The intersection operator (
&) identifies customers present in all three sets. - The union operator (
|) creates a single set of all unique customers across the groups. - The final line isolates customers in exactly two sets by first finding everyone in at least two sets, then subtracting those who are in all three using the difference operator (
-).
Get started with Replit
Put these techniques into practice and build a real tool. Tell Replit Agent: “Build a tool that accepts a list of emails and shows the unique domains and their counts,” or “Create a log analyzer that counts unique error codes.”
Replit Agent writes the code, tests for errors, and deploys your app. Start building with Replit and bring your idea to life.
Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.
Create & deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.



.png)