How to append a dataframe in Python
Learn to append a DataFrame in Python. This guide covers various methods, tips, real-world applications, and common error debugging.

You can append dataframes in Python to combine datasets. This common operation adds new rows to an existing dataframe, a key step for data analysis and manipulation.
In this article, we'll cover several techniques to append dataframes. We'll also share practical tips, explore real-world applications, and offer advice to help you debug common issues you might encounter along the way.
Using pd.concat() for basic appending
import pandas as pd
df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
df2 = pd.DataFrame({'A': [5, 6], 'B': [7, 8]})
result = pd.concat([df1, df2])
print(result)--OUTPUT--A B
0 1 3
1 2 4
0 5 7
1 6 8
The pd.concat() function is the most direct way to append DataFrames. It takes an iterable—in this case, a list [df1, df2]—and joins the DataFrames along an axis. By default, it stacks them vertically, which is exactly what you need for appending rows.
Take a look at the index in the output. You'll notice it repeats the sequence 0, 1. This happens because pd.concat() preserves the original index labels from each DataFrame. It's important to be aware of this default behavior, as duplicate indices can cause unexpected results in later data manipulations.
Basic DataFrame appending techniques
Building on the basics of pd.concat(), you can also use the DataFrame.append() method, concatenate multiple DataFrames, or even append a single Series.
Using the DataFrame.append() method
import pandas as pd
df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
df2 = pd.DataFrame({'A': [5, 6], 'B': [7, 8]})
result = df1.append(df2)
print(result)--OUTPUT--A B
0 1 3
1 2 4
0 5 7
1 6 8
The DataFrame.append() method offers a more object-oriented approach. You call it directly on one DataFrame and pass the one you want to add. While the syntax is slightly different, it produces the exact same result as the basic pd.concat() function, right down to preserving the original index.
However, it's crucial to know that the pandas development team is phasing out this method. For new projects and future compatibility, you should stick with using pd.concat().
Concatenating multiple DataFrames with pd.concat()
import pandas as pd
df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
df2 = pd.DataFrame({'A': [5, 6], 'B': [7, 8]})
df3 = pd.DataFrame({'A': [9, 10], 'B': [11, 12]})
result = pd.concat([df1, df2, df3])
print(result)--OUTPUT--A B
0 1 3
1 2 4
0 5 7
1 6 8
0 9 11
1 10 12
The real power of pd.concat() is its flexibility. You aren't limited to just two DataFrames. You can easily scale the operation by passing a list containing any number of DataFrames you need to combine, like [df1, df2, df3].
- The function stacks them vertically in the order they appear in the list.
- This makes it an efficient tool for aggregating data from multiple sources at once.
Appending a Series to a DataFrame
import pandas as pd
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
series = pd.Series({'A': 5, 'B': 6})
result = pd.concat([df, pd.DataFrame([series])])
print(result)--OUTPUT--A B
0 1 3
1 2 4
0 5 6
Appending a single row, structured as a pandas Series, requires a small but important extra step. You can't concatenate a Series directly with a DataFrame; you must first convert it into a single-row DataFrame. This ensures it has the right two-dimensional structure for joining.
- The conversion is simple: wrap your
Seriesin a list and pass it to the constructor, likepd.DataFrame([series]). - After that,
pd.concat()treats it just like any otherDataFrame, stacking it neatly at the end of your original data.
Advanced DataFrame appending techniques
Beyond basic appending, you'll often need to manage mismatched columns, create a clean index with ignore_index=True, and optimize performance using list comprehensions.
Handling different column sets when appending
import pandas as pd
df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
df2 = pd.DataFrame({'A': [5, 6], 'C': [7, 8]})
result = pd.concat([df1, df2], sort=False)
print(result)--OUTPUT--A B C
0 1 3.0 NaN
1 2 4.0 NaN
0 5 NaN 7.0
1 6 NaN 8.0
It's common for datasets to have different columns. When you use pd.concat(), it handles this by creating a union of all columns from the input DataFrames.
- For any column that doesn't exist in one of the original DataFrames, pandas fills the missing spots with
NaN(Not a Number). - The
sort=Falseargument is included to maintain the column order as it appears in the original DataFrames, preventing an automatic alphabetical sort.
Resetting index with ignore_index=True
import pandas as pd
df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
df2 = pd.DataFrame({'A': [5, 6], 'B': [7, 8]})
result = pd.concat([df1, df2], ignore_index=True)
print(result)--OUTPUT--A B
0 1 3
1 2 4
2 5 7
3 6 8
To avoid the duplicate index issue mentioned earlier, you can set the ignore_index=True argument. This tells pd.concat() to discard the original indices from each DataFrame. Instead, it generates a completely new, continuous index for the combined result, starting from 0.
- This ensures every row gets a unique label.
- It’s a simple way to create a clean, ready-to-use DataFrame, preventing potential errors in future operations.
Efficient appending with list comprehension
import pandas as pd
dataframes = [
pd.DataFrame({'A': [i, i+1], 'B': [i+2, i+3]})
for i in range(1, 6, 2)
]
result = pd.concat(dataframes, ignore_index=True)
print(result)--OUTPUT--A B
0 1 3
1 2 4
2 3 5
3 4 6
4 5 7
5 6 8
For creating and combining multiple DataFrames, a list comprehension is a highly efficient Python feature. The code generates a list of DataFrames in one go, rather than appending them one by one in a traditional loop. This method is significantly more performant.
- It avoids the overhead of creating intermediate DataFrames with each append operation.
- You then pass the entire list to
pd.concat()just once to get your final, combined DataFrame, making your code both cleaner and faster.
Move faster with Replit
Replit is an AI-powered development platform that transforms natural language into working applications. You can describe what you want to build, and Replit Agent creates it—complete with databases, APIs, and deployment.
The dataframe appending techniques from this article can be turned into production-ready tools. Replit Agent can take your concept and build a complete application that leverages functions like pd.concat() to manage and aggregate data.
- Build a log analyzer that combines daily log files from multiple sources into a single, unified DataFrame for troubleshooting.
- Create a financial dashboard that appends monthly sales reports to track year-over-year performance.
- Deploy a data ingestion service that continuously adds new records from a live data stream to a master dataset.
You can turn your own data manipulation ideas into fully functional applications. Try Replit Agent and watch it write, test, and deploy your code automatically.
Common errors and challenges
Appending dataframes can introduce a few common issues, but understanding them makes troubleshooting much easier.
Fixing index duplication with ignore_index=True
Duplicate index labels are a frequent source of trouble after appending. If you find yourself with a messy index, the quickest solution is to use the ignore_index=True argument within pd.concat(). This creates a fresh, continuous index for the new dataframe, preventing potential conflicts in later steps.
Handling missing values after concatenation
When you combine dataframes that don't share the exact same columns, pandas introduces NaN (Not a Number) values to fill the gaps. This isn't an error, but it's something you need to manage.
- You can handle these missing values by replacing them using the
fillna()method. - Alternatively, you can remove rows or columns containing
NaNvalues withdropna().
Fixing TypeError when concatenating non-DataFrame objects
A TypeError often occurs if you pass objects that aren't dataframes or series to pd.concat(). The function expects an iterable—usually a list—of dataframes, like [df1, df2].
Forgetting the square brackets and trying to pass dataframes directly, as in pd.concat(df1, df2), is a common mistake that triggers this error. Similarly, trying to append a single Series without first converting it to a one-row dataframe will also cause a TypeError.
Fixing index duplication with ignore_index=True
When you append DataFrames without resetting the index, you'll often end up with duplicate labels. This isn't just messy—it creates ambiguity. If you try to select data using a label that appears more than once, pandas may not behave as you expect.
This is especially true for label-based selection with .loc[]. The code below shows what happens when you try to access data from a DataFrame that has a duplicated index, which can lead to subtle bugs in your analysis.
import pandas as pd
df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
df2 = pd.DataFrame({'A': [5, 6], 'B': [7, 8]})
# Concatenating without handling indexes
result = pd.concat([df1, df2])
print(result)
print(f"Values at index 0: {result.loc[0]}") # Ambiguous - returns first occurrence
Because both DataFrames start with index 0, the combined result has duplicate labels. Accessing result.loc[0] then returns a DataFrame containing both rows labeled 0, not a single row as you might expect. The code below demonstrates the fix.
import pandas as pd
df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
df2 = pd.DataFrame({'A': [5, 6], 'B': [7, 8]})
# Reset index to avoid duplication
result = pd.concat([df1, df2], ignore_index=True)
print(result)
print(f"Values at index 0: {result.loc[0]}")
The fix is simple: set the ignore_index=True argument in pd.concat(). This tells pandas to discard the original indices and generate a new, continuous one. As a result, each row gets a unique label starting from 0, and you'll avoid any ambiguity.
Now, when you call result.loc[0], it returns the single, correct row. It's a small change that ensures your selections are predictable and helps prevent subtle bugs in your data analysis.
Handling missing values after concatenation
When you combine DataFrames that don't share the same columns, pandas fills the gaps with NaN values. These aren't just placeholders; they can disrupt calculations like sums or averages, leading to unexpected results. The code below shows how these values appear.
import pandas as pd
df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
df2 = pd.DataFrame({'A': [5, 6], 'C': [7, 8]})
# This creates NaN values and may cause issues in calculations
result = pd.concat([df1, df2])
print(result)
total_b = result['B'].sum()
print(f"Sum of column B: {total_b}") # Includes NaN values
The code combines DataFrames with different columns, causing pd.concat() to insert NaN values. When sum() is called on column B, these missing values are treated as zero, which can silently skew the total. See how to handle this below.
import pandas as pd
df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
df2 = pd.DataFrame({'A': [5, 6], 'C': [7, 8]})
# Fill missing values with a default value
result = pd.concat([df1, df2]).fillna(0)
print(result)
total_b = result['B'].sum()
print(f"Sum of column B: {total_b}")
The fix is to chain the fillna(0) method directly after pd.concat(). This replaces every NaN value with 0, ensuring your calculations are accurate. Now, when you call sum() on column B, the result is correct because the missing values no longer skew the total.
This is a crucial step whenever you're merging datasets from different sources, as column mismatches are common.
Fixing TypeError when concatenating non-DataFrame objects
A TypeError is a common roadblock when you try to pass an object that isn't a DataFrame or Series to pd.concat(). The function expects a list of compatible pandas objects, so mixing in other types, like a raw NumPy array, will fail.
The code below demonstrates what happens when you attempt to concatenate a DataFrame with a NumPy array directly, which triggers this error.
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
array = np.array([5, 6])
# Trying to concatenate DataFrame with numpy array directly
result = pd.concat([df, array])
print(result)
The pd.concat() function fails because it's trying to join a DataFrame with a raw NumPy array. It can't combine these incompatible data structures directly. The code below shows how to prepare the data correctly.
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
array = np.array([5, 6])
# Convert array to DataFrame first
array_df = pd.DataFrame([array], columns=['A', 'B'])
result = pd.concat([df, array_df])
print(result)
The fix is to make sure everything you pass to pd.concat() is a pandas object. You can solve the TypeError by first converting the NumPy array into a DataFrame with pd.DataFrame([array], columns=['A', 'B']). This step gives the raw data the proper structure and column labels it needs to join correctly. Keep an eye out for this error whenever you're mixing data from different libraries or adding single rows to your DataFrame.
Real-world applications
With the common pitfalls covered, you can now apply dataframe appending to solve practical, real-world data aggregation tasks.
- Combining monthly sales reports: A common business task is analyzing sales over time. You can use
pd.concat()to stack monthly reports into a single DataFrame, creating a master dataset for tracking trends and generating year-to-date summaries. - Building a comprehensive product catalog: If you manage data from multiple suppliers, appending their product lists helps create a unified catalog.
pd.concat()combines them even if columns differ, letting you standardize the data into one cohesive view.
Combining monthly sales reports with pd.concat()
Aggregating sales data from different months into a single report is a common task where pd.concat() shines.
import pandas as pd
jan_sales = pd.DataFrame({
'Product': ['Widget A', 'Widget B'],
'Units': [100, 150],
'Month': ['Jan', 'Jan']
})
feb_sales = pd.DataFrame({
'Product': ['Widget A', 'Widget C'],
'Units': [120, 90],
'Month': ['Feb', 'Feb']
})
quarterly_report = pd.concat([jan_sales, feb_sales], ignore_index=True)
print(quarterly_report)
This example creates two distinct DataFrames: jan_sales and feb_sales. Each one holds sales data for a different month, structured with the same columns for product, units, and month.
- The
pd.concat()function takes a list of these DataFrames and stacks them vertically into a single table. - Setting
ignore_index=Truetells pandas to create a new, clean index for the combinedquarterly_report.
This process results in a unified DataFrame ready for analysis, without the issue of repeated index labels from the original tables.
Building a comprehensive product catalog from different suppliers
When combining product lists from different suppliers, you'll often need to standardize column names before you can append them into a single, unified catalog.
import pandas as pd
supplier_a = pd.DataFrame({
'ProductID': ['A001', 'A002'],
'Name': ['Premium Widget', 'Deluxe Gadget'],
'Price': [19.99, 24.99]
})
supplier_b = pd.DataFrame({
'Product_Code': ['B001', 'B002'],
'Product_Name': ['Economy Widget', 'Basic Tool'],
'Wholesale_Price': [12.99, 9.99]
})
supplier_b = supplier_b.rename(columns={
'Product_Code': 'ProductID',
'Product_Name': 'Name',
'Wholesale_Price': 'Price'
})
complete_catalog = pd.concat([supplier_a, supplier_b], ignore_index=True)
print(complete_catalog)
This code shows how to merge product lists from different suppliers, even when their column names don't match. Notice that supplier_a and supplier_b use different labels for the same data, like ProductID versus Product_Code.
- To fix this, the
rename()method standardizes the column names insupplier_bso they align withsupplier_a. - Once the columns are consistent,
pd.concat()stacks the two DataFrames into a single, unified catalog. - Using
ignore_index=Truecreates a fresh index for the final result.
Get started with Replit
Turn what you've learned into a real tool. Tell Replit Agent to “build a tool that merges daily log files” or “create a dashboard that combines monthly sales CSVs to track yearly performance.”
Replit Agent will write the code, test for errors, and deploy your application automatically. Start building with Replit.
Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.
Create & deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.



.png)