How to append a dataframe in Python
Learn to append DataFrames in Python with various methods. Get tips, see real-world applications, and learn how to debug common errors.

To combine datasets in Python, you often append a pandas DataFrame. This core operation adds new rows, which is essential for many data analysis tasks.
In this article, you'll explore techniques like pd.concat(). We'll also provide practical performance tips, review real-world applications, and offer debugging advice to help you confidently handle data combination tasks.
Using pd.concat() for basic appending
import pandas as pd
df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
df2 = pd.DataFrame({'A': [5, 6], 'B': [7, 8]})
result = pd.concat([df1, df2])
print(result)--OUTPUT--A B
0 1 3
1 2 4
0 5 7
1 6 8
The pd.concat() function is the primary tool for this task. It takes an iterable—in this case, a list of DataFrames [df1, df2]—and joins them. By default, it stacks them vertically along axis=0, which is exactly what you need when appending rows.
Pay attention to the index in the output. The function preserves the original indices from each DataFrame, resulting in duplicate labels. This is a key behavior to be aware of, as you'll often want to reset the index for a clean, sequential series after combining your data.
Basic DataFrame appending techniques
Beyond the basic pd.concat() function, you can also use the older DataFrame.append() method or combine multiple DataFrames and Series in a single operation.
Using the DataFrame.append() method
import pandas as pd
df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
df2 = pd.DataFrame({'A': [5, 6], 'B': [7, 8]})
result = df1.append(df2)
print(result)--OUTPUT--A B
0 1 3
1 2 4
0 5 7
1 6 8
The DataFrame.append() method offers another way to add rows. Unlike the top-level pd.concat() function, you call append() directly on a DataFrame, such as df1.append(df2).
- It’s deprecated: This method is considered outdated. The pandas team recommends using
pd.concat()for all new code becauseappend()will be removed in a future version. - Identical result: It produces the same output as the previous example, including the duplicate index labels from the original DataFrames.
For these reasons, sticking with pd.concat() is the best practice for writing future-proof code.
Concatenating multiple DataFrames with pd.concat()
import pandas as pd
df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
df2 = pd.DataFrame({'A': [5, 6], 'B': [7, 8]})
df3 = pd.DataFrame({'A': [9, 10], 'B': [11, 12]})
result = pd.concat([df1, df2, df3])
print(result)--OUTPUT--A B
0 1 3
1 2 4
0 5 7
1 6 8
0 9 11
1 10 12
The real strength of pd.concat() is its ability to handle more than two DataFrames at once. You aren't limited to pairwise combinations.
- Simply pass a list containing all the DataFrames you want to join, like
[df1, df2, df3]. - The function appends them in the order they appear in the list, creating one unified DataFrame.
This approach is much cleaner and more efficient than repeatedly calling an append method. As before, the original indices are preserved in the output.
Appending a Series to a DataFrame
import pandas as pd
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
series = pd.Series({'A': 5, 'B': 6})
result = pd.concat([df, pd.DataFrame([series])])
print(result)--OUTPUT--A B
0 1 3
1 2 4
0 5 6
You can also append a Series, which is useful for adding a single row of data. Since pd.concat() expects a list of DataFrames, you can't pass the Series in directly. You need to convert it into a DataFrame first.
- The key is to wrap the
Seriesin a list and pass it to the DataFrame constructor:pd.DataFrame([series]).
This simple step transforms your Series into a single-row DataFrame, making it compatible for concatenation.
Advanced DataFrame appending techniques
With the fundamentals covered, you can now address common challenges like mismatched columns, duplicate indices, and optimizing the append process for greater efficiency.
Handling different column sets when appending
import pandas as pd
df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
df2 = pd.DataFrame({'A': [5, 6], 'C': [7, 8]})
result = pd.concat([df1, df2], sort=False)
print(result)--OUTPUT--A B C
0 1 3.0 NaN
1 2 4.0 NaN
0 5 NaN 7.0
1 6 NaN 8.0
When you combine DataFrames that don't share the same columns, pd.concat() creates a new DataFrame containing all columns from both. It aligns the data by column name, not by position.
- For columns that exist in one DataFrame but not the other, pandas fills the missing spots with
NaN(Not a Number). - Using
sort=Falseis a good practice. It prevents pandas from alphabetically sorting the columns, which maintains the original order and improves performance.
Resetting index with ignore_index=True
import pandas as pd
df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
df2 = pd.DataFrame({'A': [5, 6], 'B': [7, 8]})
result = pd.concat([df1, df2], ignore_index=True)
print(result)--OUTPUT--A B
0 1 3
1 2 4
2 5 7
3 6 8
As you've seen, concatenating DataFrames can lead to a messy index with duplicate labels. The ignore_index=True parameter is the solution. When you set it, pd.concat() discards the original indices entirely.
- It creates a fresh, sequential index for the new DataFrame, starting from 0.
- This is the standard way to get a clean result, making your combined data much easier to work with.
Efficient appending with list comprehension
import pandas as pd
dataframes = [
pd.DataFrame({'A': [i, i+1], 'B': [i+2, i+3]})
for i in range(1, 6, 2)
]
result = pd.concat(dataframes, ignore_index=True)
print(result)--OUTPUT--A B
0 1 3
1 2 4
2 3 5
3 4 6
4 5 7
5 6 8
When you need to combine many DataFrames, it's far more efficient to collect them in a list first and then perform a single concatenation. This example uses a Python list comprehension to quickly generate a list of DataFrames. Afterward, pd.concat() is called just once on the entire collection.
- This approach is much faster because it avoids the performance penalty of creating a new DataFrame in memory with every single append operation.
- It’s a standard pattern for performance-critical tasks that keeps your code clean and memory-friendly.
Move faster with Replit
Replit is an AI-powered development platform that comes with all Python dependencies pre-installed, so you can skip setup and start coding instantly. This lets you move from learning individual techniques like pd.concat() to building complete applications with Agent 4, which can take an idea to a working product directly from a description.
Instead of piecing together techniques, you can describe the app you want to build and let Agent 4 construct it for you:
- A utility that consolidates daily log files from different sources into a single master file for analysis.
- A script that appends new user registration data to an existing customer DataFrame.
- A dashboard that combines monthly sales reports from various regional offices into one comprehensive annual summary.
Simply describe your app, and Replit will write the code, test it, and fix issues automatically, all within your browser.
Common errors and challenges
Appending DataFrames can introduce issues like duplicate indices, missing values, and type errors, but you can easily manage them with the right approach.
Fixing index duplication with ignore_index=True
One of the most common side effects of using pd.concat() is ending up with duplicate index labels. This happens because pandas preserves the original index from each DataFrame you combine. While this behavior is predictable, it can make selecting or slicing data difficult.
The fix is simple: set the ignore_index=True parameter. When you do this, pd.concat() discards the old indices and generates a new, clean index from 0. It’s a crucial step for creating a tidy, usable final DataFrame.
Handling missing values after concatenation
You'll often see NaN (Not a Number) values appear after concatenating DataFrames with different columns. This isn't an error—it's pandas' way of handling misaligned data by filling in the gaps. However, these missing values can cause problems for calculations or machine learning models.
- Use the
.fillna()method to replaceNaNwith a specific value, like 0 for numerical columns or an empty string for text. - Use the
.dropna()method to remove rows or columns that contain anyNaNvalues. Be careful with this, as you might lose important data.
Fixing TypeError when concatenating non-DataFrame objects
A TypeError usually means you've tried to concatenate something that isn't a DataFrame. The pd.concat() function expects a list of DataFrames, so if you pass a raw Series or another object type, it will fail.
This often happens when adding a single row. The solution is to ensure every item you're concatenating is a DataFrame. If you have a Series, you can quickly convert it by wrapping it in a list and passing it to the DataFrame constructor, like pd.DataFrame([my_series]), before you concatenate.
Fixing index duplication with ignore_index=True
When you use pd.concat() without resetting the index, you get duplicate labels. This isn't just messy; it creates ambiguity. Trying to select data using .loc[0], for example, becomes unreliable. The following code demonstrates this exact problem in action.
import pandas as pd
df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
df2 = pd.DataFrame({'A': [5, 6], 'B': [7, 8]})
# Concatenating without handling indexes
result = pd.concat([df1, df2])
print(result)
print(f"Values at index 0: {result.loc[0]}") # Ambiguous - returns first occurrence
The result.loc[0] call is ambiguous because multiple rows share the index 0. Pandas returns only the first match it finds, effectively hiding other data at that same index. The following code demonstrates the correct approach for a clean result.
import pandas as pd
df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
df2 = pd.DataFrame({'A': [5, 6], 'B': [7, 8]})
# Reset index to avoid duplication
result = pd.concat([df1, df2], ignore_index=True)
print(result)
print(f"Values at index 0: {result.loc[0]}")
By setting ignore_index=True, you instruct pd.concat() to discard the original indices and generate a new, sequential one. This ensures every row in the final DataFrame has a unique label, which resolves the ambiguity. As a result, a call like result.loc[0] becomes predictable and reliably returns a single row. You should use this approach whenever you need a clean, usable index after combining DataFrames.
Handling missing values after concatenation
When you concatenate DataFrames with mismatched columns, pandas fills the gaps with NaN values. While this prevents errors during the merge, these missing markers can cause unexpected results in later calculations, like when you try to sum a column. The following code demonstrates this problem.
import pandas as pd
df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
df2 = pd.DataFrame({'A': [5, 6], 'C': [7, 8]})
# This creates NaN values and may cause issues in calculations
result = pd.concat([df1, df2])
print(result)
total_b = result['B'].sum()
print(f"Sum of column B: {total_b}") # Includes NaN values
The sum() function on column B skips the NaN values, but leaving them unaddressed can cause problems in other calculations. The code below shows a better way to handle these gaps before they become an issue.
import pandas as pd
df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
df2 = pd.DataFrame({'A': [5, 6], 'C': [7, 8]})
# Fill missing values with a default value
result = pd.concat([df1, df2]).fillna(0)
print(result)
total_b = result['B'].sum()
print(f"Sum of column B: {total_b}")
The solution is to chain the .fillna(0) method directly after pd.concat(). This replaces all NaN values with 0, ensuring your columns are ready for mathematical operations like .sum(). You should always consider this step when your source DataFrames have different column sets, as it prevents unexpected behavior in your analysis and keeps your data clean and consistent.
Fixing TypeError when concatenating non-DataFrame objects
Fixing TypeError when concatenating non-DataFrame objects
The pd.concat() function is strict—it only accepts a list of DataFrames. If you include another object type, like a raw NumPy array, Python will raise a TypeError. The code below demonstrates what happens when this rule is broken.
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
array = np.array([5, 6])
# Trying to concatenate DataFrame with numpy array directly
result = pd.concat([df, array])
print(result)
The TypeError is triggered because the NumPy array isn't wrapped in a DataFrame structure that pd.concat() expects. The function requires all items in the list to be DataFrames. The following code demonstrates the correct approach.
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
array = np.array([5, 6])
# Convert array to DataFrame first
array_df = pd.DataFrame([array], columns=['A', 'B'])
result = pd.concat([df, array_df])
print(result)
The solution is to convert the NumPy array into a DataFrame before concatenating. This ensures every item you pass to pd.concat() is the correct type. To do this:
- Wrap the array in a list and pass it to
pd.DataFrame(), making sure to specify thecolumnsso the new row aligns correctly with the existing structure.
This error is common when you mix data types from different libraries, so always check your inputs before combining them.
Real-world applications
With the troubleshooting covered, you can confidently apply DataFrame appending to practical scenarios like consolidating monthly reports or building a unified product catalog.
Combining monthly sales reports with pd.concat()
Consolidating data from different time periods, like merging monthly sales reports, is a perfect use case for pd.concat().
import pandas as pd
jan_sales = pd.DataFrame({
'Product': ['Widget A', 'Widget B'],
'Units': [100, 150],
'Month': ['Jan', 'Jan']
})
feb_sales = pd.DataFrame({
'Product': ['Widget A', 'Widget C'],
'Units': [120, 90],
'Month': ['Feb', 'Feb']
})
quarterly_report = pd.concat([jan_sales, feb_sales], ignore_index=True)
print(quarterly_report)
This example demonstrates how to stack two DataFrames, jan_sales and feb_sales, on top of each other. The pd.concat() function handles the combination, creating a unified quarterly_report.
- The key parameter here is
ignore_index=True. It discards the original indices from each DataFrame. - As a result, the final table gets a fresh, sequential index starting from 0, making the data easier to reference.
This technique is essential for cleanly merging datasets that share a similar structure.
Building a comprehensive product catalog from different suppliers
Building a unified product catalog from multiple suppliers often requires you to standardize inconsistent column names before you can append the data.
import pandas as pd
supplier_a = pd.DataFrame({
'ProductID': ['A001', 'A002'],
'Name': ['Premium Widget', 'Deluxe Gadget'],
'Price': [19.99, 24.99]
})
supplier_b = pd.DataFrame({
'Product_Code': ['B001', 'B002'],
'Product_Name': ['Economy Widget', 'Basic Tool'],
'Wholesale_Price': [12.99, 9.99]
})
supplier_b = supplier_b.rename(columns={
'Product_Code': 'ProductID',
'Product_Name': 'Name',
'Wholesale_Price': 'Price'
})
complete_catalog = pd.concat([supplier_a, supplier_b], ignore_index=True)
print(complete_catalog)
This example tackles a common data-wrangling problem: combining datasets with mismatched column names. Before appending, you must align the schemas. The .rename() method is used on supplier_b to make its column names—like Product_Code and Product_Name—match those in supplier_a.
- Once the columns are consistent,
pd.concat()stacks the two DataFrames vertically. - Using
ignore_index=Truecreates a new, clean index for the combined catalog, making the final data much easier to work with.
Get started with Replit
Now, turn what you've learned into a real tool. Describe what you want to build to Replit Agent, like "a utility that merges multiple CSV log files" or "a script that appends new survey responses to an existing dataset".
Replit Agent writes the code, tests for errors, and helps you deploy your application. Start building with Replit.
Describe what you want to build, and Replit Agent writes the code, handles the infrastructure, and ships it live. Go from idea to real product, all in your browser.
Create & deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.

.png)
.png)
.png)