When working with pandas DataFrames, you’ll often encounter columns that aren’t needed for your analysis. Removing these unnecessary columns keeps your data focused and easier to manage.
Learn how to delete one or more columns from a Pandas DataFrame in Python.
Table of Contents
What is a Column in Pandas?
In pandas, a column is a labeled set of data within a DataFrame. Each column represents a specific type of information, such as a person’s name, their age, or a location. Think of it like a single category in a well-organized spreadsheet.
Why Drop a Column?
Dropping irrelevant columns helps streamline your dataset. For example, during data pre-processing, you prepare the data frame by combining multiple CSV files, but then you don’t need all the columns, so you delete the ones that don’t play a vital role in your analysis. It makes your data frame cleaner and easier to work with.
Remove Columns From Pandas DataFrame
To remove a column from a pandas DataFrame, use the drop()
method with axis=1
for columns. The syntax is df.drop("Column_Name", axis=1, inplace=True)
.
Delete a Single Column Using the drop() Method
The most common method is drop()
, which deletes specified columns (or rows) from your DataFrame. By default, drop()
returns a new DataFrame, leaving the original unchanged, unless you set inplace=True
.
Syntax
DataFrame.drop(labels=None, axis=0, inplace=False)
labels
: Name or list of columns to delete.axis=1
: Indicates you’re dropping columns (axis=0 would drop rows).inplace
: IfTrue
, modifies the original DataFrame.
Example Code
Below is a simple example of removing a column named “Age” from a DataFrame.
- First a data frame is made using columns called : “Title”, “Age”, and “Location”.
- The drop() method removes the “Age” column by specifying its name and setting axis=1.
- The original DataFrame is not modified since inplace=False.
import pandas as pd # Creating a sample DataFrame data = {"Title": ["Hassan", "Novwera", "Mahnoor"], "Age": [23, 36, 56], "Location": ["Lucerne", "Valencia", "Brno"]} df = pd.DataFrame(data) # Dropping the "Age" column new_df = df.drop("Age", axis=1) print(new_df)
Output
Title Location 0 Hassan Lucerne 1 Novwera Valencia 2 Mahnoor Brno
Drop Multiple Columns
To drop multiple columns at once, pass a list of column names:
Syntax
DataFrame.drop(labels=None, axis=1, inplace=False)
Example Code
The “Age” and “City” columns are dropped from the DataFrame in this case.
- Creating a data frame and adding the columns Age, City, and Name.
- Both “Age” and “City” are deleted using the drop() method, which passes a list of column names to the labels parameter.
import pandas as pd # Creating a sample DataFrame data = {"Name": ["Alice", "Bob", "Charlie"], "Age": [23, 36, 56], "City": ["Lucerne", "Valencia", "Brno"]} df = pd.DataFrame(data) # Dropping the "Age" and "City" columns new_df = df.drop(["Age", "City"], axis=1) print(new_df)
Output
Name 0 Alice 1 Bob 2 Charlie
Remove Unnamed Columns
Unnamed columns in Pandas often appear when a file, such as a CSV, has extra separators (like commas) or empty columns. Similar columns do not carry any important data normally and make a mess in the DataFrame. Removing them ensures your data is clean and easy to work with.
Think of this like you are organizing an event and there are 20 chairs in the event and 18 people were invited so you would remove the chairs that are empty. The same way you would remove unnamed column that are not occupied by any data.
Syntax
DataFrame.drop(labels=None, axis=1, inplace=False)
Example Code
Here, we remove an unnamed column from a DataFrame:
- A DataFrame is created with an “Unnamed: 0” column, which often happens during file imports.
- To remove the “Unnamed: 0” column, use the drop() function.
import pandas as pd # Making a DataFrame example with an unnamed column data = {"Unnamed: 0": [1, 2, 3], "Title": ["Novwera", "Mahad", "Mahnoor"], "Lifetime": [21, 56, 38]} df = pd.DataFrame(data) # Dropping the unnamed column cleaned_df = df.drop(["Unnamed: 0"], axis=1) print(cleaned_df)
Output
Title Lifetime 0 Novwera 21 1 Mahad 56 2 Mahnoor 38
Using inplace=True
The inplace=True
parameter modifies the original DataFrame directly. This is handy when you don’t want to store a separate updated DataFrame. Just remember that changes can’t be undone easily once applied in-place.
Syntax
DataFrame.drop(labels=None, axis=1, inplace=True)
Example Code
In this example, we remove a column named “Age” directly from the DataFrame:
- A DataFrame is created with columns “Title”, “Age”, and “Location”.
- The drop() method is used with inplace=True to remove the “Age” column directly from the original DataFrame.
import pandas as pd # Creating a sample DataFrame data = {"Title": ["Novwera", "Mahad", "Mahnoor"], "Age": [59, 21, 56], "Location": ["Lucerne", "Valencia", "Brno"]} df = pd.DataFrame(data) # Dropping the "Age" column in place df.drop(["Age"], axis=1, inplace=True) print(df)
Output
Title Location 0 Novwera Lucerne 1 Mahad Valencia 2 Mahnoor Brno
Working with Indices and Axes
In Pandas, indices and axes are ways to locate and manipulate data in a DataFrame.
- Indices refer to row labels, which uniquely identify rows.
- Axes represent directions in the DataFrame:
- Axis 0: Refers to rows (vertical direction).
- Axis 1: Refers to columns (horizontal direction).
When you perform operations like removing columns or rows, specifying the correct axis ensures the operation targets the intended part of the DataFrame.
Syntax
DataFrame.drop(labels=None, axis=1)
Example Code
This example demonstrates how to remove a column by explicitly specifying axis=1:
- A DataFrame is created with columns “Title”, “Lifetime”, and “Venue”.
- The drop() method is used to remove the “Venue” column by setting axis=1.
import pandas as pd # Creating a sample DataFrame data = {"Title": ["Novwera", "Mahad", "Mahnoor"], "Lifetime": [72, 64, 56], "Venue": ["Lucerne", "Valencia", "Brno"]} df = pd.DataFrame(data) # Dropping the "City" column using axis=1 new_df = df.drop(["City"], axis=1) print(new_df)
Output
Title Lifetime 0 Novwera 72 1 Mahad 64 2 Mahnoor 56
When working with Pandas DataFrame, you also encounter NaN values in a column, you can check if a value is Nan before proceeding to perform any other operation.
Conclusion
Removing unnecessary columns—whether they’re unnamed or simply not needed—makes your DataFrame cleaner and easier to analyze. Using drop()
, along with parameters like inplace=True
and axis=1
, you can customize how you modify your data. Understanding indices and axes ensure you’re targeting the right parts of your DataFrame every time.
For more interesting articles, follow our Python tutorial series by SyntaxScenarios.