How to Get Column Names from a Pandas DataFrame?

how to get column names from pandas dataframe

When working with large datasets, the first step is to understand its structure. That’s where knowing the column names of the dataset comes in handy. It helps you explore, clean, and handle data more effectively. In this article, we will explore quick and easy methods to get column names from a Pandas DataFrame in Python.

We often use Pandas in Python for tasks like data analysis. It’s a powerful library for analyzing data. On the other hand, a DataFrame in Pandas is similar to a table in Excel or a database. It consists of rows and columns, where each column has a name.

Different Ways to Get the Column Names from a Pandas DataFrame

Imagine a simple student dataset with three columns: Names, Ages, and Grades. It has four rows. The first row holds the column names, and the others contain the students’ data. Our task is to get the column names from this dataset using different approaches in Python. Let’s get started!

How to Get Column Names from a Pandas DataFrame?

1. Using df.columns Attribute

The easiest way to get column names in Pandas is by using DataFrame’s .column attribute. It stores column names as an index object that is a type of list.

Example Code

Here, we first import the panda library and then create a dictionary data with three keys: Name, Age, and Grade, representing our dataset. The pd.DataFrame(data) function then converts this dictionary into a DataFrame called df which we store in a variable column_names. Finally, the column names of the DataFrame are printed to the console.

import pandas as pd

# Creating a DataFrame
data = {
    "Name": ["George", "John", "Sarah"],
    "Age": [20, 22, 19],
    "Grade": ["A", "B", "A"]
}

df = pd.DataFrame(data)

# Getting column names
column_names = df.columns
print(column_names)

Output

Index(['Name', 'Age', 'Grade'], dtype='object')

In the output, the dtype='object' here means the data type of the stored data is “object.” If you need these column names in a regular list, you can convert them using list() function. Here’s how:

# Getting column names as a list
column_list = list(df.columns)
print(column_list)

Output

['Name', 'Age', 'Grade']

2. Using .keys() Method

.keys() method is another way to get column names. It works just like .columns, because .column is an index object that supports keys().

Example Code

In this code, we use .keys() instead of .columns to get all the column names from our DataFrame. You can then convert the output into a list using list(df.keys()) just like we did earlier.

# Getting Column names
column_names = df.keys()
print(column_names)

Output

Index(['Name', 'Age', 'Grade'], dtype='object')

3. Using .columns.values Attribute to Get a Numpy Array

You can use .columns.values attribute to get the column names as a NumPy array, which is particularly useful for NumPy functions that require array input.

Example Code

In this code, we create a variable column_array, assign it the value of df.columns.values to retrieve the column names, and then print the resulting array.

# Getting column names
column_array = df.columns.values
print(column_array)

Output

['Name' 'Age' 'Grade']

4. Using sorted() Method to Get Sorted Column Names

Sometimes, you might need to get the column names in alphabetical order. In such cases, the built-in sorted() function in Python is very helpful. It can take any list and return a sorted version of it.

Example Code

In the given code, the df.columns attribute retrieves the column names, and sorted() sorts them in alphabetical order. The result is stored in the sorted_columns variable, which is then printed on the screen.

# Getting column names
sorted_columns = sorted(df.columns)
print(sorted_columns)

Output

['Age', 'Grade', 'Name']

5. Get Column Names as a String

Sometimes, you may want to join all the column names into one string, with each name separated by a comma. This is useful when you want to display or log the column names in a clear and easy-to-read format.

Example Code

Here, the .join() method combines the elements of df.columns into a single string. Each column name is separated by a comma and a space, and the resulting string is stored in column_string, providing us with an organized output.

# Getting column names as a string
column_string = ", ".join(df.columns)
print(column_string)

Output

Name, Age, Grade

6. Get Column Names by Index Position

If you need to access a specific column name based on its position in the DataFrame, you can use indexing to get it.

Example Code

In this code, the df.columns[0] retrieves the first column name in the DataFrame (indexing starts at 0).

# Getting the first column name by index position
first_column = df.columns[0]
print(first_column)

Output

Name

You can also access the last column name with df.columns[-1], where -1 refers to the last position.

Your Cheat Sheet for Working with Column Names in Pandas

Here’s a quick recap of different ways to get column names in Pandas, along with their output format and whether they can be converted to a list:

MethodOutput FormatCan Convert To Lists?
1. df.columnsIndex ObjectYes (list(df.columns))
2. df.keys()Index ObjectYes (list(df.keys()))
3. df.columns.valuesNumpy ArrayYes (df.columns.values.tolist())
4. sorted(df.columns)List (Sorted)Already a List
5. for col in df.columnsPrints each column nameNo
6. [col for col in df.columns]ListAlready a List
7. ", ".join(df.columns)StringNo
8. df.columns[index]String Column NameNo

If you need to remove unnecessary columns from your DataFrame, check out How to Drop a Column from a Pandas DataFrame for quick and efficient methods.

Congratulations! You’ve now learned all the possible ways to get column names from a Pandas DataFrame. Now, you can choose the right method based on your specific use case! If you want to dive deeper into working with Pandas, you might also find our Pandas DataFrame GroupBy guide helpful for more advanced data manipulation techniques.

Additionally, if you’re looking to clean up your data by removing unnecessary columns, be sure to check out our article on How to Drop a Column from Pandas DataFrame for some easy-to-follow steps.

For more beginner-friendly Python tutorials, don’t forget to explore our Python Series at Syntax Scenarios!

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top