Create a DataFrame

Summary: in this tutorial, you’ll learn how to create a DataFrame from various Python data types.

Create DataFrame from dictionaries

A DataFrame can be constructed from a dictionary, pandas understands the data and can automatically parse it.

import pandas as pd

countries = {'name' : ['Afghanistan', 'Albania', 'Algeria', 'Andorra', 'Angola'],
'continent' : ['Asia', 'Europe', 'Africa', 'Europe', 'Africa']
}

dframe = pd.DataFrame(countries)
dframe

You are going to see the following output once you run the code through JupyterLab.

namecontinent
AfghanistanAsia
1AlbaniaEurope
2AlgeriaAfrica
3AndorraEurope
4AngolaAfrica

If you have a list of dictionaries with the same structure, you can also create a DataFrame from them. Each dictionary of the list will become a row and the its key value pair will become the column and its value.

countries = [{'name': 'Cambodia', 'code': 'KH'},
             {'name': 'Cameroon', 'code': 'CM'},
             {'name': 'Canada', 'code': 'CA'}
             ]
dframe = pd.DataFrame(countries)
dframe

Output:

namecode
CambodiaKH
1CameroonCM
2CanadaCA

Create DataFrame from lists

Similarly, a DataFrame can also be created from lists.

Passing a single list to the method creates a one-column DataFrame.

countries = ['Afghanistan', 'Albania', 'Algeria', 'Andorra', 'Angola']
dframe = pd.DataFrame(countries)
dframe

will output :

Afghanistan
1Albania
2Algeria
3Andorra
4Angola

However, if you pass more two lists into it, the first one will become the index.

countries = ['Afghanistan', 'Albania', 'Algeria', 'Andorra', 'Angola']
continents = ['Asia', 'Europe', 'Africa', 'Europe', 'Africa']
dframe = pd.DataFrame(countries, continents)
dframe

will output

AsiaAfghanistan
EuropeAlbania
AfricaAlgeria
EuropeAndorra
AfricaAngola

Create multi-column DataFrame from lists

Mutiple columns DataFrame object can also be created from lists, we just have to zip them first.

countries = ['Afghanistan', 'Albania', 'Algeria', 'Andorra', 'Angola']
continents = ['Asia', 'Europe', 'Africa', 'Europe', 'Africa']
dframe = pd.DataFrame(list(zip(countries, continents)))
dframe

will output

See also  Select DataFrame columns
1
AfghanistanAsia
1AlbaniaEurope
2AlgeriaAfrica
3AndorraEurope
4AngolaAfrica

Please note that the lengths of the lists we passed into the method has to be the same, otherwise we’ll get an error.

Adding custom DataFrame column name

Columns name can be customized by passing a list to columns parameter.

countries = ['Afghanistan', 'Albania', 'Algeria', 'Andorra', 'Angola']
continents = ['Asia', 'Europe', 'Africa', 'Europe', 'Africa']
dframe = pd.DataFrame(list(zip(countries, continents)),
                      columns=['Country', 'Continent'])
dframe

Output :

1
AfghanistanAsia
1AlbaniaEurope
2AlgeriaAfrica
3AndorraEurope
4AngolaAfrica

Create empty DataFrame

We sometime need to create a blank DataFrame and then later use it inside a loop. This is how you can do it.

dframe = pd.DataFrame()
dframe

#Output
pandas.core.frame.DataFrame

Or, if we knew column names in advance and want to pre-define them:

dframe = pd.DataFrame(columns=['name', 'country_code'])

Summary: You can create DataFrame out of Python lists or dictionaries.

Author: Thijmen I’m currently a SysAdmin located in the Netherlands. Every day I try to keep around a hundred users happy with their network connections and overall, tech-related issues. I also spend my spare time fiddling with web-based applications.

Leave a Comment