Create a DataFrame

Summary: in this tutorial, you’ll learn how to create a DataFrame from various Python data types.

Create DataFrame from dictionaries

A DataFrame can be constructed from a dictionary, pandas understands the data and can automatically parse it.

import pandas as pd countries = {'name' : ['Afghanistan', 'Albania', 'Algeria', 'Andorra', 'Angola'], 'continent' : ['Asia', 'Europe', 'Africa', 'Europe', 'Africa'] } dframe = pd.DataFrame(countries) dframe
Code language: Python (python)

You are going to see the following output once you run the code through JupyterLab.

namecontinent
0AfghanistanAsia
1AlbaniaEurope
2AlgeriaAfrica
3AndorraEurope
4AngolaAfrica

If you have a list of dictionaries with the same structure, you can also create a DataFrame from them. Each dictionary of the list will become a row and the its key value pair will become the column and its value.

countries = [{'name': 'Cambodia', 'code': 'KH'}, {'name': 'Cameroon', 'code': 'CM'}, {'name': 'Canada', 'code': 'CA'} ] dframe = pd.DataFrame(countries) dframe
Code language: Python (python)

Output:

namecode
0CambodiaKH
1CameroonCM
2CanadaCA

Create DataFrame from lists

Similarly, a DataFrame can also be created from lists.

Passing a single list to the method creates a one-column DataFrame.

countries = ['Afghanistan', 'Albania', 'Algeria', 'Andorra', 'Angola'] dframe = pd.DataFrame(countries) dframe
Code language: Python (python)

will output :

0
0Afghanistan
1Albania
2Algeria
3Andorra
4Angola

However, if you pass more two lists into it, the first one will become the index.

countries = ['Afghanistan', 'Albania', 'Algeria', 'Andorra', 'Angola'] continents = ['Asia', 'Europe', 'Africa', 'Europe', 'Africa'] dframe = pd.DataFrame(countries, continents) dframe
Code language: Python (python)

will output

0
AsiaAfghanistan
EuropeAlbania
AfricaAlgeria
EuropeAndorra
AfricaAngola

Create multi-column DataFrame from lists

Mutiple columns DataFrame object can also be created from lists, we just have to zip them first.

countries = ['Afghanistan', 'Albania', 'Algeria', 'Andorra', 'Angola'] continents = ['Asia', 'Europe', 'Africa', 'Europe', 'Africa'] dframe = pd.DataFrame(list(zip(countries, continents))) dframe
Code language: Python (python)

will output

01
0AfghanistanAsia
1AlbaniaEurope
2AlgeriaAfrica
3AndorraEurope
4AngolaAfrica

Please note that the lengths of the lists we passed into the method has to be the same, otherwise we’ll get an error.

Adding custom DataFrame column name

Columns name can be customized by passing a list to columns parameter.

countries = ['Afghanistan', 'Albania', 'Algeria', 'Andorra', 'Angola'] continents = ['Asia', 'Europe', 'Africa', 'Europe', 'Africa'] dframe = pd.DataFrame(list(zip(countries, continents)), columns=['Country', 'Continent']) dframe
Code language: Python (python)

Output :

01
0AfghanistanAsia
1AlbaniaEurope
2AlgeriaAfrica
3AndorraEurope
4AngolaAfrica

Create empty DataFrame

We sometime need to create a blank DataFrame and then later use it inside a loop. This is how you can do it.

dframe = pd.DataFrame() dframe #Output pandas.core.frame.DataFrame
Code language: Python (python)

Or, if we knew column names in advance and want to pre-define them:

dframe = pd.DataFrame(columns=['name', 'country_code'])
Code language: Python (python)

Summary: You can create DataFrame out of Python lists or dictionaries.

Leave a Comment