pandas is a library that provides very user-friendly tools for storing and working with data. If you are doing data analysis or machine learning and at the same time using the Python language , then you simply must know and be able to work with pandas .
pandas is part of a group of projects sponsored by numfocus . Numfocus is an organization that supports various projects related to scientific computing.
pandas is very fast, flexible and expressive and perfect for working with one-dimensional and two-dimensional data tables, it is well integrated with the outside world – it is possible to work with CSV files , Excel tables , and can interface with the R language.
Section 1. Getting Started with Pandas
- What is pandas? – introduce you to pandas and its history.
- Install pandas – learn how to install pandas on your system.
- Install Jupyter Notebook/JupyterLab – the web-based interface that combines live code, equations, visualizations and narrative text in one document.
- Jupyter Notebook/JupyterLab Basics – learn the basics of JupyterLab.
- Jupyter Notebook basic usage – learn how to use Jupyter Notebook interface and compose a simple document.
- JupyterLab basic usage – learn how to use JupyterLab – the next version of Jupyter Notebook.
- Google Colab introduction – learn more about Colab – the free Jupyter Notebook that runs online.
Section 2. Fundamentals of Pandas
- Pandas Axis and Labels – the two pandas terms.
- Pandas Series – the one-dimensional data structure serve as a base of DataFrame
- Pandas DataFrame – learn about the main data structure of Pandas
- Pandas Index – the labels of Series and DataFrame.
Section 3. Creating DataFrame
- Creating a DataFrame from scratch – learn how to create DataFrame from Python objects.
- Create a DataFrame from JSON – show you how to read a JSON file into a DataFrame.
- Create a DataFrame from CSV – explain how you can import a CSV into a DataFrame
- Create a DataFrame from Excel files – learn about how Pandas can get the data from XLS/XLSX files.
- Create a DataFrame from HTML tables – show you how to pull data from
<table>
directly into Pandas DataFrame.
Section 4. Basic DataFrame Operations
- Indexing in Pandas – learn about the syntax as well as how indexing works.
- .loc vs .iloc – introduce the two important indexer of Pandas.
- Select DataFrame rows – learn about selecting data out of a DataFrame.
- Select DataFrame columns – similar to rows, columns in DataFrame can be selected for further analysis.
- Reindexing – learn how you can reindex a Series or a DataFrame.
Section 5. Adding data
- Add DataFrame rows – learn how to use
concat()
andappend()
to add rows to a DataFrame. - append()
- insert()
- concat()
- merge()
- join()
Section 6 : Editing data
- where()
- melt()
- cut()
Section 7 : Deleting data
drop()
del
Section 8. Data Cleaning and Prepping
- Handling missing data – learn how Pandas treat empty value internally.
- dropna() – learn how to use
dropna()
to filter out empty rows and columns. - fillna() – learn how to use
fillna()
to fill in custom values in position of NaN values. - drop_duplicates() – the method that allows you how to remove duplicate rows based on conditions.
- map() – show you how to transform the data using mappings
- replace() – a nifty function that replaces a value with another.
- Rename DataFrame column names and row indexes.
- String manipulation – show you how to apply string operations on whole arrays of data.
- Vectorized string functions – apply string and regex operations through the built-in array-optimized string methods.
Section 9. Group/Groupby
- Create a group
- Group operations
- Sorting groups
- Transform groups
Section 10. Pivot and Pivot Table
- pivot()
- pivot_table()
- Multilevel columns
- Data for selected value
- Duplicate values
- Customize missing values
Section 11. Data Transformation with pandas
- Stack and unstack
- Melt
- Transpose
Section 12. Working with date and time
- pandas DatetimeIndex
- Date ranges
- Custom holidays
- Date formats
- Timespan/Period
- Time zones
- Shift dates and times