Understanding Pandas in Python DataFrames

A Python DataFrame provides a size-mutable, two-dimensional structure for data made up of three components: rows and columns (which are labeled) and data. Make sure you have a basic working knowledge of Pandas before getting started with this tutorial.

Uses of Pandas DataFrame

DataFrame lets you organize information in a table, allowing you to view it more easily than using a list. In the grid, each row corresponds with an instance’s measurement or value and each column contains variable data. The data in the columns can contain alphanumerical characters or logical data and can be of the same type, although it does not have to be.

How to Create a Python Pandas DataFrame

To create a Python pandas DataFrame, load existing datasets using a CSV file, Excel File or SQL Database.

Another way to create your grid is to use one or multiple lists, which would look like:

You can also use a dictionary with ndarray/lists. First, ensure that the ndarray is a similar length. If you pass the index, then the index’s length should be the same as the arrays’ length. If you do not pass the index, the index will default to range(n), with n representing the array’s length. The code would look something like this, depending on the contents of your list:

Working with Columns and Rows

In a Python Pandas DataFrame, the data is organized tabularly using columns and rows, which allows you to perform basic operations like adding, deleting and selecting items.

Use DataFrame.loc[] or pass the integer’s location to the iloc[] function to select your DataFrame rows. Here is what the code might look like:

After running the code, you will get two rows back since you only had one parameter each time.

Selecting Data and Indexing

Also referred to as Subset Selection, indexing simply means using .iloc and .loc indexers to select some or all of the DataFrame’s rows or columns.

To select one column, place the column’s name between your brackets. The code would look similar to this:

Python Pandas and Missing Data

If you do not provide information for an item, your data might go missing, which can present a major problem. Missing data also refers to the NA value in Python pandas.

To avoid any issues caused my missing data, use notnull() and isnull() to look for missing or null values. Many data classes will teach you how to use this code in Python Pandas:

After using the code above will to reveal null sets, what do you do with them? Use the fillna(), interpolate() and replace() functions to replace null data values with new ones..

This is what the interpolate() function would look like in a practical Python pandas setting:

Closing Thoughts

There are many Python pandas functions available for working with columns and rows, creating a DataFrame or fixing null values. Tutorials like this one can serve as good practice as you prepare for data classes in San Francisco.

*Please note, these articles are for educational purposes and the topics covered may not be representative of the curriculum covered in our boot camp. Explore our curriculum to see what you’ll learn in our program.

Get Program Info

By submitting this form, you agree that edX Boot Camps, in partnership with UC Berkeley may contact you regarding this boot camp. Your personal data will be used as described in Berkeley Boot Camps’s privacy policy. You may opt out of receiving communications at any time.

The following requires your attention:

Back

Step 1 of 6

Are you over the age of 18?

Yes No

Back

Ready to learn more about Berkeley Data Analytics Boot Camp in San Francisco? Contact an admissions advisor at (510) 306-1218.

Get Program Info

The following requires your attention:

Back

Step 1 of 6

Are you over the age of 18?

Yes No

Back

Understanding Pandas in Python DataFrames

Uses of Pandas DataFrame

How to Create a Python Pandas DataFrame

Working with Columns and Rows

Selecting Data and Indexing

Python Pandas and Missing Data

Closing Thoughts

*Please note, these articles are for educational purposes and the topics covered may not be representative of the curriculum covered in our boot camp. Explore our curriculum to see what you’ll learn in our program.

Get Program Info

Online

Berkeley

Oakland

San Francisco

San Jose

Richmond

Santa Clara

Fremont

Fresno

Las Vegas, NV