A Pandas dataframe is a two dimensional data structure which allows you to store data in rows and columns. It's very useful when you're analyzing data.

When you have a list of data records in a dataframe, you may need to drop a specific list of rows depending on the needs of your model and your goals when studying your analytics.

In this tutorial, you'll learn how to drop a list of rows from a Pandas dataframe.

To learn how to drop columns, you can read here about How to Drop Columns in Pandas.

How to Drop a Row or Column in a Pandas Dataframe

To drop a row or column in a dataframe, you need to use the drop() method available in the dataframe. You can read more about the drop() method in the docs here.

Dataframe Axis

  • Rows are denoted using axis=0
  • Columns are denoted using axis=1

Dataframe Labels

  • Rows are labelled using the index number starting with 0, by default.
  • Columns are labelled using names.

Drop() Method Parameters

  • index - the list of rows to be deleted
  • axis=0 - Marks the rows in the dataframe to be deleted
  • inplace=True - Performs the drop operation in the same dataframe, rather than creating a new dataframe object during the delete operation.

Sample Pandas DataFrame

Our sample dataframe contains the columns product_name, Unit_Price, No_Of_Units, Available_Quantity, and Available_Since_Date columns. It also has rows with NaN values which are used to denote missing values.

import pandas as pd

data = {"product_name":["Keyboard","Mouse", "Monitor", "CPU","CPU", "Speakers",pd.NaT],
        "Unit_Price":[500,200, 5000.235, 10000.550, 10000.550, 250.50,None],
        "No_Of_Units":[5,5, 10, 20, 20, 8,pd.NaT],
        "Available_Quantity":[5,6,10,"Not Available","Not Available", pd.NaT,pd.NaT],
        "Available_Since_Date":['11/5/2021', '4/23/2021', '08/21/2021','09/18/2021','09/18/2021','01/05/2021',pd.NaT]
       }

df = pd.DataFrame(data)

df

The dataframe will look like this:

product_name Unit_Price No_Of_Units Available_Quantity Available_Since_Date
0 Keyboard 500.000 5 5 11/5/2021
1 Mouse 200.000 5 6 4/23/2021
2 Monitor 5000.235 10 10 08/21/2021
3 CPU 10000.550 20 Not Available 09/18/2021
4 CPU 10000.550 20 Not Available 09/18/2021
5 Speakers 250.500 8 NaT 01/05/2021
6 NaT NaN NaT NaT NaT

And just like that we've created our sample dataframe.

After each drop operation, you'll print the dataframe by using df which will print the dataframe in a regular HTML table format.

You can read here about how to Pretty Print a Dataframe to print the dataframe in different visual formats.

Next, you'll learn how to drop a list of rows in different use cases.

How to Drop a List of Rows by Index in Pandas

You can delete a list of rows from Pandas by passing the list of indices to the drop() method.

df.drop([5,6], axis=0, inplace=True)

df

In this code,

  • [5,6] is the index of the rows you want to delete
  • axis=0 denotes that rows should be deleted from the dataframe
  • inplace=True performs the drop operation in the same dataframe

After dropping rows with the index 5 and 6, you'll have the below data in the dataframe:

product_name Unit_Price No_Of_Units Available_Quantity Available_Since_Date
0 Keyboard 500.000 5 5 11/5/2021
1 Mouse 200.000 5 6 4/23/2021
2 Monitor 5000.235 10 10 08/21/2021
3 CPU 10000.550 20 Not Available 09/18/2021
4 CPU 10000.550 20 Not Available 09/18/2021

This is how you can delete rows with a specific index.

Next, you'll learn about dropping a range of indices.

How to Drop Rows by Index Range in Pandas

You can also drop a list of rows within a specific range.

A range is a set of values with a lower limit and an upper limit.

This may be useful in cases where you want to create a sample dataset exlcuding specific ranges of data.

You can create a range of rows in a dataframe by using the df.index() method. Then you can pass this range to the drop() method to drop the rows as shown below.

df.drop(df.index[2:4], inplace=True)

df

Here's what this code is doing:

  • df.index[2:4] generates a range of rows from 2 to 4. The lower limit of the range is inclusive and the upper limit of the range is exclusive. This means that rows 2 and 3 will be deleted and row 4 will not be deleted.
  • inplace=True performs the drop operation in the same dataframe

After dropping rows within the range 2-4, you'll have the below data in the dataframe:

product_name Unit_Price No_Of_Units Available_Quantity Available_Since_Date
0 Keyboard 500.00 5 5 11/5/2021
1 Mouse 200.00 5 6 4/23/2021
4 CPU 10000.55 20 Not Available 09/18/2021

This is how you can drop the list of rows in the dataframe using its range.

How to Drop All Rows after an Index in Pandas

You can drop all rows after a specific index by using iloc[].

You can use iloc[] to select rows by using its position index. You can specify the start and end position separated by a :. For example, you'd use 2:3 to select rows from 2 to 3. If you want to select all the rows, you can just use : in iloc[].

This may be useful in cases where you want to split the dataset for training and testing purposes.

Use the below snippet to select rows from 0 to the index 2. This results in dropping the rows after the index 2.

df = df.iloc[:2]

df

In this code, :2 selects the rows until the index 2.

This is how you can drop all rows after a specific index.

After dropping rows after the index 2, you'll have the below data in the dataframe:

product_name Unit_Price No_Of_Units Available_Quantity Available_Since_Date
0 Keyboard 500.0 5 5 11/5/2021
1 Mouse 200.0 5 6 4/23/2021

This is how you can drop rows after a specific index.

Next, you'll learn how to drop rows with conditions.

How to Drop Rows with Multiple Conditions in Pandas

You can drop rows in the dataframe based on specific conditions.

For example, you can drop rows where the column value is greater than X and less than Y.

This may be useful in cases where you want to create a dataset that ignores columns with specific values.

To drop rows based on certain conditions, select the index of the rows which pass the specific condition and pass that index to the drop() method.

df.drop(df[(df['Unit_Price'] >400) & (df['Unit_Price'] < 600)].index, inplace=True)

df

In this code,

  • (df['Unit_Price'] >400) & (df['Unit_Price'] < 600) is the condition to drop the rows.
  • df[].index selects the index of rows which passes the condition.
  • inplace=True performs the drop operation in the same dataframe rather than creating a new one.

After dropping the rows with the condition which has the unit_price greater than 400 and less than 600, you'll have the below data in the dataframe:

product_name Unit_Price No_Of_Units Available_Quantity Available_Since_Date
1 Mouse 200.0 5 6 4/23/2021

This is how you can drop rows in the dataframe using certain conditions.

Conclusion

To summarize, in this article you've learnt what the drop() method is in a Pandas dataframe. You've also seen how dataframe rows and columns are labelled. And finally you've learnt how to drop rows using indices, a range of indices, and based on conditions.

If you liked this article, feel free to share it.

You May Also Like