Pandas - Texxl

Find out dataframe index value under a certain condition

February 3, 2022 by admin

The simple code would be: df.index[condition] so something like this: df.index[df[‘column1’] == True].tolist() .tolist() is used when there are multiple values match the condition

Get actual value from pandas dataframe instead of object with index

February 2, 2022 by admin

.item() will do the job: p1 = df[df[“column2”]==df.column2.max()].column1.item() print(p1) This way you can extract the actual value from pandas dataframe and store it in variable for later use.

How to calculate percentile (quantile) for each column in pandas dataframe

December 7, 2021December 7, 2021 by admin

Here we calculate 0.9th quantile of each column in our dataframe: q = 0.9 for column in df: qr = df[column].quantile(q) print(f”{q*100}% are lower than {qr}”) Here’s a good example to understand quantiles.

AttributeError: Can only use .dt accessor with datetimelike values

November 30, 2021 by admin

Make sure you convert column to Date & Time properly before calling dt.strftime(): df[‘NewDateTime’] = pd.to_datetime(df.DateTime).dt.strftime(“%d/%m/%Y %H:%M”)

Ternary operator on pandas dataframe

November 19, 2021 by admin

Unfortunately you can use ternary operator like this a if x>y else b on pandas dataframe logic. With that said you can use numpy.where instead: df[‘result’] = np.where(df1[‘col1’] > df1[‘col2′], 1, 0) There you go. It’s also much faster.

Specify new column names when concatenating pandas dataframes

November 19, 2021 by admin

How to concatenate dataframes and give column new names? Here, look: df2 = pd.concat([df1,df2], keys=[‘x’, ‘y’, ‘z’] ,axis=1)

Pandas set options to display all columns and rows

November 19, 2021 by admin

import pandas as pd pd.set_option(‘display.max_rows’, 500) pd.set_option(‘display.max_columns’, 500) pd.set_option(‘display.width’, 1000)

How to get column index number by column name in pandas

November 18, 2021 by admin

print(df.columns.get_loc(“column_name”))

Return dataframe object from pandas groupby() instead of series or groupby object

November 15, 2021 by admin

Use double square brackets around ‘Number’, i.e.: df.groupby([‘Name’, ‘Fruit’])[[‘Number’]].agg(‘sum’)

AttributeError: ‘Series’ object has no attribute ‘strftime’

November 3, 2021 by admin

When getting this error, instead of dff[“New Time”] = dff[“Old Time”].strftime(“%d/%m/%Y %H:%M”) we should add “.dt” (it can be used to access the values of the series as datetimelike and return several properties.) dff[“New Time”] = dff[“Old Time”].dt.strftime(“%d/%m/%Y %H:%M”)