Find out dataframe index value under a certain condition
The simple code would be: df.index[condition] so something like this: df.index[df[‘column1’] == True].tolist() .tolist() is used when there are multiple values match the condition
The simple code would be: df.index[condition] so something like this: df.index[df[‘column1’] == True].tolist() .tolist() is used when there are multiple values match the condition
.item() will do the job: p1 = df[df[“column2”]==df.column2.max()].column1.item() print(p1) This way you can extract the actual value from pandas dataframe and store it in variable for later use.
Here we calculate 0.9th quantile of each column in our dataframe: q = 0.9 for column in df: qr = df[column].quantile(q) print(f”{q*100}% are lower than {qr}”) Here’s a good example to understand quantiles.
Make sure you convert column to Date & Time properly before calling dt.strftime(): df[‘NewDateTime’] = pd.to_datetime(df.DateTime).dt.strftime(“%d/%m/%Y %H:%M”)
Unfortunately you can use ternary operator like this a if x>y else b on pandas dataframe logic. With that said you can use numpy.where instead: df[‘result’] = np.where(df1[‘col1’] > df1[‘col2′], 1, 0) There you go. It’s also much faster.
How to concatenate dataframes and give column new names? Here, look: df2 = pd.concat([df1,df2], keys=[‘x’, ‘y’, ‘z’] ,axis=1)
import pandas as pd pd.set_option(‘display.max_rows’, 500) pd.set_option(‘display.max_columns’, 500) pd.set_option(‘display.width’, 1000)
print(df.columns.get_loc(“column_name”))
Use double square brackets around ‘Number’, i.e.: df.groupby([‘Name’, ‘Fruit’])[[‘Number’]].agg(‘sum’)
When getting this error, instead of dff[“New Time”] = dff[“Old Time”].strftime(“%d/%m/%Y %H:%M”) we should add “.dt” (it can be used to access the values of the series as datetimelike and return several properties.) dff[“New Time”] = dff[“Old Time”].dt.strftime(“%d/%m/%Y %H:%M”)