Find out dataframe index value under a certain condition
The simple code would be: df.index[condition] so something like this: df.index[df[‘column1’] == True].tolist() .tolist() is used when there are multiple values match the condition
The simple code would be: df.index[condition] so something like this: df.index[df[‘column1’] == True].tolist() .tolist() is used when there are multiple values match the condition
.item() will do the job: p1 = df[df[“column2”]==df.column2.max()].column1.item() print(p1) This way you can extract the actual value from pandas dataframe and store it in variable for later use.
Here we calculate 0.9th quantile of each column in our dataframe: q = 0.9 for column in df: qr = df[column].quantile(q) print(f”{q*100}% are lower than {qr}”) Here’s a good example to understand quantiles.
Unfortunately you can use ternary operator like this a if x>y else b on pandas dataframe logic. With that said you can use numpy.where instead: df[‘result’] = np.where(df1[‘col1’] > df1[‘col2′], 1, 0) There you go. It’s also much faster.
import pandas as pd pd.set_option(‘display.max_rows’, 500) pd.set_option(‘display.max_columns’, 500) pd.set_option(‘display.width’, 1000)
print(df.columns.get_loc(“column_name”))
In my case the .csv file had strings as date and time with the following format: 30/07/2017 21:01:17 I find this to be the simplest way to convert this column to float: df[‘dateTime’] = df[‘dateTime’].str.replace(‘/’, ”) df[‘dateTime’] = df[‘dateTime’].str.replace(‘:’, ”) df[‘dateTime’] = df[‘dateTime’].str.replace(‘ ‘, ”) df[‘dateTime’] = df[‘dateTime’].astype(float) So we are removing any signs and … Read more
Hello, I’m back! 😎 Now the traditional method is this: df.loc[df[‘column_name’] == value] For string, and for numeric values this: df.loc[df[‘column_name’] == ‘string’] While it always work with numeric values, for string values sometimes it doesn’t work. It picks up a blank dataframe. I guess it’s something to do with encoding of the source where … Read more
Hey guys. Being pretty average at Pandas, yesterday I stumbled upon a formatting challenge. I download some datasheets from the web for machine learning from time to time. This time I got some weird Time & date formatting which might’ve been good for regular use with Excel but unsuitable when it comes to Neural Networks: … Read more