Want to select a column from a Dataframe in Python? Use the following example.
They are different. df1[(df1['column_x']=='some_value')]
is fine if you’re just looking for a single value. The advantage of isin
is that you can pass it multiple values. For example: df1.loc[df1['column_x'].isin(['some_value', 'another_value'])]
It’s interesting to see that from a performance perspective, the first method (using ==
) actually seems to be significantly slower than the second (using isin
):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | import timeit df = pd.DataFrame({'x':np.random.choice(['a','b','c'],10000)}) def method1(df = df): return df[df['x'] == 'b'] def method2(df=df): return df[df['x'].isin(['b'])] >>> timeit.timeit(method1,number=1000)/1000 0.001710233046906069 >>> timeit.timeit(method2,number=1000)/1000 0.0008507879299577325 |
If you like this question & answer and want to contribute, then write your question & answer and email to freewebmentor[@]gmail.com. Your question and answer will appear on FreeWebMentor.com and help other developers.