Wednesday, May 11, 2022

Using iloc to slice and dice a dataset

Using loc in Python

Slicing and assignment in pandas data frame can be easily done using loc.Moreover, .loc results in faster computations and hence becomes critical when we are dealing with huge datasets.In this blog, we will look at how to use .loc to slice and dice data¶

In [1]:

import pandas as pd
import numpy as np

Step 1:Creating the data frame¶

In [50]:

df = pd.DataFrame([np.array(['A','B','C']),
                   np.array([1,2,3]),
                np.array([4,5,np.nan])
                  ]).T
df.columns=['Label','Value1','Value2']
df

Out[50]:

	Label	Value1	Value2
0	A	1	4.0
1	B	2	5.0
2	C	3	NaN

In [52]:

df.dtypes

Out[52]:

Label     object
Value1    object
Value2    object
dtype: object

In [ ]:

# Converting Value1 and Value2 into numeric

In [54]:

df[['Value1','Value2']] = df[['Value1','Value2']].apply(pd.to_numeric ,
                                                        errors = "coerce")

In [56]:

df.dtypes

Out[56]:

Label      object
Value1      int64
Value2    float64
dtype: object

Using loc to filter on certain rows¶

In [34]:

df.loc[1:2]

Out[34]:

	Label	Value1	Value2
1	B	2	5
2	C	3	nan

Using loc to filter on certain columns¶

In [35]:

df.loc[:,['Value1','Value2']]

Out[35]:

	Value1	Value2
0	1	4
1	2	5
2	3	nan

In [36]:

df.loc[:,['Value1']]

Out[36]:

	Value1
0	1
1	2
2	3

Using loc to filter on certain rows and columns¶

In [38]:

df.loc[0:1 ,['Label','Value2']]

Out[38]:

	Label	Value2
0	A	4
1	B	5

Using loc to assign values to rows and columns¶

In [39]:

df

Out[39]:

	Label	Value1	Value2
0	A	1	4
1	B	2	5
2	C	3	nan

In [60]:

df.loc[df['Value1'] > 1 , ['Value1','Value2']] = 100
df

Out[60]:

	Label	Value1	Value2
0	A	1	4.0
1	B	100	100.0
2	C	100	100.0

Machine Learning Made Easy