Machine Learning Made Easy

Sunday, May 1, 2022

Convert column(s) into numeric in Pandas

Converting column(s) into Numeric Data Type in Pandas

Sometimes while importing a dataset from csv/excel, a numeric data type is read as a string. In such a case, we need to convert all those columns into numeric.In this blog, we will look at how to do that¶

Step 1: Importing Libraries¶

In [2]:

import pandas as pd
import numpy as np

Step 2: Creating the dataset¶

In [14]:

df = pd.DataFrame([np.array(["1","2","3"]),
                np.array(["4","5",np.nan])]).T
df.columns=['Value1','Value2']
df

Out[14]:

	Value1	Value2
0	1	4
1	2	5
2	3	nan

Step 3: Checking data type of the columns¶

In [15]:

df.dtypes

Out[15]:

Value1    object
Value2    object
dtype: object

we can see that even though value1 and value 2 columns contains numbers but they are represented as string.Lets say if we need to add these two columns, then we need to first convert them into numeric data type¶

Step 4: Converting into Numeric data type¶

In [18]:

cols=['Value1','Value2']
df[cols] = df[cols].apply(pd.to_numeric, errors='coerce')
df

Out[18]:

	Value1	Value2
0	1	4.0
1	2	5.0
2	3	NaN

In [19]:

df.dtypes

Out[19]:

Value1      int64
Value2    float64
dtype: object

Lets add the two columns now¶

In [22]:

def func_1(a,b):
    return a + b

In [27]:

df['Value3']=df.apply(lambda x: func_1(x.Value1, x.Value2), axis=1)
df

Out[27]:

	Value1	Value2	Value3
0	1	4.0	5.0
1	2	5.0	7.0
2	3	NaN	NaN

Subscribe to: Posts (Atom)