Q1:convert a numpy array to a dataframe of given shape¶
In [1]:
import numpy as np
import pandas as pd
In [17]:
arr1=np.random.rand(35)
arr1
Out[17]:
array([0.93396756, 0.86539523, 0.2219763 , 0.02386981, 0.52899722, 0.48128182, 0.36097828, 0.32840324, 0.65400887, 0.75653592, 0.47091215, 0.79927516, 0.71212087, 0.39202379, 0.18226057, 0.27436685, 0.09453414, 0.26264447, 0.50901002, 0.76229193, 0.76573485, 0.82551178, 0.16686845, 0.58544734, 0.55156004, 0.72466356, 0.42176526, 0.4213822 , 0.99291738, 0.77693886, 0.13278228, 0.73468001, 0.36669957, 0.5242279 , 0.69553251])
In [18]:
arr1.shape
Out[18]:
(35,)
So the array has 35 rows and 1 column.It could represent daily temperature readings for 5 weeks.Lets say we want to convert it into a data frame with 7 rows and 5 columns, where each column represents data collected for individual week and rows could represent days¶
In [22]:
df=pd.DataFrame(arr1)
df.columns=['Temperature_Reading']
df.head()
Out[22]:
Temperature_Reading | |
---|---|
0 | 0.933968 |
1 | 0.865395 |
2 | 0.221976 |
3 | 0.023870 |
4 | 0.528997 |
In [23]:
arr2=np.reshape(arr1,(7,5))
arr2
Out[23]:
array([[0.93396756, 0.86539523, 0.2219763 , 0.02386981, 0.52899722], [0.48128182, 0.36097828, 0.32840324, 0.65400887, 0.75653592], [0.47091215, 0.79927516, 0.71212087, 0.39202379, 0.18226057], [0.27436685, 0.09453414, 0.26264447, 0.50901002, 0.76229193], [0.76573485, 0.82551178, 0.16686845, 0.58544734, 0.55156004], [0.72466356, 0.42176526, 0.4213822 , 0.99291738, 0.77693886], [0.13278228, 0.73468001, 0.36669957, 0.5242279 , 0.69553251]])
In [26]:
df2=pd.DataFrame(arr2)
df2.columns=['Week1','Week2','Week3','Week4','Week4']
df2
Out[26]:
Week1 | Week2 | Week3 | Week4 | Week4 | |
---|---|---|---|---|---|
0 | 0.933968 | 0.865395 | 0.221976 | 0.023870 | 0.528997 |
1 | 0.481282 | 0.360978 | 0.328403 | 0.654009 | 0.756536 |
2 | 0.470912 | 0.799275 | 0.712121 | 0.392024 | 0.182261 |
3 | 0.274367 | 0.094534 | 0.262644 | 0.509010 | 0.762292 |
4 | 0.765735 | 0.825512 | 0.166868 | 0.585447 | 0.551560 |
5 | 0.724664 | 0.421765 | 0.421382 | 0.992917 | 0.776939 |
6 | 0.132782 | 0.734680 | 0.366700 | 0.524228 | 0.695533 |
Important Note:Arrays with dimensions M by N can be reshaped into an array with dimensions K by L as long as MN = K L¶
Q2:Find the positions of numbers that are multiples of 3 from a series¶
In [51]:
s1=pd.Series(np.array([1,3,6,4,5,9,9]))
s1
Out[51]:
0 1 1 3 2 6 3 4 4 5 5 9 6 9 dtype: int32
In [54]:
l1=list(s1)
l1.index(9)
Out[54]:
5
We can see that ideally it should have been 5 and 6, but here we only get the index of the first instance of 9.Hence we will be using enumerate to execute this.Enumerate gives a list with index and values of list elements¶
In [55]:
list(enumerate(l1))
Out[55]:
[(0, 1), (1, 3), (2, 6), (3, 4), (4, 5), (5, 9), (6, 9)]
In [57]:
pos=[i for i,j in enumerate(l1) if j % 3 == 0]
pos
Out[57]:
[1, 2, 5, 6]
Q3:Extract items at given positions from a series¶
In [83]:
s1 = pd.Series(list('abcdefghijklmnopqrstuvwxyz'))
s1[:6]
Out[83]:
0 a 1 b 2 c 3 d 4 e 5 f dtype: object
In [84]:
# Use the below position and extract the corresponding elements from s1
pos = [0, 4, 8, 14, 20]
In [88]:
s1[s1.index.isin(pos)]
Out[88]:
0 a 4 e 8 i 14 o 20 u dtype: object
Q4: Stack two series vertically and horizontally¶
In [89]:
s1=pd.Series(range(5))
s1
Out[89]:
0 0 1 1 2 2 3 3 4 4 dtype: int64
In [90]:
s2=pd.Series(list('abcde'))
s2
Out[90]:
0 a 1 b 2 c 3 d 4 e dtype: object
Stacking vertically¶
In [94]:
pd.concat([s1,s2])
Out[94]:
0 0 1 1 2 2 3 3 4 4 0 a 1 b 2 c 3 d 4 e dtype: object
Stacking horizontally¶
In [96]:
pd.concat([s1,s2],axis=1)
Out[96]:
0 | 1 | |
---|---|---|
0 | 0 | a |
1 | 1 | b |
2 | 2 | c |
3 | 3 | d |
4 | 4 | e |
Get the positions of items of series A in another series B¶
In [99]:
sA = pd.Series([10, 9, 6, 5, 3, 1, 12, 8, 13])
sA
Out[99]:
0 10 1 9 2 6 3 5 4 3 5 1 6 12 7 8 8 13 dtype: int64
In [100]:
sB = pd.Series([1, 3, 10, 13])
sB
Out[100]:
0 1 1 3 2 10 3 13 dtype: int64
In [102]:
pos_logical=sA.isin(sB)
pos_logical
Out[102]:
0 True 1 False 2 False 3 False 4 True 5 True 6 False 7 False 8 True dtype: bool
In [105]:
pos=list(sA.index[pos_logical])
pos
Out[105]:
[0, 4, 5, 8]