Friday, March 4, 2022

RowSum and Column Sum in R

RowSum and ColSum in R


Introduction

There are certain situation where we are required to take row wise sum across multiple columns.For example:

  • Get the total weekly sum of shipment across days of the week
  • Estimate mean of daily blood pressure readings over days of the month

In this blog we will look at how dplyr can be used to generate such summaries

package.name<-c("dplyr","stringr")

for(i in package.name){

  if(!require(i,character.only = T)){

    install.packages(i)
  }
  library(i,character.only = T)

}

Step 1: Creating a data frame

We are creating a dummy data frame with details of daily Sales for 100 products

df<-data.frame(Product_Line=str_c("Prod_",1:100),
               Sales_Monday=rnorm(100,15,6),
               Sales_Tuesday=rnorm(100,18,3),
               Sales_Wednesday=rnorm(100,19,4),
               Sales_Thursday=rnorm(100,17,5),
               Sales_Friday=rnorm(100,16,5)
               )
head(df)
  Product_Line Sales_Monday Sales_Tuesday Sales_Wednesday Sales_Thursday
1       Prod_1     11.93038      17.39956        28.12787       18.03907
2       Prod_2     16.42163      19.16346        17.14527       16.07334
3       Prod_3     11.75046      20.38176        17.69459       17.15720
4       Prod_4     22.31537      17.57846        22.52997       19.05676
5       Prod_5     16.04482      19.36742        24.12514       13.11908
6       Prod_6     11.30839      14.56328        16.36527       22.69839
  Sales_Friday
1     13.58432
2     13.34327
3     13.06158
4     13.94151
5     19.54593
6     17.28198


Step 2: Taking rowsum to generate weekly Sales

total_weekly_sales<-df%>%
  select(-Product_Line)%>%
  rowSums(.)

summary.df<-cbind.data.frame(df%>%select(Product_Line),total_weekly_sales)
head(summary.df)
  Product_Line total_weekly_sales
1       Prod_1           89.08120
2       Prod_2           82.14696
3       Prod_3           80.04559
4       Prod_4           95.42207
5       Prod_5           92.20238
6       Prod_6           82.21732


Step 3: Taking rowsum to generate weekly Sales

We can use summarise to create column sum as shown below

total_daily_sales<-df%>%
  select(-Product_Line)%>%
  summarise(across(everything(),list(sum)))

total_daily_sales
  Sales_Monday_1 Sales_Tuesday_1 Sales_Wednesday_1 Sales_Thursday_1
1       1428.943        1760.138           1858.45         1736.515
  Sales_Friday_1
1        1580.97


We can also use apply to do the above calculation

total_daily_sales<-df%>%
  select(-Product_Line)%>%
  apply(2,sum)

total_daily_sales
   Sales_Monday   Sales_Tuesday Sales_Wednesday  Sales_Thursday    Sales_Friday 
       1428.943        1760.138        1858.450        1736.515        1580.970 

My Youtube Channel

Embed Shiny

Please wait...