Showing posts with label combining multiple data frame into single data frame. Show all posts
Showing posts with label combining multiple data frame into single data frame. Show all posts

Friday, April 1, 2022

Combine multiple data frames from a List into a single data frame

Combine multiple data frames from a List


Introduction

There are certain situations where we have to store results of an intermediate steps in a list as data frames and then combine all these data frames into a single consolidated data frame.Typical scenario can be when doing frequency profiling or when analyzing product performance at the end of each week.Lets look at a simple example of how we can use do.call and rbind.data.frame to execute this


package.name<-c("dplyr","stringr")

for(i in package.name){

  if(!require(i,character.only = T)){

    install.packages(i)
  }
  library(i,character.only = T)

}


Step 1: Creating the data frame

For our blog, we will be using the starwars dataset readily available in dplyr package

df<-dplyr::starwars
head(df)
# A tibble: 6 x 14
  name     height  mass hair_color  skin_color eye_color birth_year sex   gender
  <chr>     <int> <dbl> <chr>       <chr>      <chr>          <dbl> <chr> <chr> 
1 Luke Sk~    172    77 blond       fair       blue            19   male  mascu~
2 C-3PO       167    75 <NA>        gold       yellow         112   none  mascu~
3 R2-D2        96    32 <NA>        white, bl~ red             33   none  mascu~
4 Darth V~    202   136 none        white      yellow          41.9 male  mascu~
5 Leia Or~    150    49 brown       light      brown           19   fema~ femin~
6 Owen La~    178   120 brown, grey light      blue            52   male  mascu~
# ... with 5 more variables: homeworld <chr>, species <chr>, films <list>,
#   vehicles <list>, starships <list>


Step 2: Frequency Profile of eye_color and gender

nm<- c("gender","eye_color")
base.variable<-"Level"
l1<-list()


Iterating through variables in nm and storing the results in l1

l1<-lapply(nm, function(x){
  
  interim.df<-df%>%
    select(x)%>%
    group_by(!!!syms(x))%>%
    summarise(Total_Count=n())%>%
    mutate(Feature=x)%>%
    rename(!!base.variable := !!paste0(x))%>%
    select(Feature,Level,Total_Count)
  
  interim.df
  
})
Note: Using an external vector in selections is ambiguous.
i Use `all_of(x)` instead of `x` to silence this message.
i See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
This message is displayed once per session.
names(l1)<-nm

If you are perplexed by the !!! operator shown in the above piece of code and want to know more about it, then visit this Link


Step 3: Combining individual data frames in l1 into a single data frame

final.df<-do.call(rbind.data.frame,l1)
final.df
# A tibble: 18 x 3
   Feature   Level         Total_Count
 * <chr>     <chr>               <int>
 1 gender    feminine               17
 2 gender    masculine              66
 3 gender    <NA>                    4
 4 eye_color black                  10
 5 eye_color blue                   19
 6 eye_color blue-gray               1
 7 eye_color brown                  21
 8 eye_color dark                    1
 9 eye_color gold                    1
10 eye_color green, yellow           1
11 eye_color hazel                   3
12 eye_color orange                  8
13 eye_color pink                    1
14 eye_color red                     5
15 eye_color red, blue               1
16 eye_color unknown                 3
17 eye_color white                   1
18 eye_color yellow                 11


Parting Comments

In this blog we looked at a very simple example of how We can use do.call and rbind.data.frame to combine multiple data frames in a list into a single consolidated data frame

My Youtube Channel

Embed Shiny

Please wait...