Sunday, February 23, 2020

Blog 13: Read and Write huge files in R

Read and Write Huge Dataset


Introduction

In this Blog my aim is to introduce a quick hack in R. We will look at how to read and write huge dataset from/to drive.There is a very powerful fread and fwrite function available in data.table library. It imports/export even dataset with millions of records quite easily. We will look at some inbuilt dataset in R and record time while doing so.

Installing the library: dplyr and tidyr

if(!require("dplyr")){
  
  install.packages("dplyr")
}else{
  
  library(dplyr)
}

if(!require("tidyr")){
  
  install.packages("tidyr")
}else{
  
  library(tidyr)
}


Importing the dataset

For this exercise we will look at the FARS dataset which is related to US Births in 1969 - 1988.It has 372864 records with 7 columns

# Ecdat library for importing the dataset

if(!require("mosaicData")){
  
  install.packages("mosaicData")
}else{
  
  library(mosaicData)
}

data(Birthdays)
df<-Birthdays
dim(df)
[1] 372864      7


Writing the Data set to drive

if(!require("data.table")){
  
  install.packages("data.table")
}else{
  
  library(data.table)
}

t1<-Sys.time()

data.table::fwrite(df,"dummy.csv",row.names=F)

Sys.time()-t1
Time difference of 0.02991986 secs

It took 0.3 secs to write the dataset


Reading the Data set to drive

t1<-Sys.time()

data.table::fread("dummy.csv")
        state year month day                 date wday births
     1:    AK 1969     1   1 1969-01-01T00:00:00Z  Wed     14
     2:    AL 1969     1   1 1969-01-01T00:00:00Z  Wed    174
     3:    AR 1969     1   1 1969-01-01T00:00:00Z  Wed     78
     4:    AZ 1969     1   1 1969-01-01T00:00:00Z  Wed     84
     5:    CA 1969     1   1 1969-01-01T00:00:00Z  Wed    824
    ---                                                      
372860:    VT 1988    12  31 1988-12-31T00:00:00Z  Sat     21
372861:    WA 1988    12  31 1988-12-31T00:00:00Z  Sat    157
372862:    WI 1988    12  31 1988-12-31T00:00:00Z  Sat    167
372863:    WV 1988    12  31 1988-12-31T00:00:00Z  Sat     45
372864:    WY 1988    12  31 1988-12-31T00:00:00Z  Sat     18
Sys.time()-t1
Time difference of 0.1196802 secs

It took around 0.32 secs

Final Comments

We saw an example where we can read as well write a file of decent size from/to the working directory.


Wednesday, February 5, 2020

Book Review: Coalition Years

THE COALITION YEARSTHE COALITION YEARS by Pranab Mukherjee


The book by Pranab Mukherjee is an ode to his political career. Through the course of the book,
Pranab Mukherjee comes out as an honest man assuming role the party expects from him. A true Congressmen, Pranab Da carries out the chores with arduous sincerity in the face of Coalition politics. The book can be seen in the light of the following points:

• The book starts with the fag end of Atal Ji's government and marks how Pranab Da reinvigorated the Congress party amid the possibility of an NDA-2. Having worked right from grass root levels to Party ranks, Pranab Da garnered respect of the partymen and knew how to connect with people at a personal level. His proximity to 10 Janpath gave gravity to his stature within party lines.An incident during the pre poll campaign in 2004 is of special mention where he visits a village in Left dominated West Bengal and interacts with a boy. Pranab Da assures him of a bund's construction as the first thing once congress comes into power
• His stint as a Defence minister saw an appreciation in defence allocation.During his tenure, he signed pacts with US as well as Russia on major defence cooperation. For most part, he focussed on modernisation of the warfare and transfer of defence technologies with a 'Make in India' thrust in mind .
• The controversial Nuclear deal materialised while he was the Minister of External Affairs . To bring US on the same page, he connected with condoleezza rice at a personal level and made sure that the sanctions put on India by the NSG were softened. This marked the beginning of a landmark era in India-US relationships
• Of all the hats he had donned, the one he had most experience was the Finance Ministry. Introduction of the retrospective tax act, his expansionary policies during 2010,2011 regime and his aggressive stance on implementation of GST, UID made him the man with an independent stance in an otherwise subservient congress
• One aspect according to me that made him class apart was his ability to maintain cordial relationship with all and sundry of the political class. I would like to mention two instances that describe his ability to connect with fellow politicians
○ During the run for president, Bal Thackeray rebuffed his party's stance and supported Pranab Da. As a gesture of gratitude, he went to meet Thackeray at his residence where the later quipped-" It is but obvious that a Maratha lion supports Bengal Tiger"
○ The much debated GST bill was eventually passed by both the houses in 2017 . Since Pranab Da shared cordial relationship with the PM, Narendra Modi gave a congratulatory call to apprise him of the development since he was a staunch supporter of the bill during his tenure as the Finance Minister.

In the end I would like to say that Pranab Mukherjee, in true regards, is the son of the soil. He is definitely the PM India never had


View all my reviews

Web Scraping Tutorial 4- Getting the busy information data from Popular time page from Google

Popular Times Popular Times In this blog we will try to scrape the ...