Friday, October 26, 2012

NSCB Sexy Stats Version 2

This was a revised version of my previous post about the NSCB article. With the suggestion from Tal Galili, below were the new pie charts and the R codes to produce these plots by directly scrapping the data from the webpage using XML and RColorBrewer pagkage.

Unemployment by Age Group

Unemployment by Gender

Unemployment by Civil Status

Unemployment by Educational Level




##################################################################################

library (XML)
library(RColorBrewer)
 

url<-"http://www.nscb.gov.ph/sexystats/2012/Filipinoversion/SS20121022_joblessness_filver.asp"
unemployment<-readHTMLTable(url, header=T, which=2,stringsAsFactors=F)
agegroup<-unemployment[3:8,c(1,3,5)]
gender<-unemployment[12:13,c(1,3,5)]
civil<-unemployment[17:20,c(1,3,5)]
education<-unemployment[25:31,c(1,3,5)]


#Copy to clipboard
Education    Y2006    Y2009
Elementary     42.5    37.9
High School    47.7    52.2
College    9.7    10


educ<-read.table("clipboard", header=T, sep="\t")

colnames(agegroup)<-c("Age.Group","Y2006","Y2009")
colnames(gender)<-c("Gender","Y2006","Y2009")
colnames(civil)<-c("Civil.Status","Y2006","Y2009")
colnames(education)<-c("Education","Y2006","Y2009")

agegroup$Age.Group[6]<-"65 & Up"
agegroup$Y2006<-as.numeric(agegroup$Y2006)
agegroup$Y2009<-as.numeric(agegroup$Y2009)

gender$Gender[2]<-"Female"
gender$Gender[1]<-"Male"
gender$Y2006<-as.numeric(gender$Y2006)
gender$Y2009<-as.numeric(gender$Y2009)

civil$Y2006<-as.numeric(civil$Y2006)
civil$Y2009<-as.numeric(civil$Y2009)
cs<-c("Single","Married","Widowed", "Divorced")

win.graph(w=14.3,h=7)
par(mfrow=c(1,2), oma=c(1,0,1,1) , mar=c(1,1,0,1))

#Chart 1
pie(agegroup$Y2006,label=agegroup$Age.Group, col=brewer.pal(6,"Set1"), border="white")
par(new=TRUE) 
pie(c(1), labels=NA, border='white', radius=0.4)
text(0,0,labels="Percent\nUnemployment\nby Age Group\nYear 2006", cex=1.5, font=2)

pie(agegroup$Y2009,label=agegroup$Age.Group, col=brewer.pal(6,"Set1"), border="white")
par(new=TRUE) 
pie(c(1), labels=NA, border='white', radius=0.4)
text(0,0,labels="Percent\nUnemployment\nby Age Group\nYear 2009", cex=1.5, font=2)
text(0.5,-1, "Data Source: NSCB\nCreated by: ARSalvacion", adj=c(0,0), cex=0.7)

#Chart 2
pie(gender$Y2006,label=gender$Gender, col=brewer.pal(2,"Set1"), border="white", cex=1.5)
par(new=TRUE) 
pie(c(1), labels=NA, border='white', radius=0.4)
text(0,0,labels="Percent\nUnemployment\nby Gender\nYear 2006", cex=1.5, font=2)

pie(gender$Y2009,label=gender$Gender, col=brewer.pal(2,"Set1"), border="white", cex=1.5)
par(new=TRUE) 
pie(c(1), labels=NA, border='white', radius=0.4)
text(0,0,labels="Percent\nUnemployment\nby Gender\nYear 2009", cex=1.5, font=2)
text(0.5,-1, "Data Source: NSCB\nCreated by: ARSalvacion", adj=c(0,0), cex=0.7)

#Chart 3
pie(civil$Y2006,label=cs, col=brewer.pal(4,"Dark2"), border="white")
par(new=TRUE) 
pie(c(1), labels=NA, border='white', radius=0.4)
text(0,0,labels="Percent\nUnemployment\nby Civil  Status\nYear 2006", cex=1.5, font=2)

pie(civil$Y2009,label=cs, col=brewer.pal(4,"Dark2"), border="white")
par(new=TRUE) 
pie(c(1), labels=NA, border='white', radius=0.4)
text(0,0,labels="Percent\nUnemployment\nby Civil  Status\nYear 2009", cex=1.5, font=2)
text(0.5,-1, "Data Source: NSCB\nCreated by: ARSalvacion", adj=c(0,0), cex=0.7)

#Chart 4
pie(educ$Y2006,label=educ$Education, col=brewer.pal(3,"Dark2"), border="white", cex=1.5)
par(new=TRUE) 
pie(c(1), labels=NA, border='white', radius=0.4)
text(0,0,labels="Percent\nUnemployment by\nEducational Level\nYear 2006", cex=1.5, font=2)

pie(educ$Y2009,label=educ$Education, col=brewer.pal(3,"Dark2"), border="white", cex=1.5)
par(new=TRUE) 
pie(c(1), labels=NA, border='white', radius=0.4)
text(0,0,labels="Percent\nUnemployment by\nEducational Level\nYear 2009", cex=1.5, font=2)
text(0.5,-1, "Data Source: NSCB\nCreated by: ARSalvacion", adj=c(0,0), cex=0.7)


##################################################################################


4 comments:

  1. 1) Cool, thanks.
    2) My name is Tal, not Tai :D
    (and if to linkback to anyplace, you can just link to my own blog:
    http://www.r-statistics.com/
    )

    With regards :)
    Tal

    ReplyDelete
  2. I just tried to reproduce the beautiful pie charts, and it took me some twiddling.

    1) educ<-read.table("clipboard", header=T, sep="\t") didn't work correctly. It's always a bit awkward to make clipboard uses portable. The following snippet recreates the educ variable using portable code:

    Education <- c("Elementary", "High School", "College")
    Y2006 <- c(42.5, 47.7, 9.7)win.graph(w=14.3,h=7)
    Y2009 <- c(37.9, 52.2, 10)
    educ <- data.frame(Education, Y2006, Y2009)
    rm(Education, Y2006, Y2009)

    2) Talking about portability: You're on a Windows system, it seems, while I - fortunately - am not. Your use of win.graph(w=14.3,h=7) isn't portable. dev.new(width = 14.3, height = 7) should work on all platforms.

    3) I had to use "cex = 1" for the text calls, and "cex = 0.5" for the Data Source entry to keep the text within the visible plot area.

    Apart from portability aspects: Nice work!

    Cheers
    harald

    ReplyDelete
    Replies
    1. Hi Harald,

      Thanks for the comment. Will incorporate your suggestions on my next post!

      Cheers

      ARSalvacion

      Delete