|
| 1 | +--- |
| 2 | +title: "Choropleth Map for PM2.5 Emissions Data" |
| 3 | +author: "BillSeliger" |
| 4 | +date: "Saturday, January 31, 2015" |
| 5 | +output: html_document |
| 6 | +--- |
| 7 | +I have a high interest in mapping geospatial data and wanted to work with the data from the Exploratory Data Analysis Project 2 further so I created this additonal code which creates a Choropleth, a type of heat map, based on the total PM2.5 Emissions by each FIPS county. For more information on FIPS county code see this page <http://en.wikipedia.org/wiki/FIPS_county_code> |
| 8 | + |
| 9 | +The original Course Project is here - <https://class.coursera.org/exdata-010/human_grading> |
| 10 | + |
| 11 | +This is almost entirely drawn from the code of others found on the internet - many different sources contributed. |
| 12 | + |
| 13 | +First, I check to see if the data is loaded to R and if not, I set the working directory and read in the data. Because this data is in RDS format we use the readRDS function: |
| 14 | + |
| 15 | + |
| 16 | +```r |
| 17 | + if(!exists("NEI")) { |
| 18 | + print("Reading NEI file") |
| 19 | + setwd("C:/Users/rr046302/Documents/Bill's Stuff/Coursera/Exploratory Data Analysis/Project 2") |
| 20 | + NEI <- readRDS("summarySCC_PM25.rds") |
| 21 | + setwd("C:/Users/rr046302/Documents/Bill's Stuff/Coursera/Reproducible Research/DataScienceSpecialization.github.io") |
| 22 | + } |
| 23 | +``` |
| 24 | + |
| 25 | +Next I require several R packages - require checkes to see if the packages are loaded and if not loads them: |
| 26 | + |
| 27 | + |
| 28 | +```r |
| 29 | + require(maps) |
| 30 | + require(ggmap) |
| 31 | + require(doBy) |
| 32 | +``` |
| 33 | + |
| 34 | +Next I set the color for the color bands - here I have chosen a set of magenta colors to indicate that the Emissions is a measure of a deleterious substance in the environment: |
| 35 | + |
| 36 | + |
| 37 | +```r |
| 38 | +colors = c("#F1EEF6", "#D4B9DA", "#C994C7", "#DF65B0", "#DD1C77", |
| 39 | + "#980043") |
| 40 | +``` |
| 41 | + |
| 42 | +Using the doBy package I summarize Emissions by fips and year, using the FUNction sum: |
| 43 | + |
| 44 | +```r |
| 45 | + fipssum <- summaryBy(Emissions ~fips + year, data = NEI, keep.names = TRUE, FUN = sum) |
| 46 | +``` |
| 47 | + |
| 48 | +Usign the cut and breaks function I cut the Emissions sum into 6 equal parts. Note that I do this prior to filtering on the year as I want all 4 years to use the same color buckets so that the comparisons across all 4 years are appropriate: |
| 49 | + |
| 50 | +```r |
| 51 | + fipssum$colorBuckets <- cut(fipssum$Emissions, breaks=c(quantile(fipssum$Emissions, probs = seq(0, 1, by = 0.2))), |
| 52 | + labels=c(1,2,3,4,5), include.lowest=TRUE) |
| 53 | +``` |
| 54 | + |
| 55 | +We'll start with the data for 1999 (you can fork the repo on Github that has the following code set up as a function): |
| 56 | + |
| 57 | +```r |
| 58 | +NEIyear <- 1999 |
| 59 | +``` |
| 60 | + |
| 61 | +And filter on year: |
| 62 | + |
| 63 | +```r |
| 64 | + year <- subset (fipssum, year == NEIyear) |
| 65 | +``` |
| 66 | + |
| 67 | +I create an object of the colorBuckets that will be used for setting colors when I make the plot |
| 68 | + |
| 69 | +```r |
| 70 | + colorsmatched <- year$colorBuckets |
| 71 | +``` |
| 72 | + |
| 73 | +A title is created using the NEIyear that was called by the user in the function call: |
| 74 | + |
| 75 | +```r |
| 76 | + title <- paste("NEI PM2.5 Emissions by county ", NEIyear) |
| 77 | +``` |
| 78 | + |
| 79 | +The penultimate step is to create the plot: |
| 80 | + |
| 81 | +```r |
| 82 | + windows() |
| 83 | + map("county") |
| 84 | +``` |
| 85 | + |
| 86 | + |
| 87 | + |
| 88 | +```r |
| 89 | + map("county", col = colors[colorsmatched], fill = TRUE, resolution = 0, |
| 90 | + lty = 0, projection = "polyconic") |
| 91 | + title(title) |
| 92 | + leg.txt <- c("bottom", "2nd", "3rd", "4th", "5th quintile") |
| 93 | + legend("topright", leg.txt, horiz = TRUE, fill = colors) |
| 94 | +``` |
| 95 | + |
| 96 | + |
| 97 | + |
| 98 | +We'll run it again for 2002 data: |
| 99 | + |
| 100 | +```r |
| 101 | +NEIyear <- 2002 |
| 102 | + year <- subset (fipssum, year == NEIyear) |
| 103 | + colorsmatched <- year$colorBuckets |
| 104 | + title <- paste("NEI PM2.5 Emissions by county ", NEIyear) |
| 105 | + map("county", col = colors[colorsmatched], fill = TRUE, resolution = 0, |
| 106 | + lty = 0, projection = "polyconic") |
| 107 | + title(title) |
| 108 | + leg.txt <- c("bottom", "2nd", "3rd", "4th", "5th quintile") |
| 109 | + legend("topright", leg.txt, horiz = TRUE, fill = colors) |
| 110 | +``` |
| 111 | + |
| 112 | + |
| 113 | + |
| 114 | +And again for 2005 data: |
| 115 | + |
| 116 | +```r |
| 117 | +NEIyear <- 2005 |
| 118 | + year <- subset (fipssum, year == NEIyear) |
| 119 | + colorsmatched <- year$colorBuckets |
| 120 | + title <- paste("NEI PM2.5 Emissions by county ", NEIyear) |
| 121 | + map("county", col = colors[colorsmatched], fill = TRUE, resolution = 0, |
| 122 | + lty = 0, projection = "polyconic") |
| 123 | + title(title) |
| 124 | + leg.txt <- c("bottom", "2nd", "3rd", "4th", "5th quintile") |
| 125 | + legend("topright", leg.txt, horiz = TRUE, fill = colors) |
| 126 | +``` |
| 127 | + |
| 128 | + |
| 129 | + |
| 130 | +And finally for 2008 data: |
| 131 | + |
| 132 | +```r |
| 133 | +NEIyear <- 2008 |
| 134 | + year <- subset (fipssum, year == NEIyear) |
| 135 | + colorsmatched <- year$colorBuckets |
| 136 | + title <- paste("NEI PM2.5 Emissions by county ", NEIyear) |
| 137 | + map("county", col = colors[colorsmatched], fill = TRUE, resolution = 0, |
| 138 | + lty = 0, projection = "polyconic") |
| 139 | + title(title) |
| 140 | + leg.txt <- c("bottom", "2nd", "3rd", "4th", "5th quintile") |
| 141 | + legend("topright", leg.txt, horiz = TRUE, fill = colors) |
| 142 | +``` |
| 143 | + |
| 144 | + |
| 145 | + |
| 146 | +Lastly, because this was a function and the dataset exists within the function environment I check to see if the dataset is in the global environment; if it is not I assign the NEI object to the global environment: |
| 147 | + |
| 148 | +```r |
| 149 | +if(!exists("NEI")) { |
| 150 | + print("Assigning NEI to Global Environment") |
| 151 | + assign("NEI", NEI, envir=globalenv()) |
| 152 | +} |
| 153 | +``` |
| 154 | + |
| 155 | +This code is avaialble in a Github repo as a function. Find the repo here <https://github.com/BillSeliger/ExData_Plotting2> |
0 commit comments