This repo provides my R code solution to Coursera's "Getting & CLeaning Data" Project. The project is based on Smartphone datasets from UCI (University of California,Irvine) Machine Learning Repository specificly measuring Human Activity by collecting data from sensors integrated in the Smartphone itself (i.e., Samsung Galaxy II S). The dataset shall be known as UCI HAR (UCI Human Activity Recognition) data or just the data.
A prerequisite for succesully running the R code is that the data has been installed/unzipped on to your directory. The code will look for the "./UCI HAR Dataset" directory within your working directory. If not found, the code will retreive the zip file and unzip the UCI HAR directory and file structure.
This repo contains one R-code script named run_analysis.Rin line with the Project's requirements. The run_analysis.R code will do the following;
- Code will retrieve/unzip UCI HAR Dataset unless it already has been installed/unzipped
- Reads the UCI HAR Datasets.
- Creates a "results" directory (
"./results") where intermediate or final results can be written to. - Consolidate all measurement data into 1 single
data.frame - Inserts explanatory measurement
headersand corresponding variablelabels(numeric as well as descriptive). - Extracting a sub-set of measurements.
- Tidy the data-set (e.g., see also Hadley Wickham "Tidy Data" article).
- Several files are written to the "results" directory along the data tidying process.
Note: the requested final (tidy) file "tidy_mean.txt" can be found in the "./results" directory that was created by this code in the working directory.
My codebook.md will in more detail explain the R-code's functionality.