diff --git a/.gitignore b/.gitignore index 9263a1ed..d17e5544 100644 --- a/.gitignore +++ b/.gitignore @@ -1,2 +1,4 @@ _site .DS_Store +.Rhistory +.Rproj.user diff --git a/README.md b/README.md index 9c326a21..83b0568d 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ Since the beginning of the Data Science Specialization we've noticed the unbelie ## Contributing -If you've created a web page, video, sideshow, or any other kind of media you think should be shared through this directory you should: +If you've created a web page, video, slideshow, or any other kind of media you think should be shared through this directory you should: 1. Fork this repository. 2. Add a link to your content on the appropriate course page. diff --git a/about.md b/about.md index 8aad3a89..37ecc9da 100644 --- a/about.md +++ b/about.md @@ -19,6 +19,10 @@ The [Data Science Specialization](https://www.coursera.org/specialization/jhudat - [Kevin Markham](http://www.dataschool.io/) - Derek Franks - David Hood +- [Leonard Greski](https://github.com/lgreski) - Michael Sachs - Allan Inocêncio de Souza Costa -- [stepds](https://github.com/stepds) \ No newline at end of file +- [stepds](https://github.com/stepds) +- Bastiaan Quast +- [Xing Su](http://sux13.github.io/DataScienceSpCourseNotes/) +- [Edmund julian Ofilada](https://github.com/DocOfi) diff --git a/capstone.md b/capstone.md new file mode 100644 index 00000000..6285e422 --- /dev/null +++ b/capstone.md @@ -0,0 +1,14 @@ +--- +title: "Capstone" +permalink: /capstone/ +layout: page +--- +## Reference Material + +- [Speech and Language Processing, 3rd Edition](https://web.stanford.edu/~jurafsky/slp3/) Working version of Jurafsky, et. al. book on natural language processing whose content on n-grams is helpful for the capstone. + +## Course Project + +- [n-gram Computations and Computer Capacity](http://bit.ly/2couvxh) Explains the amount of memory required to convert the text files for the course project into n-grams, using the quanteda package. +- [Capstone Strategy](http://bit.ly/2rGcgc6) Describes a general strategy to get through the Capstone: use the simplest approaches possible. +- [Choosing a Text Analysis Package](http://bit.ly/2qagsPa) Reviews pros and cons of various R packages used for natural language processing, in the context of requirements for the Capstone project. diff --git a/curated.md b/curated.md index 0999830e..613c5f4e 100644 --- a/curated.md +++ b/curated.md @@ -1,6 +1,85 @@ --- layout: page -title: Curated Knowledge +title: Curated Pages permalink: /curated/ --- +### Analytics + +- [Huge Trello Board Collection of Data Science Resources](https://trello.com/b/rbpEfMld/data-science) +- [Diving Into Data Science Flipboard](https://flipboard.com/@thiakx/diving-into-data-science-5823ectuy) +- [OLAP Operation in R](http://architects.dzone.com/articles/olap-operation-r) +- [Journal of Statistical Software: Tidy data](http://www.jstatsoft.org/v59/i10/paper) +- [Verzani: simpleR – Using R for Introductory Statistics](http://cran.r-project.org/doc/contrib/Verzani-SimpleR.pdf) +- [Data Visualization packages](http://www.datavis.ca/R/) +- [Visualization hints: plotting numeric data by groups](http://www.r-bloggers.com/visualization-series-insight-from-cleveland-and-tufte-on-plotting-numeric-data-by-groups/) +- [Matrix rotation for image and contour plots in R](http://blog.snap.uaf.edu/2012/06/08/matrix-rotation-for-image-and-contour-plots-in-r/) +- [Fig Data: 11 Tips on How to Handle Big Data in R (and 1 Bad Pun)](http://theodi.org/blog/fig-data-11-tips-how-handle-big-data-r-and-1-bad-pun) +- [Data from 538](https://github.com/fivethirtyeight/data) + +### Command Line + +- [explainshell.com - match command-line arguments to their help text](http://explainshell.com/) +- [The Command Line Crash Course - Quick course in using the command line](http://cli.learncodethehardway.org/book/) +- [Mastering the command line, in one page](https://github.com/jlevy/the-art-of-command-line/blob/master/README.md) + +### R + +- [Try R](http://tryr.codeschool.com/) +- [The R Book by Michael J. Crawley](https://archive.org/details/TheRBook/) +- [Univ. of Calif. Riverside R Programming](http://manuals.bioinformatics.ucr.edu/home/programming-in-r#TOC-R-Basics) +- [G. Sanchez - Strings in R](http://gastonsanchez.com/Handling_and_Processing_Strings_in_R.pdf) +- [The Lubridate Package](http://www.jstatsoft.org/v40/i03/paper) +- [Google Developers R Programming Video Lectures](http://www.r-bloggers.com/google-developers-r-programming-video-lectures/) +- [awesome R](https://github.com/qinwf/awesome-R) - A curated list of awesome R frameworks, packages and software. +- [awesome machine learning](https://github.com/josephmisiti/awesome-machine-learning#r) - A curated list of awesome Machine Learning frameworks, libraries and software. +- [Google's R Style Guide](https://google-styleguide.googlecode.com/svn/trunk/Rguide.xml) +- [Tufte-style HTML in rmarkdown](http://sachsmc.github.io/tufterhandout/) +- [Creating an R Package](http://hilaryparker.com/2014/04/29/writing-an-r-package-from-scratch/) +- [R Packages (Hadley online book)](http://r-pkgs.had.co.nz/) - How to write your own R packages. +- [Beautiful ggplot2 Cheatsheet](http://zevross.com/blog/2014/08/04/beautiful-plotting-in-r-a-ggplot2-cheatsheet-3/) +- [Intro to Graphics](http://bcb.dfci.harvard.edu/~aedin/courses/Bioconductor/2.Plotting.pdf) +- [data.table cheat sheet](https://s3.amazonaws.com/assets.datacamp.com/img/blog/data+table+cheat+sheet.pdf) +- [Exploratory Data Analysis with data.table](http://varianceexplained.org/RData/lessons/lesson4/) +- [Fast summary statistics in R with data.table](http://blog.yhathq.com/posts/fast-summary-statistics-with-data-dot-table.html) +- [R online in r-fiddle.org](http://www.r-fiddle.org/) + +### Probability and Statistics + +- [Probability and Statistics Cookbook](http://matthias.vallentin.net/probability-and-statistics-cookbook/) + +### GitHub + +- [Official Git Tutorial](http://git-scm.com/docs/gittutorial) +- [Git - Simple Guide](http://rogerdudler.github.io/git-guide/) +- [Git Immersion - A guided tour through the fundamentals of Git](http://gitimmersion.com/) +- [GitHub - Dealing with Multiple Accounts](http://hmkcode.com/git-tutorial/how-to-deal-with-multiple-github-accounts-on-one-computer/) +- [Try Git](https://try.github.io/levels/1/challenges/1) +- [Learn Git Branching: Interactive Game](http://pcottle.github.com/learnGitBranching/) +- [Atlassian Git Tutorials - Branches](https://www.atlassian.com/git/tutorials/using-branches/) + +### Reproducible Research +- [Markdown live demo](http://markdown-here.com/livedemo.html) +- [Boosting Slides by Ron Meir](https://github.com/Aratinga/Misc/blob/master/BoostingTutorial.pdf) +- [Reproducible Research website](http://reproducibleresearch.net/) + +### Machine Learning +- [UC Irvine Machine Learning Data Repository](http://archive.ics.uci.edu/ml/) + +### Textbooks +- [OpenIntro textbook](https://www.openintro.org/stat/textbook.php) +- [Statlect - The digital textbook on probability and statistics](http://www.statlect.com/) +- [An Introduction to Statistical Learning with Applications in R](http://www-bcf.usc.edu/~gareth/ISL/) [[PDF, 4th printing]](http://www-bcf.usc.edu/~gareth/ISL/ISLR%20Fourth%20Printing.pdf) +- [The Elements of Statistical Learning: Data Mining, Inference, and Prediction](http://statweb.stanford.edu/~tibs/ElemStatLearn/) [[PDF, 10th ed]](http://statweb.stanford.edu/~tibs/ElemStatLearn/printings/ESLII_print10.pdf) + +### Further Reading + +- [Data Elixir - Free weekly newsletter of the best data-related resources and inspirations from around the web.](http://dataelixir.com/?referred=true) +- [Linkedin - Top 10 Big Data and Analytics References](https://www.linkedin.com/pulse/article/20140810194033-111366377-top-10-big-data-and-analytics-references) +- [Linkedin - Let's Get Nerdy: Data Analytics for Business Leaders Explained](https://www.linkedin.com/pulse/article/20140918162814-111366377-let-s-get-nerdy-data-analytics-for-business-leaders-explained) +- [Data Science Central : a great repository of news and resources for data science practitioners.](http://www.datasciencecentral.com) +- [Data Science Ontology - A visualized overview of Data Science concepts and tools](http://datascienceontology.com/) + +### Data Science Groups, Meetups, and Networking + +- [LinkedIn Data Science Specialisation Group](https://www.linkedin.com/groups/Coursera-Specialization-Data-Science-7495000?home=&gid=7495000&trk=anet_ug_hm&goback=%2Egmp_7495000) diff --git a/ddp.md b/ddp.md index bc7941f4..0af67104 100644 --- a/ddp.md +++ b/ddp.md @@ -5,3 +5,21 @@ permalink: /ddp/ --- - [Slidify to Github walkthrough](http://rpubs.com/thoughtfulbloke/25103) +- [ggvis and rmarkdown slides with interactive plots](http://qua.st/ggvis-shiny-html5-slides) + +## Shiny +- Choropleth of PBS WARN Distribution of Wireless Emergency Alerts + - [Code for Shiny App](https://github.com/amsilvr/shiny_choropleth) + - [App running on shinyapps.ip](https://silverman.shinyapps.io/warn_wea/) +- [Shiny app to simulate 401K growth with interactive plots](http://www.mephistosoftware.com/shiny/401k_simulator/) +- [Shiny Video Tutorials Playlist on Youtube](http://www.youtube.com/playlist?list=PL6wLL_RojB5xNOhe2OTSd-DPkMLVY9DfB) +- [Tutorial on writing Shiny simulation apps](https://github.com/homerhanumat/shinyTutorials) +- [Dockerize a Shiny App](http://www.rmining.net/2015/04/30/dockerizing-a-shiny-app/) +- [Git pushing Shiny Apps with Docker/Dokku](http://www.rmining.net/2015/05/11/git-pushing-shiny-apps-with-docker-dokku/) +- [Share your Shiny Apps with Docker and Kitematic](http://www.rmining.net/2015/08/10/share-your-shiny-apps-with-docker-and-kitematic/) +- [Shinyapps.io: Configuring Application Timeout](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/dataProd-shinyTimeoutConfig.md) +- [Plotting Natural Disasters](http://www.rpubs.com/DocOfi/367052) + +## Comprehensive Notes + +- Complete notes for [Developing Data Products](http://sux13.github.io/DataScienceSpCourseNotes/) diff --git a/eda.md b/eda.md index 9133e18d..1f56ac70 100644 --- a/eda.md +++ b/eda.md @@ -4,3 +4,13 @@ title: Exploratory Data Analysis permalink: /eda/ --- +- [Creating a Kite Graph](http://rpubs.com/thoughtfulbloke/kitegraph) +- [Analyzing Top/Green500 Supercomputer Technology Trends](http://github.com/ww44ss/Exascalar-Analysis-) +- [Emissions Choropleth Maps](https://github.com/BillSeliger/ExData_Plotting2) +- [Data Analysis using Twitter API and Python](http://blog.impiyush.com/2015/03/data-analysis-using-twitter-api-and.html) +- [Exploratory Data Analysis using Flexdashboard](http://rpubs.com/DocOfi/350830) +- [Plotting using Metricsgraphics](http://www.rpubs.com/DocOfi/352947) + +## Comprehensive Notes + +- Complete notes for [Exploratory Data Analysis](http://sux13.github.io/DataScienceSpCourseNotes/) diff --git a/getclean.md b/getclean.md index a3e98f60..deeccc56 100644 --- a/getclean.md +++ b/getclean.md @@ -6,3 +6,22 @@ permalink: /getclean/ - [Subsetting example walkthrough](http://rpubs.com/thoughtfulbloke/subset) - [Apples to Oranges Data Organisation Challenge](https://github.com/thoughtfulbloke/faoexample) +- [dplyr introductory tutorial](https://www.youtube.com/watch?v=jWjqLW-u3hc) and [R Markdown document](http://rpubs.com/justmarkham/dplyr-tutorial): A 39-minute video tutorial that covers the five basic dplyr "verbs" and a dozen other dplyr functions. dplyr is an [update](http://blog.rstudio.org/2014/01/17/introducing-dplyr/) to the plyr package, useful for subsetting, sorting, summarizing, and merging data using a more intuitive syntax than plyr or base R. +- [dplyr "going deeper" tutorial](https://www.youtube.com/watch?v=2mh1PqfsXVI) and [R Markdown document](http://rpubs.com/justmarkham/dplyr-tutorial-part-2): A 37-minute video tutorial that covers the new functionality in dplyr versions 0.3 and 0.4. +- [Downloading files general advice](http://rpubs.com/thoughtfulbloke/downloadtips) +- [Codebook sample](https://gist.github.com/kirstenfrank/218c36a1938055d0f4e4) +- [Second Codebook sample](https://gist.github.com/kirstenfrank/699abe3e16fd1dc36e5d) +- [Query string (and other fields-within-fields) unrolling](http://rpubs.com/schnee/32988) +- [Pre-processing Excel files before loading them into R](https://github.com/alkashef/cleaningexceldata) +- [Codebook template that can be used in the Getting and Cleaning Data project](https://gist.github.com/JorisSchut/dbc1fc0402f28cad9b41) +- ["Real world" example - reading American Community Survey 2000 PUMS Data:](https://github.com/lgreski/acsexample) Demonstrates how to extract records of a given type from a data file containing multiple record types, and how to use an Excel-based code book to specify arguments for reading a fixed-width file. +- [18 Months of CTA advice](https://thoughtfulbloke.wordpress.com/2015/08/31/hello-world) +- [Common Problems: Quiz 1 - Missing Java Runtime](http://bit.ly/2jjtyXM) Explains how to solve the problem of a missing Java Runtime for the question that requires students to process a Microsoft Excel spreadsheet. +- [Strategy for Reading Files & APIs / Quiz 2](http://bit.ly/2e4L5oF) +- [Common Problems: Quiz 2 - sqldf() driver fails to connect](http://bit.ly/2kD2KTY) +- [Tutorial: Downloading Files](http://bit.ly/2iP2suj) Illustrates various ways of downloading files, including binary and text files. +- [Creating dataframes from xml data](https://www.dropbox.com/s/7bbzzp4bwsmfl5y/CreatingDataframesfrom%20XmlFiles.odt?dl=0) + +## Comprehensive Notes + +- Complete notes for [Getting and Cleaning Data](http://sux13.github.io/DataScienceSpCourseNotes/) diff --git a/index.md b/index.md index 26c7f5a7..761f3e41 100644 --- a/index.md +++ b/index.md @@ -4,7 +4,7 @@ layout: page ## Table of Contents -This is site is meant to serve as a directory for the amazing content the +This site is meant to serve as a directory for the amazing content the community has created around the Data Science Specialization. If you are interested in contributing [click here](https://github.com/DataScienceSpecialization/DataScienceSpecialization.github.io#contributing). @@ -17,6 +17,7 @@ interested in contributing [click here](https://github.com/DataScienceSpecializa 7. [Regression Models](/regmod/) 8. [Practical Machine Learning](/pml/) 9. [Developing Data Products](/ddp/) +10. [Capstone](/capstone/) - [Other Resources](/other/) -- [Curated Knowledge](/curated/) +- [Curated Pages](/curated/) diff --git a/other.md b/other.md index 701275c9..ddb49135 100644 --- a/other.md +++ b/other.md @@ -4,7 +4,30 @@ title: Other Resources permalink: /other/ --- -## Troubleshooting +## Configuring R and RStudio (Linux) - [Installing xlsx and XML packages on Debian Wheezy](http://allanino.me/blog/programming/installing-some-r-packages/) +- [Rscript to customize R environment](http://bit.ly/r-customize-script) - Installs packages used in the specialization. +- [Installing Some Basic R Packages in Ubuntu; Ibrahim El Merehbi](http://elmerehbi.wordpress.com/2014/09/09/installing-some-basic-r-packages-in-ubuntu) +- [Using Projects in RStudio](https://support.rstudio.com/hc/en-us/articles/200526207-Using-Projects) +- [Using Version Control with RStudio](https://support.rstudio.com/hc/en-us/articles/200532077-Version-Control-with-Git-and-SVN) +- [Using R behind HTTP/HTTPS Proxy](https://support.rstudio.com/hc/en-us/articles/200488488-Configuring-R-to-Use-an-HTTP-or-HTTPS-Proxy) + +### Ignoring R & RStudio files +- [gitignore template for R](https://github.com/github/gitignore/blob/master/R.gitignore) (source:[gitignore](https://github.com/github/gitignore)) +- [Github Help - Using Git / Ignoring files](https://help.github.com/articles/ignoring-files/) + +## Troubleshooting - [Windows batch file to work around RStudio startup issues](https://github.com/stepds/contrib-DataScienceSpecialization/blob/master/README.md) + +## Pre-built virtual machines for R development. +- [Here's a pre-built lightweight Linux machine with R and RStudio already installed](https://github.com/queirozfcom/r-box). You just need to install [vagrant](https://www.vagrantup.com/downloads.html), download (or clone) the github repository and you'll get a clean ubuntu machine with the tools you'll need for the assignments. + +- [Data Science Toolbox](http://datasciencetoolbox.org/) - A virtual environment that allows you to start doing data science in a matter of minutes. + +- [Virtual machine with RStudio server and github setup](https://github.com/tboloo/vagrant-rstudio) - A VirtualBox, Vagrant & chef-solo managed virtual machine which provides RStudio server with git & github setup + +## Deploying and sharing Shiny Apps with Docker +- [Dockerize a Shiny App](http://www.rmining.net/2015/04/30/dockerizing-a-shiny-app/) +- [Git pushing Shiny Apps with Docker/Dokku](http://www.rmining.net/2015/05/11/git-pushing-shiny-apps-with-docker-dokku/) +- [Share your Shiny Apps with Docker and Kitematic](http://www.rmining.net/2015/08/10/share-your-shiny-apps-with-docker-and-kitematic/) diff --git a/pml.md b/pml.md index c0407c5d..1054002d 100644 --- a/pml.md +++ b/pml.md @@ -7,3 +7,31 @@ permalink: /pml/ ## Model Evaluation - [Simple Guide to Confusion Matrix Terminology (sensitivity, specificity, etc.)](http://www.dataschool.io/simple-guide-to-confusion-matrix-terminology/) +- ROC curves and Area Under the Curve explained: [video tutorial](http://youtu.be/OAl6eAyP-yo), [companion blog post](http://www.dataschool.io/roc-curves-and-auc-explained/) (with video transcript and screenshots) + +## Supplementary Videos + +- [What is machine learning, and how does it work?](https://www.youtube.com/watch?v=elojMnjn4kk): A high-level overview of machine learning in a 10-minute video +- [Video lectures from "An Introduction to Statistical Learning"](http://www.dataschool.io/15-hours-of-expert-machine-learning-videos/): Videos for Chapters 4, 5, 6, 8, and 10 can help to deepen your understanding of the topics presented in this course. + +## Machine Learning Competitions + +- [Participating in Kaggle's Allstate Purchase Prediction Challenge](http://www.dataschool.io/kaggle-allstate-purchase-prediction-challenge/): Description of what it's like to compete in a Kaggle competition, including links to a project paper, R code, presentation slides, and a presentation video. + +## Choosing a Machine Learning Model + +- [Comparing Supervised Learning Algorithms](http://www.dataschool.io/comparing-supervised-learning-algorithms/): Comparing 8 common supervised learning algorithms (for regression and classification) on 13 different dimensions. + +## Content Related to the Lectures + +- Complete notes for [Practical Machine Learning](http://sux13.github.io/DataScienceSpCourseNotes/) +- [Week 4: Combining Predictors -- Math Explained](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/pml-combiningPredictorsBinomial.md) + +## Configuring Github Pages with RStudio for PML Project + +- Step by step instructions to [Configure Github Pages with RStudio](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/pml-ghPagesSetup.md) to support the PML course project. + +## Improving Runtime Performance of Caret + +- Step by step instructions to [implement parallel processing in caret::train()](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/pml-randomForestPerformance.md) on a random forest model, along with runtime performance analysis for a variety of laptops, ranging from an Intel Atom-based tablet to a quad-core i7 processor. + diff --git a/regmod.md b/regmod.md index c72eefd1..1445c83d 100644 --- a/regmod.md +++ b/regmod.md @@ -4,3 +4,10 @@ title: Regression Models permalink: /regmod/ --- +## Supplementary Videos + +- [Video lectures from "An Introduction to Statistical Learning"](http://www.dataschool.io/15-hours-of-expert-machine-learning-videos/): Videos for Chapter 3 can help to deepen your understanding of regression. + +## Comprehensive Notes + +- Complete notes for [Regression Models](http://sux13.github.io/DataScienceSpCourseNotes/) diff --git a/repres.md b/repres.md index 09979e38..cba776f9 100644 --- a/repres.md +++ b/repres.md @@ -6,4 +6,11 @@ permalink: /repres/ - [Turning a RPubs document into a Github website walkthrough](https://github.com/thoughtfulbloke/appleorange) - [Introduction to knitr with rmarkdown](https://sachsmc.github.io/knit-git-markr-guide/knitr/knit.html) +- [Trends and severity of Data Breaches](http://rpubs.com/ww44ss/29389) +- [Benefit-cost analysis of a park user fee](https://rstudio-pubs-static.s3.amazonaws.com/72135_dc45211d976842c2a9a8c8b5f2472ff0.html) +- [Data Lake Integrity](http://rpubs.com/rshane/81297) +- [ProjectTemplate in RStudio with Git](http://padamson.github.io/r/rstudio/projecttemplate/git/2016/01/17/projecttemplate-in-rstudio-with-git.html) +## Comprehensive Notes + +- Complete notes for [Reproducible Research](http://sux13.github.io/DataScienceSpCourseNotes/) diff --git a/rprog.md b/rprog.md index 37f9dbf0..47df54d1 100644 --- a/rprog.md +++ b/rprog.md @@ -1,14 +1,57 @@ --- -layout: page -title: R Programming +title: "R Programming" permalink: /rprog/ +layout: page --- +## Getting Started +- [Resources for R Programming](http://bit.ly/2dhZ8Dy) +- [References for R Programming](http://bit.ly/2b8AxhF) +- [Data Science Specialization Value Proposition](http://bit.ly/2j3EcCn) +- [R Onboarding for SAS Users](http://bit.ly/2dr7yum) + ## Programming Assignments -- [Tutorial for those struggling with Programming Assignment 1](https://github.com/derekfranks/practice_assignment) +- [Strategy for Coding the Programming Assignments](http://bit.ly/2ddFh9A) +- [Tutorial for those struggling with Programming Assignment 1](https://github.com/derekfranks/practice_assignment) +- [Breaking Down pollutantmean](http://bit.ly/2cHyiCl) +- [Assignment 1: A More Elegant Solution](http://bit.ly/2kwBBlK) +- [A SAS Version of pollutantmean?](http://bit.ly/2d3DR4e) +- [Tutorial for those struggling with Programming Assignment 2](https://github.com/DanieleP/PA2-clarifying_instructions) +- [Tutorial for those struggling with Programming Assignment 3](https://github.com/DanieleP/PA3-tutorial) +- [PA1-test: `testthat`, Unit Tests for Programming Assignment 1](https://github.com/cbryant1000/pa1test) +- [PA3-test: `testthat`, Unit Tests for Programming Assignment 3](https://github.com/cbryant1000/pa3test) +- [Alternative submit script for Programming Assignment 1 that makes submitting more convenient by allowing selection of multiple parts plus prompting if user wants to submit another part before exiting](https://github.com/rchampoux/coursera/blob/master/rprog-scripts-submitscript1.R) +- [Grading the SHA-1 Hash Code](http://bit.ly/2iUWoB6) +- [Assignment 2: Demystifying makeVector](http://bit.ly/2bTXXfq) +- [Assignment 2: makeCacheMatrix as an Object](http://bit.ly/2byUe4e) ## R Language -- [Some notes on the R Language](http://lopezrj.github.io) \ No newline at end of file +- [Some notes on the R Language](http://lopezrj.github.io) +- [A Data Frame is Also a List](http://bit.ly/2fmMRAp) +- [S Objects, R Objects, and Lexical Scoping](http://bit.ly/2dtOSXi) +- [Common R Mistakes: Overwriting Functions with Data Objects](http://bit.ly/2i3gmoA) +- [Forms of the Extract Operator](http://bit.ly/2bzLYTL) +- [Functions to Sort Data Frames](http://bit.ly/2dxItzw) +- [Creative Use of R: Downloading Course Lectures](http://bit.ly/2bGlI7R) Article illustrating how to use R to automate the download of lectures from *Data Science Specialization* courses, such as *R Programming*. Techniques used in this article are helpful to make research reproducible, as required for courses like *Getting and Cleaning Data* and *Reproducible Research*. +- [Lexical Scoping and Statistical Computing](http://bit.ly/2cmqAPy) Article by Robert Gentleman and Ross Ihaka at the University of Auckland describing how lexical scoping works, and why it is valuable in statistical computing. +- [Data Science Job Report 2017: R Passes SAS, But Python Leaves Them Both Behind](http://bit.ly/2oCHulX) Bob Muenchen's take on the job market for various data science langauges. + + + +## R language cheatsheet + +- [R cheatsheet covering all lectures](https://github.com/startupjing/Tech_Notes/blob/master/R/R_language.md) + +## R and Commercial Statistics Packages + +- [R Onboarding for SAS Users](http://bit.ly/2dr7yum) Provides an overview and links to a variety of resources to help people with SAS experience make the transition to R +- [Commercial Statistics Packages: An Historical Perspective](http://bit.ly/2fPj2qN) +- [Why is R More Difficult than SAS?](http://bit.ly/2erxk3A) +- [Thinking in R versus Thinking in SAS](http://bit.ly/2cH3u8x) + +## Comprehensive Notes + +- Complete notes for [R Programming](http://sux13.github.io/DataScienceSpCourseNotes/) diff --git a/statinf.md b/statinf.md index fb8017ab..19592a27 100644 --- a/statinf.md +++ b/statinf.md @@ -4,3 +4,19 @@ title: Statistical Inference permalink: /statinf/ --- +- [Why degrees of freedom decrease for sample variance](https://github.com/Manu58/bias/blob/master/bias.pdf) +- [CONCEPTS: Calculating Area for a Point on the Normal Curve](http://bit.ly/2hw5AMF) Reviews the mathematics that explain why one cannot calculate the exact proability for a specific value within a distribution for a continuous variable, and illustrates how to calculate a quantile for a point on the curve. +- [Analysis of exponential distribution of births data set from the CDC](https://gist.github.com/ProgramErgoSum/5316008387746fcd84de) +- [Exponential Distribution / Central Limit Theorem - Assignment Checklist](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/statinf-expDistChecklist.md) +- [ToothGrowth Analysis - Assignment Checklist](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/ToothGrowthChecklist.md) +- [Exploratory Data Analysis in ToothGrowth Assignment](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/edaInToothGrowthAnalysis.md), explaining the exploratory data analysis requirement for students who have not taken the *Exploratory Data Analysis* course prior to taking *Statistical Inference*. +- [Using MathJax with Discussion Forums, R Markdown, and Github Pages](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/mathjaxWithGithubMarkdown.md) +- [Kable Tables with Data Frames](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/kableDataFrameTable.md) illustrates how to display a custom table in a `knitr()` document by creating a data frame to contain the information to be rendered with `kable()`. +- [Interactive Confidence Interval Visualization](https://github.com/amcadie/interactive_CI) +- [Installing MiKTeK on Windows 10 / Generate a PDF from knitr](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/statinf-generatePDF.md) +- [Power calculations: optimal szmple size](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/statinf-optimalSampleSize.md) +- [Permutation Tests Explained](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/statinf-permutationTests.md) + +## Comprehensive Notes + +- Complete notes for [Statistical Inference](http://sux13.github.io/DataScienceSpCourseNotes/) diff --git a/toolbox.md b/toolbox.md index 67cb050f..3c2dfc68 100644 --- a/toolbox.md +++ b/toolbox.md @@ -6,9 +6,21 @@ permalink: /toolbox/ ## Command Line +- [Working with files in Bash](http://edgarsh.es/ins/working-with-files-in-bash/) + ## Git/GitHub - [Git & GitHub Video Playlist](https://www.youtube.com/playlist?list=PL5-da3qGB5IBLMp7LtN8Nc3Efd4hJq0kD) (also available for [download](https://drive.google.com/folderview?id=0BxRfg0msVmAoRlZFQjJ3T3VTOUE&usp=sharing) as mp4 files) - [A Beginner's Quick Reference Guide for Git Commands](http://www.dataschool.io/git-quick-reference-for-beginners/) - [Understanding the Relationship Between Git and GitHub](http://www.dataschool.io/github-is-just-dropbox-for-git/) - [Simple Guide to GitHub Forks](http://www.dataschool.io/simple-guide-to-forks-in-github-and-git/) +- [Github Repo Tutorial How to fork a repo, download it to your local drive and commit changes ](https://www.youtube.com/watch?v=MY94AIplcaU) +- [Configuring RStudio to work with Git / Github - Mac OSX](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/configureRStudioGitOSXVersion.md) +- [Configuring RStudio to work with Git / Github - Windows](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/configureRStudioGitWindowsVersion.md) + +## Comprehensive Notes + +- Complete notes for [The Data Scientist's Toolbox](http://sux13.github.io/DataScienceSpCourseNotes/) + +## Miscellaneous +- [Using Editor Modes in Coursera Discussion Forum Posts](https://github.com/lgreski/datasciencectacontent/blob/master/markdown/usingMarkdownInForumPosts.md)