Scott Davis Transcript
Open Source Data Science Masters
I'm going to have some time for indepedent study this year so I plan to work through as much as possible. I work in the real estate industry and we have so much data that isn't used for meaningful analysis and the tools, though readily available, haven't caught up for the needs of real estate users. That's what I'm interested in working on. I use a lot of GIS and R, so my curriculum is tailored to follow [R](https://www.r-project.org/)/[Python](www.python.org) and [QGIS](www.qgis.org). I'm a bit of an open-source nut so I like learning much better this way. I'm looking for people to connect with, and possibly to work on projects.
Want to collaborate? Get in touch:
* [linkedin](http://www.linkedin.com/in/scottcdavis);
* [twitter](http://www.twitter.com/scottdavisCRE); or
* [email](mailto:scott@tisonadevelopment.com)
Open Source Curriculum
Base Introduction
Data Science Introductions
- [ ] Intro to Data Science by UW / Coursera, online course
- [ ] Data Science Specialization by Johns Hopkins / Coursera
- [X] [Data Scientists Toolbox](https://www.coursera.org/account/accomplishments/certificate/UY4EBM46HL)
- [X] [R Programming](https://www.coursera.org/account/accomplishments/records/Va5vuEvGKyr7UyHEL)
- [X] [Getting and Cleaning Data](https://www.coursera.org/account/accomplishments/records/ENSGmvNfx24sANRW)
- [X] [Exploratory Data Analysis](https://www.coursera.org/account/accomplishments/records/2PPsRu2Us3sUehBQ)
- [X] [Reproducible Research]
- [ ] [Statistical Inference] (in progress)
- [ ] [Regression Models] (in progress)
- [X] [Practical Machine Learning]
- [ ] [Developing Data Products]
- [ ] [Data Science Capstone]
- [ ] [Data Science by Harvard](http://cs109.github.io/2015/) (online course)
- [ ] [Data Science with Open Source Tools](http://shop.oreilly.com/product/9780596802363.do)
- [50 Years of Data Science](http://pages.cs.wisc.edu/~anhai/courses/784-fall15/50YearsDataScience.pdf)
- [ ] [Datasmart](http://www.amazon.com/Data-Smart-Science-Transform-Information/dp/111866146X/ref=sr_1_1?s=books&ie=UTF8&qid=1458768727&sr=1-1&keywords=datasmart) - in Excel, but also works in LibreOffice and so much of business analytics is still in Excel.
Mathematics/Statistics
- [ ] [Statistics for Spatial Data, Revised Edition](http://www.wiley.com/WileyCDA/WileyTitle/productCd-1119114616.html)
- [ ] [Statistics for Spatio-Temporal Data](http://www.wiley.com/WileyCDA/WileyTitle/productCd-EHEP002348.html)
- [ ] [Linear Algebra](http://www.amazon.com/Linear-Algebra-Dover-Books-Mathematics/dp/048663518X)
- [ ] Problem-Solving Heuristics: [How to Solve It](http://www.amazon.com/How-Solve-It-Mathematical-Princeton/dp/069111966X)
Computing
R:
- [ ] [R in Action](https://www.manning.com/books/r-in-action-second-edition?a_bid=5c2b1e1d&a_aid=RiA2ed)
- [ ] [R Cookbook](http://shop.oreilly.com/product/9780596809164.do)
- [ ] [Forecasting: Principles and Practice](http://otexts.com/fpp/)
R Libraries/Task Views
* [ProjectTemplate](http://projecttemplate.net/index.html)
* Spatial Data [CRAN Task View: Analysis of Spatial Data](https://cran.r-project.org/web/views/Spatial.html)
* Spatio-Temporal Data [CRAN Task View: Handling and Analyzing Spatio-Temporal Data](https://cran.r-project.org/web/views/SpatioTemporal.html)
* Optimization [CRAN Task View: Optimization and Mathematical Programming](https://cran.r-project.org/web/views/Optimization.html)
* Finance [CRAN Task View: Empirical Finance](https://cran.r-project.org/web/views/Finance.html)
Python:
- [ ] [Dive Into Python](http://www.diveintopython.net/)
- [ ] [Google's Python Class](code.google.com/edu/languages/google-python-class/)
- [ ] [Python for Data Analysis](http://shop.oreilly.com/product/0636920023784.do)
- [ ] [Webscraping with Python](https://www.packtpub.com/big-data-and-business-intelligence/web-scraping-python)
QGIS:
- [X] [QGIS Tutorials and Tips](http://www.qgistutorials.com/en/)
- [X] [Mastering QGIS](https://www.packtpub.com/application-development/mastering-qgis)
- [ ] [Building Mapping Applications with QGIS](https://www.packtpub.com/application-development/building-mapping-applications-qgis)
- [ ] [GIS Tutorial Workbook 1](https://esripress.esri.com/display/index.cfm?fuseaction=display&websiteID=232&moduleID=1) This is for ArcView, but you can work the examples in QGIS too
- [ ] [GIS Tutorial Workbook 2: Spatial Analysis](https://esripress.esri.com/display/index.cfm?fuseaction=display&websiteID=230&moduleID=0) This is for ArcView, but you can work the examples in QGIS too
- [ ] [QGIS Map Design](https://locatepress.com/qmd) I've just thumbed through this, but it's beautiful and belongs on any list of GIS books.
MySQL:
- [ ] [Learn MySQL in One Video](https://www.youtube.com/watch?v=yPu6qV5byu4)
- [ ] [MySQL Workbench Starter](code.google.com/edu/languages/google-python-class/)
Octave:
- [ ] [GNU Octave Beginners Guide](https://www.packtpub.com/big-data-and-business-intelligence/gnu-octave-beginners-guide)
-
PostGIS/PostGRESQL:
- [ ] [PostGIS Essentials](https://www.packtpub.com/big-data-and-business-intelligence/postgis-essentials)
- [ ] [PostGRESQL Tutorial](http://www.postgresqltutorial.com/)
- [ ] [PostgreSQL: Up and Running: A Practical Introduction to the Advanced Open Source Database](http://shop.oreilly.com/product/0636920032144.do)
Algorithms
- [ ] [Algorithms Design & Analysis](http://openclassroom.stanford.edu/MainFolder/CoursePage.php?course=IntroToAlgorithms) Stanford openclassroom
Distributed Computing Paradigms
- [ ] Intro to Hadoop and MapReduce by Cloudera and Udacity
*Note: I might swap the above course with an EdX course on Apache Spark and distributed computing*
Data Mining
- [ ] Mining Massive Data Sets, by Stanford and Coursera
- [ ] [Clean Data](https://www.packtpub.com/big-data-and-business-intelligence/clean-data)
Machine Learning/Predictive Analytics - Foundational/Theoretical/Practical
- [ ] Machine Learning, by Ng Stanford and Coursera (NB this class requires a lot of higher level math)
- [ ] [An Introduction to Statistical Learning with Applications in R](http://www.r-bloggers.com/in-depth-introduction-to-machine-learning-in-15-hours-of-expert-videos/) (by the authors of The Elements of Statistical Learning at Stanford.)
- [ ] [Machine Learning with R](https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-r-second-edition)
- [ ] [Building a Recommendation System in R](https://www.packtpub.com/big-data-and-business-intelligence/building-recommendation-system-r)
- [ ] [Mastering Predictive Analytics in R](https://www.packtpub.com/application-development/mastering-predictive-analytics-r)
- [ ] [Bootstrapping Machine Learning](http://www.louisdorard.com/machine-learning-book/)
- [ ] [Applied Predictive Modeling](http://www.amazon.com/gp/product/1461468485?psc=1&redirect=true&ref_=oh_aui_detailpage_o08_s00)
Analysis
- [ ] [Practical Data Science Cookbook](http://www.diveintopython.net/)
- [ ] [R Data Analysis Cookbook](code.google.com/edu/languages/google-python-class/)
Spatial Analysis
- [ ] [An Introduction to R for Spatial Analysis and Mapping](https://us.sagepub.com/en-us/nam/an-introduction-to-r-for-spatial-analysis-and-mapping/book241031)
- [ ] [Applied Spatial Data Analysis with R](http://www.springer.com/us/book/9781461476177)
Land Use/Transport/Gravity Modeling
- [ ] [Integrated Land Use and Transport Modelling: Decision Chains and Hierarchies](http://www.amazon.com/gp/product/0521022177?psc=1&redirect=true&ref_=oh_aui_detailpage_o03_s00)
- [ ] [Gravity and Spatial Interaction Models (Scientific Geography Series)](http://www.amazon.com/gp/product/0803925441?psc=1&redirect=true&ref_=oh_aui_detailpage_o06_s00)
- [ ] [TRANUS Model](http://www.tranus.com/tranus-english)
- [ ] [Urban Sim](https://pypi.python.org/pypi/urbansim)
- [ ] [Huff-tools Package in R](http://rstudio-pubs-static.s3.amazonaws.com/42357_1e6fcc5bcfec439096eb86a106ebf22e.html)
-
Data Design/Data Viz
- [ ] [Beautiful Evidence](http://www.edwardtufte.com/tufte/books_be)
- [ ] [Semiology of Graphics](http://www.amazon.com/Semiology-Graphics-Diagrams-Networks-Maps/dp/1589482611)
- [ ] [Visual Complexity Mapping Patterns of Information](hhttp://www.visualcomplexity.com/vc/book/)
- [ ] [The Visual Display of Quantitative Information](http://www.edwardtufte.com/tufte/books_vdqi)
- [ ] [Design for Information](http://isabelmeirelles.com/book-design-for-information/)
- [ ] [Design Elements: A Graphical Style Manual](http://www.amazon.com/Design-Elements-Graphic-Style-Manual/dp/1592532616)
- [ ] [Storytelling with Data](http://www.amazon.com/gp/product/1119002257?psc=1&redirect=true&ref_=oh_aui_detailpage_o09_s00)
- [ ] [Mastering Python Data Visualization](https://www.packtpub.com/big-data-and-business-intelligence/mastering-python-data-visualization)
- [ ] [The Grammar of Graphics](https://www.packtpub.com/big-data-and-business-intelligence/mastering-python-data-visualization)
- [ ] [R Graphics Cookbook](http://shop.oreilly.com/product/9780596809164.do)
Relevant prior studies
- [X] MS in Community and Regional Planning, UT-Austin
- [X] BA in Liberal Arts, concentration in geography, UT-Austin
OpenSource Data Science Masters Capstone Project
I'm interesting in using data science approaches for better intelligence behind real estate decisions, specifically evaluating population growth, transactions and location decisions. I'd also like to evaluate statistical learning technqiues to make better pricing decisions. Finally, I'd like to develop a model to optimize real estate portfolios.
If you'd like to pair up for the capstone, [let me know](http://www.twitter.com/scottdavisCRE)