All of my Kaggle code in one place... mostly... if I ever get it all uploaded.
For the most part these are simple snippets of reusable techniques gleaned from all sorts of Kaggle competitions. Examples of things that are in here, or should be but aren't yet built are:
- Automatic Numeric Feature Creation from Dates, Factors, and Character variables Golden feature ideas such as:
- the difference between highly correlated features: (https://www.kaggle.com/yangnanhai/homesite-quote-conversion/keras-around-0-9633/comments)
- Counting the number of out-of-band or NA or zero fields in a record
- Techniques to generate features such as: PCA, t-SNE, SOM, etc.
- Scaling methods
- Duplicate feature identification
- Sample Models for a variety of models (from decision trees to neural networks) and output types (logit, regression, classification)
- Feature selection notes
- Meta-model generation for automatic mining
- Approaches to difficult problems like image analysis and full text analysis
Again, this is a snippet collection; not fully working code!