diff --git a/README.md b/README.md index bb8de8f..d7716f2 100644 --- a/README.md +++ b/README.md @@ -1,377 +1,159 @@ -# 机器学习资源 Machine learning Resources - -**致力于分享最新最全面的机器学习资料,欢迎你成为贡献者!** - -*快速开始学习:* - -- 周志华的[《机器学习》](https://pan.baidu.com/s/1hscnaQC)作为通读教材,不用深入,从宏观上了解机器学习 - - 《机器学习》西瓜书公式推导解析:https://datawhalechina.github.io/pumpkin-book/ - -- 最新的[《神经网络与深度学习》](https://mp.weixin.qq.com/s?__biz=MzIwOTc2MTUyMg==&mid=2247488439&idx=1&sn=df51b67ac2a42fe1a8417a7e4d308b8b&chksm=976fb62aa0183f3c8cfbfcf2c1613aa3a168f782bc5b439aa2a5db9574a33f678a081a1d24a5&mpshare=1&scene=1&srcid=0409hgaWjfxz2LzGtniTpAKh&key=12a4c5f4665589b6914fa6a60a7fe4bd6a4fc4855ac8967b945678646a60c26482467697a46b85e85c7a6a7d564aac41d6c0312307a7f95ba299d3b3cf8433f9a159f999d9484534452672dbdd9fd270&ascene=1&uin=NjMzMjQzMTYw&devicetype=Windows+10&version=62060739&lang=zh_CN&pass_ticket=CIhr0hAvTnkZIvwFNRQ2%2BWhir8OVCkCt9tarvfIPS5SWtyyQKMLGOBt%2BItSffrll) - -- 李航的[《统计学习方法》](https://pan.baidu.com/s/1dF2b4jf)作为经典的深入案例,仔细研究几个算法的来龙去脉 | [书中的代码实现](https://github.com/WenDesi/lihang_book_algorithm) - -- 使用Python语言,根据[《机器学习实战》](https://pan.baidu.com/s/1gfzV7PL)快速上手写程序 - -- 来自国立台湾大学李宏毅老师的机器学习和深度学习中文课程,强烈推荐:[课程](http://speech.ee.ntu.edu.tw/~tlkagk/courses.html) - -- 《迁移学习导论》助你快速入门迁移学习! [书的主页](http://jd92.wang/tlbook) - - 迁移学习统一代码库:[Domain adaptation](https://github.com/jindongwang/transferlearning/tree/master/code/DeepDA) | [Domain generalization](https://github.com/jindongwang/transferlearning/tree/master/code/DeepDG) | [更多代码](https://github.com/jindongwang/transferlearning) - -- 最后,你可能想真正实战一下。那么,请到著名的机器学习竞赛平台Kaggle上做一下这些基础入门的[题目](https://www.kaggle.com/competitions?sortBy=deadline&group=all&page=1&pageSize=20&segment=gettingStarted)吧!(Kaggle上对于每个问题你都可以看到别人的代码,方便你更加快速地学习)  [Kaggle介绍及入门解读](https://zhuanlan.zhihu.com/p/25686876) [可以用来练手的数据集](https://www.kaggle.com/annavictoria/ml-friendly-public-datasets/notebook) - -其他有用的资料: - -- 想看别人怎么写代码?[机器学习经典教材《PRML》所有代码实现](https://github.com/ctgk/PRML) - -- [机器学习算法Python实现](https://github.com/lawlite19/MachineLearning_Python) - -- [吴恩达新书:Machine Learning Yearning中文版](https://pan.baidu.com/s/10kosKx6rDguS4tPejY-fRw) - -- 另外,对于一些基础的数学知识,你看[深度学习(花书)中文版](https://github.com/exacity/deeplearningbook-chinese)就够了。这本书同时也是**深度学习**经典之书。 - -- 来自南京大学周志华小组的博士生写的一本小而精的[解析卷积神经网络—深度学习实践手册](http://lamda.nju.edu.cn/weixs/book/CNN_book.html) - -- - - - -[一个简洁明了的时间序列处理(分窗、特征提取、分类)库:Seglearn](https://dmbee.github.io/seglearn/index.html) - -[计算机视觉这一年:这是最全的一份CV技术报告](https://zhuanlan.zhihu.com/p/31430602) - -[深度学习(花书)中文版](https://github.com/exacity/deeplearningbook-chinese) - -**[深度学习最值得看的论文](http://www.dlworld.cn/YeJieDongTai/4385.html)** - -**[最全面的深度学习自学资源集锦](http://dataunion.org/29975.html)** - -**[Machine learning surveys](https://github.com/metrofun/machine-learning-surveys/)** - -**[快速入门TensorFlow](https://github.com/aymericdamien/TensorFlow-Examples)** - -[自然语言处理数据集](http://abunchofdata.com/datasets-for-natural-language-processing/) -  -[Learning Machine Learning? Six articles you don’t want to miss](http://www.ibmbigdatahub.com/blog/learning-machine-learning-six-articles-you-don-t-want-miss) - -[Getting started with machine learning documented by github](https://github.com/collections/machine-learning) - -- - - - - -## 研究领域资源细分 - -- ### [深度学习 Deep learning](https://github.com/ChristosChristofidis/awesome-deep-learning) - -- ### [强化学习 Reinforcement learning](https://github.com/aikorea/awesome-rl) - -- ### [迁移学习 Transfer learning](https://github.com/jindongwang/transferlearning) - -- ### [分布式学习系统 Distributed learning system](https://github.com/theanalyst/awesome-distributed-systems) - -- ### [计算机视觉/机器视觉 Computer vision / machine vision](https://github.com/jbhuang0604/awesome-computer-vision) - -- ### [自然语言处理 Natural language procesing](https://github.com/Nativeatom/NaturalLanguageProcessing) - -- ### [生物信息学 Bioinfomatics](https://github.com/danielecook/Awesome-Bioinformatics) - -- ### [行为识别 Activity recognition](https://github.com/jindongwang/activityrecognition) - -- ### [多智能体 Multi-Agent](http://ddl.escience.cn/f/ILKI) - -- - - - -## 开始学习:预备知识 Prerequisite - -- [学习知识与路线图](https://metacademy.org/) - -- [MIT线性代数课堂笔记(中文)](https://github.com/zlotus/notes-linear-algebra) - -- [概率与统计 The Probability and Statistics Cookbook](http://statistics.zone/) - -- Python - - - [Learn X in Y minutes](https://learnxinyminutes.com/docs/python/) - - - [Python机器学习互动教程](https://www.springboard.com/learning-paths/machine-learning-python/) - -- Markdown - - - [Mastering Markdown](https://guides.github.com/features/mastering-markdown/) - Markdown is a easy-to-use writing tool on the GitHu. - -- R - - - [R Tutorial](http://www.cyclismo.org/tutorial/R/) - -- Python和Matlab的一些cheat sheet:http://ddl.escience.cn/f/IDkq 包含: - - - Numpy、Scipy、Pandas科学计算库 - - - Matlab科学计算 - - - Matplotlib画图 - -- 深度学习框架 - - - Python - - [TensorFlow](https://www.tensorflow.org/) - - [Scikit-learn](http://scikit-learn.org/) - - [PyTorch](http://pytorch.org/) - - [Keras](https://keras.io/) - - [MXNet](http://mxnet.io/)|[相关资源大列表](https://github.com/chinakook/Awesome-MXNet) - - [Caffe](http://caffe.berkeleyvision.org/) - - [Caffe2](https://caffe2.ai/) - - - Java - - [Deeplearning4j](https://deeplearning4j.org/) - - - Matlab - - [Neural Network Toolbox](https://cn.mathworks.com/help/nnet/index.html) - - [Deep Learning Toolbox](https://cn.mathworks.com/matlabcentral/fileexchange/38310-deep-learning-toolbox) - -- - - - - -## 文档 notes - -- [综述文章汇总](https://github.com/allmachinelearning/MachineLearning/blob/master/notes/survey_readme.md) - -- [近200篇机器学习资料汇总!](https://zhuanlan.zhihu.com/p/26136757) - -- [机器学习入门资料](https://github.com/allmachinelearning/MachineLearning/blob/master/notes/MLMaterials.md) - -- [MIT.Introduction to Machine Learning](http://ddl.escience.cn/f/Iwtu) - -- [东京大学同学做的人机交互报告](https://github.com/allmachinelearning/MachineLearning/blob/master/notes/FieldResearchinChina927-104.pdf) - -- [人机交互简介](https://github.com/jindongwang/HCI) - -- [人机交互与创业论坛](https://github.com/allmachinelearning/MachineLearning/blob/master/notes/%E4%BA%BA%E6%9C%BA%E4%BA%A4%E4%BA%92%E4%B8%8E%E5%88%9B%E4%B8%9A%E8%AE%BA%E5%9D%9B.md) - -- [职场机器学习入门](https://github.com/allmachinelearning/MachineLearning/blob/master/notes/%E8%81%8C%E5%9C%BA-%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E5%85%A5%E9%97%A8.md) - -- [机器学习的发展历程及启示](http://mt.sohu.com/20170326/n484898474.shtml), (@Prof. Zhihua Zhang/@张志华教授) - -- [常用的距离和相似度度量](https://github.com/allmachinelearning/MachineLearning/blob/master/notes/distance%20and%20similarity.md) - -- - - - - -## 课程与讲座 Course and talk - -### 机器学习 Machine Learning -  -[台湾大学应用深度学习课程](https://www.csie.ntu.edu.tw/~yvchen/f106-adl/index.html) - -- [神经网络,机器学习,算法,人工智能等 30 门免费课程详细清单](http://www.datasciencecentral.com/profiles/blogs/neural-networks-for-machine-learning) -  -- [斯坦福机器学习入门课程](https://www.coursera.org/learn/machine-learning),讲师为Andrew Ng,适合数学基础一般的人,适合入门,但是学完会发现只是懂个大概,也就相当于什么都不懂。省略了很多机器学习的细节 - -- [Neural Networks for Machine Learning](https://www.coursera.org/learn/neural-networks), Coursera上的著名课程,由Geoffrey Hinton教授主讲。 - -- [Stanford CS 229](http://cs229.stanford.edu/materials.html), Andrew Ng机器学习课无阉割版,Notes比较详细,可以对照学习[CS229课程讲义的中文翻译](https://github.com/Kivy-CN/Stanford-CS-229-CN)。 - -- [CMU 10-702 Statistical Machine Learning](http://www.stat.cmu.edu/~larry/=sml/), 讲师是Larry Wasserman,应该是统计系开的机器学习,非常数学化,第一节课就提到了RKHS(Reproducing Kernel Hilbert Space),建议数学出身的同学看或者是学过实变函数泛函分析的人看一看 - -- [CMU 10-715 Advanced Introduction to Machine Learning](https://www.cs.cmu.edu/~epxing/Class/10715/),同样是CMU phd级别的课,节奏快难度高 - -- [机器学习基石](https://www.coursera.org/course/ntumlone)(适合入门)。国立台湾大学[林轩田](https://www.coursera.org/instructor/htlin) - -- [机器学习技法](https://www.coursera.org/course/ntumltwo)(适合提高)。国立台湾大学[林轩田](https://www.coursera.org/instructor/htlin) - -- [Machine Learning for Data Analysis](https://www.coursera.org/learn/machine-learning-data-analysis), Coursera上Wesleyan大学的Data Analysis and Interpretation专项课程第四课。 - -- Max Planck Institute for Intelligent Systems Tübingen[德国马普所智能系统研究所2013的机器学习暑期学校视频](https://www.youtube.com/playlist?list=PLqJm7Rc5-EXFv6RXaPZzzlzo93Hl0v91E),仔细翻这个频道还可以找到2015的暑期学校视频 - -- 知乎Live:[我们一起开始机器学习吧](https://www.zhihu.com/lives/792423196996546560),[机器学习入门之特征工程](https://www.zhihu.com/lives/819543866939174912) - -### 深度学习 Machine Learning - -- 斯坦福大学Feifei Li教授的[CS231n系列深度学习课程](http://cs231n.stanford.edu/)。Feifei Li目前是Google的科学家,深度学习与图像识别方面的大牛。这门课的笔记可以看[这里](https://zhuanlan.zhihu.com/p/21930884)。 - -- [CS224n: Natural Language Processing](http://cs224n.stanford.edu). Course instructors: Chris Manning, Richard Socher. - -### 强化学习 Machine Learning - -- [CS 294 Deep Reinforcement Learning, Fall 2017](http://rll.berkeley.edu/deeprlcourse/). Course instructors: Sergey Levine, John Schulman, Chelsea Finn. - -- [UCL Course on RL](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html) - -- [CS234: Reinforcement Learning](http://web.stanford.edu/class/cs234/index.html). 暂无视频 - -- - - - - -## 相关书籍 reference book - -- [Hands on Machine Learning with Scikit-learn and Tensorflow](https://my.pcloud.com/publink/show?code=XZ9ev77Zk2l6xcMtfIhHm7mRKAYhISb6sl3k) - -- 入门读物 [The Elements of Statistical Learning(英文第二版),The Elements of Statistical Learning.pdf](http://ddl.escience.cn/ff/emZH) - -- [机器学习](https://book.douban.com/subject/26708119/), (@Prof. Zhihua Zhou/周志华教授) - -- [统计学习方法](https://book.douban.com/subject/10590856/), (@Dr. Hang Li/李航博士) - -- [一些Kindle读物](http://ddl.escience.cn/f/IwWE): - - - 利用Python进行数据分析 - - - 跟老齐学Python:从入门到精通 - - - Python与数据挖掘 (大数据技术丛书) - 张良均 - - - Python学习手册 - - - Python性能分析与优化 - - - Python数据挖掘入门与实践 - - - Python数据分析与挖掘实战(大数据技术丛书) - 张良均 - - - Python科学计算(第2版) - - - Python计算机视觉编程 [美] Jan Erik Solem - - - python核心编程(第三版) - - - Python核心编程(第二版) - - - Python高手之路 - [法] 朱利安·丹乔(Julien Danjou) - - - Python编程快速上手 让繁琐工作自动化 - - - Python编程:从入门到实践 - - - Python3 CookBook中文版 - - - 终极算法机器学习和人工智能如何重塑世界 - [美 ]佩德罗·多明戈斯 - - - 机器学习系统设计 (图灵程序设计丛书) - [美]Willi Richert & Luis Pedro Coelho - - - 机器学习实践指南:案例应用解析(第2版) (大数据技术丛书) - 麦好 - - - 机器学习实践 测试驱动的开发方法 (图灵程序设计丛书) - [美] 柯克(Matthew Kirk) - - - 机器学习:实用案例解析 - - -- [数学](https://mega.nz/#F!WVAlGL6B!mqIjYoTjiQnO4jBGVLRIWA -): - - - Algebra - Michael Artin - - - Algebra - Serge Lang - - - Basic Topology - M.A. Armstrong - - - Convex Optimization by Stephen Boyd & Lieven Vandenberghe - - - Functional Analysis by Walter Rudin - - - Functional Analysis, Sobolev Spaces and Partial Differential Equations by Haim Brezis - - - Graph Theory - J.A. Bondy, U.S.R. Murty - - - Graph Theory - Reinhard Diestel - - - Inside Interesting Integrals - Pual J. Nahin - - - Linear Algebra and Its Applications - Gilbert Strang - - - Linear and Nonlinear Functional Analysis with Applications - Philippe G. Ciarlet - - - Mathematical Analysis I - Vladimir A. Zorich - - - Mathematical Analysis II - Vladimir A. Zorich - - - Mathematics for Computer Science - Eric Lehman, F Thomson Leighton, Alber R Meyer - - - Matrix Cookbook, The - Kaare Brandt Petersen, Michael Syskind Pedersen - - - Measures, Integrals and Martingales - René L. Schilling - - - Principles of Mathematical Analysis - Walter Rudin - - - Probabilistic Graphical Models: Principles and Techniques - Daphne Koller, Nir Friedman - - - Probability: Theory and Examples - Rick Durrett - - - Real and Complex Analysis - Walter Rudin - - - Thomas' Calculus - George B. Thomas - - - 普林斯顿微积分读本 - Adrian Banner - - -- [Packt每日限免电子书精选](http://ddl.escience.cn/f/IS4a): - - - Learning Data Mining with Python - - - Matplotlib for python developers - - - Machine Learing with Spark - - - Mastering R for Quantitative Finance - - - Mastering matplotlib - - - Neural Network Programming with Java - - - Python Machine Learning - - - R Data Visualization Cookbook - - - R Deep Learning Essentials - - - R Graphs Cookbook second edition - - - D3.js By Example - - - Data Analysis With R - - - Java Deep Learning Essentials - - - Learning Bayesian Models with R - - - Learning Pandas - - - Python Parallel Programming Cookbook - - - Machine Learning with R - ---- - - -## 其他 Miscellaneous - -- [机器学习日报](http://forum.ai100.com.cn/):每天更新学术和工业界最新的研究成果 - -- [机器之心](https://www.jiqizhixin.com/) - -- [集智社区](https://jizhi.im/index) - -- - - - - -## 如何加入 How to contribute - -如果你对本项目感兴趣,非常欢迎你加入! - -- 正常参与:请直接fork、pull都可以 -- 如果要上传文件:请**不要**直接上传到项目中,否则会造成git版本库过大。正确的方法是上传它的**超链接**。如果你要上传的文件本身就在网络中(如paper都会有链接),直接上传即可;如果是自己想分享的一些文件、数据等,鉴于国内网盘的情况,请按照如下方式上传: - - (墙内)目前没有找到比较好的方式,只能通过链接,或者自己网盘的链接来做。 - - (墙外)首先在[UPLOAD](https://my.pcloud.com/#page=puplink&code=4e9Z0Vwpmfzvx0y2OqTTTMzkrRUz8q9V)直接上传(**不**需要注册账号);上传成功后,在[DOWNLOAD](https://my.pcloud.com/publink/show?code=kZWtboZbDDVguCHGV49QkmlLliNPJRMHrFX)里找到你刚上传的文件,共享链接即可。 - - - -## 如何开始项目协同合作 - -[快速了解github协同工作](http://hucaihua.cn/2016/12/02/github_cooperation/) - -[及时更新fork项目](https://jinlong.github.io/2015/10/12/syncing-a-fork/) - - -#### [贡献者 Contributors](https://github.com/allmachinelearning/MachineLearning/blob/master/contributors.md) - - - - - +# 机器学习资源 Machine learning + +**致力于分享最新最全面的机器学习资料,欢迎你成为贡献者!** + + +- - - + +## 预备知识 Prerequisite + +- Python + - [Learn X in Y minutes](https://learnxinyminutes.com/docs/python/) + - [Python机器学习互动教程](https://www.springboard.com/learning-paths/machine-learning-python/) + +- Markdown + - [Mastering Markdown](https://guides.github.com/features/mastering-markdown/) - Markdown is a easy-to-use writing tool on the GitHu. + +- R + - [R Tutorial](http://www.cyclismo.org/tutorial/R/) + +- Python和Matlab的一些cheat sheet:http://ddl.escience.cn/f/IDkq 包含: + - Numpy、Scipy、Pandas科学计算库 + - Scikit-learn机器学习库、Keras深度学习库 + - Matlab科学计算 + - Matplotlib画图 + +- - - + + +## 理论 Theory + +- ### 深度学习 Deep learning + +- ### [强化学习 Reinforcement learning](https://github.com/allmachinelearning/ReinforcementLearning) + +- ### [迁移学习 Transfer learning](https://jindongwang.github.io/transferlearning/) + +- ### [分布式学习系统 Distributed learning system](https://github.com/allmachinelearning/Deep-Learning-System-Design) + + +- - - + + +## 应用 Applications + +- ### 计算机视觉/机器视觉 Computer vision / machine vision + +- ### [自然语言处理 Natural language procesing](https://github.com/allmachinelearning/NaturalLanguageProcessing) + +- ### 语音识别 Speech recognition + +- ### 生物信息学 Bioinfomatics + +- ### 医疗 Medical + +- ### [行为识别 Activity recognition](https://github.com/jindongwang/activityrecognition) + +- ### [人工智能(多智能体) Artificial Intelligence(Multi-Agent)](http://ddl.escience.cn/f/ILKI) + + +- - - + +## 文档 notes + +- [综述文章汇总](https://github.com/jindongwang/MachineLearning/tree/master/papers/survey) + +- [近200篇机器学习资料汇总!](https://zhuanlan.zhihu.com/p/26136757) + +- [机器学习入门资料](https://github.com/allmachinelearning/MachineLearning/blob/master/notes/MLMaterials.md) + +- [MIT.Introduction to Machine Learning](http://ddl.escience.cn/f/Iwtu) + +- [东京大学同学做的人机交互报告](https://github.com/allmachinelearning/MachineLearning/blob/master/notes/FieldResearchinChina927-104.pdf) + +- [人机交互简介](https://github.com/jindongwang/HCI) + +- [人机交互与创业论坛](https://github.com/allmachinelearning/MachineLearning/blob/master/notes/%E4%BA%BA%E6%9C%BA%E4%BA%A4%E4%BA%92%E4%B8%8E%E5%88%9B%E4%B8%9A%E8%AE%BA%E5%9D%9B.md) + +- [职场机器学习入门](https://github.com/allmachinelearning/MachineLearning/blob/master/notes/%E8%81%8C%E5%9C%BA-%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E5%85%A5%E9%97%A8.md) + +- [机器学习的发展历程及启示](http://mt.sohu.com/20170326/n484898474.shtml), (@Prof. Zhihua Zhang/@张志华教授) + + +- - - + +## 课程与讲座 Course and talk + +- [斯坦福机器学习入门课程](https://www.coursera.org/learn/machine-learning),讲师为Andrew Ng,适合数学基础一般的人,适合入门,但是学完会发现只是懂个大概,也就相当于什么都不懂。省略了很多机器学习的细节 +- [Stanford CS 229](http://cs229.stanford.edu/materials.html), Andrew Ng机器学习课无阉割版,Notes比较详细 +- [CMU 10-702 Statistical Machine Learning](http://www.stat.cmu.edu/~larry/=sml/), 讲师是Larry Wasserman,应该是统计系开的机器学习,非常数学化,第一节课就提到了RKHS(Reproducing Kernel Hilbert Space),建议数学出身的同学看或者是学过实变函数泛函分析的人看一看 +- [CMU 10-715 Advanced Introduction to Machine Learning](https://www.cs.cmu.edu/~epxing/Class/10715/),同样是CMU phd级别的课,节奏快难度高 +- Coursera上国立台湾大学[林轩田](https://www.coursera.org/instructor/htlin)开的两门课:[机器学习基石](https://www.coursera.org/course/ntumlone)(适合入门),[机器学习技法](https://www.coursera.org/course/ntumltwo)(适合提高)。 +- [Machine Learning for Data Analysis](https://www.coursera.org/learn/machine-learning-data-analysis), Coursera上Wesleyan大学的Data Analysis and Interpretation专项课程第四课。 +- [Neural Networks for Machine Learning](https://www.coursera.org/learn/neural-networks), Coursera上的著名课程,由Geoffrey Hinton教授主讲。 +- 斯坦福大学Feifei Li教授的[CS231n系列深度学习课程](http://cs231n.stanford.edu/)。Feifei Li目前是Google的科学家,深度学习与图像识别方面的大牛。这门课的笔记可以看[这里](https://zhuanlan.zhihu.com/p/21930884)。 +- Max Planck Institute for Intelligent Systems Tübingen[德国马普所智能系统研究所2013的机器学习暑期学校视频](https://www.youtube.com/playlist?list=PLqJm7Rc5-EXFv6RXaPZzzlzo93Hl0v91E),仔细翻这个频道还可以找到2015的暑期学校视频 +- 知乎Live:[我们一起开始机器学习吧](https://www.zhihu.com/lives/792423196996546560),[机器学习入门之特征工程](https://www.zhihu.com/lives/819543866939174912) + +- - - + + + + + +## 相关书籍 reference book + + + +- 入门读物 [The Elements of Statistical Learning(英文第二版),The Elements of Statistical Learning.pdf](http://ddl.escience.cn/ff/emZH) + +- [机器学习](https://book.douban.com/subject/26708119/), (@Prof. Zhihua Zhou/周志华教授) + +- [统计学习方法](https://book.douban.com/subject/10590856/), (@Dr. Hang Li/李航博士) + +- [一些Kindle读物](http://ddl.escience.cn/f/IwWE): + + - 利用Python进行数据分析.azw3 + - 跟老齐学Python:从入门到精通.azw3 + - Python与数据挖掘 (大数据技术丛书) - 张良均.azw3 + - Python学习手册.azw3 + - Python性能分析与优化.mobi + - Python数据挖掘入门与实践_7242.azw3 + - Python数据分析与挖掘实战(大数据技术丛书) - 张良均.azw3 + - Python科学计算(第2版).azw3 + - Python计算机视觉编程 [美] Jan Erik Solem.azw3 + - python核心编程(第三版).azw3 + - Python核心编程(第二版).azw3 + - Python高手之路 - [法] 朱利安·丹乔(Julien Danjou).azw3 + - Python编程快速上手 让繁琐工作自动化.azw3 + - Python编程:从入门到实践.azw3 + - Python3 CookBook中文版.mobi + - 终极算法机器学习和人工智能如何重塑世界 - [美 ]佩德罗·多明戈斯.azw3.azw3 + - 机器学习系统设计 (图灵程序设计丛书) - [美]Willi Richert & Luis Pedro Coelho.azw3.azw3 + - 机器学习实践指南:案例应用解析(第2版) (大数据技术丛书) - 麦好.azw3 + - 机器学习实践 测试驱动的开发方法 (图灵程序设计丛书) - [美] 柯克(Matthew Kirk).a.azw3 + - 机器学习:实用案例解析 (O'Reilly精品图书系 + + +--- + +## 其他 Miscellaneous + +- [机器学习日报](http://forum.ai100.com.cn/):每天更新学术和工业界最新的研究成果 + +- - - + +## 如何加入 How to contribute + +- 直接pull requests +- 或者到[这里](https://github.com/allmachinelearning/MachineLearning/issues/1)留下你的Github账号我们把你加入贡献者列表 +- PDF等大文件上传方法:登录 http://ddl.escience.cn 用户名:allmachinelearning@163.com,密码:machine123。登录后,在‘个人空间’中上传,然后将文件(夹)链接共享。 +- 之后请在贡献者页面加入自己的信息 + +## 如何开始项目协同合作 +[快速了解github协同工作](http://hucaihua.cn/2016/12/02/github_cooperation/) + +[及时更新fork项目](https://jinlong.github.io/2015/10/12/syncing-a-fork/) + +#### [贡献者 Contributors](https://github.com/allmachinelearning/MachineLearning/blob/master/contributors.md) + diff --git a/contributors.md b/contributors.md index 280fd6b..835d2f5 100644 --- a/contributors.md +++ b/contributors.md @@ -6,9 +6,6 @@ | [Xiandong QI](https://xiandong79.github.io) | 香港科技大学 | | [Youjie Xia](https://youjiexia.github.io) | 上海交通大学 | | [Jiapeng Zhang](https://www.zhihu.com/people/jiapengzhang) | 三本大学生 | -| [Zhigang He](https://github.com/Hochikong) | 暨南大学 | -| [Wenhan Wu](https://github.com/wwh2259253) | 三本菜鸡 | -| [Nativeatom](https://github.com/Nativeatom)| 南京大学 | @@ -30,7 +27,6 @@ - - + diff --git a/notes/MLMaterials.md b/notes/MLMaterials.md index 77b20d1..1f00176 100644 --- a/notes/MLMaterials.md +++ b/notes/MLMaterials.md @@ -24,7 +24,7 @@ #### 1.3. 工具 * 第三方库 机器学习有很多开源库可以直接拿来用,github是个不错的获取代码的网站,比较著名的有: -    * [libsvm](https://github.com/cjlin1/libsvm),作者是林智仁,是svm的标准库。 + * [libsvm](https://github.com/cjlin1/libsvm),作者是林轩田,是svm的标准库。 * [scikit-learn](http://scikit-learn.org),scikit包是python中著名的处理数据的包,其中内置了几乎所有流行的机器学习算法,配合python简洁的语法操作,使用起来很方便。 * [pandas](http://www.cnblogs.com/chaosimple/p/4153083.html),python的一个包,其中对表的处理比较出色,我只是试用过。 * [pylearn2](https://github.com/lisa-lab/pylearn2),这个我没有接触过,不过在github上排名很靠前,应该不错。 diff --git a/notes/distance and similarity.md b/notes/distance and similarity.md deleted file mode 100644 index 84d94ab..0000000 --- a/notes/distance and similarity.md +++ /dev/null @@ -1,167 +0,0 @@ -# 距离度量 - -- - - - -[toc] - -## 常见距离与相似度度量 - -- - - - -### 欧氏距离 - -定义在两个向量(两个点)上:点$\mathbf{x}$和点$\mathbf{y}$的欧氏距离为: - -$$ -d_{Euclidean}=\sqrt{(\mathbf{x}-\mathbf{y})^\top (\mathbf{x}-\mathbf{y})} -$$ - -### 闵可夫斯基距离 - -Minkowski distance, 两个向量(点)的$p$阶距离: - -$$ -d_{Minkowski}=(|\mathbf{x}-\mathbf{y}|^p)^{1/p} -$$ - -当$p=1$时就是曼哈顿距离,当$p=2$时就是欧氏距离。 - -### 马氏距离 - -定义在两个向量(两个点)上,这两个点在同一个分布里。点$\mathbf{x}$和点$\mathbf{y}$的马氏距离为: - -$$ -d_{Mahalanobis}=\sqrt{(\mathbf{x}-\mathbf{y})^\top \Sigma^{-1} (\mathbf{x}-\mathbf{y})} -$$ - -其中,$\Sigma$是这个分布的协方差。 - -当$\Sigma=\mathbf{I}$时,马氏距离退化为欧氏距离。 - -### 互信息 - -定义在两个概率分布$X,Y$上,$x \in X,y \in Y$.它们的互信息为: - -$$ -I(X;Y)=\sum_{x \in X} \sum_{y \in Y} p(x,y) \log \frac{p(x,y)}{p(x)p(y)} -$$ - -### 余弦相似度 - -衡量两个向量的相关性(夹角的余弦)。向量$\mathbf{x},\mathbf{y}$的余弦相似度为: - -$$ -\cos (\mathbf{x},\mathbf{y}) = \frac{\mathbf{x} \cdot \mathbf{y}}{|\mathbf{x}|\cdot |\mathbf{y}|} -$$ - -理解:向量的内积除以向量的数量积。 - -### 皮尔逊相关系数 - -衡量两个随机变量的相关性。随机变量$X,Y$的Pearson相关系数为: - -$$ -\rho_{X,Y}=\frac{Cov(X,Y)}{\sigma_X \sigma_Y} -$$ - -理解:协方差矩阵除以标准差之积。 - -范围:[-1,1],绝对值越大表示(正/负)相关性越大。 - -### Jaccard相关系数 - -对两个集合$X,Y$,判断他们的相关性,借用集合的手段: - -$$ -J=\frac{X \cap Y}{X \cup Y} -$$ - -理解:两个集合的交集除以并集。 - -扩展:Jaccard距离=$1-J$。 - -- - - - -## 概率分布的距离度量 - -- - - - -### KL散度 - -Kullback–Leibler divergence,相对熵,衡量两个概率分布$P(x),Q(x)$的距离: - -$$ -D_{KL}(P||Q)=\sum_{i=1} P(x) \log \frac{P(x)}{Q(x)} -$$ - -非对称距离:$D_{KL}(P||Q) \ne D_{KL}(Q||P)$. - -### JS距离 - -Jensen–Shannon divergence,基于KL散度发展而来,是对称度量: - -$$ -JSD(P||Q)= \frac{1}{2} D_{KL}(P||M) + \frac{1}{2} D_{KL}(Q||M) -$$ - -其中$M=\frac{1}{2}(P+Q)$。是对称度量。 - -### MMD距离 - -Maximum mean discrepancy,度量在再生希尔伯特空间中两个分布的距离,是一种核学习方法。两个随机变量的距离为: - -$$ -MMD(X,Y)=\left \Vert \sum_{i=1}^{n_1}\phi(\mathbf{x}_i)- \sum_{j=1}^{n_2}\phi(\mathbf{y}_j) \right \Vert^2_\mathcal{H} -$$ - -其中$\phi(\cdot)$是映射,用于把原变量映射到高维空间中。 - -理解:就是求两堆数据在高维空间中的均值的距离。 - -### Principal angle - -也是将两个分布映射到高维空间(格拉斯曼流形)中,在流形中两堆数据就可以看成两个点。Principal angle是求这两堆数据的对应维度的夹角之和。对于两个矩阵$\mathbf{X},\mathbf{Y}$,计算方法:首先正交化两个矩阵,然后: - -$$ -PA(\mathbf{X},\mathbf{Y})=\sum_{i=1}^{\min(m,n)} \sin \theta_i -$$ - -其中$m,n$分别是两个矩阵的维度,$\theta_i$是两个矩阵第$i$个维度的夹角,$\Theta=\{\theta_1,\theta_2,\cdots,\theta_t\}$是两个矩阵SVD后的角度: - -$$ -\mathbf{X}^\top\mathbf{Y}=\mathbf{U} (\cos \Theta) \mathbf{V}^\top -$$ - -### HSIC - -希尔伯特-施密特独立性系数,Hilbert-Schmidt Independence Criterion,用来检验两组数据的独立性: - -$$ -HSIC(X,Y) = trace(HXHY) -$$ - -其中$X,Y$是两堆数据的kernel形式。 - -### Earth Mover’s Distance - -推土机距离,度量两个分布之间的距离,又叫Wasserstein distance。以最优运输的观点来看,就是分布$X$能够变换成分布$Y$所需要的最小代价: - -一个二分图上的流问题,最小代价就是最小流,用匈牙利算法可以解决。 - -$$ -emd(X,Y)=\min{\frac{\sum_{i,j}f_{ij}d(\textbf{x}_i,\textbf{y}_j)}{\sum_{j}w_{yj}}} -$$ - -约束条件为 - -$$ -s.t. \sum_{i}f_{ij}=w_{yj}, \sum_{j}f_{ij}=w_{xi}. -$$ - -#### References - -[1] http://blog.csdn.net/pipisorry/article/details/45651315 - -[2] http://chaofan.io/archives/earth-movers-distance-%E6%8E%A8%E5%9C%9F%E6%9C%BA%E8%B7%9D%E7%A6%BB - - diff --git a/notes/survey_readme.md b/notes/survey_readme.md deleted file mode 100644 index bf17e93..0000000 --- a/notes/survey_readme.md +++ /dev/null @@ -1,20 +0,0 @@ -### 综述文章汇总 - -这里汇总了一些我下载过或看过的综述survey文章。 - -- 迁移学习的综述请看这里:[迁移学习综述](https://github.com/jindongwang/transferlearning#2迁移学习的综述文章) -- [多标签学习A Review on Multi-Label Learning Algorithms_Zhang_Zhou_2014.pdf](https://mega.nz/#!ZWhijBoS!Ql7HCV_fX3e2SyrjOgmjDGtGgawGLBIuGPXvdPqXAOw) -- [多视角学习A survey of multi-view machine learning_Sun_2013.pdf](https://mega.nz/#!IeB10Q5C!xH4iCXl0BAt3EAwx9Qbtu_AUTktGNaOvnAhe1K42FcE) -- [随机森林用于数据挖掘Mining data with random forests - A survey and results of new tests](https://mega.nz/#!MCQnhazB!Iqx8gKpLNseoRo-qvkXj4G5LiZKNeFW9DNMjOhAus0M) -- [多任务学习Multi-task learning survey](https://mega.nz/#!BLg0QCaD!Zwybj-5UW_8x4wDWsYvv6A0825co1lJ3CSCj2-jM1go) -- [半监督学习Semi-supervised learning literature survey_Zhu_2005.pdf](https://mega.nz/#!gKYVFTrI!sLkVspn3uVwVHWVhv3XUObmFBIVRdlhbHuqQXzuht_4) -- [稀疏子空间聚类Sparse Subspace Clustering_Elhamifar_Vidal_2013.pdf](https://mega.nz/#!0eAC2ajD!xWZhO9Pvh7qJwpHKkyYLnqKbLye9coSX0fd6WuyiIs4) -- [聚类算法Survey of Clustering Algorithms_Xu_WunschII_2005.pdf](https://mega.nz/#!dKJAjAqJ!BwiVi3KGDaGXIWGlIiOo9cenHcTmtRyAxNW6WgKFQgE) -- [特征选择综述] ---------1.A Survey on semi-supervised feature selection methods ---------2.Feature selection in machine learning A new perspective ---------3.Feature Selection: A Data Perspective -- [深度学习各种综述](https://mega.nz/#F!NaxA0ADS!QIxYDA6A760jfPbFbElCYA) - - 最著名的综述:深度学习三巨头在2015年Nature上的[Deep learning](https://mega.nz/#!ZL4VFTiK!hcpVDDd9MtsFlBZHp-KaETk0bOAdcBaq_ioci75NrK8) - - 比较完整的综述:[Deep learning in neural networks - an overview](https://mega.nz/#!lLJj3Rwb!t6yO7hDDYZHYj1UDFX17gAjQkZ77mmXhQKa0aayDhJg) - - 此外,还有一些中英文的综述,都在上面目录里。 diff --git a/papers/survey/A Review on Multi-Label Learning Algorithms_Zhang_Zhou_2014.pdf b/papers/survey/A Review on Multi-Label Learning Algorithms_Zhang_Zhou_2014.pdf new file mode 100644 index 0000000..097be6e Binary files /dev/null and b/papers/survey/A Review on Multi-Label Learning Algorithms_Zhang_Zhou_2014.pdf differ diff --git a/papers/survey/A survey of multi-view machine learning_Sun_2013.pdf b/papers/survey/A survey of multi-view machine learning_Sun_2013.pdf new file mode 100644 index 0000000..c6eb746 Binary files /dev/null and b/papers/survey/A survey of multi-view machine learning_Sun_2013.pdf differ diff --git a/papers/survey/Mining data with random forests - A survey and results of new tests_Verikas et al_2011.pdf b/papers/survey/Mining data with random forests - A survey and results of new tests_Verikas et al_2011.pdf new file mode 100644 index 0000000..6dd8c3a Binary files /dev/null and b/papers/survey/Mining data with random forests - A survey and results of new tests_Verikas et al_2011.pdf differ diff --git a/papers/survey/Multi-task learning survey_.pdf b/papers/survey/Multi-task learning survey_.pdf new file mode 100644 index 0000000..c50b011 Binary files /dev/null and b/papers/survey/Multi-task learning survey_.pdf differ diff --git a/papers/survey/Semi-supervised learning literature survey_Zhu_2005.pdf b/papers/survey/Semi-supervised learning literature survey_Zhu_2005.pdf new file mode 100644 index 0000000..d894114 Binary files /dev/null and b/papers/survey/Semi-supervised learning literature survey_Zhu_2005.pdf differ diff --git a/papers/survey/Sparse Subspace Clustering_Elhamifar_Vidal_2013.pdf b/papers/survey/Sparse Subspace Clustering_Elhamifar_Vidal_2013.pdf new file mode 100644 index 0000000..378c63a Binary files /dev/null and b/papers/survey/Sparse Subspace Clustering_Elhamifar_Vidal_2013.pdf differ diff --git a/papers/survey/Survey of Clustering Algorithms_Xu_WunschII_2005.pdf b/papers/survey/Survey of Clustering Algorithms_Xu_WunschII_2005.pdf new file mode 100644 index 0000000..ff61ecb Binary files /dev/null and b/papers/survey/Survey of Clustering Algorithms_Xu_WunschII_2005.pdf differ diff --git a/papers/survey/deep learning/A tutorial survey of architectures, algorithms, and applications for deep_Deng_2014.pdf b/papers/survey/deep learning/A tutorial survey of architectures, algorithms, and applications for deep_Deng_2014.pdf new file mode 100644 index 0000000..6c07048 Binary files /dev/null and b/papers/survey/deep learning/A tutorial survey of architectures, algorithms, and applications for deep_Deng_2014.pdf differ diff --git a/papers/survey/deep learning/Deep Learning Neural Networks on Mobile Platforms_Plieninger et al_2016.pdf b/papers/survey/deep learning/Deep Learning Neural Networks on Mobile Platforms_Plieninger et al_2016.pdf new file mode 100644 index 0000000..651fb79 Binary files /dev/null and b/papers/survey/deep learning/Deep Learning Neural Networks on Mobile Platforms_Plieninger et al_2016.pdf differ diff --git a/papers/survey/deep learning/Deep learning in neural networks - an overview. AMiner_.pdf b/papers/survey/deep learning/Deep learning in neural networks - an overview. AMiner_.pdf new file mode 100644 index 0000000..6246864 Binary files /dev/null and b/papers/survey/deep learning/Deep learning in neural networks - an overview. AMiner_.pdf differ diff --git a/papers/survey/deep learning/LeCun et al_Deep learning_2015.pdf b/papers/survey/deep learning/LeCun et al_Deep learning_2015.pdf new file mode 100644 index 0000000..04144b6 Binary files /dev/null and b/papers/survey/deep learning/LeCun et al_Deep learning_2015.pdf differ diff --git a/papers/survey/deep learning/Overview of deep learning_Du et al_2016.pdf b/papers/survey/deep learning/Overview of deep learning_Du et al_2016.pdf new file mode 100644 index 0000000..53ad227 Binary files /dev/null and b/papers/survey/deep learning/Overview of deep learning_Du et al_2016.pdf differ diff --git "a/papers/survey/deep learning/\346\267\261\345\272\246\345\255\246\344\271\240\347\232\204\347\240\224\347\251\266\344\270\216\345\217\221\345\261\225_.pdf" "b/papers/survey/deep learning/\346\267\261\345\272\246\345\255\246\344\271\240\347\232\204\347\240\224\347\251\266\344\270\216\345\217\221\345\261\225_.pdf" new file mode 100644 index 0000000..aee4c90 Binary files /dev/null and "b/papers/survey/deep learning/\346\267\261\345\272\246\345\255\246\344\271\240\347\232\204\347\240\224\347\251\266\344\270\216\345\217\221\345\261\225_.pdf" differ diff --git "a/papers/survey/deep learning/\346\267\261\345\272\246\345\255\246\344\271\240\347\240\224\347\251\266\347\273\274\350\277\260_.pdf" "b/papers/survey/deep learning/\346\267\261\345\272\246\345\255\246\344\271\240\347\240\224\347\251\266\347\273\274\350\277\260_.pdf" new file mode 100644 index 0000000..90fcf82 Binary files /dev/null and "b/papers/survey/deep learning/\346\267\261\345\272\246\345\255\246\344\271\240\347\240\224\347\251\266\347\273\274\350\277\260_.pdf" differ diff --git "a/papers/survey/deep learning/\346\267\261\345\272\246\345\255\246\344\271\240\347\240\224\347\251\266\347\273\274\350\277\260__2.pdf" "b/papers/survey/deep learning/\346\267\261\345\272\246\345\255\246\344\271\240\347\240\224\347\251\266\347\273\274\350\277\260__2.pdf" new file mode 100644 index 0000000..0bb364f Binary files /dev/null and "b/papers/survey/deep learning/\346\267\261\345\272\246\345\255\246\344\271\240\347\240\224\347\251\266\347\273\274\350\277\260__2.pdf" differ diff --git a/papers/survey/readme.md b/papers/survey/readme.md new file mode 100644 index 0000000..fcfc7e6 --- /dev/null +++ b/papers/survey/readme.md @@ -0,0 +1,16 @@ +### 综述文章汇总 + +这里汇总了一些我下载过或看过的综述survey文章。 + +- 迁移学习的综述请看这里:[迁移学习综述](https://github.com/jindongwang/transferlearning#2迁移学习的综述文章) +- [多标签学习A Review on Multi-Label Learning Algorithms_Zhang_Zhou_2014.pdf](https://github.com/jindongwang/MachineLearning/blob/master/papers/survey/A%20Review%20on%20Multi-Label%20Learning%20Algorithms_Zhang_Zhou_2014.pdf) +- [多视角学习A survey of multi-view machine learning_Sun_2013.pdf](https://github.com/jindongwang/MachineLearning/blob/master/papers/survey/A%20survey%20of%20multi-view%20machine%20learning_Sun_2013.pdf) +- [随机森林用于数据挖掘Mining data with random forests - A survey and results of new tests](https://github.com/jindongwang/MachineLearning/blob/master/papers/survey/Mining%20data%20with%20random%20forests%20-%20A%20survey%20and%20results%20of%20new%20tests_Verikas%20et%20al_2011.pdf) +- [多任务学习Multi-task learning survey](https://github.com/jindongwang/MachineLearning/blob/master/papers/survey/Multi-task%20learning%20survey_.pdf) +- [半监督学习Semi-supervised learning literature survey_Zhu_2005.pdf](https://github.com/jindongwang/MachineLearning/blob/master/papers/survey/Semi-supervised%20learning%20literature%20survey_Zhu_2005.pdf) +- [稀疏子空间聚类Sparse Subspace Clustering_Elhamifar_Vidal_2013.pdf](https://github.com/jindongwang/MachineLearning/blob/master/papers/survey/Sparse%20Subspace%20Clustering_Elhamifar_Vidal_2013.pdf) +- [聚类算法Survey of Clustering Algorithms_Xu_WunschII_2005.pdf](https://github.com/jindongwang/MachineLearning/blob/master/papers/survey/Survey%20of%20Clustering%20Algorithms_Xu_WunschII_2005.pdf) +- [深度学习各种综述](https://github.com/jindongwang/MachineLearning/tree/master/papers/survey/deep%20learning) + - 最著名的综述:深度学习三巨头在2015年Nature上的[Deep learning](https://github.com/jindongwang/MachineLearning/blob/master/papers/survey/deep%20learning/LeCun%20et%20al_Deep%20learning_2015.pdf) + - 比较完整的综述:[Deep learning in neural networks - an overview](https://github.com/jindongwang/MachineLearning/blob/master/papers/survey/deep%20learning/Deep%20learning%20in%20neural%20networks%20-%20an%20overview.%20AMiner_.pdf) + - 此外,还有一些中英文的综述,都在上面目录里。 \ No newline at end of file