本帖最后由 马猴烧酒 于 2018-9-7 14:34 编辑
机器学习 机器学习是近20多年兴起的一门多领域交叉学科,涉及概率论、统计学、逼近论、凸分析、算法复杂度理论等多门学科。机器学习理论主要是设计和分析一些让计算机可以自动“学习”的算法。机器学习算法是一类从数据中自动分析获得规律,并利用规律对未知数据进行预测的算法。因为学习算法中涉及了大量的统计学理论,机器学习与统计推断学联系尤为密切,也被称为统计学习理论。算法设计方面,机器学习理论关注可以实现的,行之有效的学习算法。 下面从微观到宏观试着梳理一下机器学习的范畴:一个具体的算法,领域进一步细分,实战应用场景,与其他领域的关系。 图1:机器学习的例子 NLTK监督学习的工作流程图
大致分三类: 起步体悟,实战笔记,行家导读
Tom Mitchell 和 Andrew Ng 的课都很适合入门 2011 Tom Mitchell(CMU)机器学习 - Decision Trees
- Probability and Estimation
- Naive Bayes
- Logistic Regression
- Linear Regression
- Practical Issues: Feature selection,Overfitting ...
- Graphical models: Bayes networks, EM,Mixture of Gaussians clustering ...
- Computational Learning Theory: PAC Learning, Mistake bounds ...
- Semi-Supervised Learning
- Hidden Markov Models
- Neural Networks
- Learning Representations: PCA, Deep belief networks, ICA, CCA ...
- Kernel Methods and SVM
- Active Learning
- Reinforcement Learning 以上为课程标题节选
2014 Andrew Ng (Stanford)机器学习 英文原版视频 这就是针对自学而设计的,免费还有修课认证。“老师讲的是深入浅出,不用太担心数学方面的东西。而且作业也非常适合入门者,都是设计好的程序框架,有作业指南,根据作业指南填写该完成的部分就行。”(参见白马同学的入门攻略)"推荐报名,跟着上课,做课后习题和期末考试。(因为只看不干,啥都学不会)。"(参见reyoung的建议) - Introduction (Week 1)
- Linear Regression with One Variable (Week 1)
- Linear Algebra Review (Week 1, Optional)
- Linear Regression with Multiple Variables (Week 2)
- Octave Tutorial (Week 2)
- Logistic Regression (Week 3)
- Regularization (Week 3)
- Neural Networks: Representation (Week 4)
- Neural Networks: Learning (Week 5)
- Advice for Applying Machine Learning (Week 6)
- Machine Learning System Design (Week 6)
- Support Vector Machines (Week 7)
- Clustering (Week 8)
- Dimensionality Reduction (Week 8)
- Anomaly Detection (Week 9)
- Recommender Systems (Week 9)
- Large Scale Machine Learning (Week 10)
- Application Example: Photo OCR
- Conclusion
2013年Yaser Abu-Mostafa (Caltech) Learningfrom Data -- - The Learning Problem
- Is Learning Feasible?
- The Linear Model I
- Error and Noise
- Training versus Testing
- Theory of Generalization
- The VC Dimension
- Bias-Variance Tradeoff
- The Linear Model II
- Neural Networks
- Overfitting
- Regularization
- Validation
- Support Vector Machines
- Kernel Methods
- Radial Basis Functions
- Three Learning Principles
- Epilogue
2014年 林軒田(国立台湾大学)機器學習基石 (Machine Learning Foundations) -- When Can Machines Learn? [何時可以使用機器學習]The Learning Problem [機器學習問題] -- Learning to AnswerYes/No [二元分類] -- Types of Learning [各式機器學習問題] -- Feasibility of Learning [機器學習的可行性] Why Can Machines Learn? [為什麼機器可以學習] --Training versus Testing [訓練與測試] -- Theory ofGeneralization [舉一反三的一般化理論] -- The VC Dimension [VC 維度] -- Noise and Error [雜訊一錯誤] How Can Machines Learn? [機器可以怎麼樣學習] --Linear Regression [線性迴歸] -- Linear `Soft'Classification [軟性的線性分類] -- Linear Classificationbeyond Yes/No [二元分類以外的分類問題] -- Nonlinear Transformation[非線性轉換] How Can Machines Learn Better? [機器可以怎麼樣學得更好] -- Hazard of Overfitting [過度訓練的危險] -- PreventingOverfitting I: Regularization [避免過度訓練一:控制調適] --Preventing Overfitting II: Validation [避免過度訓練二:自我檢測] --Three Learning Principles [三個機器學習的重要原則]
2008年Andrew Ng CS229 机器学习 -- 1. 机器学习的动机与应用 2. 监督学习应用.梯度下降 3. 欠拟合与过拟合的概念 4. 牛顿方法 5. 生成学习算法 6. 朴素贝叶斯算法 7. 最优间隔分类器问题 8. 顺序最小优化算法 9. 经验风险最小化 10.特征选择 11.贝叶斯统计正则化 12.K-means算法 13.高斯混合模型 14.主成分分析法 15.奇异值分解 16.马尔可夫决策过程 17.离散与维数灾难 18.线性二次型调节控制 19.微分动态规划 20.策略搜索 2012年余凯(百度)张潼(Rutgers) 机器学习公开课 -- 1. Introduction to MLand review of linear algebra, probability, statistics (kai) 2. linear model(tong) 3. overfitting andregularization(tong) 4. linearclassification (kai) 5. basis expansionand kernelmethods (kai) 6. model selection andevaluation(kai) 7. model combination(tong) 8. boosting andbagging (tong) 9. overview oflearning theory(tong) 10.optimization inmachinelearning (tong) 11.online learning(tong) 12.sparsity models(tong) 13.introduction tographicalmodels (kai) 14.structuredlearning (kai) 15.feature learningand deeplearning (kai) 16.transfer learningand semi supervised learning (kai) 17.matrixfactorization and recommendations (kai) 18.learning on images(kai) 19.learning on theweb (tong) 一些好东西,入门前未必看得懂,要等学有小成时再看才能体会。 - 机器学习关注从训练数据中学到已知属性进行预测
- 数据挖掘侧重从数据中发现未知属性
- If there are up to 3 variables, it is statistics.
- If the problem is NP-complete, it is machine learning.
- If the problem is PSPACE-complete, it is AI.
- If you don't know what is PSPACE-complete, it is data mining.
一本好书 - 李航博士的《统计学习方法》一书前段也推荐过,给个豆瓣的链接
感谢贡献者: tang_Kaka_back@新浪微博
作者:张松阳
链接:https://www.zhihu.com/question/20691338/answer/53910077
来源:知乎
著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。
|