深度学习相关工具软件介绍合集,值得积累
除了这些主流的框架以外,我们还整理了如下所示的一些软件列表.
C
通用机器学习
Recommender-一个产品推荐的C语言库,利用了协同过滤.
CCV-C-based/Cached/CoreComputerVisionLibrary,是一个现代化的计算机视觉库。
VLFeat-VLFeat是开源的computervisionalgorithms库,有Matlabtoolbox。
C++
OpenCV-最常用的视觉库。有C++,C,Python以及Java接口),支持Windows,Linux,AndroidandMacOS。
DLib-DLib有C++和Python脸部识别和物体检测接口。
EBLearn-Eblearn是一个面向对象的C++库,实现了各种机器学习模型。
VIGRA-VIGRA是一个跨平台的机器视觉和机器学习库,可以处理任意维度的数据,有Python接口。
通用机器学习
MLPack-可拓展的C++机器学习库。
DLib-设计为方便嵌入到其他系统中。
encog-cpp
shark
VowpalWabbit(VW)-Afastout-of-corelearningsystem.
sofia-ml-fastincremental算法套件.
Shogun-TheShogunMachineLearningToolbox
Caffe-deeplearning框架,结构清晰,可读性好,速度快。
CXXNET-精简的框架,核心代码不到1000行。
XGBoost-为并行计算优化过的gradientboostinglibrary.
CUDA-ThisisafastC++/CUDAimplementationofconvolutional[DEEPLEARNING]
Stan-AprobabilisticprogramminglanguageimplementingfullBayesianstatisticalinferencewithHamiltonianMonteCarlosampling
BanditLib-AsimpleMulti-armedBanditlibrary.
Timbl-实现了多个基于内存的算法,其中IB1-IG(KNN分类算法)和IGTree(决策树)在NLP中广泛应用.
自然语言处理
MITInformationExtractionToolkit-C,C++,andPython工具,用来命名实体识别和关系抽取。
CRF++-条件随机场的开源实现,可以用作分词,词性标注等。
CRFsuite-CRFsuite是条件随机场的实现,可以用作词性标注等。
BLLIPParser-即Charniak-Johnsonparser。
colibri-core-一组C++library,命令行工具以及Pythonbinding,高效实现了n-grams和skipgrams。
ucto-多语言tokenizer,支持面向Unicode的正则表达式,支持FoLiA格式.
libfolia-C++libraryfortheFoLiAformat
MeTA-MeTA:ModErnTextAnalysis从巨量文本中挖掘数据。
机器翻译
EGYPT(GIZA++)
Moses
pharaoh
SRILM
NiuTrans
jane
SAMT
Kaldi-Kaldi是一个C++工具,以Apache许可证V2.0发布。Kaldi适用于语音识别的研究。
SequenceAnalysis
ToPS-Thisisanobjected-orientedframeworkthatfacilitatestheintegrationofprobabilisticmodelsforsequencesoverauserdefinedalphabet.
Java
自然语言处理
Cortical.io-Retina:此API执行复杂的NLP操作(消歧义,分类,流文本过滤等),快速、直观如同大脑一般。
CoreNLP-StanfordCoreNLP提供了一组自然语言分析工具,可采取raw英语文本输入并给出单词的基本形式。
StanfordParser-parser是一个程序,能分析出句子的语法结构。
StanfordPOSTagger-词性标注器
StanfordNameEntityRecognizer-斯坦福大学NER是一个Java实现的命名实体识别器。
StanfordWordSegmenter-原始文本的token化是许多NLP任务的标准预处理步骤。
Tregex,TsurgeonandSemgrex-Tregex是匹配树模式的工具,基于树的关系和正则表达式的节点匹配(shortfor"treeregularexpressions")。
StanfordPhrasal:APhrase-BasedTranslationSystem
StanfordEnglishTokenizer-StanfordPhrasal是最先进的统计的基于短语的机器翻译系统,用Java编写。
StanfordTokensRegex-Atokenizerdividestextintoasequenceoftokens,whichroughlycorrespondto"words"
StanfordTemporalTagger-SUTime是识别和规范时间表达式的库。
StanfordSPIED-从种子集开始,迭代使用模式,从未标注文本中习得实体。
StanfordTopicModelingToolbox-主题建模工具,社会学家用它分析的数据集。
TwitterTextJava-Java实现的Twitter文本处理库。
MALLET-基于Java的软件包,包括统计自然语言处理,文档分类,聚类,主题建模,信息提取,以及其它机器学习应用。
OpenNLP-一个基于机器学习的自然语言处理的工具包。
LingPipe-计算语言学工具包。
ClearTK-ClearTK提供了开发统计自然语言处理组件的框架,其建立在ApacheUIMA之上。
ApachecTAKES-Apache临床文本分析及知识提取系统(cTAKES)是从电子病历、临床文本中进行信息抽取的一个开源系统
通用机器学习
aerosolve-Airbnb从头开始设计的机器学习库,易用性好。
Datumbox-机器学习和统计应用程序的快速开发框架。
ELKI-数据挖掘工具.(非监督学习:聚类,离群点检测等.)
Encog-先进的神经网络和机器学习框架。Encog中包含用于创建各种网络,以及规范和处理数据的神经网络。Encog训练采用多线程弹性的传播方式。Encog还可以利用GPU的进一步加快处理时间。有基于GUI的工作台。
H2O-机器学习引擎,支持Hadoop,Spark等分布式系统和个人电脑,可以通过R,Python,Scala,REST/JSON调用API。
htm.java-通用机器学习库,使用Numenta’sCorticalLearningAlgorithm
java-deeplearning-分布式深度学习平台forJava,Clojure,Scala
JAVA-ML-Java通用机器学习库,所有算法统一接口。
JSAT-具有很多分类,回归,聚类等机器学习算法。
Mahout-分布式机器学习工具。
Meka-一个开源实现的多标签分类和评估方法。基于weka扩展。
MLlibinApacheSpark-Spark分布式机器学习库
Neuroph-轻量级Java神经网络框架
ORYX-LambdaArchitectureFramework,使用ApacheSpark和ApacheKafka实现实时大规模机器学习。
RankLib-排序算法学习库。
StanfordClassifier-Aclassifierisamachinelearningtoolthatwilltakedataitemsandplacethemintooneofkclasses.
SmileMiner-StatisticalMachineIntelligence&LearningEngine
SystemML-灵活的,可扩展的机器学习语言。
WalnutiQ-面向对象的人脑模型
Weka-WEKA是机器学习算法用于数据挖掘任务的算法集合。
CMUSphinx-开源工具包,用于语音识别,完全基于Java的语音识别库。
数据分析、可视化
Hadoop-Hadoop/HDFS
Spark-Spark快速通用的大规模数据处理引擎。
Impala-实时Hadoop查询。
DataMelt-数学软件,包含数值计算,统计,符号计算,数据分析和数据可视化。
Dr.MichaelThomasFlanagan’sJavaScientificLibrary
DeepLearning
Deeplearning4j-可扩展的产业化的深度学习,利用并行的GPU。
Python
Scikit-Image-Python中的图像处理算法的集合。
SimpleCV-一个开源的计算机视觉框架,允许访问几个高性能计算机视觉库,如OpenCV。可以运行在Mac,Windows和UbuntuLinux操作系统上。
Vigranumpy-计算机视觉库VIGRAC++的Python绑定。
自然语言处理
NLTK-构建与人类语言数据相关工作的Python程序的领先平台。
Pattern-基于Python的Web挖掘模块。它有自然语言处理,机器学习等工具。
Quepy-将自然语言问题转换成数据库查询语言。
TextBlob-为普通的自然语言处理(NLP)任务提供一致的API。构建于NLTK和Pattern上,并很好地与两者交互。
YAlign-句子对齐工具,从对照语料中抽取并行句子。
jieba-中文分词工具
SnowNLP-中文文本处理库。
loso-中文分词工具
genius-基于条件随机场的中文分词工具
KoNLPy-韩语自然语言处理
nut-自然语言理解工具
Rosetta-Textprocessingtoolsandwrappers(e.g.VowpalWabbit)
BLLIPParser-BLLIPNaturalLanguageParser的Python绑定(即Charniak-Johnsonparser)
PyNLPl-Python的自然语言处理库。还包含用于解析常见NLP格式的工具,如FoLiA,以及ARPAlanguagemodels,Mosesphrasetables,GIZA++对齐等。
python-ucto-ucto(面向unicode的基于规则的tokenizer)的Python绑定
python-frog-Frog的Python绑定。荷兰语的词性标注,lemmatisation,依存分析,NER。
python-zpar-ZPar的Python绑定(英文的基于统计的词性标注,constiuency解析器和依赖解析器)
colibri-core-高效提取n-grams和skipgrams的C++库的Python绑定
spaCy-工业级NLPwithPythonandCython.
PyStanfordDependencies-将PennTreebanktree转换到Stanford依存树的Python接口.
通用机器学习
machinelearning-构建和web-interface,programmatic-interface兼容的支持向量机API.相应的数据集存储到一个SQL数据库,然后生成用于预测的模型,存储到一个NoSQL的数据库。
XGBoost-eXtremeGradientBoosting(Tree)库的Python绑定
Featureforge一组工具,用于创建和测试机器学习的特征,具有与scikit-learn兼容的API
scikit-learn-基于SciPy的机器学习的Python模块。
metric-learn-metriclearning的Python模块
SimpleAI-实现了“人工智能现代方法”一书中描述的许多人工智能算法。它着重于提供一个易于使用的,文档良好的和经过测试的库。
astroML-天文学机器学习和数据挖掘库。
graphlab-create-基于disk-backedDataFrame的库,实现了各种机器学习模型(回归,聚类,推荐系统,图形分析等)。
BigML-与外部服务器交流的库。
pattern-Web数据挖掘模块.
NuPIC-Numenta智能计算平台.
Pylearn2-基于Theano的机器学习库。
keras-基于Theano的神经网络库
hebel-GPU加速的Python深度学习库。
Chainer-灵活的神经网络架构
gensim-易用的主题建模工具
topik-主题建模工具包
PyBrain-AnotherPythonMachineLearningLibrary.
Crab-灵活的,快速的推荐引擎
python-recsys-实现一个推荐系统的Python工具
RestrictedBoltzmannMachines-受限玻尔兹曼机
CoverTree-Pythonimplementationofcovertrees,near-drop-inreplacementforscipy.spatial.kdtree
nilearn-NeuroImaging机器学习库
Shogun-ShogunMachineLearningToolbox
Pyevolve-遗传算法框架
Caffe-deeplearning框架,结构清晰,可读性好,速度快。
breze-基于Theano的深度神经网络
pyhsmm-贝叶斯隐马尔可夫模型近似无监督的推理和显式时长隐半马尔可夫模型,专注于贝叶斯非参数扩展,theHDP-HMMandHDP-HSMM,大多是弱极限近似。
mrjob-使得Python程序可以跑在Hadoop上.
SKLL-简化的scikit-learn接口,易于做实验
neurolab-https://github.com/zueve/neurolab
Spearmint-贝叶斯算法的优化。方法见于论文:PracticalBayesianOptimizationofMachineLearningAlgorithms.JasperSnoek,HugoLarochelleandRyanP.Adams.AdvancesinNeuralInformationProcessingSystems,2012.
Pebl-贝叶斯学习的Python环境
Theano-优化GPU元编程代码,生成面向矩阵的优化的数学编译器
TensorFlow-用数据流图进行数值计算的开源软件库
yahmm-隐马尔可夫模型,用Cython实现
python-timbl-包装了完整的TiMBLC++编程接口.Timbl是一个精心制作的k最近邻机器学习工具包。
deap-进化算法框架
pydeep-Python深度学习
mlxtend-对数据科学和机器学习任务非常有用的工具库。
neon-高性能深度学习框架
Optunity-致力于自动化超参数优化过程,使用一个简单的,轻量级的API,以方便直接替换网格搜索。
Annoy-Approximatenearestneighboursimplementation
skflow-TensorFlow的简化界面,类似ScikitLearn.
TPOT-自动创建并利用geneticprogramming优化机器学习的管道。将它看作您的数据科学助理,自动化机器学习中大部分的枯燥工作。
数据分析、可视化
SciPy-APython-basedecosystemofopen-sourcesoftwareformathematics,science,andengineering.
NumPy-AfundamentalpackageforscientificcomputingwithPython.
Numba-PythonJIT(justintime)compliertoLLVMaimedatscientificPythonbythedevelopersofCythonandNumPy.
NetworkX-Ahigh-productivitysoftwareforcomplexnetworks.
Pandas-Alibraryprovidinghigh-performance,easy-to-usedatastructuresanddataanalysistools.
OpenMining-BusinessIntelligence(BI)inPython(Pandaswebinterface)
PyMC-MarkovChainMonteCarlosamplingtoolkit.
zipline-APythonicalgorithmictradinglibrary.
PyDy-ShortforPythonDynamics,usedtoassistwithworkflowinthemodelingofdynamicmotionbasedaroundNumPy,SciPy,IPython,andmatplotlib.
SymPy-APythonlibraryforsymbolicmathematics.
statsmodels-StatisticalmodelingandeconometricsinPython.
astropy-AcommunityPythonlibraryforAstronomy.
matplotlib-APython2Dplottinglibrary.
bokeh-InteractiveWebPlottingforPython.
plotly-CollaborativewebplottingforPythonandmatplotlib.
vincent-APythontoVegatranslator.
d3py-AplottlinglibraryforPython,basedonD3.js.
ggplot-SameAPIasggplot2forR.
ggfortify-Unifiedinterfacetoggplot2popularRpackages.
Kartograph.py-RenderingbeautifulSVGmapsinPython.
pygal-APythonSVGChartsCreator.
PyQtGraph-Apure-pythongraphicsandGUIlibrarybuiltonPyQt4/PySideandNumPy.
pycascading
Petrel-Toolsforwriting,submitting,debugging,andmonitoringStormtopologiesinpurePython.
Blaze-NumPyandPandasinterfacetoBigData.
emcee-ThePythonensemblesamplingtoolkitforaffine-invariantMCMC.
windML-APythonFrameworkforWindEnergyAnalysisandPrediction
vispy-GPU-basedhigh-performanceinteractiveOpenGL2D/3Ddatavisualizationlibrary
cerebro2Aweb-basedvisualizationanddebuggingplatformforNuPIC.
NuPICStudioAnall-in-oneNuPICHierarchicalTemporalMemoryvisualizationanddebuggingsuper-tool!
SparklingPandasPandasonPySpark(POPS)
Seaborn-Apythonvisualizationlibrarybasedonmatplotlib
bqplot-AnAPIforplottinginJupyter(IPython)
CommonLisp
通用机器学习
mgl-Neuralnetworks(boltzmannmachines,feed-forwardandrecurrentnets),GaussianProcesses
mgl-gpr-Evolutionaryalgorithms
cl-libsvm-Wrapperforthelibsvmsupportvectormachinelibrary
Clojure
自然语言处理
Clojure-openNLP-NaturalLanguageProcessinginClojure(opennlp)
Infections-clj-Rails-likeinflectionlibraryforClojureandClojureScript
通用机器学习
Touchstone-ClojureA/Btestinglibrary
Clojush-hePushprogramminglanguageandthePushGPgeneticprogrammingsystemimplementedinClojure
Infer-Inferenceandmachinelearninginclojure
Clj-ML-AmachinelearninglibraryforClojurebuiltontopofWekaandfriends
Encog-ClojurewrapperforEncog(v3)(Machine-Learningframeworkthatspecializesinneural-nets)
Fungp-AgeneticprogramminglibraryforClojure
Statistiker-BasicMachineLearningalgorithmsinClojure.
clortex-GeneralMachineLearninglibraryusingNumenta’sCorticalLearningAlgorithm
comportex-FunctionallycomposableMachineLearninglibraryusingNumenta’sCorticalLearningAlgorithm
数据分析、可视化
Incanter-IncanterisaClojure-based,R-likeplatformforstatisticalcomputingandgraphics.
PigPen-Map-ReduceforClojure.
Envision-ClojureDataVisualisationlibrary,basedonStatistikerandD3
Matlab
Contourlets-MATLABsourcecodethatimplementsthecontourlettransformanditsutilityfunctions.
Shearlets-MATLABcodeforshearlettransform
Curvelets-TheCurvelettransformisahigherdimensionalgeneralizationoftheWavelettransformdesignedtorepresentimagesatdifferentscalesanddifferentangles.
Bandlets-MATLABcodeforbandlettransform
mexopencv-CollectionandadevelopmentkitofMATLABmexfunctionsforOpenCVlibrary
自然语言处理
NLP-AnNLPlibraryforMatlab
通用机器学习
t-DistributedStochasticNeighborEmbedding-t-SNE是一个获奖的技术,可以降维,尤其适合高维数据可视化
Spider-Thespider有望成为matlab里机器学习中的完整的面向对象环境。
LibSVM-著名的支持向量机库。
LibLinear-ALibraryforLargeLinearClassification
Caffe-deeplearning框架,结构清晰,可读性好,速度快。
PatternRecognitionToolbox-Matlab机器学习中一个完整的面向对象的环境。
Optunity-Alibrarydedicatedtoautomatedhyperparameteroptimizationwithasimple,lightweightAPItofacilitatedrop-inreplacementofgridsearch.OptunityiswritteninPythonbutinterfacesseamlesslywithMATLAB.致力于自动化超参数优化的,一个简单的,轻量级的API库,方便直接替换网格搜索。Optunity是用Python编写的,但与MATLAB的无缝连接。
数据分析、可视化
matlab_gbl-MatlabBGLisaMatlabpackageforworkingwithgraphs.
gamic-Efficientpure-MatlabimplementationsofgraphalgorithmstocomplementMatlabBGL’smexfunctions.
.NET
OpenCVDotNet-AwrapperfortheOpenCVprojecttobeusedwith.NETapplications.
EmguCV-CrossplatformwrapperofOpenCVwhichcanbecompiledinMonotoerunonWindows,Linus,MacOSX,iOS,andAndroid.
AForge.NET-OpensourceC#frameworkfordevelopersandresearchersinthefieldsofComputerVisionandArtificialIntelligence.DevelopmenthasnowshiftedtoGitHub.
Accord.NET-TogetherwithAForge.NET,thislibrarycanprovideimageprocessingandcomputervisionalgorithmstoWindows,WindowsRTandWindowsPhone.SomecomponentsarealsoavailableforJavaandAndroid.
自然语言处理
Stanford.NLPfor.NET-AfullportofStanfordNLPpackagesto.NETandalsoavailableprecompiledasaNuGetpackage.
通用机器学习
Accord-Framework-一个完整的框架,可以用于机器学习,计算机视觉,computeraudition,信号处理,统计应用等。.
Accord.MachineLearning-SupportVectorMachines,DecisionTrees,NaiveBayesianmodels,K-means,GaussianMixturemodelsandgeneralalgorithmssuchasRansac,Cross-validationandGrid-Searchformachine-learningapplications.ThispackageispartoftheAccord.NETFramework.
DiffSharp-Anautomaticdifferentiation(AD)libraryprovidingexactandefficientderivatives(gradients,Hessians,Jacobians,directionalderivatives,andmatrix-freeHessian-andJacobian-vectorproducts)formachinelearningandoptimizationapplications.Operationscanbenestedtoanylevel,meaningthatyoucancomputeexacthigher-orderderivativesanddifferentiatefunctionsthatareinternallymakinguseofdifferentiation,forapplicationssuchashyperparameteroptimization.
Vulpes-DeepbeliefanddeeplearningimplementationwritteninF#andleveragesCUDAGPUexecutionwithAlea.cuBase.
Encog-Anadvancedneuralnetworkandmachinelearningframework.Encogcontainsclassestocreateawidevarietyofnetworks,aswellassupportclassestonormalizeandprocessdatafortheseneuralnetworks.Encogtrainsusingmultithreadedresilientpropagation.EncogcanalsomakeuseofaGPUtofurtherspeedprocessingtime.AGUIbasedworkbenchisalsoprovidedtohelpmodelandtrainneuralnetworks.
NeuralNetworkDesigner-DBMSmanagementsystemanddesignerforneuralnetworks.ThedesignerapplicationisdevelopedusingWPF,andisauserinterfacewhichallowsyoutodesignyourneuralnetwork,querythenetwork,createandconfigurechatbotsthatarecapableofaskingquestionsandlearningfromyourfeedback.Thechatbotscanevenscrapetheinternetforinformationtoreturnintheiroutputaswellastouseforlearning.
数据分析、可视化
numl-numlisamachinelearninglibraryintendedtoeasetheuseofusingstandardmodelingtechniquesforbothpredictionandclustering.
Math.NETNumerics-NumericalfoundationoftheMath.NETproject,aimingtoprovidemethodsandalgorithmsfornumericalcomputationsinscience,engineeringandeverydayuse.Supports.Net4.0,.Net3.5andMonoonWindows,LinuxandMac;Silverlight5,WindowsPhone/SL8,WindowsPhone8.1andWindows8withPCLPortableProfiles47and344;Android/iOSwithXamarin.
Sho-Shoisaninteractiveenvironmentfordataanalysisandscientificcomputingthatletsyouseamlesslyconnectscripts(inIronPython)withcompiledcode(in.NET)toenablefastandflexibleprototyping.Theenvironmentincludespowerfulandefficientlibrariesforlinearalgebraaswellasdatavisualizationthatcanbeusedfromany.NETlanguage,aswellasafeature-richinteractiveshellforrapiddevelopment.
Ruby
自然语言处理
Treat-TextREtrievalandAnnotationToolkit,definitelythemostcomprehensivetoolkitI’veencounteredsofarforRuby
RubyLinguistics-LinguisticsisaframeworkforbuildinglinguisticutilitiesforRubyobjectsinanylanguage.Itincludesagenericlanguage-independentfrontend,amoduleformappinglanguagecodesintolanguagenames,andamodulewhichcontainsvariousEnglish-languageutilities.
Stemmer-Exposelibstemmer_ctoRuby
RubyWordnet-ThislibraryisaRubyinterfacetoWordNet
Raspel-raspellisaninterfacebindingforruby
UEAStemmer-RubyportofUEALiteStemmer-aconservativestemmerforsearchandindexing
Twitter-text-rb-Alibrarythatdoesautolinkingandextractionofusernames,listsandhashtagsintweets
通用机器学习
RubyMachineLearning-SomeMachineLearningalgorithms,implementedinRuby
MachineLearningRuby
jRubyMahout-JRubyMahoutisagemthatunleashesthepowerofApacheMahoutintheworldofJRuby.
CardMagic-Classifier-AgeneralclassifiermoduletoallowBayesianandothertypesofclassifications.
数据分析、可视化
rsruby-Ruby-Rbridge
data-visualization-ruby-SourcecodeandsupportingcontentformyRubyManorpresentationonDataVisualisationwithRuby
ruby-plot-gnuplotwrapperforruby,especiallyforplottingroccurvesintosvgfiles
plot-rb-AplottinglibraryinRubybuiltontopofVegaandD3.
scruffy-AbeautifulgraphingtoolkitforRuby
SciRuby
Glean-Adatamanagementtoolforhumans
Bioruby
Arel
Misc
BigDataForChimps
Listof-Communitybaseddatacollection,packedingem.Getlistofprettymuchanything(stopwords,countries,nonwords)intxt,jsonorhash.Demo/Searchforalist
R
通用机器学习
ahaz-ahaz:Regularizationforsemiparametricadditivehazardsregression
arules-arules:MiningAssociationRulesandFrequentItemsets
bigrf-bigrf:BigRandomForests:ClassificationandRegressionForestsforLargeDataSets
bigRR-bigRR:GeneralizedRidgeRegression(withspecialadvantageforp>>ncases)
bmrm-bmrm:BundleMethodsforRegularizedRiskMinimizationPackage
Boruta-Boruta:Awrapperalgorithmforall-relevantfeatureselection
bst-bst:GradientBoosting
C50-C50:C5.0DecisionTreesandRule-BasedModels
caret-ClassificationandRegressionTraining:Unifiedinterfaceto>150MLalgorithmsinR.
caretEnsemble-caretEnsemble:Frameworkforfittingmultiplecaretmodelsaswellascreatingensemblesofsuchmodels.
CleverAlgorithmsForMachineLearning
CORElearn-CORElearn:Classification,regression,featureevaluationandordinalevaluation
CoxBoost-CoxBoost:Coxmodelsbylikelihoodbasedboostingforasinglesurvivalendpointorcompetingrisks
Cubist-Cubist:Rule-andInstance-BasedRegressionModeling
e1071-e1071:MiscFunctionsoftheDepartmentofStatistics(e1071),TUWien
earth-earth:MultivariateAdaptiveRegressionSplineModels
elasticnet-elasticnet:Elastic-NetforSparseEstimationandSparsePCA
ElemStatLearn-ElemStatLearn:Datasets,functionsandexamplesfromthebook:"TheElementsofStatisticalLearning,DataMining,Inference,andPrediction"byTrevorHastie,RobertTibshiraniandJeromeFriedmanPrediction"byTrevorHastie,RobertTibshiraniandJeromeFriedman
evtree-evtree:EvolutionaryLearningofGloballyOptimalTrees
fpc-fpc:Flexibleproceduresforclustering
frbs-frbs:FuzzyRule-basedSystemsforClassificationandRegressionTasks
GAMBoost-GAMBoost:Generalizedlinearandadditivemodelsbylikelihoodbasedboosting
gamboostLSS-gamboostLSS:BoostingMethodsforGAMLSS
gbm-gbm:GeneralizedBoostedRegressionModels
glmnet-glmnet:Lassoandelastic-netregularizedgeneralizedlinearmodels
glmpath-glmpath:L1RegularizationPathforGeneralizedLinearModelsandCoxProportionalHazardsModel
GMMBoost-GMMBoost:Likelihood-basedBoostingforGeneralizedmixedmodels
grplasso-grplasso:FittinguserspecifiedmodelswithGroupLassopenalty
grpreg-grpreg:Regularizationpathsforregressionmodelswithgroupedcovariates
h2o-Aframeworkforfast,parallel,anddistributedmachinelearningalgorithmsatscale--Deeplearning,Randomforests,GBM,KMeans,PCA,GLM
hda-hda:HeteroscedasticDiscriminantAnalysis
IntroductiontoStatisticalLearning
ipred-ipred:ImprovedPredictors
kernlab-kernlab:Kernel-basedMachineLearningLab
klaR-klaR:Classificationandvisualization
lars-lars:LeastAngleRegression,LassoandForwardStagewise
lasso2-lasso2:L1constrainedestimationaka‘lasso’
LiblineaR-LiblineaR:LinearPredictiveModelsBasedOnTheLiblinearC/C++Library
LogicReg-LogicReg:LogicRegression
MachineLearningForHackers
maptree-maptree:Mapping,pruning,andgraphingtreemodels
mboost-mboost:Model-BasedBoosting
medley-medley:Blendingregressionmodels,usingagreedystepwiseapproach
mlr-mlr:MachineLearninginR
mvpart-mvpart:Multivariatepartitioning
ncvreg-ncvreg:RegularizationpathsforSCAD-andMCP-penalizedregressionmodels
nnet-nnet:Feed-forwardNeuralNetworksandMultinomialLog-LinearModels
oblique.tree-oblique.tree:ObliqueTreesforClassificationData
pamr-pamr:Pam:predictionanalysisformicroarrays
party-party:ALaboratoryforRecursivePartytioning
partykit-partykit:AToolkitforRecursivePartytioning
penalized-penalized:L1(lassoandfusedlasso)andL2(ridge)penalizedestimationinGLMsandintheCoxmodel
penalizedLDA-penalizedLDA:PenalizedclassificationusingFisher’slineardiscriminant
penalizedSVM-penalizedSVM:FeatureSelectionSVMusingpenaltyfunctions
quantregForest-quantregForest:QuantileRegressionForests
randomForest-randomForest:BreimanandCutler’srandomforestsforclassificationandregression
randomForestSRC-randomForestSRC:RandomForestsforSurvival,RegressionandClassification(RF-SRC)
rattle-rattle:GraphicaluserinterfacefordatamininginR
rda-rda:ShrunkenCentroidsRegularizedDiscriminantAnalysis
rdetools-rdetools:RelevantDimensionEstimation(RDE)inFeatureSpaces
REEMtree-REEMtree:RegressionTreeswithRandomEffectsforLongitudinal(Panel)Data
relaxo-relaxo:RelaxedLasso
rgenoud-rgenoud:RversionofGENeticOptimizationUsingDerivatives
rgp-rgp:Rgeneticprogrammingframework
Rmalschains-Rmalschains:ContinuousOptimizationusingMemeticAlgorithmswithLocalSearchChains(MA-LS-Chains)inR
rminer-rminer:Simpleruseofdataminingmethods(e.g.NNandSVM)inclassificationandregression
ROCR-ROCR:Visualizingtheperformanceofscoringclassifiers
RoughSets-RoughSets:DataAnalysisUsingRoughSetandFuzzyRoughSetTheories
rpart-rpart:RecursivePartitioningandRegressionTrees
RPMM-RPMM:RecursivelyPartitionedMixtureModel
RSNNS-RSNNS:NeuralNetworksinRusingtheStuttgartNeuralNetworkSimulator(SNNS)
RWeka-RWeka:R/Wekainterface
RXshrink-RXshrink:MaximumLikelihoodShrinkageviaGeneralizedRidgeorLeastAngleRegression
sda-sda:ShrinkageDiscriminantAnalysisandCATScoreVariableSelection
SDDA-SDDA:StepwiseDiagonalDiscriminantAnalysis
SuperLearnerandsubsemble-Multi-algorithmensemblelearningpackages.
svmpath-svmpath:svmpath:theSVMPathalgorithm
tgp-tgp:BayesiantreedGaussianprocessmodels
tree-tree:Classificationandregressiontrees
varSelRF-varSelRF:Variableselectionusingrandomforests
XGBoost.R-RbindingforeXtremeGradientBoosting(Tree)Library
Optunity-Alibrarydedicatedtoautomatedhyperparameteroptimizationwithasimple,lightweightAPItofacilitatedrop-inreplacementofgridsearch.OptunityiswritteninPythonbutinterfacesseamlesslytoR.
数据分析、可视化
ggplot2-Adatavisualizationpackagebasedonthegrammarofgraphics.
Scala
自然语言处理
ScalaNLP-ScalaNLPisasuiteofmachinelearningandnumericalcomputinglibraries.
Breeze-BreezeisanumericalprocessinglibraryforScala.
Chalk-Chalkisanaturallanguageprocessinglibrary.
FACTORIE-FACTORIEisatoolkitfordeployableprobabilisticmodeling,implementedasasoftwarelibraryinScala.Itprovidesitsuserswithasuccinctlanguageforcreatingrelationalfactorgraphs,estimatingparametersandperforminginference.
数据分析、可视化
MLlibinApacheSpark-DistributedmachinelearninglibraryinSpark
Scalding-AScalaAPIforCascading
SummingBird-StreamingMapReducewithScaldingandStorm
Algebird-AbstractAlgebraforScala
xerial-DatamanagementutilitiesforScala
simmer-Reduceyourdata.Aunixfilterforalgebird-poweredaggregation.
PredictionIO-PredictionIO,amachinelearningserverforsoftwaredevelopersanddataengineers.
BIDMat-CPUandGPU-acceleratedmatrixlibraryintendedtosupportlarge-scaleexploratorydataanalysis.
WolfeDeclarativeMachineLearning
通用机器学习
Conjecture-ScalableMachineLearninginScalding
brushfire-DistributeddecisiontreeensemblelearninginScala
ganitha-scaldingpoweredmachinelearning
adam-AgenomicsprocessingengineandspecializedfileformatbuiltusingApacheAvro,ApacheSparkandParquet.Apache2licensed.
bioscala-BioinformaticsfortheScalaprogramminglanguage
BIDMach-CPUandGPU-acceleratedMachineLearningLibrary.
Figaro-aScalalibraryforconstructingprobabilisticmodels.
H2OSparklingWater-H2OandSparkinteroperability.
