Parallel Feature Selection Based on MapReduce

Parallel Feature Selection Based on MapReduce

Zhanquan Sun

In this paper, a parallel feature selection method based on MapReduce model is proposed. Large-scale dataset is partitioned into sub-datasets. Feature selection is operated on each computational node. Selected feature variables are combined into one feature vector in Reduce job. The parallel feature selection method is scalable. The efficiency of the method is illustrated through example analysis.

Parallel Feature Selection Based on MapReduce

时间: 2024-08-23 16:55:48

Parallel Feature Selection Based on MapReduce的相关文章

Incremental Data Processing based on MapReduce

Incremental Data Processing based on MapReduce Cairong Yan  Xin Yang  Ze Yu  Min Li  Xiaolin Li IncMR framework is proposed in this paper for incrementally processing new data of a large data set Keywords:MapReduce,Incrementaldataprocessing,State,Dat

Andrew Ng机器学习公开课笔记 -- Regularization and Model Selection

网易公开课,第10,11课  notes,http://cs229.stanford.edu/notes/cs229-notes5.pdf   Model Selection 首先需要解决的问题是,模型选择问题,如何来平衡bais和variance来自动选择模型?比如对于多项式分类,如何决定阶数k,对于locally weighted regression如何决定窗口大小,对于SVM如何决定参数C  For instance, we might be using a polynomial reg

Deep Web and MapReduce

Deep Web and MapReduce Yufei Tao This invited paper introduces results on Web science and technology obtained during work with the Korea Advanced Institute of Science and Technology. In the first part, we discuss algorithms for exploring the  deep We

Massive Parallel Processing with Alibaba Cloud HybridDB for PostgreSQL

When you have massive amounts of data and the need for data analytics, or you have high availability requirements, or security and backup protocols to follow, services like Alibaba Cloud's HybridDB for PostgreSQL can come in handy. The service takes

Machine and Deep Learning with Python

Machine and Deep Learning with Python Education Tutorials and courses Supervised learning superstitions cheat sheet Introduction to Deep Learning with Python How to implement a neural network How to build and run your first deep learning network Neur

Awesome Machine Learning

  Awesome Machine Learning  A curated list of awesome machine learning frameworks, libraries and software (by language). Inspired by awesome-php. If you want to contribute to this list (please do), send me a pull request or contact me @josephmisiti A

R语言数据挖掘

数据分析与决策技术丛书 R语言数据挖掘 Learning Data Mining with R [哈萨克斯坦]贝特·麦克哈贝尔(Bater Makhabel) 著 李洪成 许金炜 段力辉 译 图书在版编目(CIP)数据 R语言数据挖掘 / (哈)贝特·麦克哈贝尔(Bater Makhabel)著:李洪成,许金炜,段力辉译. -北京:机械工业出版社,2016.9 (数据分析与决策技术丛书) 书名原文:Learning Data Mining with R ISBN 978-7-111-54769-

100个最受欢迎的机器学习课程视频

  26971 views, 1:00:45,  Gaussian Process Basics, David MacKay, 8 comments 7799 views, 3:08:32, Introduction to Machine Learning, Iain Murray 16092 views, 1:28:05, Introduction to Support Vector Machines, Colin Campbell, 22 comments 5755 views, 2:53:

【OH】Glossary Oracle词汇表(中)

Glossary [OH]Glossary Oracle词汇表(中) Oracle? Database Net Services Administrator's Guide 11g Release 2 (11.2) E41945-02 Glossary ● access control list (ACL) The group of access directives that you define. The directives grant levels of access to specif