HOME > 상세정보

상세정보

Hands-on machine learning with R

Hands-on machine learning with R (9회 대출)

자료유형
단행본
개인저자
Boehmke, Brad. Greenwell, Brandon M.
서명 / 저자사항
Hands-on machine learning with R / Brad Boehmke, Brandon Greenwell.
발행사항
Boca Raton :   CRC Press,   c2020.  
형태사항
xxiv, 459 p. : ill. (some col.) ; 24 cm.
총서사항
Chapman & Hall/CRC the R series
ISBN
9781138495685 (hardback) 9780367418298 (paperback) 9780367816377 (pdf)
요약
"This book is designed to introduce the concept of advanced business analytic approaches and would the first to cover the gamut of how to use the R programming language to apply descriptive, predictive, and prescriptive analytic methodologies for problem solving"--
서지주기
Includes bibliographical references (p. 443-456) and index.
일반주제명
Machine learning. R (Computer program language).
000 00000cam u2200205 a 4500
001 000046010953
005 20200103114022
008 191231s2020 flua b 001 0 eng d
010 ▼a 2019029574
020 ▼a 9781138495685 (hardback)
020 ▼a 9780367418298 (paperback)
020 ▼a 9780367816377 (pdf)
035 ▼a (KERIS)REF000019151370
040 ▼a DLC ▼b eng ▼e rda ▼c DLC ▼d 211009
050 0 0 ▼a Q325.5 ▼b .B59 2019
082 0 0 ▼a 006.3/1 ▼2 23
084 ▼a 006.31 ▼2 DDCK
090 ▼a 006.31 ▼b B671h
100 1 ▼a Boehmke, Brad.
245 1 0 ▼a Hands-on machine learning with R / ▼c Brad Boehmke, Brandon Greenwell.
260 ▼a Boca Raton : ▼b CRC Press, ▼c c2020.
300 ▼a xxiv, 459 p. : ▼b ill. (some col.) ; ▼c 24 cm.
490 1 ▼a Chapman & Hall/CRC the R series
504 ▼a Includes bibliographical references (p. 443-456) and index.
520 ▼a "This book is designed to introduce the concept of advanced business analytic approaches and would the first to cover the gamut of how to use the R programming language to apply descriptive, predictive, and prescriptive analytic methodologies for problem solving"-- ▼c Provided by publisher.
650 0 ▼a Machine learning.
650 0 ▼a R (Computer program language).
700 1 ▼a Greenwell, Brandon M.
830 0 ▼a Chapman & Hall/CRC the R series.
945 ▼a KLPA

소장정보

No. 소장처 청구기호 등록번호 도서상태 반납예정일 예약 서비스
No. 1 소장처 중앙도서관/서고6층/ 청구기호 006.31 B671h 등록번호 111821127 (9회 대출) 도서상태 대출가능 반납예정일 예약 서비스 B M

컨텐츠정보

책소개

Hands-on Machine Learning with R provides a practical and applied approach to learning and developing intuition into today's most popular machine learning methods. This book serves as a practitioner's guide to the machine learning process and is meant to help the reader learn to apply the machine learning stack within R, which includes using various R packages such as glmnet, h2o, ranger, xgboost, keras, and others to effectively model and gain insight from their data. The book favors a hands-on approach, providing an intuitive understanding of machine learning concepts through concrete examples and just a little bit of theory. 

Throughout this book, the reader will be exposed to the entire machine learning process including feature engineering, resampling, hyperparameter tuning, model evaluation, and interpretation. The reader will be exposed to powerful algorithms such as regularized regression, random forests, gradient boosting machines, deep learning, generalized low rank models, and more By favoring a hands-on approach and using real word data, the reader will gain an intuitive understanding of the architectures and engines that drive these algorithms and packages, understand when and how to tune the various hyperparameters, and be able to interpret model results. By the end of this book, the reader should have a firm grasp of R's machine learning stack and be able to implement a systematic approach for producing high quality modeling results.

Features:

  • Offers a practical and applied introduction to the most popular machine learning methods.
  • Takes readers through the entire modeling process; from data prep to hyperparameter tuning, model evaluation, and interpretation.
  • Introduces readers to a wide variety of packages that make up R's machine learning stack.
  • Uses a hands-on approach and real world data.

Brad Boehmke is a data scientist at 84.51 where he wears both software developer and machine learning engineer hats. He is an Adjunct Professor at the University of Cincinnati, author of Data Wrangling with R, and creator of multiple public and private enterprise R packages.

Brandon Greenwell is a data scientist at 84.51 where he works on a diverse team to enable, empower, and encourage others to successfully apply machine learning to solve real business problems. He's part of the Adjunct Graduate Faculty at Wright State University, an Adjunct Instructor at the University of Cincinnati, and the author of several R packages available on CRAN.


정보제공 : Aladin

목차

FUNDAMENTALS


Introduction to Machine Learning


Supervised learning


Regression problems


Classification problems


Unsupervised learning


Roadmap


The data sets


Modeling Process


Prerequisites


Data splitting


Simple random sampling


Stratified sampling


Class imbalances


Creating models in R


Many formula interfaces


Many engines


Resampling methods


Contents


k-fold cross validation


Bootstrapping


Alternatives


Bias variance trade-off


Bias


Variance


Hyperparameter tuning


Model evaluation


Regression models


Classification models


Putting the processes together


Feature & Target Engineering


Prerequisites


Target engineering


Dealing with missingness


Visualizing missing values


Imputation


Feature filtering


Numeric feature engineering


Skewness


Standardization


Categorical feature engineering


Lumping


One-hot & dummy encoding


Label encoding


Alternatives


Dimension reduction


Proper implementation


Sequential steps


Data leakage


Putting the process together


Contents v


SUPERVISED LEARNING


Linear Regression


Prerequisites


Simple linear regression


Estimation


Inference


Multiple linear regression


Assessing model accuracy


Model concerns


Principal component regression


Partial least squares


Feature interpretation


Final thoughts


Logistic Regression


Prerequisites


Why logistic regression


Simple logistic regression


Multiple logistic regression


Assessing model accuracy


Model concerns


Feature interpretation


Final thoughts


Regularized Regression


Prerequisites


Why regularize?


Ridge penalty


Lasso penalty


Elastic nets


Implementation


vi Contents


Tuning


Feature interpretation


Attrition data


Final thoughts


Multivariate Adaptive Regression Splines


Prerequisites


The basic idea


Multivariate regression splines


Fitting a basic MARS model


Tuning


Feature interpretation


Attrition data


Final thoughts


K-Nearest Neighbors


Prerequisites


Measuring similarity


Distance measures


Pre-processing


Choosing k


MNIST example


Final thoughts


Decision Trees


Prerequisites


Structure


Partitioning


How deep?


Early stopping


Pruning


Ames housing example


Contents vii


Feature interpretation


Final thoughts


Bagging


Prerequisites


Why and when bagging works


Implementation


Easily parallelize


Feature interpretation


Final thoughts


Random Forests


Prerequisites


Extending bagging


Out-of-the-box performance


Hyperparameters


Number of trees


mtry


Tree complexity


Sampling scheme


Split rule


Tuning strategies


Feature interpretation


Final thoughts


Gradient Boosting


Prerequisites


How boosting works


A sequential ensemble approach


Gradient descent


Basic GBM


Hyperparameters


viii Contents


Implementation


General tuning strategy


Stochastic GBMs


Stochastic hyperparameters


Implementation


XGBoost


XGBoost hyperparameters


Tuning strategy


Feature interpretation


Final thoughts


Deep Learning


Prerequisites


Why deep learning


Feedforward DNNs


Network architecture


Layers and nodes


Activation


Backpropagation


Model training


Model tuning


Model capacity


Batch normalization


Regularization


Adjust learning rate


Grid Search


Final thoughts


Contents ix


Support Vector Machines


Prerequisites


Optimal separating hyperplanes


The hard margin classifier


The soft margin classifier


The support vector machine


More than two classes


Support vector regression


Job attrition example


Class weights


Class probabilities


Feature interpretation


Final thoughts


Stacked Models


Prerequisites


The Idea


Common ensemble methods


Super learner algorithm


Available packages


Stacking existing models


Stacking a grid search


Automated machine learning


Final thoughts


Interpretable Machine Learning


Prerequisites


The idea


Global interpretation


Local interpretation


Model-specific vs. model-agnostic


x Contents


Permutation-based feature importance


Concept


Implementation


Partial dependence


Concept


Implementation


Alternative uses


Individual conditional expectation


Concept


Implementation


Feature interactions


Concept


Implementation


Alternatives


Local interpretable model-agnostic explanations


Concept


Implementation


Tuning


Alternative uses


Shapley values


Concept


Implementation


XGBoost and built-in Shapley values


Localized step-wise procedure


Concept


Implementation


Final thoughts


DIMENSION REDUCTION


Contents xi


Principal Components Analysis


Prerequisites


The idea


Finding principal components


Performing PCA in R


Selecting the number of principal components


Eigenvalue criterion


Proportion of variance explained criterion


Scree plot criterion


Final thoughts


Generalized Low Rank Models


Prerequisites


The idea


Finding the lower ranks


Alternating minimization


Loss functions


Regularization


Selecting k


Fitting GLRMs in R


Basic GLRM model


Tuning to optimize for unseen data


Final thoughts


Autoencoders


Prerequisites


Undercomplete autoencoders


Comparing PCA to an autoencoder


Stacked autoencoders


Visualizing the reconstruction


Sparse autoencoders


xii Contents


Denoising autoencoders


Anomaly detection


Final thoughts


CLUSTERING


K-means Clustering


Prerequisites


Distance measures


Defining clusters


k-means algorithm


Clustering digits


How many clusters?


Clustering with mixed data


Alternative partitioning methods


Final thoughts


Hierarchical Clustering


Prerequisites


Hierarchical clustering algorithms


Hierarchical clustering in R


Agglomerative hierarchical clustering


Divisive hierarchical clustering


Determining optimal clusters


Working with dendrograms


Final thoughts


Model-based Clustering


Prerequisites


Measuring probability and uncertainty


Covariance types


Model selection


My basket example


Final thoughts

관련분야 신착자료

Dyer-Witheford, Nick (2026)
양성봉 (2025)