--- title: "Basic Workflow" description: > This vignette describes the basic workflow of SHAPforxgboost. bibliography: "biblio.bib" link-citations: true output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Basic Workflow} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", warning = FALSE, message = FALSE, fig.width = 5, fig.height = 4 ) ``` ## Introduction This vignette shows the basic workflow of using `SHAPforxgboost` for interpretation of models trained with `XGBoost`, a hightly efficient gradient boosting implementation [@chen2016]. ```{r setup} library("ggplot2") library("SHAPforxgboost") library("xgboost") set.seed(9375) ``` ## Training the model Let's train a small model to predict the first column in the iris data set, namely `Sepal.Length`. ```{r} head(iris) X <- data.matrix(iris[, -1]) dtrain <- xgb.DMatrix(X, label = iris[[1]]) fit <- xgb.train( params = list( objective = "reg:squarederror", learning_rate = 0.1 ), data = dtrain, nrounds = 50 ) ``` ## SHAP analysis Now, we can prepare the SHAP values and analyze the results. All this in just very few lines of code! ```{r} # Crunch SHAP values shap <- shap.prep(fit, X_train = X) # SHAP importance plot shap.plot.summary(shap) # Alternatively, mean absolute SHAP values shap.plot.summary(shap, kind = "bar") # Dependence plots in decreasing order of importance # (colored by strongest interacting variable) for (x in shap.importance(shap, names_only = TRUE)) { p <- shap.plot.dependence( shap, x = x, color_feature = "auto", smooth = FALSE, jitter_width = 0.01, alpha = 0.4 ) + ggtitle(x) print(p) } ``` *Note: `print` is required only in the context of using `ggplot` in `rmarkdown` and for loop.* This is just a teaser: `SHAPforxgboost` can do much more! Check out the README for much more information. ## References