# # caret

`caret`

is an R package that aids in data processing needed for machine learning problems. It stands for classification and regression training. When building models for a real dataset, there are some tasks other than the actual learning algorithm that need to be performed, such as cleaning the data, dealing with incomplete observations, validating our model on a test set, and compare different models.

`caret`

helps in these scenarios, independent of the actual learning algorithms used.

## # Preprocessing

Pre-processing in caret is done through the `preProcess()`

function. Given a matrix or data frame type object `x`

, `preProcess()`

applies transformations on the training data which can then be applied to testing data.

The heart of the `preProcess()`

function is the `method`

argument. Method operations are applied in this order:

- Zero-variance filter
- Near-zero variance filter
- Box-Cox/Yeo-Johnson/exponential transformation
- Centering
- Scaling
- Range
- Imputation
- PCA
- ICA
- Spatial Sign

Below, we take the mtcars data set and perform centering, scaling, and a spatial sign transform.

```
auto_index <- createDataPartition(mtcars$mpg, p = .8,
list = FALSE,
times = 1)
mt_train <- mtcars[auto_index,]
mt_test <- mtcars[-auto_index,]
process_mtcars <- preProcess(mt_train, method = c("center","scale","spatialSign"))
mtcars_train_transf <- predict(process_mtcars, mt_train)
mtcars_test_tranf <- predict(process_mtcars,mt_test)
```