Bin continuous data using winsorized method.

rbin_winsorize(data = NULL, response = NULL, predictor = NULL,
  bins = 10, winsor_rate = 0.05, min_val = NULL, max_val = NULL,
  include_na = TRUE)

# S3 method for rbin_winsorize
plot(x, ...)

Arguments

data

A data.frame or tibble.

response

Response variable.

predictor

Predictor variable.

bins

Number of bins.

winsor_rate

A value from 0.0 to 0.5.

min_val

the low border, all values being lower than this will be replaced by this value. The default is set to the 5 percent quantile of predictor.

max_val

the high border, all values being larger than this will be replaced by this value. The default is set to the 95 percent quantile of predictor.

include_na

logical; if TRUE, a separate bin is created for missing values.

x

An object of class rbin_winsorize.

...

further arguments passed to or from other methods.

Value

A tibble.

Examples

bins <- rbin_winsorize(mbank, y, age, 10, winsor_rate = 0.05) bins
#> Binning Summary #> ------------------------------ #> Method Winsorize #> Response y #> Predictor age #> Bins 10 #> Count 4521 #> Goods 517 #> Bads 4004 #> Entropy 0.51 #> Information Value 0.1 #> #> #> # A tibble: 10 x 7 #> cut_point bin_count good bad woe iv entropy #> <chr> <int> <int> <int> <dbl> <dbl> <dbl> #> 1 < 30.2 723 112 611 -0.350 0.0224 0.622 #> 2 < 33.4 567 55 512 0.184 0.00395 0.459 #> 3 < 36.6 573 58 515 0.137 0.00225 0.473 #> 4 < 39.8 497 44 453 0.285 0.00798 0.432 #> 5 < 43 396 37 359 0.225 0.00408 0.448 #> 6 < 46.2 461 43 418 0.227 0.00482 0.447 #> 7 < 49.4 281 22 259 0.419 0.00927 0.396 #> 8 < 52.6 309 32 277 0.111 0.000811 0.480 #> 9 < 55.8 244 25 219 0.123 0.000781 0.477 #> 10 >= 55.8 470 89 381 -0.593 0.0456 0.700
# plot plot(bins)