This is some made up data for exploring the performance of boosting. The variables have been renamed and the class labels altered to remove any historical associations with some other data that we try to avoid using.
For both datasets, column names are x
, y
, and class
.
Format
An object of class tbl_df
(inherits from tbl
, data.frame
) with 506 rows and 3 columns.
An object of class tbl_df
(inherits from tbl
, data.frame
) with 150 rows and 3 columns.
Details
goodboost
is a case where stump-based boosting works well.
badboost
is a case where stump-based boosting works poorly
Examples
goodboost
#> # A tibble: 506 × 3
#> x y class
#> <dbl> <dbl> <fct>
#> 1 1.34 0.624 A
#> 2 1.41 1.15 A
#> 3 1.60 1.08 A
#> 4 1.70 1.24 A
#> 5 1.78 1.32 A
#> 6 1.86 1.42 A
#> 7 1.60 1.30 A
#> 8 1.58 1.51 A
#> 9 1.64 1.55 A
#> 10 1.69 1.61 A
#> # ℹ 496 more rows
badboost
#> # A tibble: 150 × 3
#> x y class
#> <dbl> <dbl> <fct>
#> 1 1.16 4.92 A
#> 2 1.65 1.05 B
#> 3 4.34 4.90 B
#> 4 2.11 4.65 B
#> 5 2.19 4.15 B
#> 6 4.57 3.40 B
#> 7 2.69 1.78 A
#> 8 0.102 1.04 B
#> 9 0.970 3.26 B
#> 10 1.33 3.46 B
#> # ℹ 140 more rows