Skip to contents

This is some made up data for exploring the performance of boosting. The variables have been renamed and the class labels altered to remove any historical associations with some other data that we try to avoid using.

For both datasets, column names are x, y, and class.

Usage

goodboost

badboost

Format

An object of class tbl_df (inherits from tbl, data.frame) with 506 rows and 3 columns.

An object of class tbl_df (inherits from tbl, data.frame) with 150 rows and 3 columns.

Details

goodboost is a case where stump-based boosting works well.

badboost is a case where stump-based boosting works poorly

Examples

goodboost
#> # A tibble: 506 × 3
#>        x     y class
#>    <dbl> <dbl> <fct>
#>  1  1.34 0.624 A    
#>  2  1.41 1.15  A    
#>  3  1.60 1.08  A    
#>  4  1.70 1.24  A    
#>  5  1.78 1.32  A    
#>  6  1.86 1.42  A    
#>  7  1.60 1.30  A    
#>  8  1.58 1.51  A    
#>  9  1.64 1.55  A    
#> 10  1.69 1.61  A    
#> # ℹ 496 more rows

badboost
#> # A tibble: 150 × 3
#>        x     y class
#>    <dbl> <dbl> <fct>
#>  1 1.16   4.92 A    
#>  2 1.65   1.05 B    
#>  3 4.34   4.90 B    
#>  4 2.11   4.65 B    
#>  5 2.19   4.15 B    
#>  6 4.57   3.40 B    
#>  7 2.69   1.78 A    
#>  8 0.102  1.04 B    
#>  9 0.970  3.26 B    
#> 10 1.33   3.46 B    
#> # ℹ 140 more rows