This repository was archived by the owner on Jun 22, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 21
Home
Jakub edited this page Jun 25, 2018
·
24 revisions
- Preprocessing:
- drop constant columns
- drop duplicate columns
- drop columns where zero over % of time
- Feature Extraction:
- as is
- Model:
- lightGBM raw 1.39 CV 1.43 Public LB
- zero treated as missing
- Preprocessing:
- Feature Extraction:
- is_missing dummy table
- Model:
- lightGBM is_missing 1.6 CV 1.77 Public LB
- Preprocessing:
- Feature Extraction:
- row aggregation features mean/max/min//std/count_non_zero/fraction_non_zero
- Model:
- lightGBM agg 1.37 CV
- lightGBM raw + agg 1.44 CV
- Preprocessing:
- Feature Extraction:
- truncated svd projection
truncated_svd__n_components: 50
truncated_svd__n_iter: 10- pca projection
pca__n_components: 100- fast ica projection
fast_ica__n_components: 15- factor analysis
factor_analysis__n_components: 50- gaussian random projection
gaussian_random_projection__n_components: 100
gaussian_projection__eps: 0.1Note as it turns out the eps parameter doesn't matter (tried 0.01,0.1,1.0) with exact same results
- sparse random projection
sparse_random_projection__n_components: 50- Model:
- lightGBM truncated svd 1.56 CV
- lightGBM pca 1.55 CV
- lightGBM fast ica 1.57 CV
- lightGBM factor analysis 1.51 CV
- lightGBM gaussian random projection 1.63 CV
- lightGBM sparse random projection 1.47 CV
- lightGBM projections CV
- lightGBM raw + projections CV
- lightGBM raw + projections + aggregations CV
check our GitHub organization https://github.com/neptune-ml for more cool stuff 😃
Kamil & Kuba, core contributors
- honey bee 🐝 LightGBM and 5fold CV
- beetle 🪲 LightGBM on binarized dataset
- dromedary camel 🐪 LightGBM with row aggregations
- whale 🐳 LightGBM on dimension reduced dataset
- water buffalo 🐃 Exploring various dimension reduction techniques
- blowfish 🐡 bucketing row aggregations