Skip to content

On behavior of lbjava when dealing with real valued features.

Daniel Khashabi edited this page Feb 14, 2017 · 17 revisions

There was speculations about lbjava failing when using real-valued features.

Thanks to @Slash0BZ (Ben Zhou) we did a comprehensive experiments, running a few example problems, with different number of real-valued features, across different algorithms, and different number of iterations.

NewsGroup (table for single real feature)

Condition\Algorithm SparseAveragedPerceptron SparseWinnow PassiveAggresive SparseConfidenceWeighted BinaryMIRA
1 round w/o real features 48.916 92.597 19.038 33.739
1 round w/ real features 47.753 92.491 23.268 32.364
10 rounds w/o real features 82.390 91.539 24.802 76.891
10 rounds w/ real features 82.126 91.529 12.427 75.939
50 rounds w/o real features 84.823 91.592 14.120 77.208
50 rounds w/ real features 85.299 91.433 19.566 76.891
100 rounds w/o real features 85.828 91.433 12.956 76.574
100 rounds w real features 84.770 91.486 15.442 61.026

NewsGroup (table for the same amount of Gaussian random real features as discrete ones)

Condition\Algorithm SparseAveragedPerceptron SparseWinnow PassiveAggresive BinaryMIRA
1 round w/o real features 51.454 92.597 12.057 33.739
1 round w/ real features 17.980 6.081 14.913 14.225
10 rounds w/o real features 82.813 91.539 22.369 76.891
10 rounds w/ real features 52.829 42.517 45.743
50 rounds w/o real features 84.294 91.592 21.100 77.208
50 rounds w/ real features 75.727 67.054 75.198
100 rounds w/o real features 85.506 91.433 17.768 76.574
100 rounds w real features 77.631 74.828 74.194

Badges (table for single real feature)

Condition\Algorithm SparsePerceptron SparseWinnow NaiveBayes
1 round w/o real features 100.0 95.745 100.0
1 round w/ real features 100.0 95.745 100.0
10 rounds w/o real features 100.0 100.0 100.0
10 rounds w/ real features 100.0 100.0 100.0
50 rounds w/o real features 100.0 100.0 100.0
50 rounds w/ real features 100.0 100.0 100.0
100 rounds w/o real features 100.0 100.0 100.0
100 rounds w real features 100.0 100.0 100.0

Badges (table for same amount of constant real features as discrete features)

Condition\Algorithm SparsePerceptron SparseWinnow NaiveBayes
1 round w/o real features 100.0 95.745 100.0
1 round w/ real features 74.468 100.0 100.0
10 rounds w/o real features 100.0 100.0 100.0
10 rounds w/ real features 78.723 100.0 100.0
50 rounds w/o real features 100.0 100.0 100.0
50 rounds w/ real features 100.0 100.0 100.0
100 rounds w/o real features 100.0 100.0 100.0
100 rounds w real features 100.0 100.0 100.0

Badges (table for same amount of of random Gaussian real features as discrete features)

Condition\Algorithm SparsePerceptron SparseWinnow NaiveBayes
1 round w/o real features 100.0 95.745 100.0
1 round w/ real features 55.319 56.383 100.0
10 rounds w/o real features 100.0 100.0 100.0
10 rounds w/ real features 62.766 100.0 100.0
50 rounds w/o real features 100.0 100.0 100.0
50 rounds w/ real features 74.468 87.234 100.0
100 rounds w/o real features 100.0 100.0 100.0
100 rounds w real features 86.170 100.0 100.0

The conclusion made here is that, as more number of real-valued features are added, more training iterations are need to train the system.

Clone this wiki locally