|
69 | 69 | # - Training time (sec)
|
70 | 70 | #
|
71 | 71 | # * - Linear method (one-vs-rest)
|
72 |
| -# - 0.5171960144875225 |
73 |
| -# - 4.327306747436523 |
| 72 | +# - 0.52 |
| 73 | +# - 4.33 |
74 | 74 | #
|
75 | 75 | # * - Deep learning method (BERT)
|
76 |
| -# - 0.564618763137536 |
77 |
| -# - 5412.955321788788 |
| 76 | +# - 0.56 |
| 77 | +# - 5412.96 |
78 | 78 | #
|
79 | 79 | # Step 2. Training:
|
80 | 80 | # -----------------
|
|
120 | 120 | # - Macro-F1
|
121 | 121 | #
|
122 | 122 | # * - One-vs-rest
|
123 |
| -# - 0.5171960144875225 |
| 123 | +# - 0.52 |
124 | 124 | #
|
125 | 125 | # * - Thresholding
|
126 |
| -# - 0.5643407144065415 |
| 126 | +# - 0.56 |
127 | 127 | #
|
128 | 128 | # * - Cost-sensitive
|
129 |
| -# - 0.5704056980791481 |
| 129 | +# - 0.57 |
130 | 130 | #
|
131 | 131 | # From the comparison, one can see that these techniques improves the naive method.
|
132 | 132 | #
|
|
139 | 139 | # Training models directly in this case may result in high runtime and space consumption.
|
140 | 140 | # A solution to reduce these costs is to utilize tree-based models.
|
141 | 141 | # Here we provide an example comparing a linear one-vs-rest model and a tree model on the EUR-Lex-57k dataset, which has a larger label space.
|
142 |
| -# We start by training a tree model following another detailed `tutorial <../auto_examples/plot_linear_tree_tutorial.html>`__. |
| 142 | +# We start by training a tree model following the `linear tree tutorial <../auto_examples/plot_linear_tree_tutorial.html>`__. |
143 | 143 |
|
144 | 144 | datasets_eurlex = linear.load_dataset("txt", "data/eurlex57k/train.txt", "data/eurlex57k/test.txt")
|
145 | 145 | preprocessor_eurlex = linear.Preprocessor()
|
|
168 | 168 | #
|
169 | 169 | # It is clear that the tree model significantly improves efficiency.
|
170 | 170 | # As for deep learning, a similar improvement in efficiency can be observed.
|
171 |
| -# Details for the tree-based deep learning model can be found in this `tutorial <../tutorials/AttentionXML.html>`__. |
| 171 | +# Details for the tree-based deep learning model can be found in the `deep learning tree tutorial <../tutorials/AttentionXML.html>`__. |
172 | 172 | #
|
173 | 173 | # Step 3. Evaluation: Pick Suitable Metrics
|
174 | 174 | # -----------------------------------------
|
|
203 | 203 | # -----------------------------
|
204 | 204 | # Models with suboptimal hyperparameters may lead to poor performance :cite:p:`JJL21a`.
|
205 | 205 | # Users can incorporate hyperparameter tuning into the training process.
|
206 |
| -# Because this functionality is more complex and cannot be adequately demonstrated within a code snippet, please refer to these two tutorials for more details about hyperparameter tuning (`linear <../auto_examples/plot_gridsearch_tutorial.html>`_ |
207 |
| -# and `deep learning <../tutorials/Parameter_Selection_for_Neural_Networks.html>`_). |
| 206 | +# Because this functionality is more complex and cannot be adequately demonstrated within a code snippet, please refer to these two tutorials for more details about hyperparameter tuning (`linear <../auto_examples/plot_linear_gridsearch_tutorial.html>`_ and `deep learning <../tutorials/Parameter_Selection_for_Neural_Networks.html>`_). |
208 | 207 | # Another thing to consider is that hyperparameter search can be time-consuming, especially in the case of deep learning.
|
209 | 208 | # Users need to conduct this step with consideration of the available resources and time.
|
210 | 209 | #
|
|
214 | 213 | # To use as much information as possible, for linear methods, after determining the best hyperparameters, all available data are generally trained under these optimal hyperparameters to obtain the final model.
|
215 | 214 | # We refer to this as the "retrain" strategy.
|
216 | 215 | #
|
217 |
| -# For linear methods, the `tutorial <../auto_examples/plot_gridsearch_tutorial.html>`__ for hyperparameter search already handles retraining by default. |
| 216 | +# For linear methods, the `tutorial <../auto_examples/plot_linear_gridsearch_tutorial.html>`_ for hyperparameter search already handles retraining by default. |
218 | 217 | # As for deep learning, since this additional step is not common in practice, we include it in the last section of this `tutorial <../tutorials/Parameter_Selection_for_Neural_Networks.html>`__.
|
219 | 218 | #
|
220 | 219 | # Step 6. Prediction
|
|
0 commit comments