Using multiple classifier visualizers #829

amueller · 2019-05-02T02:10:19Z

I feel like I'm still missing something wrt the interface.
It looks like PrecisionRecallCurve and ROCAUC fit the model before plotting, and the convenience function does so as well.
How would I plot both of them for a given classifier?

Thanks!

bbengfort · 2019-05-03T13:46:27Z

This is still one of our primary pain points that we're trying to resolve (e.g. #297, #623, #498, visual pipelines, model reports, etc.) and we haven't quite figured it out yet, but this is absolutely on our radar to handle soon! Many of our visualizers do require access to the training data in order to make visualization decisions and we haven't had the opportunity to dive in depth into the check_is_fitted logic to see how that would work for visualizers.

There are a couple of workarounds that I do in practice, though they tend to be about specific visualizers. For example, in the case of PrecisionRecallCurve and ROCAUC - the PR curve needs access to fit, and ROCAUC does not, therefore you could simply bypass fit and go directly to score e.g.:

_, axes = plot.subplots(ncols=2)

prcurve = PrecisionRecallCurve(clf, ax=axes[0])
rocauc = ROCAUC(clf, ax=axes[1])

prcurve.fit(X_train, y_train)
prcurve.score(X_test, y_test)
rocauc.score(X_test, y_test)

prcurve.finalize()
rocauc.finalize()
plt.show()

Of course, this requires some knowledge of what fit in each visualizer does, so that's not ideal, nor a long term solution. Another hack that I use fairly often is to just create a fitted model wrapper (and yes, this is not another long term solution):

class FittedEstimator(object):

    def __init__(self, model):
        self.model = model

    def fit(self, X, y):
        return self

    def predict(self, X):
        return self.model.predict(X)

    def score(self, X, y):
        return self.model.score(X, y)

    def get_params(self):
        return self.model.get_params()

But potentially they may be helpful in the interim!

Our ideal solutions are:

Visualizers check if the model is fitted before fitting it using check_is_fitted
Visual pipelines compose multiple visualizers for one model evaluation

However, if you think that this is critical, we could add an is_fitted=True flag onto the ModelVisualizers and quick methods as a temporary solution.

amueller · 2019-05-03T17:13:46Z

Thanks for your input. Can you say why the PrecisionRecallCurve needs access to fit? I guess the current behavior is just not very coherent with my mental model of plotting. I don't see why you decided to include fitting into the plotting process in the first place. I'm trying to see if I can reuse some of yellowbrick for dabl but it seems easier to rewrite everything given the limitations of the API.

amueller · 2019-05-03T17:16:36Z

The interface I would use for any plotting function would be fitted_estimator, X_val, y_val. The main use-case I care about it not so much computing multiple metrics (though that's an important one), it's just plotting anything on a fitted estimator. I don't see why you would decide on the visualization before the fitting, or why you would decide to use a single visualizer.

amueller · 2019-05-03T18:02:09Z

I think this can be closed as a duplicate of #297. Thanks!

bbengfort added the type: question more information is required label May 3, 2019

amueller closed this as completed May 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Using multiple classifier visualizers #829

Using multiple classifier visualizers #829

amueller commented May 2, 2019

bbengfort commented May 3, 2019

Uh oh!

amueller commented May 3, 2019

Uh oh!

amueller commented May 3, 2019

Uh oh!

amueller commented May 3, 2019

Uh oh!

Uh oh!

Using multiple classifier visualizers #829

Using multiple classifier visualizers #829

Comments

amueller commented May 2, 2019

bbengfort commented May 3, 2019

Uh oh!

amueller commented May 3, 2019

Uh oh!

amueller commented May 3, 2019

Uh oh!

amueller commented May 3, 2019

Uh oh!