@@ -3,31 +3,31 @@ Deployment
33
44When deploying your dashboard it is better not to use the built-in flask
55development server but use a more robust production server like ``gunicorn `` or ``waitress ``.
6- Probably `gunicorn <https://gunicorn.org/ >`_ is a bit more fully featured and
6+ Probably `gunicorn <https://gunicorn.org/ >`_ is a bit more fully featured and
77faster but only works on unix/linux/osx, whereas
8- `waitress <https://docs.pylonsproject.org/projects/waitress/en/stable/ >`_ also works
9- on Windows and has very minimal dependencies.
8+ `waitress <https://docs.pylonsproject.org/projects/waitress/en/stable/ >`_ also works
9+ on Windows and has very minimal dependencies.
1010
11- Install with either ``pip install gunicorn `` or ``pip install waitress ``.
11+ Install with either ``pip install gunicorn `` or ``pip install waitress ``.
1212
1313Storing explainer and running default dashboard with gunicorn
1414=============================================================
1515
16- Before you start a dashboard with gunicorn you need to store both the explainer
16+ Before you start a dashboard with gunicorn you need to store both the explainer
1717instance and and a configuration for the dashboard::
1818
1919 from explainerdashboard import ClassifierExplainer, ExplainerDashboard
2020
2121 explainer = ClassifierExplainer(model, X, y)
22- db = ExplainerDashboard(explainer, title="Cool Title", shap_interaction=False)
22+ db = ExplainerDashboard(explainer, title="Cool Title", shap_interaction=False)
2323 db.to_yaml("dashboard.yaml", explainerfile="explainer.joblib", dump_explainer=True)
2424
2525Now you re-load your dashboard and expose a flask server as ``app `` in ``dashboard.py ``::
2626
2727 from explainerdashboard import ExplainerDashboard
2828
2929 db = ExplainerDashboard.from_config("dashboard.yaml")
30- app = db.flask_server()
30+ app = db.flask_server()
3131
3232
3333.. highlight :: bash
@@ -36,13 +36,13 @@ If you named the file above ``dashboard.py``, you can now start the gunicorn ser
3636
3737 $ gunicorn dashboard:app
3838
39- If you want to run the server server with for example three workers, binding to
39+ If you want to run the server server with for example three workers, binding to
4040port ``8050 `` you launch gunicorn with::
4141
4242 $ gunicorn -w 3 -b localhost:8050 dashboard:app
4343
44- If you now point your browser to ``http://localhost:8050 `` you should see your dashboard.
45- Next step is finding a nice url in your organization's domain, and forwarding it
44+ If you now point your browser to ``http://localhost:8050 `` you should see your dashboard.
45+ Next step is finding a nice url in your organization's domain, and forwarding it
4646to your dashboard server.
4747
4848With waitress you would call::
@@ -70,19 +70,19 @@ You need to pass the Flask ``server`` instance and the ``url_base_pathname`` to
7070under ``db.app.index ``::
7171
7272 from flask import Flask
73-
73+
7474 app = Flask(__name__)
7575
7676 [...]
77-
77+
7878 db = ExplainerDashboard(explainer, server=app, url_base_pathname="/dashboard/")
7979
8080 @app.route('/dashboard')
8181 def return_dashboard():
8282 return db.app.index()
8383
8484
85- .. highlight :: bash
85+ .. highlight :: bash
8686
8787Now you can start the dashboard by::
8888
@@ -95,12 +95,12 @@ Deploying to heroku
9595===================
9696
9797In case you would like to deploy to `heroku <www.heroku.com >`_ (which is normally
98- the simplest option for dash apps, see
99- `dash instructions here <https://dash.plotly.com/deployment >`_). The demonstration
98+ the simplest option for dash apps, see
99+ `dash instructions here <https://dash.plotly.com/deployment >`_). The demonstration
100100dashboard is also hosted on heroku at `titanicexplainer.herokuapp.com <http://titanicexplainer.herokuapp.com >`_.
101101
102- In order to deploy the heroku there are a few things to keep in mind. First of
103- all you need to add ``explainerdashboard `` and ``gunicorn `` to
102+ In order to deploy the heroku there are a few things to keep in mind. First of
103+ all you need to add ``explainerdashboard `` and ``gunicorn `` to
104104``requirements.txt `` (pinning is recommended to force a new build of your environment
105105whenever you upgrade versions)::
106106
@@ -112,8 +112,8 @@ your explainer in ``runtime.txt``::
112112
113113 python-3.8.6
114114
115- (supported versions as of this writing are ``python-3.9.0 ``, ``python-3.8.6 ``,
116- ``python-3.7.9 `` and ``python-3.6.12 ``, but check the
115+ (supported versions as of this writing are ``python-3.9.0 ``, ``python-3.8.6 ``,
116+ ``python-3.7.9 `` and ``python-3.6.12 ``, but check the
117117`heroku documentation <https://devcenter.heroku.com/articles/python-support#supported-runtimes >`_
118118for the latest)
119119
@@ -126,10 +126,10 @@ And you need to tell heroku how to start your server in ``Procfile``::
126126Graphviz buildpack
127127------------------
128128
129- If you want to visualize individual trees inside your ``RandomForest `` or ``xgboost ``
129+ If you want to visualize individual trees inside your ``RandomForest `` or ``xgboost ``
130130model using the ``dtreeviz `` package you will
131131need to make sure that ``graphviz `` is installed on your ``heroku `` dyno by
132- adding the following buildstack (as well as the ``python `` buildpack):
132+ adding the following buildstack (as well as the ``python `` buildpack):
133133``https://github.com/weibeld/heroku-buildpack-graphviz.git ``
134134
135135(you can add buildpacks through the "settings" page of your heroku project)
@@ -150,11 +150,17 @@ E.g. **generate_dashboard.py**::
150150 X_train, y_train, X_test, y_test = titanic_survive()
151151 model = RandomForestClassifier(n_estimators=50, max_depth=5).fit(X_train, y_train)
152152
153- explainer = ClassifierExplainer(model, X_test, y_test,
153+ explainer = ClassifierExplainer(model, X_test, y_test,
154154 cats=["Sex", 'Deck', 'Embarked'],
155155 labels=['Not Survived', 'Survived'],
156156 descriptions=feature_descriptions)
157157
158+ # For sklearn/imblearn pipeline models you can alternatively use:
159+ # explainer = ClassifierExplainer(
160+ # pipeline_model, X_test, y_test,
161+ # strip_pipeline_prefix=True,
162+ # auto_detect_pipeline_cats=True)
163+
158164 db = ExplainerDashboard(explainer)
159165 db.to_yaml("dashboard.yaml", explainerfile="explainer.joblib", dump_explainer=True)
160166
@@ -193,45 +199,45 @@ Reducing memory usage
193199
194200If you deploy the dashboard with a large dataset with a large number of rows (``n ``)
195201and a large number of columns (``m ``),
196- it can use up quite a bit of memory: the dataset itself, shap values,
202+ it can use up quite a bit of memory: the dataset itself, shap values,
197203shap interaction values and any other calculated properties are alle kept in
198204memory in order to make the dashboard responsive. You can check the (approximate)
199205memory usage with ``explainer.memory_usage() ``. In order to reduce the memory
200206footprint there are a number of things you can do:
201207
2022081. Not including shap interaction tab.
203- Shap interaction values are shape ``n*m*m ``, so can take a subtantial amount
204- of memory, especially if you have a significant amount of columns ``m ``.
205- 2. Setting a lower precision.
209+ Shap interaction values are shape ``n*m*m ``, so can take a subtantial amount
210+ of memory, especially if you have a significant amount of columns ``m ``.
211+ 2. Setting a lower precision.
206212 By default shap values are stored as ``'float64' ``,
207213 but you can store them as ``'float32' `` instead and save half the space:
208- ```ClassifierExplainer(model, X_test, y_test, precision='float32')` ``. You
214+ ```ClassifierExplainer(model, X_test, y_test, precision='float32')` ``. You
209215 can also set a lower precision on your ``X_test `` dataset yourself ofcourse.
2102163. Drop non-positive class shap values.
211217 For multi class classifiers, by default ``ClassifierExplainer `` calculates
212218 shap values for all classes. If you are only interested in a single class
213219 you can drop the other shap values with ``explainer.keep_shap_pos_label_only(pos_label) ``
214- 4. Storing row data externally and loading on the fly.
220+ 4. Storing row data externally and loading on the fly.
215221 You can for example only store a subset of ``10.000 `` rows in
216222 the ``explainer `` itself (enough to generate representative importance and dependence plots),
217- and store the rest of your millions of rows of input data in an external file
223+ and store the rest of your millions of rows of input data in an external file
218224 or database that get loaded one by one with the following functions:
219225
220- - with ``explainer.set_X_row_func() `` you can set a function that takes
226+ - with ``explainer.set_X_row_func() `` you can set a function that takes
221227 an `index ` as argument and returns a single row dataframe with model
222228 compatible input data for that index. This function can include a query
223- to a database or fileread.
224- - with ``explainer.set_y_func() `` you can set a function that takes
229+ to a database or fileread.
230+ - with ``explainer.set_y_func() `` you can set a function that takes
225231 and `index ` as argument and returns the observed outcome ``y `` for
226232 that index.
227- - with ``explainer.set_index_list_func() `` you can set a function
233+ - with ``explainer.set_index_list_func() `` you can set a function
228234 that returns a list of available indexes that can be queried.
229-
230- If the number of indexes is too long to fit in a dropdown you can pass
235+
236+ If the number of indexes is too long to fit in a dropdown you can pass
231237 ``index_dropdown=False `` which turns the dropdowns into free text fields.
232- Instead of an ``index_list_func `` you can also set an
238+ Instead of an ``index_list_func `` you can also set an
233239 ``explainer.set_index_check_func(func) `` which should return a bool whether
234- the ``index `` exists or not.
240+ the ``index `` exists or not.
235241
236242 Important: these function can be called multiple times by multiple independent
237243 components, so probably best to implement some kind of caching functionality.
@@ -242,22 +248,22 @@ footprint there are a number of things you can do:
242248Setting logins and password
243249===========================
244250
245- ``ExplainerDashboard `` supports `dash basic auth functionality <https://dash.plotly.com/authentication >`_.
251+ ``ExplainerDashboard `` supports `dash basic auth functionality <https://dash.plotly.com/authentication >`_.
246252``ExplainerHub `` uses ``flask_simple_login `` for its user authentication.
247253
248- You can simply add a list of logins to the ``ExplainerDashboard `` to force a login
254+ You can simply add a list of logins to the ``ExplainerDashboard `` to force a login
249255and prevent random users from accessing the details of your model dashboard::
250256
251257 ExplainerDashboard(explainer, logins=[['login1', 'password1'], ['login2', 'password2']]).run()
252258
253- Whereas :ref: `ExplainerHub<ExplainerHub> ` has somewhat more intricate user management
254- using ``FlaskLogin ``, but the basic syntax is the same. See the
259+ Whereas :ref: `ExplainerHub<ExplainerHub> ` has somewhat more intricate user management
260+ using ``FlaskLogin ``, but the basic syntax is the same. See the
255261:ref: `ExplainerHub documetation<ExplainerHub> ` for more details::
256262
257263 hub = ExplainerHub([db1, db2], logins=[['login1', 'password1'], ['login2', 'password2']])
258264
259- Make sure not to check these login/password pairs into version control though,
260- but store them somewhere safe! ``ExplainerHub `` stores passwords into a hashed
265+ Make sure not to check these login/password pairs into version control though,
266+ but store them somewhere safe! ``ExplainerHub `` stores passwords into a hashed
261267format by default.
262268
263269
@@ -266,20 +272,20 @@ Automatically restart gunicorn server upon changes
266272
267273We can use the ``explainerdashboard `` CLI tools to automatically rebuild our
268274explainer whenever there is a change to the underlying
269- model, dataset or explainer configuration. And we we can use ``kill -HUP gunicorn.pid ``
270- to force the gunicorn to restart and reload whenever a new ``explainer.joblib ``
271- is generated or the dashboard configuration ``dashboard.yaml `` changes. These two
272- processes together ensure that the dashboard automatically updates whenever there
275+ model, dataset or explainer configuration. And we we can use ``kill -HUP gunicorn.pid ``
276+ to force the gunicorn to restart and reload whenever a new ``explainer.joblib ``
277+ is generated or the dashboard configuration ``dashboard.yaml `` changes. These two
278+ processes together ensure that the dashboard automatically updates whenever there
273279are underlying changes.
274280
275- First we store the explainer config in ``explainer.yaml `` and the dashboard
281+ First we store the explainer config in ``explainer.yaml `` and the dashboard
276282config in ``dashboard.yaml ``. We also indicate which modelfiles and datafiles the
277- explainer depends on, and which columns in the datafile should be used as
283+ explainer depends on, and which columns in the datafile should be used as
278284a target and which as index::
279285
280286 explainer = ClassifierExplainer(model, X, y, labels=['Not Survived', 'Survived'])
281287 explainer.dump("explainer.joblib")
282- explainer.to_yaml("explainer.yaml",
288+ explainer.to_yaml("explainer.yaml",
283289 modelfile="model.pkl",
284290 datafile="data.csv",
285291 index_col="Name",
@@ -300,12 +306,12 @@ directly from the config file::
300306
301307.. highlight :: bash
302308
303- Now we would like to rebuild the ``explainer.joblib `` file whenever there is a
304- change to ``model.pkl ``, ``data.csv `` or ``explainer.yaml `` by running
305- ``explainerdashboard build ``. And we restart the ``gunicorn `` server whenever
306- there is a change in ``explainer.joblib `` or ``dashboard.yaml `` by killing
307- the gunicorn server with ``kill -HUP pid `` To do that we need to install
308- the python package ``watchdog `` (``pip install watchdog[watchmedo] ``). This
309+ Now we would like to rebuild the ``explainer.joblib `` file whenever there is a
310+ change to ``model.pkl ``, ``data.csv `` or ``explainer.yaml `` by running
311+ ``explainerdashboard build ``. And we restart the ``gunicorn `` server whenever
312+ there is a change in ``explainer.joblib `` or ``dashboard.yaml `` by killing
313+ the gunicorn server with ``kill -HUP pid `` To do that we need to install
314+ the python package ``watchdog `` (``pip install watchdog[watchmedo] ``). This
309315package can keep track of filechanges and execute shell-scripts upon file changes.
310316
311317So we can start the gunicorn server and the two watchdog filechange trackers
@@ -321,17 +327,14 @@ from a shell script ``start_server.sh``::
321327
322328 wait # wait till user hits ctrl-c to exit and kill all three processes
323329
324- Now we can simply run ``chmod +x start_server.sh `` and ``./start_server.sh `` to
330+ Now we can simply run ``chmod +x start_server.sh `` and ``./start_server.sh `` to
325331get our server up and running.
326332
327- Whenever we now make a change to either one of the source files
333+ Whenever we now make a change to either one of the source files
328334(``model.pkl ``, ``data.csv `` or ``explainer.yaml ``), this produces a fresh
329335``explainer.joblib ``. And whenever there is a change to either ``explainer.joblib ``
330- or ``dashboard.yaml `` gunicorns restarts and rebuild the dashboard.
331-
332- So you can keep an explainerdashboard running without interuption and simply
333- an updated ``model.pkl `` or a fresh dataset ``data.csv `` into the directory and
334- the dashboard will automatically update.
335-
336-
336+ or ``dashboard.yaml `` gunicorns restarts and rebuild the dashboard.
337337
338+ So you can keep an explainerdashboard running without interuption and simply
339+ an updated ``model.pkl `` or a fresh dataset ``data.csv `` into the directory and
340+ the dashboard will automatically update.
0 commit comments