You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -14,9 +22,9 @@ The package has also an `R` implementation available at [r-neer-match](https://g
14
22
15
23
The package is built on the concept of similarity maps. Similarity maps are concise representations of potential associations between fields in two datasets. Entities from two datasets can be matched using one or more pairs of fields (one from each dataset). Each field pair can have one or more ways to compute the similarity between the values of the fields.
16
24
17
-
Similarity maps are used to automate the construction of entity matching models and to facilitate the reasoning capabilities of the package. More details on the concept of similarity maps and an early implementation of the package’s functionality (without neural-symbolic components) are given by (Karapanagiotis and Liebald 2023).
25
+
Similarity maps are used to automate the construction of entity matching models and to facilitate the reasoning capabilities of the package. More details on the concept of similarity maps and an early implementation of the package’s functionality (without neural-symbolic components) are given by [@karapanagiotis2023].
18
26
19
-
The training loops for both deep and symbolic learning models are implemented in [tensorflow](https://www.tensorflow.org)(see Abadi et al. 2015). The pure deep learning model inherits from the [keras](https://keras.io) model class (Chollet et al. 2015). The neural-symbolic model is implemented using the logic tensor network ([LTN](https://pypi.org/project/ltn/)) framework (Badreddine et al. 2022). Pure neural-symbolic and hybrid models do not inherit directly from the (Chollet et al. 2015) model class, but they emulate the behavior by providing custom `compile`, `fit`, `evaluate`, and `predict`methods, so that all model classes in `neermatch` have a uniform calling interface.
27
+
The training loops for both deep and symbolic learning models are implemented in [tensorflow](https://www.tensorflow.org)[@tensorflow2015]. The pure deep learning model inherits from the [keras](https://keras.io) model class [@keras2015]. The neural-symbolic model is implemented using the logic tensor network ([LTN](https://pypi.org/project/ltn/)) framework [@badreddine2022]. Pure neural-symbolic and hybrid models do not inherit directly from the [keras](https://keras.io) model class, but they emulate the behavior by providing custom `compile`, `fit`, `evaluate`, and `predict`methods, so that all model classes in `neermatch` have a uniform calling interface.
20
28
21
29
## Auxiliary Features
22
30
In addition, the package offers explainability functionality customized for the needs of matching problems. The default explainability behavior is built on the information provided by the similarity map. From a global explainability aspect, the package can be used to calculate partial matching dependencies and accumulated local effects on similarities. From a local explainability aspect, the package can be used to calculate local interpretable model-agnostic matching explanations and Shapley matching values.
@@ -31,13 +39,28 @@ Implementing matching models using `neermatch` is a three-step process:
31
39
32
40
To train the model you need to provide three datasets. Two datasets should contain records representing the entities to be matched. By convention, the first dataset is called Left and the second dataset is called Right dataset in the package’s documentation. The third dataset should contain the ground truth labels for the matching entities. The ground truth dataset should have two columns, one for the index of the entity in the Left dataset and one for the index of the entity in the Right dataset.
33
41
34
-
```python
42
+
```{python}
43
+
#| label: data-setup
44
+
#| include: false
45
+
46
+
import os
47
+
import sys
48
+
sys.path.append("../../../")
49
+
import test
50
+
51
+
def prepare_data():
52
+
return test.left, test.right, test.matches
53
+
```
54
+
55
+
```{python}
56
+
#| label: usage
57
+
35
58
from neer_match.similarity_map import SimilarityMap
36
59
from neer_match.matching_model import NSMatchingModel
37
60
import tensorflow as tf
38
61
39
62
# 0) replace this with your own data preprocessing function
40
-
from neer_match.examples import games
63
+
left, right, matches = prepare_data()
41
64
42
65
# 1) customize according to the fields in your data
#>>> Training finished at Epoch 99 with DL loss 0.9324 and Sat 0.9020
77
85
```
78
86
79
87
# Installation
@@ -82,12 +90,12 @@ model.fit(
82
90
83
91
You can obtain the sources for the development version of `neermatch` from its github [repository](https://github.com/pi-kappa-devel/py-neer-match).
@@ -99,7 +107,7 @@ Online documentation is available for the [release](https://py-neer-match.pikapp
99
107
## Reproducing Documentation from Source
100
108
101
109
Make sure to build and install the package with the latest modifications before building the documentation. The documentation website is using [sphinx](https://www.sphinx-doc.org/). The build the documentation, from `<project-root>/docs`, execute
102
-
```bash
110
+
```
103
111
make html
104
112
```
105
113
@@ -126,38 +134,4 @@ The package is distributed under the [MIT license](LICENSE.txt).
0 commit comments