pi-kappa-devel
diff --git a/‎.github/workflows/deploy-docs.yaml
Lines changed: 1 addition & 1 deletion b/‎.github/workflows/deploy-docs.yaml
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/Makefile
Lines changed: 7 additions & 4 deletions b/‎docs/Makefile
Lines changed: 7 additions & 4 deletions
diff --git a/‎README.md renamed to ‎docs/source/_static/README.qmd
Lines changed: 34 additions & 60 deletions b/‎README.md renamed to ‎docs/source/_static/README.qmd
Lines changed: 34 additions & 60 deletions
diff --git a/‎docs/source/_static/bibliography.bib
Lines changed: 94 additions & 0 deletions b/‎docs/source/_static/bibliography.bib
Lines changed: 94 additions & 0 deletions
diff --git a/‎docs/source/_static/css/extra.css
Lines changed: 11 additions & 0 deletions b/‎docs/source/_static/css/extra.css
Lines changed: 11 additions & 0 deletions
@@ -33,7 +33,7 @@ jobs:
           wget https://github.com/quarto-dev/quarto-cli/releases/download/v1.6.33/quarto-1.6.33-linux-amd64.deb
           sudo dpkg -i quarto-1.6.33-linux-amd64.deb
           python -m pip install --upgrade pip
-          python -m pip install build itables jupyter myst-parser setuptools sphinx sphinx-autodoc-typehints
+          python -m pip install build jupyter myst-parser setuptools sphinx sphinx-autodoc-typehints sphinx-book-theme
 
       - name: Build Sphinx docs
         run: |
 
@@ -18,8 +18,11 @@ help:
 # "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
 %: Makefile
 	cp "$(SOURCEDIR)/../../LICENSE.txt" "$(SOURCEDIR)/LICENSE.md"
-	cp "$(SOURCEDIR)/../../README.md" "$(SOURCEDIR)/README.md"
-	sed -i 's/LICENSE.txt/LICENSE.md/g' "$(SOURCEDIR)/README.md"
-	sed -i 's/<a.*hex-logo.png.*<\/a>//g' "$(SOURCEDIR)/README.md"
+	quarto render "$(SOURCEDIR)/_static/README.qmd" -t gfm --output-dir ../
+	cp "$(SOURCEDIR)/README.md" ../README.md
+	sed -i 's/\[MIT license\](LICENSE.txt)/<a href="LICENSE.html">MIT license<\/a>/g' "$(SOURCEDIR)/README.md"
+	sed -i 's/docs\/source\///g' "$(SOURCEDIR)/README.md"
 	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
-	quarto render "$(SOURCEDIR)/_static/examples/dl-matching.qmd" --output-dir ../../../build/html/
+	quarto render "$(SOURCEDIR)/_static/examples/dl-matching.qmd" -t gfm
+	mv "$(SOURCEDIR)/_static/examples/dl-matching.md" "$(SOURCEDIR)/dl-matching.md"
+	sed -i 's/dl-matching_files/_static\/examples\/dl-matching_files/g' "$(SOURCEDIR)/dl-matching.md"
@@ -1,4 +1,12 @@
-# Neer Match <a href="https://py-neer-match.pikappa.eu"><img src="docs/source/_static/img/hex-logo.png" align="right" height="139" alt="neermatch website" /></a>
+---
+title: "Neer Match"
+self-contained: true
+resource-path:
+  - "../../../"
+bibliography: bibliography.bib 
+---
+
+<a href="https://py-neer-match.pikappa.eu" style="float:right;margin-left:10px;"><img src="docs/source/_static/img/hex-logo.png" align="right" height="139" alt="neermatch website" /></a>
 
 <!-- badges: start -->
 ![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)
@@ -14,9 +22,9 @@ The package has also an `R` implementation available at [r-neer-match](https://g
 
 The package is built on the concept of similarity maps. Similarity maps are concise representations of potential associations between fields in two datasets. Entities from two datasets can be matched using one or more pairs of fields (one from each dataset). Each field pair can have one or more ways to compute the similarity between the values of the fields.
 
-Similarity maps are used to automate the construction of entity matching models and to facilitate the reasoning capabilities of the package. More details on the concept of similarity maps and an early implementation of the package’s functionality (without neural-symbolic components) are given by (Karapanagiotis and Liebald 2023).
+Similarity maps are used to automate the construction of entity matching models and to facilitate the reasoning capabilities of the package. More details on the concept of similarity maps and an early implementation of the package’s functionality (without neural-symbolic components) are given by [@karapanagiotis2023].
 
-The training loops for both deep and symbolic learning models are implemented in [tensorflow](https://www.tensorflow.org) (see Abadi et al. 2015). The pure deep learning model inherits from the [keras](https://keras.io) model class (Chollet et al. 2015). The neural-symbolic model is implemented using the logic tensor network ([LTN](https://pypi.org/project/ltn/)) framework (Badreddine et al. 2022). Pure neural-symbolic and hybrid models do not inherit directly from the (Chollet et al. 2015) model class, but they emulate the behavior by providing custom `compile`, `fit`, `evaluate`, and `predict`methods, so that all model classes in `neermatch` have a uniform calling interface.
+The training loops for both deep and symbolic learning models are implemented in [tensorflow](https://www.tensorflow.org) [@tensorflow2015]. The pure deep learning model inherits from the [keras](https://keras.io) model class [@keras2015]. The neural-symbolic model is implemented using the logic tensor network ([LTN](https://pypi.org/project/ltn/)) framework [@badreddine2022]. Pure neural-symbolic and hybrid models do not inherit directly from the [keras](https://keras.io) model class, but they emulate the behavior by providing custom `compile`, `fit`, `evaluate`, and `predict`methods, so that all model classes in `neermatch` have a uniform calling interface.
 
 ## Auxiliary Features
 In addition, the package offers explainability functionality customized for the needs of matching problems. The default explainability behavior is built on the information provided by the similarity map. From a global explainability aspect, the package can be used to calculate partial matching dependencies and accumulated local effects on similarities. From a local explainability aspect, the package can be used to calculate local interpretable model-agnostic matching explanations and Shapley matching values.
@@ -31,13 +39,28 @@ Implementing matching models using `neermatch` is a three-step process:
 
 To train the model you need to provide three datasets. Two datasets should contain records representing the entities to be matched. By convention, the first dataset is called Left and the second dataset is called Right dataset in the package’s documentation. The third dataset should contain the ground truth labels for the matching entities. The ground truth dataset should have two columns, one for the index of the entity in the Left dataset and one for the index of the entity in the Right dataset.
 
-``` python
+```{python}
+#| label: data-setup
+#| include: false
+
+import os
+import sys
+sys.path.append("../../../")
+import test
+
+def prepare_data():
+    return test.left, test.right, test.matches
+```
+
+```{python}
+#| label: usage
+
 from neer_match.similarity_map import SimilarityMap
 from neer_match.matching_model import NSMatchingModel
 import tensorflow as tf
 
 # 0) replace this with your own data preprocessing function
-from neer_match.examples import games
+left, right, matches = prepare_data()
 
 # 1) customize according to the fields in your data
 smap = SimilarityMap(
@@ -55,25 +78,10 @@ model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.01))
 
 # 3) train
 model.fit(
-    games.left,
-    games.right,
-    games.matches,
-    epochs=100,
-    batch_size=16,
-    log_mod_n=10,
+    left, right, matches,
+    epochs=10, batch_size=16,
+    log_mod_n=1,
 )
-#>>> | Epoch      | BCE        | Recall     | Precision  | F1         | Sat        |
-#>>> | 0          | 5.2150     | 1.0000     | 0.3333     | 0.5000     | 0.7245     |
-#>>> | 10         | 6.9364     | 0.0000     | nan        | nan        | 0.7806     |
-#>>> | 20         | 9.4707     | 0.0000     | nan        | nan        | 0.7853     |
-#>>> | 30         | 8.9746     | 0.0000     | nan        | nan        | 0.7857     |
-#>>> | 40         | 1.9495     | 0.0000     | nan        | nan        | 0.8339     |
-#>>> | 50         | 0.7654     | 1.0000     | 0.8919     | 0.9429     | 0.8853     |
-#>>> | 60         | 0.3452     | 1.0000     | 0.9429     | 0.9706     | 0.9083     |
-#>>> | 70         | 1.2782     | 1.0000     | 0.8462     | 0.9167     | 0.8718     |
-#>>> | 80         | 0.6670     | 1.0000     | 0.9167     | 0.9565     | 0.9039     |
-#>>> | 90         | 0.8415     | 1.0000     | 0.9167     | 0.9565     | 0.9002     |
-#>>> Training finished at Epoch 99 with DL loss 0.9324 and Sat 0.9020
 ```
 
 # Installation
@@ -82,12 +90,12 @@ model.fit(
 
 You can obtain the sources for the development version of `neermatch` from its github [repository](https://github.com/pi-kappa-devel/py-neer-match).
 
-``` bash
+```
 git clone https://github.com/pi-kappa-devel/py-neer-match
 ```
 
 To build and install the package locally, from the project's root path, execute
-```bash
+```
 python -m build
 python -m pip install dist/$(basename `ls -Art dist | tail -n 1` -py3-none-any.whl).tar.gz
 ```
@@ -99,7 +107,7 @@ Online documentation is available for the [release](https://py-neer-match.pikapp
 ## Reproducing Documentation from Source
 
 Make sure to build and install the package with the latest modifications before building the documentation.  The documentation website is using [sphinx](https://www.sphinx-doc.org/). The build the documentation, from `<project-root>/docs`, execute 
-```bash
+```
 make html
 ```
 
@@ -126,38 +134,4 @@ The package is distributed under the [MIT license](LICENSE.txt).
 
 # References
 
-<div id="refs" class="references csl-bib-body hanging-indent"
-entry-spacing="0">
-
-<div id="ref-tensorflow2015" class="csl-entry">
-
-Abadi, Martín, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen,
-Craig Citro, Greg S. Corrado, et al. 2015. “TensorFlow: Large-Scale
-Machine Learning on Heterogeneous Systems.”
-<https://www.tensorflow.org/>.
-
-</div>
-
-<div id="ref-badreddine2022logic" class="csl-entry">
-
-Badreddine, Samy, Artur d’Avila Garcez, Luciano Serafini, and Michael
-Spranger. 2022. “Logic Tensor Networks.” *Artificial Intelligence* 303:
-103649. <https://doi.org/10.1016/j.artint.2021.103649>.
-
-</div>
-
-<div id="ref-keras2015" class="csl-entry">
-
-Chollet, François et al. 2015. “Keras.” <https://keras.io>.
-
-</div>
-
-<div id="ref-karapanagiotis2023" class="csl-entry">
-
-Karapanagiotis, Pantelis, and Marius Liebald. 2023. “Entity Matching
-with Similarity Encoding: A Supervised Learning Recommendation Framework
-for Linking (Big) Data.” <http://dx.doi.org/10.2139/ssrn.4541376>.
-
-</div>
 
-</div>
@@ -0,0 +1,94 @@
+@article{badreddine2022,
+  title =	 {Logic Tensor Networks},
+  journal =	 {Artificial Intelligence},
+  volume =	 303,
+  pages =	 103649,
+  year =	 2022,
+  issn =	 {0004-3702},
+  doi =		 {10.1016/j.artint.2021.103649},
+  author =	 {Samy Badreddine and Artur {d'Avila Garcez} and
+                  Luciano Serafini and Michael Spranger},
+  keywords =	 {Neurosymbolic AI, Deep learning and reasoning,
+                  Many-valued logics}
+}
+
+@misc{karapanagiotis2023,
+  title =	 {Entity Matching with Similarity Encoding: A
+                  Supervised Learning Recommendation Framework for
+                  Linking (Big) Data},
+  author =	 {Pantelis Karapanagiotis and Marius Liebald},
+  year =	 2023,
+  url =		 {http://dx.doi.org/10.2139/ssrn.4541376},
+  note =	 {SAFE Working Paper No. 398},
+}
+
+@misc{keras2015,
+  title =	 {Keras},
+  author =	 {Chollet, Fran\c{c}ois and others},
+  year =	 2015,
+  url =		 {https://keras.io},
+}
+
+@misc{neermatch2024,
+  title =	 {{NEural-symbolic Entity Reasoning and Matching
+                  (Python Neer Match)}},
+  author =	 {Pantelis Karapanagiotis and Marius Liebald},
+  year =	 2024,
+  url =		 {https://github.com/pi-kappa-devel/py-neer-match},
+}
+
+@misc{pkgdown2024,
+  title =	 {pkgdown: Make Static HTML Documentation for a
+                  Package},
+  author =	 {Hadley Wickham and Jay Hesselberth and Maëlle Salmon
+                  and Olivier Roy and Salim Brüggemann},
+  year =	 2024,
+  note =	 {R package version 2.1.1,
+                  https://github.com/r-lib/pkgdown},
+  url =		 {https://pkgdown.r-lib.org/},
+}
+
+@misc{rapidfuzz2021,
+  author =	 {Max Bachmann},
+  title =	 {maxbachmann/RapidFuzz: Release 1.8.0},
+  month =	 oct,
+  year =	 2021,
+  publisher =	 {Zenodo},
+  version =	 {v1.8.0},
+  doi =		 {10.5281/zenodo.5584996},
+  url =		 {https://doi.org/10.5281/zenodo.5584996}
+}
+
+@misc{roxygen22024,
+  title =	 {roxygen2: In-Line Documentation for R},
+  author =	 {Hadley Wickham and Peter Danenberg and Gábor Csárdi
+                  and Manuel Eugster},
+  year =	 2024,
+  note =	 {R package version 7.3.2,
+                  https://github.com/r-lib/roxygen2},
+  url =		 {https://roxygen2.r-lib.org/},
+}
+
+@misc{tensorflow2015,
+  title =	 { {TensorFlow}: Large-Scale Machine Learning on
+                  Heterogeneous Systems},
+  url =		 {https://www.tensorflow.org/},
+  note =	 {Software available from tensorflow.org},
+  author =	 { Mart\'{i}n~Abadi and Ashish~Agarwal and Paul~Barham
+                  and Eugene~Brevdo and Zhifeng~Chen and Craig~Citro
+                  and Greg~S.~Corrado and Andy~Davis and Jeffrey~Dean
+                  and Matthieu~Devin and Sanjay~Ghemawat and
+                  Ian~Goodfellow and Andrew~Harp and Geoffrey~Irving
+                  and Michael~Isard and Yangqing Jia and
+                  Rafal~Jozefowicz and Lukasz~Kaiser and
+                  Manjunath~Kudlur and Josh~Levenberg and
+                  Dandelion~Man\'{e} and Rajat~Monga and Sherry~Moore
+                  and Derek~Murray and Chris~Olah and Mike~Schuster
+                  and Jonathon~Shlens and Benoit~Steiner and
+                  Ilya~Sutskever and Kunal~Talwar and Paul~Tucker and
+                  Vincent~Vanhoucke and Vijay~Vasudevan and
+                  Fernanda~Vi\'{e}gas and Oriol~Vinyals and
+                  Pete~Warden and Martin~Wattenberg and Martin~Wicke
+                  and Yuan~Yu and Xiaoqiang~Zheng},
+  year =	 2015,
+}
@@ -0,0 +1,11 @@
+#main-content .caption {
+  display: none;
+}
+
+#main-content p {
+  text-align: justify;
+}
+
+#sidebar > li {
+  display: none;
+}