MILR

Code for the paper 'Boosting Document-Level Relation Extraction by Mining and Injecting Logical Rules", which is accepted to EMNLP 2022 main conference.

For simplicity, we only supply the code with the strong backbone ATLOP being the backbone tested on the DWIE dataset. MILR with other backbones and datasets are similar. This code is adapted from the repository of ATLOP. Thanks for their excellent work.

In addition, predictions used for analysis in our paper are provided. We also provide mined rules whose confidence is higher than the threshold minC.

Requirements

Python (tested on 3.6)
apex==0.9.10dev
cvxpy==1.1.18
dill==0.3.4
gurobipy==9.5.1 (Note that installation via pip may not work. Please request an evaluation license or a free academic license of Gurobi. More instructions can be found in link.)
matplotlib==3.3.1
numpy==1.19.2
opt_einsum==3.3.0
pandas==1.1.3
scipy==1.2.0
tqdm==4.50.0
transformers==3.4.0
ujson==4.0.2
wandb==0.10.32
torch==1.6.0

We also exported enviroment.yaml and requirements.txt.

Dataset

The training and development set of DocRED dataset can be downloaded at link. And the test set used in MILR can be downloaded at link. The DWIE dataset can be obtained following the instructions in LogiRE. We also upload the processed dataset in EMNLP 2022 START Conference Manager.

The expected structure of files is:

```
ATLOP+MILR
 |-- dataset_dwie
 |    |-- train_annotated.json        
 |    |-- dev.json
 |    |-- test.json
 |    |-- meta
 |    |    |-- ner2id.json        
 |    |    |-- rel2id.json
 |    |    |-- vec.npy
 |    |    |-- word2id.json
 |-- dataset_docred
 |    |-- train_annotated.json        
 |    |-- dev.json
 |    |-- test.json
 |    |-- rel_info.json
 |    |-- meta
 |    |    |-- ner2id.json        
 |    |    |-- rel2id.json
 |    |    |-- vec.npy
 |    |    |-- word2id.json
 |    |    |-- char_vec.npy
 |    |    |-- char2id.json
```

Pre-Trained Language Model

Download BERT-base-uncased at link. And put downloaded files into ./PLM/bert-base-uncased . The expected structure of files is:

```
ATLOP+MILR
 |-- PLM
 |    |-- bert-base-uncased
 |    |    |-- config.json        
 |    |    |-- pytorch_model.bin
 |    |    |-- vocab.txt
```

Mined Rules

We supply mined rules on DWIE and DocRED in ./mined_rules. The structure of files is :

```
ATLOP+MILR
 |-- mined_rules:
 |    |-- rule_docred.txt
 |    |-- rule_dwie.txt

Examples are as follows:

['in1', 'in0'] -> in0 : 1.0 means in0(h,t) ← in1(h,z) ⋀ in0(z,t),whose confidence is 1.0.
['anti_based_in2', 'based_in0'] -> in0 : 1.0 means in0(h,t) ← based_in2(z,h) ⋀ based_in0(z,t),whose confidence is 1.0.

Predictions Produced by Trained Models

We supply predictions produced by ATLOP, ATLOP+LogiRE, and ATLOP+MILR. The structure of files is :

```
ATLOP+MILR
 |-- results_for_dwie
 |    |-- result_ATLOP_dev.json
 |    |-- result_ATLOP_test.json
 |    |-- result_LogiRE_test.json
 |    |-- result_MILR_dev.json
 |    |-- result_MILR_test.json
 |-- results_for_docred
 |    |-- result_ATLOP_test.json
 |    |-- result_MILR_test.json

Trained Models

We supply trained ATLOP & ATLOP+MILR on the DWIE dataset in link and link, respectively. Please download trained models and put them into the path ./trained_model/. The expected structure of files is:

```
ATLOP+MILR
 |-- trained_model
 |    |-- model_ATLOP_DWIE.pth
 |    |-- model_MILR_DWIE.pth

Log Samples

We also provide log samples in ./logs/. These samples involve the training and inference of ATLOP+MILR and the inference of ATLOP.

Training and Evaluation of ATLOP+MILR

>> sh scripts/MILR_train_DWIE.sh  # for training; if trained model has been downloaded, this process can be omitted

The classification loss, consistency regularization loss, total loss, and evaluation results on the dev set are synced to the wandb dashboard.

>> sh scripts/MILR_evaluate_DWIE.sh  # for inference

The program will generate a test file ./results_for_dwie/result_MILR.json in the official evaluation format. In addition, the log involving evaluation results would be dumped to ./logs/MILR_DWIE_evaluation.out.

Attention: There may be a bug when wandb is synchronized in the cloud. If this happens, try wandb offline in terminal. More instructions can be found in link .

Evaluation of ATLOP

>> sh scripts/ATLOP_evaluate_DWIE.sh  # for inference

The program will generate a test file ./results_for_dwie/result_ATLOP.json in the official evaluation format. In addition, the log involving evaluation results would be dumped to ./logs/ATLOP_DWIE_evaluation.out.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
dataset_dwie		dataset_dwie
logs		logs
meta		meta
mined_rules		mined_rules
results_for_docred		results_for_docred
results_for_dwie		results_for_dwie
scripts		scripts
README.md		README.md
enviroment.yaml		enviroment.yaml
evaluation.py		evaluation.py
long_seq.py		long_seq.py
losses.py		losses.py
mine_rule.py		mine_rule.py
model.py		model.py
prepro.py		prepro.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
soft_rule_regularization.py		soft_rule_regularization.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MILR

Requirements

Dataset

Pre-Trained Language Model

Mined Rules

Predictions Produced by Trained Models

Trained Models

Log Samples

Training and Evaluation of ATLOP+MILR

Evaluation of ATLOP

About

Uh oh!

Releases

Packages

Uh oh!

Languages

XingYing-stack/MILR

Folders and files

Latest commit

History

Repository files navigation

MILR

Requirements

Dataset

Pre-Trained Language Model

Mined Rules

Predictions Produced by Trained Models

Trained Models

Log Samples

Training and Evaluation of ATLOP+MILR

Evaluation of ATLOP

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages