The scripts for the execution of the experiments have been implemented in Python (3.6). You will require to have OpenEA installed, also Morph-KGC.
The file requirements.txt include the Python packages needed for executing the experiments.
You can install them using
pip install -r requirements.txt
Next, we illustrate how to run an experiment, which requires to use one dataset, two ontologies and two mapping files. The concrete example is eCommerce LLM-LLM.
All the scripts needed are available in the scripts folder. Next, we explain the task carried out by each script:
Hay que añadir el nombre de los ficheros
-
RDF data generation (csv2rdf-morph-kgc-rdflib.py): Generation of the RDF data using Morph-KGC with the previously defined mapping rules.
-
Data preparation (processingRDF.py): Splitting the RDF files required by OpenEA for the alignments, that is, files with triples representing the attributes (attr), the relations (rel) and the aligned entities (ent).
-
Training sets (randomPairs.py): Randomly split the set of aligned entities in three files, containing: links for training the model, links for the validation of the model, and links for testing.
-
Entity alignment (main_from_args.py): Run the entity alignment experiment with a graph alignment method.
-
Evaluation (countingAlignment.py): Generation of metrics for evaluating the results.
-
RDF data generation:
- Indicate the YML file (config.ini)
- Indicate the output RDF nt file (csv2rdf-morph-kgc-rdflib.py)
- Generates the RDF nt file for each ontology-csv:
nohup python3 csv2rdf-morph-kgc-rdflib.py &
-
Data preparation (processingRDF.py):
- Indicate the two RDF nt files and the CSV file
- Indicate the Entities to be aligned
- Indicate the Object properties for each knowledge graph
- Indicate the output directory and files
nohup python3 processingRDF.py &
-
Training sets (randomPairs.py):
- Indicate the input directory and "ent_links" file
- Indicate the output directories
- Indicate the ratios of training, testing and validation
- Create the corresponding split folders and code execution:
nohup python3 randomPairs.py &
-
Entity alignment:
- Edit arg file: "training_data", "output", "dataset_division". Example
cd ~/OpenEA/run nohup python main_from_args.py <arg.json> <training_data> <dataset_division> &
Example:
nohup python main_from_args.py ./args/attre_args_15K.json LLM-LLM/Input 451_1fold/1/ &
- nohup.out indicates the statistics
-
Evaluation
- Entity count per class(countingAlignment.py)
- Indicate the inputs: dataset, approach, test_links result, alignment_results_12 result, rel_triples1 result, kg1_ent_ids result and kg2_ent_ids result
- Indicate the outputs: count_aligned_ent file and ent_match file
- Generates the count file by class and the entity matching:
nohup python3 countingAlignment.py &
- Merge results to combine (merge_entity_alignments)
- Indicate the inputs: the path of results for each method (main_path_results), methods and pairwise alignments to combine (methods and pairs)
- Indicate the outputs: the path to save the results (path_save)
- Generate the merger:
Rscript merge_entity_alignments.R
- Merged entity count per class (count_merge_entity_alignments)
- Indicate the inputs: the previously merged file (input_file), and the original file with relations(rel_triples1) and test file (test_links)
- Indicate the output (file with count): output_path
- Generate the count:
Rscript count_merge_entity_alignments.R
- Entity count per class(countingAlignment.py)