-
Notifications
You must be signed in to change notification settings - Fork 26
Dev federated query performance improvement merge request #1463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
inverted index: class to files
load inverted index from a saved file
Multilevel inverted indexing
load saved multilevel inverted index from file
print_key_value_stats(self) is defined
We tested against two blazegraph namespaces
added progress bar
it creates class to blazegraph endpoints (namespace) index
minor modification
It can analyse SPARQL to retrieve classes and properties in the query so that we can decide at which endpoints we should run our SPARQL using our proposed inverted index
It can run a SPARQL and retrieve results in the JSON format
It creates inverted index from property to endpoints
It can 1) build 3 indices (concept->endpoint, property-> endpoint, concept -> property -> endpoint), 2) update indices on adding new triple-set, 3) save indices (individually and collectively) to a dir, 4) load indices from a dir (individually and collectively)
It can analyse a SPARQL and find necessary endpoints to run the SPARQL
It can run a SPARQL against an endpoint
experimented earlier. not necessary now.
It can build Inverted Index from a number of namespaces, save the inverted index in a local directory, update inverted index for new set of triple inserted into KGs, and load the inverted index into memory.
It can load the inverted index from local directory, analyse the user query to find classes/properties, find the relevant endpoints, and finally can run the user query against endpoints using FedX.
Added maven dependencies for the federated query processing
Java code is documented and elimination of some stop_classes and stop_properties has implemented so that unimportant class/property does not have any negative impact.
Some stop_classes and stop_properties are defined so that unimportant class/property does not have any negative impact.
Java code is documented
Timer has been added to the main() module so that we can know the elapse time
compacted to produce overhead on 100 and 1000 iterations for a query
code documentation
dev-federated-query-performance-improvement: modified main to perform various experiments.
…een duplicated to change as required for integrating with RemoteStoreClient
…e and extend index
…om INSERT sparql is done for updating index
…erformance-improvement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JPS Base Lib does not build on this branch due to the change frrom java 1.8 to java 17.
From the build report:
Error: Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.12.1:compile (default-compile) on project jps-base-lib: Fatal error compiling: error: release version 17 not supported -> [Help 1]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will move the code here to our new JPS repo and keep this branch as a WIP in the new location
Implementation of the Inverted index for query federation
Implementation of TIFF-CSV and CSV-TIFF conversion