Replies: 2 comments
-
I started looking at this ages ago:
#411
I have revisited it in 2016 - but apparently didn't push, I'll see if I can
find my changes.
It's still far away from working though!
- Gunnar
…On 19 June 2017 at 10:08, Robert Jäschke ***@***.***> wrote:
It would be great if rdflib could support processing large files which do
not fit into main memory. For instance, providing an iterator over the
statements is often sufficient, if the statements are "atomic", that is,
provide all information about an entity. This can be the case for datasets
generated from relational databases.
As an example: I would like to extract data from a large Turtle dataset
<http://datendienst.dnb.de/cgi-bin/mabit.pl?userID=opendata&pass=opendata&cmd=login>
and convert it into JSON. For that purpose (and given the structure of the
dataset) it would be sufficient to have an iterator over the statements in
the file.
I had a look at the source code for parsing N3
<https://github.com/RDFLib/rdflib/blob/master/rdflib/plugins/parsers/notation3.py>
but could not find an apparent method for that use case. I suppose that's
related to what W3C's RDF stream processing community group
<https://www.w3.org/community/rsp/> is aiming for but on a much simpler
scale.
Tips on how to accomplish this (with or without rdflib) would also be
great.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#751>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAK3bMB6f5xxFvDCYy8APH-LKeiuDe7Dks5sFix0gaJpZM4N94oC>
.
|
Beta Was this translation helpful? Give feedback.
0 replies
-
Thank you Gunnar, that would be great. At the moment I am trying to use |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
It would be great if rdflib could support processing large files which do not fit into main memory. For instance, providing an iterator over the statements is often sufficient, if the statements are "atomic", that is, provide all information about an entity. This can be the case for datasets generated from relational databases.
As an example: I would like to extract data from a large Turtle dataset and convert it into JSON. For that purpose (and given the structure of the dataset) it would be sufficient to have an iterator over the statements in the file.
I had a look at the source code for parsing N3 but could not find an apparent method for that use case. I suppose that's related to what W3C's RDF stream processing community group is aiming for but on a much simpler scale.
Tips on how to accomplish this (with or without rdflib) would also be great.
Beta Was this translation helpful? Give feedback.
All reactions