You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A collection of Jupyter notebooks with examples of querying different PID providers like [ORCID](https://orcid.org/), [ROR](https://ror.readme.io/), [Crossref](https://www.crossref.org/) and PID graphs like the [FREYA PID Graph](https://blog.datacite.org/powering-the-pid-graph/) and [OpenAlex](https://openalex.org/about) for connected objects.
6
+
A collection of Jupyter notebooks with examples of querying different PID providers like [ORCID](https://orcid.org/), [ROR](https://ror.readme.io/), [Crossref](https://www.crossref.org/) and PID graphs like the [FREYA PID Graph](https://blog.datacite.org/powering-the-pid-graph/), [OpenAlex](https://openalex.org/about) and [OpenAIRE](https://www.openaire.eu/) for connected objects.
7
7
8
8
Currently included connections:
9
9
* organization-organization
@@ -17,7 +17,11 @@ Currently included connections:
17
17
* person-works
18
18
* input: ORCID
19
19
* output: list of works authored/created by the person, each identified by their DOI
20
-
* data sources: Crossref, FREYA PID Graph, OpenAlex, ORCID
20
+
* data sources: Crossref, FREYA PID Graph, OpenAlex, ORCID, OpenAIRE
21
+
* work-projects
22
+
* input: DOI
23
+
* output: list of projects the work was produced in, each identified by their OpenAIRE project ID
24
+
* data sources: OpenAIRE
21
25
22
26
23
27
Please navigate into the respective folder to see the list of available notebooks.
"### Query OpenAIRE for publications authored by a person\n",
8
+
"This notebook queries the [OpenAIRE HTTP API](https://graph.openaire.eu/develop/api.html) via its `/publications` endpoint for publications authored by a person. It takes an ORCID iD as input which is used to filter for publications where one of the creators' `orcid` field matches the given ORCID iD. From the resulting list of publications we output all DOIs.\n",
9
+
"\n",
10
+
"*Note:\n",
11
+
"The API has several different endpoints for research outputs: they are divided into publications, research data, software metadata and other research products, so to get a full picture about a person's research output, you would have to query all of these endpoints and union their results.*"
12
+
]
13
+
},
14
+
{
15
+
"cell_type": "code",
16
+
"execution_count": 1,
17
+
"metadata": {
18
+
"pycharm": {
19
+
"name": "#%%\n"
20
+
}
21
+
},
22
+
"outputs": [],
23
+
"source": [
24
+
"# Prerequisites:\n",
25
+
"import requests # dependency for making HTTP calls\n",
26
+
"from benedict import benedict # dependency for dealing with json"
27
+
]
28
+
},
29
+
{
30
+
"cell_type": "markdown",
31
+
"metadata": {
32
+
"collapsed": true,
33
+
"pycharm": {
34
+
"name": "#%% md\n"
35
+
}
36
+
},
37
+
"source": [
38
+
"The input for this notebook is an ORCID iD, e.g. '`0000-0003-2499-7741`'."
39
+
]
40
+
},
41
+
{
42
+
"cell_type": "code",
43
+
"execution_count": 2,
44
+
"metadata": {
45
+
"pycharm": {
46
+
"name": "#%%\n"
47
+
}
48
+
},
49
+
"outputs": [],
50
+
"source": [
51
+
"# input parameter\n",
52
+
"example_orcid_id=\"0000-0003-2499-7741\""
53
+
]
54
+
},
55
+
{
56
+
"cell_type": "markdown",
57
+
"metadata": {},
58
+
"source": [
59
+
"We use it to query the OpenAIRE HTTP API for publications that specified the ORCID iD within their metadata in one of the creators `orcid` field. Since the API uses pagination, we need to loop through all pages to get the complete result set."
60
+
]
61
+
},
62
+
{
63
+
"cell_type": "code",
64
+
"execution_count": 3,
65
+
"metadata": {
66
+
"pycharm": {
67
+
"name": "#%%\n"
68
+
}
69
+
},
70
+
"outputs": [],
71
+
"source": [
72
+
"# OpenAIRE endpoint to query for publications\n",
"From the resulting list of publications we extract and print out each title and DOI. \n",
111
+
"\n",
112
+
"*Note: publications that do not have a DOI assigned, will not be printed.*"
113
+
]
114
+
},
115
+
{
116
+
"cell_type": "code",
117
+
"execution_count": 4,
118
+
"metadata": {},
119
+
"outputs": [
120
+
{
121
+
"name": "stdout",
122
+
"output_type": "stream",
123
+
"text": [
124
+
"Number of publications found: 6\n",
125
+
"\n",
126
+
"10.15488/11463, Roadmap to FAIR Research Information in Open Infrastructures\n",
127
+
"10.1515/bd.2006.40.4.466, Informationsvermittlung: Personalisiertes Lernen in der Bibliothek: das Düsseldorfer Online-Tutorial (DOT) Informationskompetenz\n",
128
+
"10.1080/00048623.2006.10755322, Teaching Information Literacy with the Lerninformationssystem\n",
129
+
"10.3389/frma.2021.694307, Enhancing Knowledge Graph Extraction and Validation From Scholarly Publications Using Bibliographic Metadata\n",
130
+
"10.3897/rio.7.e66264, OPTIMETA – Strengthening the Open Access publishing system through open citations and spatiotemporal metadata\n",
131
+
"10.1016/j.procs.2019.01.074, The Research Core Dataset (KDSF) in the Linked Data context\n"
132
+
]
133
+
}
134
+
],
135
+
"source": [
136
+
"# from the result pages, extract the data about each publication\n",
137
+
"def extract_publications_from_page(page):\n",
138
+
" return [pub for pub in benedict.from_json(page).get('response.results.result') or []]\n",
0 commit comments