Skip to content

Search Query: A Python package designed to load, lint, translate, save, improve, and automate academic literature search queries.

License

Notifications You must be signed in to change notification settings

CoLRev-Environment/search-query

Repository files navigation

GitHub Actions Workflow Status GitHub Release Coverage PyPI - Version GitHub License status Binder

Search Query is a Python package designed to load, lint, translate, save, improve, and automate academic literature search queries. It is extensible and currently supports PubMed, EBSCOHost, and Web of Science. The package can be used programmatically, through the command line, or as a pre-commit hook. It has zero dependencies and integrates in a variety of environments. The parsers and linters are battle-tested on peer-reviewed searchRxiv queries.

Installation

To install search-query, run:

pip install search-query

Quickstart

Creating a query programmatically is simple:

from search_query import OrQuery, AndQuery

# Typical building-blocks approach
digital_synonyms = OrQuery(["digital", "virtual", "online"], field="abstract")
work_synonyms = OrQuery(["work", "labor", "service"], field="abstract")
query = AndQuery([digital_synonyms, work_synonyms])

We can also parse a query from a string or a JSON search file (see the overview of platform identifiers)

from search_query.parser import parse

query_string = '("digital health"[Title/Abstract]) AND ("privacy"[Title/Abstract])'
query = parse(query_string, platform="pubmed")

A useful feature of parsers is the built-in linter functionality, which helps us to validate the query by identifying syntactical errors:

from search_query.parser import parse

query_string = '("digital health"[Title/Abstract]) AND ("privacy"[Title/Abstract]'
query = parse(query_string, platform="pubmed")
# Output:
# ❌ Fatal: unbalanced-parentheses (PARSE_0002)
#   - Unbalanced opening parenthesis
#   Query: ("digital health"[Title/Abstract]) AND ("privacy"[Title/Abstract]
#                                                ^^^

Once we have created a query object, we can translate it for different databases. Note how the syntax is translated and how the search for Title/Abstract is split into two elements:

from search_query.parser import parse

query_string = '("digital health"[Title/Abstract]) AND ("privacy"[Title/Abstract])'
pubmed_query = parse(query_string, platform="pubmed")
wos_query = pubmed_query.translate(target_syntax="wos")
print(wos_query.to_string())
# Output:
# (AB="digital health" OR TI="digital health") AND (AB="privacy" OR TI="privacy")

For a more detailed overview of the package’s functionality, see the documentation.

Demo

A Jupyter Notebook demo (hosted on Binder) is available here: Binder

Encounter a problem?

If you find a bug or run into any issues while using the package, please open an issue or contact one of the developers.

How to cite

Eckhardt, P., Ernst, K., Fleischmann, T., Geßler, A., Schnickmann, K., Thurner, L., and Wagner, G. "search-query: An Open-Source Python Library for Academic Search Queries".

The package was developed as part of Bachelor's theses:

  • Fleischmann, T. (2025). Advances in literature search queries: Validation and translation of search strings for EBSCOHost. Otto-Friedrich-University of Bamberg.
  • Geßler, A. (2025). Design of an Emulator for API-based Academic Literature Searches. Otto-Friedrich-University of Bamberg.
  • Schnickmann, K. (2025). Validating and Parsing Academic Search Queries: A Design Science Approach. Otto-Friedrich-University of Bamberg.
  • Eckhardt, P. (2025). Advances in literature searches: Evaluation, analysis, and improvement of Web of Science queries. Otto-Friedrich-University of Bamberg.
  • Ernst, K. (2024). Towards more efficient literature search: Design of an open source query translator. Otto-Friedrich-University of Bamberg.

Not what you are looking for?

This Python package was developed with the purpose of integrating it into other literature management tools. If that isn't your use case, it might be useful for you to look at these related tools:

License

This project is distributed under the MIT License.

About

Search Query: A Python package designed to load, lint, translate, save, improve, and automate academic literature search queries.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 8

Languages