-
Notifications
You must be signed in to change notification settings - Fork 113
Add python keyword search quickstart #2106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 2 commits
Commits
Show all changes
24 commits
Select commit
Hold shift + click to select a range
183e888
Add new python keyword search quickstart
lcawl 657c146
Merge branch 'main' into lcawl/gs-python
lcawl 43e4944
Update solutions/search/get-started/keyword-search-python.md
lcawl 6b4a507
Update solutions/search/get-started/keyword-search-python.md
lcawl ee2b737
Update solutions/search/get-started/keyword-search-python.md
lcawl d828108
Update solutions/search/get-started/keyword-search-python.md
lcawl e679259
Update solutions/search/get-started/keyword-search-python.md
lcawl 3ebeca2
Update solutions/search/get-started/keyword-search-python.md
lcawl d801c53
Merge branch 'main' into lcawl/gs-python
lcawl c547fab
Address feedback about introduction
lcawl 55f0cb0
Update solutions/search/get-started/keyword-search-python.md
lcawl 1761a00
Update solutions/search/get-started/keyword-search-python.md
lcawl 2160992
Update solutions/search/get-started/keyword-search-python.md
lcawl 9014c1f
Update solutions/search/get-started/keyword-search-python.md
lcawl 26261e5
Add project creation step and python prereqs
lcawl d43d8af
More edits
lcawl 56d3454
Improve query and next steps
lcawl 1a19a32
Update solutions/search/get-started/keyword-search-python.md
lcawl a5bf9fb
Update solutions/search/get-started/keyword-search-python.md
lcawl 88aacb9
Merge branch 'main' into lcawl/gs-python
lcawl 77795d4
Fix typo
lcawl 3469afd
Add stepper component
lcawl 3c85f7f
Merge branch 'main' into lcawl/gs-python
lcawl 0559ddc
Merge branch 'main' into lcawl/gs-python
lcawl File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,177 @@ | ||
--- | ||
description: An introduction to building an Elasticsearch query in Python. | ||
applies_to: | ||
serverless: all | ||
products: | ||
- id: elasticsearch | ||
--- | ||
# Build your first search query with Python | ||
|
||
{{es}} provides a range of search techniques, starting with BM25, the industry standard for textual search. | ||
It provides official clients for multiple programming languages, including Python, Rust, Java, JavaScript, and others. | ||
These clients offer full API support for indexing, searching, and cluster management. | ||
They are optimized for performance and kept up to date with {{es}} releases, ensuring compatibility and security. | ||
lcawl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
In this quickstart guide, you will index a couple documents and query them using Python. | ||
lcawl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
By the end of this guide, you’ll have learned how to connect a backend application to {{es}} to answer your queries. | ||
lcawl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
lcawl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
## Prerequisites | ||
|
||
- If you're using [{{es-serverless}}](/solutions/search/serverless-elasticsearch-get-started.md), create a general purpose project. To add the sample data, you must have a `developer` or `admin` predefined role or an equivalent custom role. | ||
<!-- | ||
lcawl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
If you're using [{{ech}}](/deploy-manage/deploy/elastic-cloud/cloud-hosted.md) or [running {{es}} locally](/solutions/search/run-elasticsearch-locally.md), start {{es}} and {{kib}}. To add the sample data, log in with a user that has the `superuser` built-in role. | ||
--> | ||
|
||
To learn about role-based access control, check out [](/deploy-manage/users-roles/cluster-or-deployment-auth/user-roles.md). | ||
lcawl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Create an index | ||
|
||
An index is a collection of documents uniquely identified by a name or an alias. | ||
Go to **{{es}} > Home**, select keyword search, and follow the guided index workflow. | ||
lcawl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
<!-- | ||
Click **Create a TBD index**. | ||
- If you're using {{es-serverless}}... | ||
- If you're using {{ech}} or running {{es}} locally, go to **{{es}} > Home** and click **Create API index**. Select the semantic search workflow. | ||
--> | ||
|
||
You've created your first index! | ||
Next, create an API key so your application can talk to {{es}}. | ||
<!-- | ||
TBD: Describe how to create the key | ||
--> | ||
:::{tip} | ||
For an introduction to the concept of indices, check out [](/manage-data/data-store/index-basics.md). | ||
lcawl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
::: | ||
|
||
## Install an {{es}} client | ||
lcawl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Select your preferred language in the keyword search workflow. For this example, leverage Python. | ||
lcawl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
 | ||
|
||
In your terminal, install the {{es}} client using `pip`: | ||
|
||
```py | ||
pip install elasticsearch | ||
``` | ||
|
||
Copy your API key from the top right corner and add it to the client’s configuration alongside the project URL. | ||
|
||
```py | ||
from elasticsearch import Elasticsearch | ||
|
||
client = Elasticsearch( | ||
"https://my-project-bff300.es.us-east-1.aws.elastic.cloud:443", | ||
api_key="YOUR-API-KEY" | ||
) | ||
|
||
index_name = "my-index" | ||
``` | ||
|
||
## Create field mappings | ||
|
||
At this stage, you can define the mappings for your index, including a single text field — named "text". | ||
lcawl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
```py | ||
mappings = { | ||
"properties": { | ||
"text": { | ||
"type": "text" | ||
} | ||
} | ||
} | ||
|
||
mapping_response = client.indices.put_mapping(index=index_name, body=mappings) | ||
print(mapping_response) | ||
``` | ||
|
||
## Add documents | ||
|
||
Next, use a bulk request to index three documents in {{es}}. | ||
Bulk requests are the preferred method for indexing large volumes of data, from hundreds to billions of documents. | ||
|
||
```py | ||
docs = [ | ||
{ | ||
"text": "Yellowstone National Park is one of the largest national parks in the United States. It ranges from the Wyoming to Montana and Idaho, and contains an area of 2,219,791 acress across three different states. Its most famous for hosting the geyser Old Faithful and is centered on the Yellowstone Caldera, the largest super volcano on the American continent. Yellowstone is host to hundreds of species of animal, many of which are endangered or threatened. Most notably, it contains free-ranging herds of bison and elk, alongside bears, cougars and wolves. The national park receives over 4.5 million visitors annually and is a UNESCO World Heritage Site." | ||
}, | ||
{ | ||
"text": "Yosemite National Park is a United States National Park, covering over 750,000 acres of land in California. A UNESCO World Heritage Site, the park is best known for its granite cliffs, waterfalls and giant sequoia trees. Yosemite hosts over four million visitors in most years, with a peak of five million visitors in 2016. The park is home to a diverse range of wildlife, including mule deer, black bears, and the endangered Sierra Nevada bighorn sheep. The park has 1,200 square miles of wilderness, and is a popular destination for rock climbers, with over 3,000 feet of vertical granite to climb. Its most famous and cliff is the El Capitan, a 3,000 feet monolith along its tallest face." | ||
}, | ||
{ | ||
"text": "Rocky Mountain National Park is one of the most popular national parks in the United States. It receives over 4.5 million visitors annually, and is known for its mountainous terrain, including Longs Peak, which is the highest peak in the park. The park is home to a variety of wildlife, including elk, mule deer, moose, and bighorn sheep. The park is also home to a variety of ecosystems, including montane, subalpine, and alpine tundra. The park is a popular destination for hiking, camping, and wildlife viewing, and is a UNESCO World Heritage Site." | ||
} | ||
] | ||
|
||
bulk_response = helpers.bulk(client, docs, index=index_name) | ||
print(bulk_response) | ||
``` | ||
|
||
## Explore the data | ||
|
||
You should be able to see the documents in {{es}}! | ||
|
||
 | ||
<!-- | ||
To familiarize yourself with this data set, open [Discover](/explore-analyze/discover.md) from the navigation menu or the global search field. | ||
--> | ||
|
||
## Test keyword search | ||
|
||
Create a new script (for instance `search.py`), which defines a query and runs the following search request: | ||
lcawl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
```esql | ||
FROM my-index | ||
| WHERE MATCH(text, "yosemite") | ||
| LIMIT 5 | ||
``` | ||
|
||
Add this query inside `client.esql.query`: | ||
|
||
```py | ||
from elasticsearch import Elasticsearch | ||
|
||
client = Elasticsearch( | ||
"https://my-project-bff307.es.us-east-1.aws.elastic.cloud:443", | ||
api_key="YOUR-API-KEY" | ||
) | ||
|
||
# Run the search query | ||
response = client.esql.query( | ||
query=""" | ||
FROM my-index | ||
| WHERE MATCH(text, "yosemite") | ||
| LIMIT 5 | ||
""", | ||
format="csv" | ||
) | ||
|
||
print(response) | ||
``` | ||
|
||
## Analyze the results | ||
|
||
Check your result: | ||
|
||
```txt | ||
"Yosemite National Park is a United States National Park, covering over 750,000 acres of land in California. A UNESCO World Heritage Site, the park is best known for its granite cliffs, waterfalls and giant sequoia trees. Yosemite hosts over four million visitors in most years, with a peak of five million visitors in 2016. The park is home to a diverse range of wildlife, including mule deer, black bears, and the endangered Sierra Nevada bighorn sheep. The park has 1,200 square miles of wilderness, and is a popular destination for rock climbers, with over 3,000 feet of vertical granite to climb. Its most famous and cliff is the El Capitan, a 3,000 feet monolith along its tallest face." | ||
Now you are ready to use the client to query Elasticsearch from any Python backend like Flask, Django, etc. Check out the Elasticsearch Python Client documentation to explore further | ||
``` | ||
|
||
<!-- | ||
When you finish your tests and no longer need the sample data set, delete the index: | ||
|
||
```console | ||
DELETE /semantic-index | ||
``` | ||
--> | ||
|
||
## Next steps | ||
|
||
Thanks for taking the time to learn how to build an application on top of {{es}}. | ||
lcawl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
For a deeper dive, check out the following resources: | ||
lcawl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
- [Getting started with the Python client](elasticsearch-py://reference/getting-started.md) | ||
- [Python notebooks](https://github.com/elastic/elasticsearch-labs/tree/main/notebooks/README.md) | ||
- [](/manage-data/ingest/ingesting-data-from-applications/ingest-data-with-python-on-elasticsearch-service.md) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.