- Elastic Search
- Logstash
- Beat
- Kibana
Elasticsearch is a highly scalable open-source full-text search and analytics engine. It allows you to store, search, and analyze big volumes of data quickly and in near real time. It is generally used as the underlying engine/technology that powers applications that have complex search features and requirements.
Installation
Installing elastic search on Ubuntu Machine.
$ wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.3.0.deb
$ dpkg -i https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.3.0.deb
Configuration
After installation need to configure,
$ vim /etc/elasticsearch/elasticsearch.yaml
"""
cluster.name: <name>
node.name: <node-name>
network.host: <ip of machine>
"""
Now, need to change one VM level configuration,
$ sysctl -w vm.max_map_count=262144
Now, Restart elastic search
$ service elasticsearch start
By default elastic search runs on default port 9200
$ curl http://localhost:9200
Cluster
- Collection of nodes
- Identified by unique
Node
-
Master Node
- Light weight node
- Responsible for cluster Management
- Ensures the cluster is stable
- It is not recommended to send index or search request to this node
-
Data Node
- Responsible for storing actual data
- Participiates in index processing
-
Client Node
- Acts as a Load balancer for processing requests
- Used to perform scatter/gather based operations like search
- Neither stores the data nor participiates in cluster management
- Relieves data node to do heavy duty of searching Docuement
-
Json Document stored in elasticsearch
-
Equvalent to a row in a relational database
Index
- Collection of document having similar characteristics
- Equivalent to a database instance in a relational database world.
- Mappings which defines multiple types.
- Logical namespace to map one or more primary shards.
- Can have zero or more replicas.
- Supports time series indexing
Type
- Equivalent to table in a relational database
- Each type has a list of fields
- Mapping defines how each field is analyzed
ID
- Unique identifier to identify a document
- Combination of index,type and id must be unique to be able to identify a document deterministically
- The user can specify the id but id can also be auto generated by Elasticsearch if it is not provided
Shards
- Logical unit to store data
- Each Document is stored in a single primary shard.
- By default each index has 5 shards.
- Number of shards cannot be changed after index creation.
Replicas
- Each index can have 0 or more replicas
- Helps with fail over, performance
- Replicas are never stored on the same node as primary shard node.
- can be changed after index creation.
Mapping
- Equivalent to schema in relational database.
- Each index has mappings which define each type within a index.
- Can be defined explicity or it will generated automatically.
Analysis
- Process of converting full text to terms
- Texts are broken depending upon type of analyzers.
Snapshot and Restore
- Used for index backup and restore
- Supports full or incremental index backup
$ curl -XPUT 172.28.128.3:9200/movies =d '
{
"mappings" : {
"movie" : {
"_all" : { "enabled" : false },
"properties" : {
"year" : {
"type" : "date"
}
}
}
}
}'
To get the mappings
$ curl -XGET 172.28.128.3:9200/movies/_mapping/movie?pretty
Import One movie via curl
$ curl -XPUT 172.28.128.3:9200/movies/movie/123456 -d '
{
"genre" : "SCi_FI",
"Title" : "Interstellar",
"year" : 2014
}'
To import many movies at once,
$ wget http://media.sundog-soft.com/es/movies.json
$ curl http://172.28.128.3:9200/_bulk?pretty --data-binary @movies.json
To update the movies,
$ curl -XGET 172.28.128.3:9200/movies/movie/123456?pretty
$ curl -XPOST 172.28.128.3:9200/movies/movie/123456/_update -d '
{
"doc" : {
"title" : "Interstellar By Christopher Nolan"
}
}'
To Delete the movie
$ curl -XDELETE 172.28.128.3:9200/movies/movie/123456
"Retry on conflict" to resolve the concurreny issues.
ElasticSearch Quires
-
Match Query
curl -s -XGET http://172.28.128.3:9200/movies/_search? -H 'Content-Type: application/json' -d ' { "query" : { "match" : { "title" : "interstellar" } } }' | jq '.'
-
Term
$ curl -s -XGET http://172.28.128.3:9200/movies/_search? -H 'Content-Type: application/json' -d '
{
"query" : {
"match" : {
"genre" : "SCI-FI"
}
}
}' | jq '.'
- Range
$ curl -s -XGET http://172.28.128.3:9200/movies/_search? -H 'Content-Type: application/json' -d '
{
"query" : {
"range" : {
"year" : { "gte" : 1990, "lt": 2015 }
}
}
}' | jq '.'
-
Match_phrase
- slop
- "quick brown fox" as "quick fox"
- slop
-
Pagination
$ curl <url>/movies/movie/_search?size=2&from=2&pretty $ curl -XGET http://172.28.128.3:9200/movies/movie/_search -d ' { "size" : 2, "from" : 2, "query" : { "match" : { "genre" : "SCI-FI" } } }' | jq '.'
-
Sorting
$ curl -XGET http://172.28.128.3:9200/movies/movie/_search -d '
{
"query": {
"match" : {
"genre" : "SCI-FI"
}
},
"sort": { "year" : { "order" : "desc" } }
}' | jq '.'
-
Filter Query
- must (AND)
- must_not
- should (OR)
- filter
Note: "Bool query Doesn;t support filters"
$ ##"filters" : { "range" : { "year" : { "gte" : 2010 , "lt" : 2015 }}}
$ curl http://172.28.128.3:9200/movies/_search?pretty -d ' { "query" : { "bool" : { "must" : { "match" : { "genre" : "SciFi" }}, "must_not" : { "match" : { "title" : "trek" }} } } }'
-
Fuziness
$ curl http://172.28.128.3:9200/movies/_search?pretty -d '
{
"query" : {
"fuzzy" : {
"title" : { "value" : "intrstellar", "fuzziness" : 2 }
}
}
}'
- Prefix and wildcard matching
- Ngram