Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Why Elasticsearch?
● You can ask query anyway you want.
● Lets you analyze billions of data.
Document
Document is a basic unit of information.
It is nothing but JSON.
Each document has multiple fields.
Type
Type is defined for documents which have a
common set of fields.
It is a logical partition of index.
Each type has multiple documents.
Index
An index is a collection of documents having
similar characteristics.
Each index has multiple types.
Node
A node is a single instance of the elasticsearch
server which stores the data.
Each node has multiple indices.
Cluster
Cluster is a collection of one or more nodes that
work together.
This distributed nature grant the easy handling
of data that is too large for a single node to
handle.
Shard
Elasticsearch allows you to subdivide your
index into multiple pieces which are called
shards.
Each shard is a fully-functional and
independent “index” which can be hosted on
any node within the cluster.
Replica
Elasticsearch provides replicas.
Replicas are just an additional copy of a shard
and can be used for queries just as the original
shards.
API
Elasticsearch API comes in the form of HTTP RESTful APIs (GET, PUT, DELETE) that uses
JSON as the data exchange format.
Data Storage Mechanism in Elasticsearch
The act of storing data in Elasticsearch is called indexing. An Elasticsearch cluster can contain
multiple indices, which in turn contain multiple types. These types hold multiple documents, and
each document has multiple fields.
Document API
1. SINGLE DOCUMENT API
● Index API ( PUT /playlist/kpop/1) ( PUT /index/type/id)
● Get API ( GET /playlist/kpop/1)
● Update API ( PUT /playlist/kpop/1)
● Delete API ( DELETE /playlist/kpop/1)
2. MULTI-DOCUMENT API
● Multi Get API
● Bulk API
● Delete By Query API
● Update By Query API
● Reindex API
Search API
There are various parameters which can be passed in a search operation having Uniform Resource
Identifier (URI).
Parameter Description
lenient By setting this parameter’s value to true, format based errors can be
ignored
Installation on Windows
Step2: Go to https://www.elastic.co/downloads
Step3: Click on the Download to get the zip file
Step4: Once the file is download, unzip it and extract the contents
Step5: Go to elasticsearch-x.y.z>bin
Step6: Inside bin folder, find elasticsearch.bat file and double-click on it to start the Elasticsearch
server
Step8: Open browser and type localhost:9200 to check whether the server is running or not
Step9: If you can see the above-shown message on the browser, it means everything is fine.
Step10: Last thing you need to do is to add the Sense(beta) plugin which will act as a developers
interface to Elasticsearch
In Python Code
Defining a document
e1={
"first_name":"Kiran",
"last_name":"Kumar",
"age": 22,
"about": "Love to play volleyball",
"interests": ['sports','music'],
}
print e1
{'interests': ['sports', 'music'], 'about': 'Love to play
volleyball', 'first_name': 'Kiran', 'last_name': 'Kumar', 'age':
22}
Inserting a document
res= es.index(index='megacorp',doc_type='employee',id=1,body=e1)
e2={
"first_name" : "Jane",
"last_name" : "Smith",
"age" : 32,
"about" : "I like to collect rock albums",
"interests": [ "music" ]
}
e3={
"first_name" : "Douglas",
"last_name" : "Fir",
"age" : 35,
"about": "I like to build cabinets",
"interests": [ "forestry" ]
}
res=es.index(index='megacorp',doc_type='employee',id=2,body=e2)
print res['created']
res=es.index(index='megacorp',doc_type='employee',id=3,body=e3)
print res['created']
False
True
Retrieving a document
res=es.get(index='megacorp',doc_type='employee',id=3)
print res
{u'_type': u'employee', u'_source': {u'interests': [u'forestry'],
u'age': 35, u'about': u'I like to build cabinets', u'last_name':
u'Fir', u'first_name': u'Douglas'}, u'_index': u'megacorp',
u'_version': 1, u'found': True, u'_id': u'3'}
print res['_source']
Deleting a document
res=es.delete(index='megacorp',doc_type='employee',id=3)
print res['result']
deleted
References
https://www.edureka.co/blog/what-is-elasticsearch/
https://towardsdatascience.com/elasticsearch-tutorial-for-beginners-using-python-b9cb48edcedc
https://www.elastic.co/guide/en/kibana/current/deb.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/deb.html
$ java -version
$ gedit .bashrc
Scroll down to the end of the file and append below lines
JAVA_HOME = “/usr/lib/jvm/java-11-openjdk-amd64/bin”
export JAVA_HOME
PATH = $PATH:$JAVA_HOME
export PATH
click on save.
$ source ~/.bashrc
$ echo $JAVA_HOME
Install below apt repository package provided by elasticsearch team to install elasticsearch on
ubuntu
$ sudo apt-get install apt-transport-https
By default the Elasticsearch service doesn’t log information in the systemd journal
To list journal entries for the elasticsearch service starting from a given time:
$ sudo journalctl --unit elasticsearch --since "2016-10-30 18:17:16"