Elastic Stack
Before we Get Started
Please clone the git repo: [Link]
Once cloned, please run `docker-compose up -d` inside the docker-elk directory.
If RAM<8GiB, please use Elastic Cloud, please use a throwaway email address eg,
[Link] or some other service of the sort.
If I self-host ELK, I’ll add the link here: [Link]
Also, please download the log file here.
What is the Elastic Stack?
What is the Elastic Stack?
● The products that are developed by Elastic NV.
● Frequently used together as there is a strong synergy between them.
● Data Ingestion
○ Logstash
○ Beats
● Search, Store and Analyze
○ Elasticsearch - Heart of the Elastic Stack
● Visualisation
○ Kibana
● Additional functionality
○ X Pack
What is Elasticsearch?
● It is a Analytics and Full Text Search Engine.
● It is written in Java, and built on top of Apache Lucene.
● Often used for building search functionality into your applications.
● This can also include autocompletion, highlighting typos, correcting matches,
adjusting relevance, etc.
● Query and analyze structured data.
● Can be used to provide BI functionality.
● Application Performance Management (APM) - CPU/Memory Usage
● Send events to Elasticsearch - Aggregation - Same as RDBMS!
● Forecast future values as well as Anomaly Detection using ML - XPack needed.
How does Elasticsearch work?
● Data is stored as Documents.
○ Documents are same as rows in a RDBMS.
● A Document can have Fields.
○ Fields are same as columns in a RDBMS.
● Documents are stored as Indexes.
○ Elasticsearch Indices are logical partitions of documents
○ Can be compared to a Database.
● Sample Document:
{
"name" : “ABC", // name is a field name, Aman is a field value.
"subject" : "SPE",
"email" : “abc@[Link]"
}
● You can query Elasticsearch using REST.
What is Logstash?
● It is a data processing pipeline.
● Used to process logs from applications and pass them along to Elasticsearch.
● Data received by logstash is treated as an event and then passed along to
Elasticsearch.
● Logstash supports multiple sources as input and outputs, data needn’t always
be passed along to Elasticsearch but this event can be passed as an event into
Kafka queue, an e-mail message or to an HTTP endpoint.
● Logstash Pipeline consists of 3 stages, input, filter and output. Each stage uses
plugins. These plugins are called Input Plugins, Filter Plugins and Output
Plugins/Stashes respectively.
● It can store this pipeline as a proprietary markup format very similar to JSON.
Sample Logstash Pipeline
input { file { path => "/var/log/apache/[Link]" start_position => "beginning" } }
filter {
grok {
match => { "message" => ['%{TIMESTAMP_ISO8601:time} %{LOGLEVEL:logLevel}
%{GREEDYDATA:logMessage}'] }
}//Other steps or conditional steps
output { elasticsearch { hosts => ["localhost:9200"] } }
Grok Patterns
Sample grok pattern:
%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:[Link]} %{IP:ipaddress} .*?
"%{WORD:http_method} %{PATH:path} HTTP/%{NUMBER:http_version}"
%{NUMBER:http_status:int} %{NUMBER:response_size:int}
"%{GREEDYDATA:message}"
● “Basically” regex
● %{PATTERN:identifier}
What is Kibana?
● It is Visualisation and Analytics platform which fetches data from Elasticsearch
and let’s you easily create visualisations from it.
● Has multiple built-in visualisations.
● Can track real time data across a map. Can also build dashboards.
● Not a BI tool replacement.
● Manages Authentication and Authorization for Elasticsearch and Logstash.
● Uses the same REST API to communicate with Elasticsearch.
● Can be thought of as a dashboard for Elasticsearch.
What is X-Pack?
● Adds additional functionality to Elasticsearch.
● Can provide LDAP, AD and other authentication providers.
● Monitoring the ELK stack and set up alerting.
● Can generate reports from Kibana.
● Machine Learning for Forecasting or Anomaly Detection.
What is Beats?
● Collection of Data shippers.
● Types of Beats:
○ Filebeat - collects log files and support common log file types like SQL, nginx, etc.
○ Metricbeat - collects system metrics like CPU, RAM usage, etc.
○ Packetbeat - collects network data
○ Winlogbeat - collects Windows Event Logs
○ Auditbeat - collects Audit data from Linux
○ Heartbeat - monitors service uptime
Other Tools
● Graph - Identify relationships between your data
○ Eg: 10 people use Google vs 10 people who use Stackoverflow.
○ ‘Uncommonly common’ relationships.
● Elasticsearch SQL
○ Everyone mostly uses SQL though.
Important References
[Link]
Hands-on!
Run ELK -> Upload a Log file through Kibana -> Apply a Grok Filter -> Show that the
data is properly ingested by logstash into elasticsearch.
Hands-on!
Upload Data
Hands-on!
Add Grok Filter
Hands-on!
Add Grok Filter
Hands-on!
Confirm Data is ingested properly
Hands-on!
Save the index
Hands-on!
Index is created!
Hands-on!
Logs are properly shipped to Elasticsearch, now you can create any visualisation
you like.
Interacting with Elasticsearch - List Indexes
● Use the REST API.
● Can be done either via cURL or Kibana
● On Kibana, Open hamburger menu > Management > Dev Tools
● We need to send a GET request to _cat/indices (cat: Compact Aligned Text)
● _cat: API and indices is the Command.
GET /_cat/indices?v
Interacting with Elasticsearch - List all Indexes
● We need to send a GET request to _cat/indices
GET /_cat/indices?v=true&expand_wildcards=all
Cluster Health and List Nodes
● We need to send a GET request to _cluster/health and _cat/nodes
GET /_cluster/health
GET /_cat/nodes?v
Create a Index with documents
● We need to send a POST request to /<new-index> and specify the doc within _doc
POST index-with-docs/_doc/
{
"@timestamp": "2023-11-15T13:12:00",
"message": "GET /search HTTP/1.1 200 1070000",
"user": {
"id": "aman"
}
}
Create a Index with documents
● We need to send a POST request to /<new-index> and specify the doc within
_doc
Show all Documents in a Index
● We need to send a GET request to /<index> and use the _search API.
GET /index-with-docs/_search
"query": {
"match_all": {}
}
Show all Documents in a Index
● We need to send a GET request to / and use the _search API.
Thank You