Let’s talk about Dockerized Elastic Stack…
Elastic Stack is an open source solution that reliably and securely take data from any source, in any format, and search, analyze, and visualize it in real time. It is a collection of open source products – Elasticsearch, Logstash, Kibana & recently added fourth product, called Beats. Elastic Stack can be deployed on premises or made available as Software as a Service.
Brief about Elastic Stack Components:
Elasticsearch is a RESTful, distributed, highly scalable, JSON-based search and analytics engine built on top of Apache Lucene and released under Apache license. It is Java-based and designed for horizontal scalability, maximum reliability, and easy management. It is basically an open-source full-text search and analytics engine. It allows you to store, search, and analyze big volumes of data quickly and in near real time. It is generally used as the underlying engine/technology that powers applications that have complex search features and requirements.
Logstash is an open source, server-side data processing pipeline that ingests data from a multitude of sources simultaneously, transforms it, and then sends it to your favorite “stash.” (Elasticsearch). Logstash is a dynamic data collection pipeline with an extensible plugin ecosystem and strong Elasticsearch synergy. The product was originally optimized for log data but has expanded the scope to take data from all sources.Data is often scattered or siloed across many systems in many formats. Logstash supports a variety of inputs that pull in events from a multitude of common sources, all at the same time. Easily ingest from your logs, metrics, web applications, data stores, and various AWS services, all in continuous, streaming fashion.As data travels from source to store, Logstash filters parse each event, identify named fields to build structure, and transform them to converge on a common format for easier, accelerated analysis and business value.
Logstash dynamically transforms and prepare your data regardless of format or complexity:
- Derive structure from unstructured data with grok
- Decipher geo coordinates from IP addresses
- Anonymize PII data, exclude sensitive fields completely
- Ease overall processing independent of the data source, format, or schema.
Logstash has a pluggable framework featuring over 200 plugins. Mix, match, and orchestrate different inputs, filters, and outputs to work in pipeline harmony.
Lastly, Kibana lets you visualize your Elasticsearch data and navigate the Elastic Stack. It gives you the freedom to select the way you give shape to your data. And you don’t always have to know what you’re looking for. With its interactive visualizations, start with one question and see where it leads you.Kibana developer tools offer powerful ways to help developers interact with the Elastic Stack. With Console, you can bypass using curl from the terminal and tinker with your Elasticsearch data directly. The Search Profiler lets you easily see where time is spent during search requests. And authoring complex grok patterns in your Logstash configuration becomes a breeze with the Grok Debugger.
In next 5 minutes, we are going to test drive ELK stack on PWD playground.
Let’s get started –
Open up https://play-with-docker.com
Click on icon next to Instances to open up ready-made templates for Docker Swarm Mode:
Choose the first template (as highlighted in the above figure) to select 3 Managers and 2 Workers. It will bring up Docker 17.06 Swarm Mode cluster in just 10 seconds.
Run the below command to show up the cluster nodes:
$docker node ls
Run the necessary command on node which will run elasticsearch:
$sysctl -w vm.max_map_count=262144
$echo ‘vm.max_map_count=262144’ >> /etc/sysctl.conf
Clone the GitHub repository:
$git clone https://github.com/ajeetraina/docker101
Run the below command to bring up visualiser tool as shown below:
Soon you will notice port 8080 displayed on the top of the page which when clicked will open up visualiser tool.
It’s time to clone ELK stack and execute the below command to bring up ELK stack across Docker 17.06 Swarm Mode cluster:
$git clone https://github.com/ajeetraina/swarm-elk
$docker stack deploy -c docker-compose.yml myself
[Credits to Andrew Hromis for building this docker-compose file. I leveraged his project repository to bring up the ELK stack in the first try]
You will soon see the below list of containers appearing on the nodes:
Run the below command to see the list of services running across the cluster:
$docker service ls
Click on port 5601 displayed on the top of the PWD page:
Please Note: Kibana need data in Elasticsearch to work with. The .kibana index holds Kibana related data, and if they is the only index you have there is no data available that Kibana can visualise.Before you can use Kibana you will therefore need to index some data into Elasticsearch. This can be done e.g. using Logstash or directly through the REST interface using curl.
Soon you will see the below Kibana page:
Enabling High Availability for Elastic Stack through scaling
Let us scale out more number of replicas for elasticsearch:
Pushing data into Logstash:
Let us push NGINX web server logs into logstash and see if Kibana is able to detect it:
$docker run -d –name nginx-with-syslog –log-driver=syslog –log-opt syslog-address=udp://10.0.173.7:12201 -p 80:80 nginx:alpine
Now if you open up Kibana UI, you should be able to see logs being displayed for Nginx:
We can also push logs to logstash using the below command:
$docker run –rm -it –log-driver=gelf –log-opt gelf-address=udp://10.0.173.7:12201 alpine ping 126.96.36.199
Open up Kibana and now you will see the below GREEN status:
Did you find this blog helpful? Feel free to share your experience. Get in touch @ajeetsraina.
If you are looking out for contribution/discussion, join me at Docker Community Slack Channel.