Real time analytics with Elasticsearch and Kibana3

Last month, I attended a great talk on Devoxx French edition (see http://www.devoxx.com/display/FR13/Accueil) on “Migrating an application from SQL to NoSQL”. The talk title was pretty well chosen but it was mainly a presentation of 2 products features : Couchbase and Elasticsearh.

Beyond the relevancy of the speakers and the products, an Elasticsearch extension called Kibana3 was briefly introduced and – although marked as alpha release – it totally astonished me ! Kibana3 is an extension designed for real time analytics of data stored into Elasticsearch. It allows a full customization of dashboards and is such easy to use that it can almost be put into the hands of business people…

Some weeks later I found some time for a test run and although things go well, I thought it would be useful to write kind of a “How to” or “Quickstart” with Kibana3. Here it is.

The setup

Install and run Elasticsearch

Download Elasticsearch from http://www.elasticsearch.org (as I recheck everything for writing this post, I have chosen the 0.90.0 release that wasn’t out when I first test this… so everything should run fine also on the 0.20.6 release I’ve picked previously). Just extract the archive into a target directory and simply run the following ;

laurent@ponyo:~$/dev/elasticsearch-0.90.0$ bin/elasticsearh -f
[2013-04-30 00:13:14,312][INFO ][node                     ] [Dominic Fortune] {0.90.0}[4013]: initializing ...
[2013-04-30 00:13:14,321][INFO ][plugins                  ] [Dominic Fortune] loaded [], sites []
[2013-04-30 00:13:17,045][INFO ][node                     ] [Dominic Fortune] {0.90.0}[4013]: initialized
[2013-04-30 00:13:17,046][INFO ][node                     ] [Dominic Fortune] {0.90.0}[4013]: starting ...
[2013-04-30 00:13:17,225][INFO ][transport                ] [Dominic Fortune] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/192.168.1.80:9300]}
[2013-04-30 00:13:20,306][INFO ][cluster.service          ] [Dominic Fortune] new_master [Dominic Fortune][evQbXTeASNmADq4h-Q847A][inet[/192.168.1.80:9300]], reason: zen-disco-join (elected_as_master)
[2013-04-30 00:13:20,353][INFO ][discovery                ] [Dominic Fortune] elasticsearch/evQbXTeASNmADq4h-Q847A
[2013-04-30 00:13:20,376][INFO ][http                     ] [Dominic Fortune] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/192.168.1.80:9200]}
[2013-04-30 00:13:20,376][INFO ][node                     ] [Dominic Fortune] {0.90.0}[4013]: started
[2013-04-30 00:13:20,489][INFO ][gateway                  ] [Dominic Fortune] recovered [0] indices into cluster_state

Congratulations ! You are now running an Elasticsearh cluster with one node ! That is basically anything you need in order to have a basic setup because every interaction with the node – from the administration ones to the client APIs – are done through REST APIs over HTTP. That means a simple CURL command does the job.

Anyway, before going further, we’d like to add an administration console to our cluster (cause having some GUI doesn’t hurt after all) and we need to feed our node with data. For that, we are going to install 2 plugins :

Plugins simply install themselves using the bin/plugin command as follow.

For elasticsearch-head :

laurent@ponyo:~/dev/elasticsearch-0.90.0$ bin/plugin -install mobz/elasticsearch-head
-> Installing mobz/elasticsearch-head...
Trying https://github.com/mobz/elasticsearch-head/zipball/master... (assuming site plugin)
Downloading ............DONE
Identified as a _site plugin, moving to _site structure ...
Installed head

For elasticsearch-river-twitter :

laurent@ponyo:~/dev/elasticsearch-0.90.0$ bin/plugin -install elasticsearch/elasticsearch-river-twitter/1.2.0
-> Installing elasticsearch/elasticsearch-river-twitter/1.2.0...
Trying http://download.elasticsearch.org/elasticsearch/elasticsearch-river-twitter/elasticsearch-river-twitter-1.2.0.zip...
Downloading ...............................................................................................................................................................................................................................................DONE
Installed river-twitter

Now just restart your node by killing the started elasticsearch process and launching another one and point your browser to http://localhost:9200/_plugin/head/ ; you should now have access to web frontend.

Install and run Kibana3

As said into introduction, Kibana3 is an Elasticsearch plugin hosted by Elasticsearch itself and dedicated to analytics by providing the mean to dynamically build any dashboard onto an ES index (the data store). The best way to retrieve the product is to clone the GitHub repository like this :

laurent@ponyo:~/dev/github$ git clone https://github.com/elasticsearch/kibana3.git
Cloning into 'kibana3'...
remote: Counting objects: 2148, done.
remote: Compressing objects: 100% (892/892), done.
remote: Total 2148 (delta 1305), reused 2060 (delta 1226)
Receiving objects: 100% (2148/2148), 11.47 MiB | 273 KiB/s, done.
Resolving deltas: 100% (1305/1305), done.

As states Kibana3 documentation, it’s ‘just’ a bunch of static HTML and Javascript resources that can be put onto any reachable web server. For test commodity, Kibana3 embeds a little Node.js server that can be run if you’re lazy like me :

laurent@ponyo:~/dev/github/kibana3$ node scripts/server.js 
Http Server running at http://localhost:8000/

You can now check http://localhost:8000/index.html with your web browser and should see a default dashboard appearing with a bunch of red panels announcing errors… We’re going to fix that in next section.

The dashboard creation

Before starting to acutally create a dashboard, we need data ! Remember, we have installed the Twitter river plugin : we are going to connect Twitter public stream to retrieve such data. In order to complete following step, you need a valid Twitter account.

The following command helps us creating a Twitter connection specifying some trendy keywords 😉 Just substitute the placeholders with your Twitter account name and password and that’s done.

laurent@ponyo:~$ curl -X PUT 'localhost:9200/_river/twitter-river/_meta' -d '{ "type" : "twitter", "twitter" : { "user" : "<twitter_user>", "password" : "<twitter_password>", "filter" : { "tracks" : "java,nosql,node.js,elasticsearch,eclipse,couchdb,hadoop,mongodb" } }, "index" : { "index" : "tweets", "type" : "status", "bulk_size" : 5 } }'
{"ok":true,"_index":"_river","_type":"twitter-river","_id":"_meta","_version":1}

By browsing to http://localhost:9200/_plugin/head/, you should see the number into “tweets” index grow fast.

Let’s go back now to the defaul Kibana3 dashboard into your web browser.. We are gonna change somme params to make it a descent dashboard. First thing to change is the “Timepicker” widget that is use to define the data store on which dashboard it based.

14-may update

For the lazy ones (;-)) that will only want to see the result without building the dashboard, I’ve posted the JSON export here as a Gist : https://gist.github.com/lbroudoux/5579650. It’s easily importable into Kibana.

Edit this widget settings and change the time field as follow :

kibana3-timepicker-1

and then the index patterns as follow :

kibana3-timepicker-2

You should already have a descent dashboard as below (I’ve also changed the dashboard title and the time resolution to see many green bars on histogram).

kibana3-1st-result

You can experiment the “Zoom In” and “Zoom Out” on histogram and see their effect onto timepicker widget. You can also draw a rectangular zone onto histogram in order to zoom to this temporal period. Typing keywords into the Query input fied also have dynamic effects on searched records and histogram.

When moving down the page, you see a table widget that still have errors. Its goal is to display excerpts of found records. Edit this widget parameters as follow to configure it to correctly display your tweets :

kibana3-table

You see that we reference here the different fields found into a Twitter message coming from public stream (such informations on available fields can be found through the Head web frontend when browsing indexes and looking at stored documents).

Note that we can also modify the layout of widgets by editing row parameters. For exemple, we’re switching table and fields widgets to suit our preferences. Fields widget is indeed very convenient for adding new fields to table view. The screenshot below shows a result obtained after such a switch.

kibana3-2nd-result

Last thing I’ll show you here is the addition of new Kibana3 widget onto your dashboard. We are now going to display a map showing location of our Twitter users into the “Events” row. Open this row settings editor and select “map” into the new panel dropdown list. Then you’ll have to tell which field is used to get this information ; in the case of tweets the field is “place.country_code”. The setting is shown below :

kibana3-map

Don’t forget to click on the “Create Panel” button before closing editor ! The map now displays on your row. Finally after having heavenly distribute widget onto the row, you may reach the following result :

kibana3-final-result

The map widget is also clickable and can be used to drilldown into the data previously selected using query filter and/or timepicker filter. Quite impressive !

Conclusion

If I succeed in my demonstration, you have seen that using Kibana3 can be just easy when understanding the basic customization steps. Kibana3 looks like a very promising tool into this new area of big data, data scientist and miners that has appeared last years.

Some features might be still missing (like a complete integration with Elasticesearch indexes or document types catalogs, security around data consultation or dashboard sharing, etc…) for ensuring a deployment into enterprise world. However premices are already there with the ability of storing Kibana3 dashboard into Elasticsearch itself and the recent posts on how to secure an Elasticsearch cluster (see http://dev.david.pilato.fr/?p=241 for french readers).

I think that Kibana3 being hosted under the Elasticsearch umbrella may be a guarantee of seeing this extension developped and enhanced in the near future. In my humble opinion, this can represent a big advantage onto Elasticsearch business cards.

Advertisements

18 thoughts on “Real time analytics with Elasticsearch and Kibana3

  1. This works well. I hope it will be continued and become a stable product.
    One question – is it possible to do calculations on an index, i.e. sum or avg.
    If I have a table for orders with a column units and want to find out the total units per day.

  2. hi,i just followed your tutorial but iam getting this error

    Oops! FacetPhaseExecutionException[Facet [chart0]: (key) field [created_at] not found]

    can u help me where iam commitin mistake??

    Thanks..:-)

  3. Hey guys, its a great post.

    I am facing an Error on the Kibana3-UI:
    Oops! Could not match index pattern to any ElasticSearch indices

    Allthough i made exactly the same steps as described. Could anyone please provide any assistance.

    Thanks

    1. Hi Nizar, have you tried using the provided Gist to check if everything runs right ? If yes, I should take some time running this tutorial against Kibana fresh version cause many things seems to have change …

      1. Hey, thanks for responding.

        I you mean that gist of the update of may 14th, yes i did, but unfort. with the same result.

        Now i got my twitter-account blacklisted, because i tried a couple of times.

        If you could check it, would be great.

        Many thanks
        Nizar

    1. Hey David,

      yes, i noticed that and installed the new river.
      The problem i had afterwards is, that i could not register (for an app) at twitter. The registration form was each time throwing an error that the callback_url is invalid and its true, i did not enter anything there since the field is not mandatory.

      My goal is actually is to get Kibana running with ElasticSearch. Therefore i created an index with a timestamp-field as required but still getting the following error on the Kibana-UI:
      Oops! Could not match index pattern to any ElasticSearch indices

      I was also getting this error with the running ES-Twitter-River!!

      Many thanks for your assistance
      Nizar

  4. Hi,
    good tutorial.
    I am trying to adapt this for OmniOS, but I get no command “node” or folder
    “scripts/server.js”

    Thanks!

  5. Thanks for this post, last version of kibana uses [tweets] instead of “tweets” as index pattern in timepicker option box it removes the “could not match index pattern…” error.

  6. Thanks for this extra-ordinary stuff..It helps a lot for beginner’s like me..i followed all your steps mentioned..i
    just wanna know in Elastic search head i can view my database and table result set and able to search the same …Can i see the same in Kibana 3 if so let me know how ? and also is it possible to plot the graph according to database fields..
    Note *:
    1.OS = Windows xp
    2.DB = Mysql

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s