See www.zabbix.com for the official Zabbix site.

Docs/specs/ZBXNEXT-844

From Zabbix.org
Jump to: navigation, search

Cassandra support for historical data

ZBXNEXT-844

Status: Draft


Description

Summary

Support of Cassandra for storing historical data.

Wikipedia page on Apache Cassandra

DataStax Cassandra 0.8 Documentation

Cassandra API with Examples

Useful information about composite type in Cassandra 0.8: https://issues.apache.org/jira/browse/CASSANDRA-2231

Specification

The functionality should be based on Cassandra 0.8.1.

Only numeric historical data (previously stored in tables history and history_uint in a traditional SQL storage) will be stored in Cassandra with millisecond resolution of timestamps. Zabbix won't generate or update trends, even long term graphs will be generated using historical (not trend!) data.

Any non-numeric data (history_log, history_str and history_text) will be stored in the traditional SQL storage. Also all history*sync tables will be stored in SQL database for DM setups.

Housekeeper will not remove any data from Cassandra storage even for deleted items. Item-level housekeeper settings will be available but will not be used.

Scope of changes

Cassandra storage is to be implemented for Zabbix Server only. It won't affect Zabbix Proxy.

No upgrade or migration issues should be covered by this task. It is out of scope.

Internal items zabbix[history], zabbix[history_uint], etc. will not be supported with Cassandra.

Implementation details

Server side should use https://github.com/eyealike/libcassandra, which is considered to be the most feature rich (load balancing and node fail over support) and stable Cassandra library.

PHP frontend will use phpcassa it has more contributors and seems updated faster.

Questions for specification

  • For different versions of Cassandra different bindings are generated with Thrift. Who will generate them, we or clients?
  • What will be the schema design for Cassandra? Existing queries in both server and GUI should be analyzed to come up with schema.
  • A consideration: when we convert our Zabbix database to distributed monitoring setup, we change IDs of all things, including items. That means that if our Zabbix installation already has historical information in Cassandra, we would have to change item keys there as well.
  • Zabbix 2.0 supports nanoseconds in history tables, Zabbix 1.8 does not. Do we implement support for nanoseconds in 1.8 straight away? If not, we should take care of upgrade patches for Cassandra.
  • Which PHP API we will be using to access Cassandra? Options:
  1. https://github.com/mjpearson/Pandra
  2. https://github.com/kallaspriit/Cassandra-PHP-Client-Library
  3. https://github.com/thobbs/phpcassa
  4. http://code.google.com/p/simpletools-php/wiki/SimpleCassie
  5. raw Thrift API in PHP: http://svn.apache.org/viewvc/thrift/trunk/lib/php/
  • How to configure Cassandra for PHP? Will it be zabbix.conf.php and installation wizard?
  • Any additional checks in frontend for Cassandra. (if it's up? if it has correct structure?)

To be considered for future development, but out of scope currently

We may decide to support trend data stored in Cassandra. Keep history_log, history_str and history_text in Cassandra.

Details

In zabbix_server.conf 2 new parameters added related to Cassandra:

 # CassandraHost - array of Cassandra servers IPs with ports
 CassandraHost=127.0.0.1:9160
 # CassandraKeyspace - name of keyspace
 CassandraKeyspace=KeySpace1

In zabbix.conf.php 3 new parameters added related to Cassandra:

 $DB['USE_CASSANDRA'] = True; // enable/disable Cassandra support
 $DB['CASSANDRA_IP'] = array('127.0.0.1:9160'); // array of Cassandra servers IPs with ports
 $DB['CASSANDRA_KEYSPACE'] = 'KeySpace1'; // name of keyspace

API changes

Database changes