See www.zabbix.com for the official Zabbix site.

Docs/specs/ZBXNEXT-322

From Zabbix.org
Jump to: navigation, search

In-memory cache for historical data

ZBXNEXT-322

Status: v1.1

Owner: Alexei, Sasha

Summary

Zabbix Server should use in-memory cache for historical data in order to make execution of trigger expressions, calculated/aggregate items and some macros much faster. It should eliminate most of SQL statements used to retrieve information from historical tables.

Specification

Server changes

  • zabbix_server.conf will accept new optional parameter ValueCacheSize
    • It will specify amount of memory used for the cache buffer
    • Acceptable range for this parameter is 128KB to 64GB or 0. Default value is 8MB.
    • If set to 0, value cache will not be created and used
    • Sample of zabbix_server.conf must be updated to include new parameter and detailed description
  • Trigger functions, calculated items and macros will use the trigger cache instead of making direct SQL calls to the database
    • There should be a single entry point like get_history(itemid, t_from, t_to/now) or get_history(itemid, t_from, period/now)
  • History syncers will update cache when processing new historical values
    • Preferably only for items we need historical data
  • The h_* fields used to cache last,previous values should be removed from DB_ITEM structure and the values must be retrieved from ValueCache when necessary.

Cache implementation

  • If historical values are not present in the cache, the missing values will be requested from the database and cache updated accordingly. If cache update is not possible (insufficient resources) the data retrieved from database is returned to user.
  • The history data is stored as a sorted list of timestamp (zbx_timespec_t) and value (history_value_t) pairs for every item.
  • Different storage modes are used to adapt the cache to request periods - lastvalue storage mode and history storage mode. The storage mode is automatically changed if the current mode can't cache the history request.
    • The lastvalue storage mode is used to cache last and previous values. It has low resource usage, as the values are stored directly into cache item data structure.
    • The history storage mode is used to cache all values from the largest request range to the current time. When request range exceeds the range of cached data, the missing values are fetched from database and cache is updated. The history are stored in data chunks of item values. When a chunk of history data moves outside the largest request range it is dropped from cache.
  • When cache runs out of resources (shared memory) to store data it first drops all items not accessed during the last day. If there is still not enough memory, then cache enters low memory mode:
    • a warning (LOG_LEVEL_WARNING) message "Value cache is fully used: please increase ValueCacheSize configuration parameter" is written to the debug log once per N minutes.
    • items with worst hits/values ratio might be dropped from cache to free space
    • only items with low predicted value count are added to cache
    • item storage mode can't be changed. If the current storage mode can't handle the request - the historical data are read directly from database.
  • Housekeeper and configuration cache syncer won't update the value cache

High level pseudo-algorithm:

                                   _______________________________
                                   \                              \
                                    ) Get ITEMID history for RANGE )
                                   /______________________________/
                                                   |
                                                   v
                                          .-----------------.
                                         / Is ITEMID cached? \
                      .-----------------(  (Yes/No)           )-------------------------.
                      |                  \                   /                          |
                      |                   '-----------------'                           |
                      |                                                                 v
                      v                                               .-----------------------------------.
         .------------------------.                                  / Is cache working in low memory mode \
        / Can current storage      \                             .--(  and lastvalue storage mode can't be  )--.
    .--(  mode handle the request?  )---------.                  |   \ used to cache the request? (Yes/No) /   |
    |   \ (Yes/No)                 /          |                  |    '-----------------------------------'    |
    |    '------------------------'           v                  |                                             |
    |                              .--------------------.        |                      .----------------------'
    |                             / Is cache working in  \       |                      v
    |                       .----(  normal mode?          )------.            .-------------------.
    |                       |     \ (Yes/No)             /       |           ( Read VALUES from DB )
    |                       |      '--------------------'        |            '-------------------'
    |                       v                                    |                      |
    |         .---------------------------.                      |                      |
    |        ( Change item's storage mode. )                     |                      v
    |         '---------------------------'                      |      .-------------------------------.
    |                       |                                    |     / Add ITEMID with VALUES to cache \
    '---------------.----------                        .---------|----(  (SUCCEED/FAIL)                   )--.
                    |                                  |         |     \                                 /   |
                    v                                  |         |      '-------------------------------'    |
        .----------------------.                       |         |                                           |
       / Does the cache contain \                      |         |                                           |
  .---(  RANGE values? (Yes/No)  )--.                  |         |                                           |
  |    \                        /   |                  |         .-------------------------------------------'
  |     '----------------------'    |                  |         |
  |                                 v                  |         |
  |                   .---------------------------.    |         |
  |                  ( Read missing values from DB )   |         |
  |                   '---------------------------'    |         |
  |                                 |                  |         |
  |                 .---------------|------------------'         |
  |                 |               v                            |
  |                 |       .--------------.                     |
  |                 |      / Add values     \                    |
  '-----------------.-----(  to cache        )----.              |
                    |      \ (SUCCEED/FAIL) /     |              |
                    |       '--------------'      |              |
                    |                             v              |
                    |                 .----------------------.   |
                    |                ( Drop ITEMID from cache )--.
                    |                 '----------------------'   |
                    v                                            v
        .----------------------.                       .-------------------.
       ( Read VALUES from cache )----------------.----( Read VALUES from DB )
        '----------------------'                 |     '-------------------'
                                                 |
                                                 |
                                                 |
                                                 v
                                 ______________________________
                                 \                             \
                                  ) Return VALUES               )
                                 /_____________________________/

Internal monitoring

  • New set of parameters will be used to monitor health of the value cache
    • availability: zabbix[vcache,buffer,free/pfree/total/used/pused]
    • effectiveness: zabbix[vcache,cache,hits/requests/misses]
      • hits - number of cache hits (history taken from the cache)
      • requests - total number of requests
      • misses - number of cache misses (history taken from the database)

Front-end and API changes

  • No changes will be introduced by this functionality.
  • Both front-end and API will use direct SQL statements for graphs, latest data and evaluation of complex macros.

Database changes

  • Existing template should be updated to contain items and triggers related to health of the value cache
    • screen, triggers, graphs, items should be updated

Documentation

  • What's new
  • New parameter 'ValueCacheSize' should be described

Test cases

  • It must be verified that the enabled cache:
    • significantly (up-to 100%) reduces number of SQLs related to evaluation of trigger expressions
    • leads to much better performance comparing to disabled cache

ChangeLog

  • v1.1 Removed support of zabbix[vcache,cache,phits/pmisses]

Open issues

  • think about a better name for the "value cache". Must be related to history somehow.