See www.zabbix.com for the official Zabbix site.

Docs/specs/ZBXNEXT-1633

From Zabbix.org
Jump to: navigation, search

Monitoring of vSphere, also framework for monitoring of virtual environments

ZBXNEXT-1633

Status: Draft v1.4

Owner: Alexei, Sasha

Summary

Zabbix will support virtual machine monitoring. Zabbix will use LLD rules to automatically discover VMs and create hosts to monitor them based on host prototypes.

Specification

Host discovery

  • The following options can be configured for a host prototype: host name, visible name, status, linked templates and host inventory mode.
    • The host name must contain a discovery macro and must be unique in an LLD rule.
  • Other options will be inherited from the host that the host prototype belongs to.
  • Host prototypes can be created for both hosts and templates.
  • Discovered hosts cannot be directly updated.
  • If a host prototype is deleted, all of the discovered hosts are deleted too.
  • Host prototypes on discovered hosts will not be supported.

VMWare data gathering

The VWMWare data are gathered by specific processes - vmware collectors. A collector downloads data from a VMWare service, partially parses it and stores in the shared memory. The shared memory is not locked during data downloading, which is the most time consuming process. Only when all data of the specific VMWare service are downloaded and prepared the shared memory is locked and the data are replaced.

The data downloading from one VMWare service is done by one collector and can't be shared between multiple collectors. However multiple collectors can be used to download data from multiple VMWare services simultaneously.

The shared memory is allocated only if at least one vmware collector process is configured to start.

Simple checks

The VMWare data from the shared memory is accessed by the following simple checks:

  • vmware.cluster.discovery[url] - discovery of VMWare clusters
  • vmware.cluster.status[url, name] - VMWare cluster status
  • vmware.eventlog[url] - VMWare event log
  • vmware.hv.discovery[url] - discovery of VMWare hypervisors, url - the VMWare service URL
  • vmware.hv.cluster.name[url,uuid] - VMWare hypervisor cluster name, url - the VMWare service URL, uuid - the VMWare hypervisor uuid
  • vmware.hv.cpu.usage[url,uuid] - VMWare hypervisor processor usage in Hz, url - the VMWare service URL, uuid - the VMWare hypervisor uuid
  • vmware.hv.full.name[url,uuid] - VMWare hypervisor name, url - the VMWare service URL, uuid - the VMWare hypervisor uuid
  • vmware.hv.hw.cpu.num[url,uuid] - number of processor cores on VMWare hypervisor, url - the VMWare service URL, uuid - the VMWare hypervisor uuid
  • vmware.hv.hw.cpu.freq[url,uuid] - VMWare hypervisor processor frequency, url - the VMWare service URL, uuid - the VMWare hypervisor uuid
  • vmware.hv.hw.cpu.model[url,uuid] - VMWare hypervisor processor model, url - the VMWare service URL, uuid - the VMWare hypervisor uuid
  • vmware.hv.hw.cpu.threads[url,uuid] - number of processor threads on VMWare hypervisor, url - the VMWare service URL, uuid - the VMWare hypervisor uuid
  • vmware.hv.hw.memory[url,uuid] - VMWare hypervisor total memory size, url - the VMWare service URL, uuid - the VMWare hypervisor uuid
  • vmware.hv.hw.model[url,uuid] - VMWare hypervisor model, url - the VMWare service URL, uuid - the VMWare hypervisor uuid
  • vmware.hv.hw.uuid[url,uuid] - VMWare hypervisor bios uuid, url - the VMWare service URL, uuid - the VMWare hypervisor uuid
  • vmware.hv.hw.vendor[url,uuid] - VMWare hypervisor vendor name, url - the VMWare service URL, uuid - the VMWare hypervisor uuid
  • vmware.hv.memory.size.ballooned[url,uuid] - VMWare hypervisor ballooned memory size, url - the VMWare service URL, uuid - the VMWare hypervisor uuid
  • vmware.hv.memory.used[url,uuid] - VMWare hypervisor used memory size, url - the VMWare service URL, uuid - the VMWare hypervisor uuid
  • vmware.hv.status[url,uuid] - VMWare hypervisor status, url - the VMWare service URL, uuid - the VMWare hypervisor uuid
  • vmware.hv.uptime[url,uuid] - VMWare hypervisor uptime, url - the VMWare service URL, uuid - the VMWare hypervisor uuid
  • vmware.hv.version[url,uuid] - VMWare hypervisor version, url - the VMWare service URL, uuid - the VMWare hypervisor uuid
  • vmware.hv.vm.num[url,uuid] - number of virtual machines on VMWare hypervisor, url - the VMWare service URL, uuid - the VMWare hypervisor uuid
  • vmware.vm.discovery[url] - discovery of VMWare virtual machines, url - the VMWare service URL
  • vmware.vm.cluster.name[url,uuid] - VMWare virtual machine name, url - the VMWare service URL, uuid - the VMWare virtual machine uuid
  • vmware.vm.cpu.num[url,uuid] - number of processors on VMWare virtual machine, url - the VMWare service URL, uuid - the VMWare virtual machine uuid
  • vmware.vm.cpu.usage[url,uuid] - VMWare virtual machine processor usage in Hz, url - the VMWare service URL, uuid - the VMWare virtual machine uuid
  • vmware.vm.hv.name[url,uuid] - VMWare virtual machine hypervisor name, url - the VMWare service URL, uuid - the VMWare virtual machine uuid
  • vmware.vm.memory.size[url,uuid] - VMWare virtual machine total memory size, url - the VMWare service URL, uuid - the VMWare virtual machine uuid
  • vmware.vm.memory.size.ballooned[url,uuid] - VMWare virtual machine ballooned memory size, url - the VMWare service URL, uuid - the VMWare virtual machine uuid
  • vmware.vm.memory.size.compressed[url,uuid] - VMWare virtual machine compressed memory size, url - the VMWare service URL, uuid - the VMWare virtual machine uuid
  • vmware.vm.memory.size.swapped[url,uuid] - VMWare virtual machine swapped memory size, url - the VMWare service URL, uuid - the VMWare virtual machine uuid
  • vmware.vm.memory.size.usage.guest[url,uuid] - VMWare virtual machine guest memory usage, url - the VMWare service URL, uuid - the VMWare virtual machine uuid
  • vmware.vm.memory.size.usage.host[url,uuid] - VMWare virtual machine host memory usage, url - the VMWare service URL, uuid - the VMWare virtual machine uuid
  • vmware.vm.memory.size.private[url,uuid] - VMWare virtual machine private memory size, url - the VMWare service URL, uuid - the VMWare virtual machine uuid
  • vmware.vm.memory.size.shared[url,uuid] - VMWare virtual machine shared memory size, url - the VMWare service URL, uuid - the VMWare virtual machine uuid
  • vm.net.if.discovery[url,uuid] - discovery of VMWare virtual machine network interfaces, url - the VMWare service URL, uuid - the VMWare virtual machine uuid
  • vmware.vm.net.if.in[url,uuid,instance,mode] - VMWare virtual machine network interface input statistics, url - the VMWare service URL, uuid - the VMWare virtual machine uuid, instance - the network interface instance, mode - bps/pps - bytes/packets per second
  • vmware.vm.net.if.out[url,uuid,instance,mode] - VMWare virtual machine network interface output statistics, url - the VMWare service URL, uuid - the VMWare virtual machine uuid, instance - the network interface instance, mode - bps/pps - bytes/packets per second
  • vmware.vm.powerstate[url,uuid] - VMWare virtual machine power state, url - the VMWare service URL, uuid - the VMWare virtual machine uuid
  • vmware.vm.storage.committed[url,uuid] - VMWare virtual machine committed storage space, url - the VMWare service URL, uuid - the VMWare virtual machine uuid
  • vmware.vm.storage.unshared[url,uuid] - VMWare virtual machine unshared storage space, url - the VMWare service URL, uuid - the VMWare virtual machine uuid
  • vmware.vm.storage.uncommitted[url,uuid] - VMWare virtual machine uncommitted storage space, url - the VMWare service URL, uuid - the VMWare virtual machine uuid
  • vmware.vm.uptime[url,uuid] - VMWare virtual machine uptime, url - the VMWare service URL, uuid - the VMWare virtual machine uuid
  • vmware.vm.vfs.dev.discovery[url,uuid] - discovery of VMWare virtual machine disk devices, url - the VMWare service URL, uuid - the VMWare virtual machine uuid
  • vmware.vm.vfs.dev.read[url,uuid,instance,mode] - VMWare virtual machine disk device read statistics, url - the VMWare service URL, uuid - the VMWare virtual machine uuid, instance - the disk device instance, mode - bps/ops - bytes/operations per second
  • vmware.vm.vfs.dev.write[url,uuid,instance,mode] - VMWare virtual machine disk device write statistics, url - the VMWare service URL, uuid - the VMWare virtual machine uuid, instance - the disk device instance, mode - bps/ops - bytes/operations per second
  • vmware.vm.vfs.fs.discovery[url,uuid] - discovery of VMWare virtual machine file systems, url - the VMWare service URL, uuid - the VMWare virtual machine uuid
  • vmware.vm.vfs.fs.size[url,uuid,fsname,mode] - VMWare virtual machine file system statistics, url - the VMWare service URL, uuid - the VMWare virtual machine uuid, instance - the network interface instance, mode - bps/pps - bytes/packets per second
  • vmware.hv.datastore.read[url,uuid,datastore,mode] - VMWare hypervisor datastore read statistics, url - the VMWare service URL, uuid - the VMWare hypervisor uuid, datastore - the datastore name, mode - latency
  • vmware.hv.datastore.write[url,uuid,datastore,mode] - VMWare hypervisor datastore write statistics, url - the VMWare service URL, uuid - the VMWare hypervisor uuid, datastore - the datastore name, mode - latency
  • vmware.hv.network.in[url,uuid,mode] - VMWare hypervisor netowork input statistics, url - the VMWare service URL, uuid - the VMWare hypervisor uuid, mode - bps
  • vmware.hv.network.out[url,uuid,mode] - VMWare hypervisor netowork output statistics, url - the VMWare service URL, uuid - the VMWare hypervisor uuid, mode - bps

Configuration file changes

The following configuration items must be added to allow customization of vmware collectors:

  • StartVMWareCollectors - the number of pre-forked vmware collector processes, 0-250, default 0.
  • VMWareFrequency - the time in seconds between polling VMWare service for new data, 10-86400, default 60.
  • WMWareCacheSize - the size of shared memory allocated to store downloaded VMWare data, 256K-2G, default 8M

Internal items

A new internal items must be added to allow VMWare cache monitoring:

  • zabbix[vmware,buffer,<mode>] - where mode is one of following: total, free, pfree, used or pused

Self monitoring

Zabbix self monitoring must be updated to recognize the vmware collector processes.

Additional libraries

  • Zabbix will use libxml2 for parsing of XML SOAP responses
  • Libcurl library will be used for HTTP and HTTPS communications

Server-side changes

New hosts will be created with host groups as defined in host group prototypes. Zabbix server will not create any new host groups, it will link to existing host groups only. If there is not host group, the server will not add a new host with an error message for discovery rule.

Frontend changes

  • The "User name" and "Password" must be displayed for simple check items, item prototypes and LLD rules; they should be optional.
Host prototype configuration
  • The host prototype configuration section will be added after graph prototypes.
  • A link to host prototype configuration will be added to the LLD rule navigation bar after graph prototypes.
  • Editing a host prototype on a host:
    • Only the host prototype configuration options will be editable, all field inherited from the parent host must be disabled.
    • The host inventory fields will not be displayed.
  • Editing a host prototype on a template:
    • Only the host prototype configuration options will be displayed.
    • The "IPMI" tab will not be displayed.
    • The host inventory fields will not be displayed.
  • Editing an inherited host prototype:
    • Only the status parameter can be overridden
  • Host prototypes can be cloned.
  • Host prototypes will be copied when full cloning a host.
  • Host prototypes will be exported and imported.
    • Importing a group prototype that references a missing host group should respect the "Add missing group" checkbox, that is, create a new host group if it's checked and fail otherwise.
  • Validation must be performed when linking templates to host prototypes.
  • Host prototype will support configuration of host group prototypes:
    • Host group prototypes will be configured in a separate "Groups" tab.
    • The first block will look similar to macro configuration and will allow to create custom host groups.
    • The second block will use the Chosen plugin to select existing groups.
    • Discovered groups cannot be used in group prototypes.
Discovered host configuration
  • Discovered host list:
    • Discovered hosts will be prefixed with the name of the LLD rule in gold color similar to items.
    • If a host is no longer discovered, an orange icon will appear in the "Availability" column with a message saying "The host is not discovered anymore and will be deleted in %1$s (on %2$s at %3$s)."
  • Discovered hosts can be cloned, full cloned and deleted.
  • Mass configuration checkbox will be enabled for discovered hosts: they can be mass enabled, disabled and deleted.
  • Discovered hosts cannot be exported or mass updated; when trying to export or mass update discovered hosts they will be silently skipped.
  • Discovered host configuration form:
    • All fields except for "Status" and host inventory fields will be readonly.
    • A link to the LLD rule will be displayed on top of the form.
  • When deleting a template linked to a discovered host, all objects must be cleared.
  • In host group, proxy and template configuration forms:
    • In the left column of the tweenbox the corresponding discovered hosts must appear as read-only items.
    • In the right column discovered hosts must not appear at all.
  • Host prototype configuration section must not be available for discovered hosts.
Discovered host group configuration
  • It will be possible to manually add a host to a discovered group.
  • Discovered host group list:
    • Discovered host groups will be prefixed with the name of the LLD rule in gold color similar to items.
    • If a host group is no longer discovered, a "Status" column will appear containing an orange icon with a message saying "The host group is not discovered anymore and will be deleted in %1$s (on %2$s at %3$s)."
  • Discovered host group configuration form:
    • Discovered host groups cannot be renamed.
Administration / general / other
  • Discovered host groups cannot be used for network discovery discovered hosts.

API changes

discoveryrule
  • discoveryrule:
    • the username and password properties will be supported for simple check LLD rules
  • discoveryrule.get:
    • new selectHostPrototypes parameter will return LLD rule's host prototypes; it will support the count value. It must return an empty array for LLD rules on discovered hosts, even if the prototypes exist in the database.
hostprototype
  • New hostprototype API will allow to manage host prototypes.
  • It will implement the following methods: get, create, update, delete, isreadable and iswritable.
  • hostprototype.get will support the following parameters:
    • discoveryids - string/array, return only host prototypes that belong to the given LLD rules.
    • inherited - boolean, return only inherited host prototypes.
    • selectDiscoveryRule - query, return the LLD rule, that the host prototype belongs to.
    • selectParentHost - query, return the host that the host prototype belongs to.
    • selectTemplates - query, return templates linked to the host prototype.
    • selectInventory - query, return host prototype inventory.
    • selectGroupPrototypes - query, return host group prototypes.
  • hostprototype.create will support the following additional parameters:
    • ruleid - string, ID of the LLD rule that the host prototype belongs to.
    • templates - array, templates to link to the host prototype.
    • groupPrototypes - array, host group prototypes to create for the host prototype.
    • inventory - object, host prototype inventory.
  • hostprototype.update will support the following additional parameters:
    • templates - array, templates to replace the ones linked to the host prototype.
    • groupPrototypes - array, host group prototypes to replace the existing group prototypes.
    • inventory - object, host prototype inventory.
  • The host prototype inventory object will use only the inventory_mode field.
  • When trying to create a host prototype on a discovered host an error must be triggered: Cannot create a host prototype on a discovered host.
  • hostprototype.get must not return host prototypes from discovered hosts.
  • hostprototype.delete
    • Deleting a host prototype will delete all discovered hosts without checking permissions.
group prototype
  • A host prototype must contain at least one group prototype based on an existing host group.
  • A group prototype must contain either a name with an LLD macro or an ID of an existing host group.
    • If a group prototype is created with a name, it cannot be linked to an existing group and vice versa.
  • When deleting a host prototype or a group prototype, the discovered host groups must be deleted.
    • Permissions are ignored when deleting discovered host groups.
    • If a discovered group cannot be deleted, an error must be displayed.
host
  • host:
    • additional flags property will be added. Integer, supported values: 0 - normal hosts; 4 - discovered hosts.
  • host.get:
    • host.get will return both normal and discovered hosts.
    • new selectDiscoveryRule parameter will return the LLD rule that created the discovered host.
    • new selectHostDiscovery rule will return the host_discovery object.
hostinterface
  • When updating host interfaces on a discovered host an error will appear: Cannot update interface for discovered host "%1$s".
hostgroup
  • hostgroup:
    • additional flags property will be added. Integer, supported values: 0 - normal host group; 4 - discovered host group.
  • hostgroup.get:
    • new selectDiscoveryRule parameter will return the LLD rule that created the discovered host group.
    • new selectHostDiscovery rule will return the group_discovery object.
  • Host groups that are used in group prototype cannot be deleted.
  • When updating host groups on a discovered host an error will appear: Cannot update groups for discovered host "%1$s".
  • When renaming a discovered host group an error will appear: Cannot update name for discovered hostgroup "%1$s".
item
  • item:
    • the username and password properties will be supported for simple check items
itemprototype
  • itemprototype:
    • the username and password properties will be supported for simple check item prototypes
proxy
  • When updating proxy on a discovered host an error will appear: Cannot update proxy on discovered host "%1$s".
template
  • When updating templates on a discovered host an error will appear: Cannot update templates on discovered host "%1$s".

Translation strings

  • New or updated string

Database changes

  • New fields for table 'hosts'
 FIELD         |flags          |t_integer      |'0'    |NOT NULL       |ZBX_SYNC
 FIELD         |templateid     |t_id           |       |NULL           |ZBX_SYNC               |3|hosts     |hostid
  • New field for table 'groups'
 FIELD         |flags          |t_integer      |'0'    |NOT NULL       |ZBX_SYNC
  • New table 'host_discovery'
 TABLE|host_discovery|hostid|ZBX_SYNC,ZBX_DATA
 FIELD		|hostid		|t_id		|	|NOT NULL	|0			|1|hosts
 FIELD		|parent_hostid	|t_id		|	|NULL		|ZBX_SYNC		|2|hosts	|hostid		|RESTRICT
 FIELD		|parent_itemid	|t_id		|	|NULL		|ZBX_SYNC		|3|items	|itemid		|RESTRICT
 FIELD		|host		|t_varchar(64)	|''	|NOT NULL	|ZBX_NODATA
 FIELD		|lastcheck	|t_integer	|'0'	|NOT NULL	|ZBX_NODATA
 FIELD		|ts_delete	|t_time		|'0'	|NOT NULL	|ZBX_NODATA
  • New table 'interface_discovery'
 TABLE|interface_discovery|interfaceid|ZBX_SYNC
 FIELD		|interfaceid	|t_id		|	|NOT NULL	|ZBX_SYNC		|1|interface
 FIELD		|parent_interfaceid|t_id	|	|NOT NULL	|ZBX_SYNC		|2|interface	|interfaceid
  • New table 'group_prototype'
 TABLE|group_prototype|group_prototypeid|ZBX_SYNC,ZBX_DATA
 FIELD		|group_prototypeid|t_id		|	|NOT NULL	|0
 FIELD		|hostid		|t_id		|	|NOT NULL	|ZBX_SYNC		|1|hosts
 FIELD		|name		|t_varchar(64)	|''	|NOT NULL	|ZBX_SYNC
 FIELD		|groupid	|t_id		|	|NULL		|ZBX_SYNC		|2|groups	|		|RESTRICT
 FIELD		|templateid	|t_id		|	|NULL		|ZBX_SYNC		|3|group_prototype|group_prototypeid
 INDEX		|1		|hostid
  • New table 'group_discovery'
 TABLE|group_discovery|groupid|ZBX_SYNC,ZBX_DATA
 FIELD		|groupid	|t_id		|	|NOT NULL	|ZBX_SYNC		|1|groups
 FIELD		|parent_group_prototypeid|t_id	|	|NOT NULL	|ZBX_SYNC		|2|group_prototype|group_prototypeid|RESTRICT
 FIELD		|name		|t_varchar(64)	|''	|NOT NULL	|ZBX_NODATA
 FIELD		|lastcheck	|t_integer	|'0'	|NOT NULL	|ZBX_NODATA
 FIELD		|ts_delete	|t_time		|'0'	|NOT NULL	|ZBX_NODATA

Database object relations

ZBXNEXT-1633.host-prototypes.png

Documentation

  • whatsnew
  • Document new configuration option --with-libxml2
  • new virtual machine monitoring section
  • update all sections/screenshots where "host prototypes" could appear

Test cases

Open questions

  • We should somehow mark discovered host groups in the group list.

ChangeLog

  • Draft 1.0
    • New tables 'group_prototype' and 'group_discovery'
    • Added description of configuration of host group prototypes
    • Added server side changes
    • added API changes
  • 1.1
    • Host groups that are used in group prototype cannot be deleted
    • A host prototype must contain at least one group prototype based on an existing host group
  • 1.2
    • Discovered host groups will be prefixed with the name of the LLD rule in gold color similar to items
    • Discovered groups cannot be used in group prototypes
    • Discovered host groups cannot be used for discovered hosts
  • 1.3
    • Importing a group prototype that references a missing host group should respect the "Add missing group" checkbox
  • 1.4
    • The "User name" and "Password" must be displayed for simple check items, item prototypes and LLD rules