A prototype of a monitoring system
Go to file
R. Tyler Croy 4afb5d56d1 Set up a simple test harness/infrastructure for AUnit 2014-06-14 21:26:14 -07:00
contrib Import AUnit 2013 into the tree 2014-06-14 21:26:13 -07:00
src Switch to using Alog inside of blick-statsd for now 2014-06-14 21:26:13 -07:00
tests Set up a simple test harness/infrastructure for AUnit 2014-06-14 21:26:14 -07:00
.gitignore Start working on a blick-statsd agent for side-channeling statsd metrics 2014-06-14 21:25:52 -07:00
.gitmodules Add dbus and zeromq-ada dependencies into the tree 2014-05-18 17:17:00 -07:00
DEPENDS.md Start working on a blick-statsd agent for side-channeling statsd metrics 2014-06-14 21:25:52 -07:00
Makefile Set up a simple test harness/infrastructure for AUnit 2014-06-14 21:26:14 -07:00
NOTES.md Start working on a blick-statsd agent for side-channeling statsd metrics 2014-06-14 21:25:52 -07:00
README.md Add notes with thoughts on agent design/structure 2014-05-19 16:41:42 -07:00
blick.gpr Set up a simple test harness/infrastructure for AUnit 2014-06-14 21:26:14 -07:00

README.md

Blick

Blick is currently a concept, purely an experimental approach to a monitoring system. While conceptually similar to Sensu, the idea behind Blick is to use ZeroMQ to create event-driven agents which stream events directly to a server.

Contributing

The project is still in its infancy, but you can chat with us in the #blick channel on the Freenode IRC network.

Design

                              +--------> +--------------+       +--------------+
+---------------+             |          | Blick Server |       | Carbon Cache |
|  Blick Agent  |             |          +--------------+       +--------------+
+---------------+---> +-------+-----+        ^     ^                     ^
                      | Blick Relay |        |     |                     |
+---------------+---> +-------------+        |     +---+--------------+--+
|  Blick Agent  |                            |         | Blick Statsd |
+---------------+                            |         +--------------+
                                             |                   ^
                                   +---------+-----+             |
                                   |  Blick Agent  |       +-----+-------+
                                   +---------------+       | Application |
                                                           +-------------+

Server

The Server is the main destionation for all Blick events. The Server is responsible for aggregating event data, presenting data, and issuing alerts based on that data.

The server should also receive some node information from Agents which can be used to pull a node classification from a Puppet master or Chef server. Ideally, the Server would be able to present automatic checks and alerts based on what is presented in the node's classification. For example, if a service { 'httpd': ensure => running, } is defined in the node's [Puppet] resource graph then the Server should automatically alert if the httpd process is not running.

Agent

The Agent's sole responsibility is to publish events via a ZeroMQ socket to the Server, or a Relay.

The Agent should be primarily event driven, allowing multiple sources of events, e.g.:

  • System-level: Provided by dbus or Kernel uevents (TBD)
  • Process-level: Provided by systemd to dbus

For non-evented data (/proc related events, non-standard process events) a polling loop should exist in the Agent, but this should not be the default mechanism for event acquisition.

It's currently unclear where application/process-level data, such as JMX, should be gathered from. This might make sense to live in the Agent, or the Relay.

Agent Design

                      +--------+                                                              
                      | ZeroMQ |                                                              
                      +--------+                                                              
                       ^                                                                      
                       |                                                                      
                       |                                                                      
+---------+     +------+----+           +-----------------+                                   
|Main loop|     | Publisher |<----------+ Process Monitor |                                   
+---------+     +-----------+           |   (listener)    |<--------- systemd/dbus            
                     ^ ^ ^ ^            +-----------------+                                   
                     | | | |                                                                  
                     | | | |           +--------------------+                                 
                     | | | +-----------+ Filesystem Monitor |<------- inotify/kqueue          
                     | | |             |    (listener)      |                                 
    +------------+   | | |             +--------------------+                                 
    | Heartbeat  +---+ | |                                                                    
    | (observer) |     | |                +---------------+                                   
    +------------+     | |                | MySQL Slow    |                                   
                       | +----------------+ Query Monitor |<--------- inotify/kqueue file-read
  +--------------+     |                  |  (listener)   |                                   
  | Disk Monitor +-----+                  +---------------+                                   
  |  (observer)  |                                                                            
  +--------------+                                                                            
Listeners

Listeners are evented entities within the agent, in order for a monitor to act as a listener it must receive events from some external source on the system being monitored.

Unless otherwise required, all monitors should be listeners by default.

Observers

Observers are all polling/interval based monitors that the agent will run in a separate thread.

Statsd

The intended purpose of the Statsd daemon is to provide application-based monitoring and alerting. Blick should not replace Graphite but by using the Blick Statsd server as the destination of Graphite events, Blick can get a side-channel of these events and provide alerts based on their values.

Relay

The Relay is more of a planned addition to help Blick scale. The Relay sitting at the top of a rack, siphoning events from Agents as well as SNMP providers into the Server would provide a more scalable means of data aggregation.