A prototype of a monitoring system
Go to file
R. Tyler Croy a4cd46aa37 Add the yard rake task for generating internal API documentation 2014-07-06 16:29:00 -07:00
config Initial Ruby commit 2014-07-05 13:57:58 -07:00
lib Properly "emit" a heartbeat 2014-07-05 22:00:03 -07:00
proto Add some basic protobuf message types for comunicating from the agent to collector 2014-07-05 15:06:51 -07:00
scripts Add some zeromq gems and a helper script for running JRuby 2014-07-05 15:31:46 -07:00
spec Flesh out the scheduler to run observers as TimerTasks 2014-07-05 21:43:01 -07:00
tasks Add the yard rake task for generating internal API documentation 2014-07-06 16:29:00 -07:00
.gitignore Add the yard rake task for generating internal API documentation 2014-07-06 16:29:00 -07:00
.rspec Add the outline of an observer by way of the heartbeat observer 2014-07-05 16:17:03 -07:00
Gemfile Add the yard rake task for generating internal API documentation 2014-07-06 16:29:00 -07:00
Gemfile.lock Add the yard rake task for generating internal API documentation 2014-07-06 16:29:00 -07:00
README.md Initial Ruby commit 2014-07-05 13:57:58 -07:00
Rakefile Add support for compiling protobufs and running tests 2014-07-05 14:55:17 -07:00

README.md

Blick

Blick is currently a concept, purely an experimental approach to a monitoring system. While conceptually similar to Sensu, the idea behind Blick is to use ZeroMQ to create event-driven agents which stream events directly to a server.

Contributing

The project is still in its infancy, but you can chat with us in the #blick channel on the Freenode IRC network.

Design

                              +--------> +--------------+       +--------------+
+---------------+             |          | Blick Server |       | Carbon Cache |
|  Blick Agent  |             |          +--------------+       +--------------+
+---------------+---> +-------+-----+        ^     ^                     ^
                      | Blick Relay |        |     |                     |
+---------------+---> +-------------+        |     +---+--------------+--+
|  Blick Agent  |                            |         | Blick Statsd |
+---------------+                            |         +--------------+
                                             |                   ^
                                   +---------+-----+             |
                                   |  Blick Agent  |       +-----+-------+
                                   +---------------+       | Application |
                                                           +-------------+

Server

The Server is the main destionation for all Blick events. The Server is responsible for aggregating event data, presenting data, and issuing alerts based on that data.

The server should also receive some node information from Agents which can be used to pull a node classification from a Puppet master or Chef server. Ideally, the Server would be able to present automatic checks and alerts based on what is presented in the node's classification. For example, if a service { 'httpd': ensure => running, } is defined in the node's [Puppet] resource graph then the Server should automatically alert if the httpd process is not running.

Agent

The Agent's sole responsibility is to publish events via a ZeroMQ socket to the Server, or a Relay.

The Agent should be primarily event driven, allowing multiple sources of events, e.g.:

  • System-level: Provided by dbus or Kernel uevents (TBD)
  • Process-level: Provided by systemd to dbus

For non-evented data (/proc related events, non-standard process events) a polling loop should exist in the Agent, but this should not be the default mechanism for event acquisition.

It's currently unclear where application/process-level data, such as JMX, should be gathered from. This might make sense to live in the Agent, or the Relay.

Agent Design

                      +--------+                                                              
                      | ZeroMQ |                                                              
                      +--------+                                                              
                       ^                                                                      
                       |                                                                      
                       |                                                                      
+---------+     +------+----+           +-----------------+                                   
|Main loop|     | Publisher |<----------+ Process Monitor |                                   
+---------+     +-----------+           |   (listener)    |<--------- systemd/dbus            
                     ^ ^ ^ ^            +-----------------+                                   
                     | | | |                                                                  
                     | | | |           +--------------------+                                 
                     | | | +-----------+ Filesystem Monitor |<------- inotify/kqueue          
                     | | |             |    (listener)      |                                 
    +------------+   | | |             +--------------------+                                 
    | Heartbeat  +---+ | |                                                                    
    | (observer) |     | |                +---------------+                                   
    +------------+     | |                | MySQL Slow    |                                   
                       | +----------------+ Query Monitor |<--------- inotify/kqueue file-read
  +--------------+     |                  |  (listener)   |                                   
  | Disk Monitor +-----+                  +---------------+                                   
  |  (observer)  |                                                                            
  +--------------+                                                                            
Listeners

Listeners are evented entities within the agent, in order for a monitor to act as a listener it must receive events from some external source on the system being monitored.

Unless otherwise required, all monitors should be listeners by default.

Observers

Observers are all polling/interval based monitors that the agent will run in a separate thread.

Statsd

The intended purpose of the Statsd daemon is to provide application-based monitoring and alerting. Blick should not replace Graphite but by using the Blick Statsd server as the destination of Graphite events, Blick can get a side-channel of these events and provide alerts based on their values.

Relay

The Relay is more of a planned addition to help Blick scale. The Relay sitting at the top of a rack, siphoning events from Agents as well as SNMP providers into the Server would provide a more scalable means of data aggregation.