Some tools for working with Graphite
Go to file
R. Tyler Croy febb5c9275 Flesh out some readme for the project and some of the uses 2011-04-05 14:21:18 -07:00
monitors Add a simple monitor script to report the length of user-specified lists in redis 2011-04-04 10:07:39 -07:00
.gitignore
README.markdown Flesh out some readme for the project and some of the uses 2011-04-05 14:21:18 -07:00
aggclient.py Add a very basic aggregator client written in python 2011-03-29 11:04:27 -07:00
aggregated.py For all aggregate timing events, record an upper and lower bound, and a count 2011-03-25 16:48:40 -07:00
runmonitors.py Add support for running a specific (non-auto) monitor in runmonitors.py 2011-04-04 10:07:06 -07:00

README.markdown

graphite-tools

This repository contains a number of handy tools for using Graphite in a production environment, particularly with Python applications.

aggregated.py

This daemon is modeled after Etsy's statsd and listens for UDP packets with either timing information or count information.

For timing information, aggregated.py will tabulate the following metrics to be sent to the Carbon daemon every minute:

  • avg - average for the minute
  • upper - highest value seen in the minute
  • lower - lowest value seen in the minute
  • count - total events with this label seen in the minute

For count information, aggregated.py will send the sum of all the events to the Carbon daemon every minute.

One example, using both of these methods in conjunction with one another is when profiling cache hits/misses. For example:

result = cache.get('mykey')
if result:
    graphite.incr('app.method.cache.hit')
    return result

start = time.time()
result = slow_uncached_method()
graphite.timing('app.method.cache.miss', (time.time() - start))

The above example would give you the following metrics in Graphite:

  • app.method.cache.hit
  • app.method.cache.miss.avg
  • app.method.cache.miss.count
  • app.method.cache.miss.upper
  • app.method.cache.miss.lower

Using these five metrics you can get total cache hits/misses and plot an average cache miss time.

monitors

These system monitors are still in development, but are meant to be run from something like Jenkins on a timer to report information about machines or services running in the production cluster.