superdocc

2018-11-02 00:16:44 +00:00 · 2018-11-02 00:16:44 +00:00 · b904fb795a
parent ea812eb9a0
commit b904fb795a
9 changed files with 269 additions and 205 deletions
--- a/.gitignore
+++ b/.gitignore
@ -2,3 +2,5 @@ target/
 **/*.rs.bk
 Cargo.lock
 src/prometheus_proto.rs
 .idea
 cmake-*
--- a/HANDBOOK.md
+++ b/HANDBOOK.md
@ -0,0 +1,262 @@
 # The dipstick handbook
 This handbook's purpose is to get you started instrumenting your apps with Dipstick
 and give an idea of what's possible.
 # Background
 Dipstick was born of the desire to build a metrics library that would allow to select from,
 switch between and combine multiple backends.
 Such a design has multiple benefits:
 - simplified instrumentation
 - flexible configuration
 - easier metrics testing
 Because of its Rust nature, performance, safety and ergonomy are also prime concerns. 
 ## API Overview
 Dipstick's API is split between _input_ and _output_ layers.
 The input layer provides named metrics such as counters and timers to be used by the application.
 The output layer controls how metric values will be recorded and emitted by the configured backend(s).
 Input and output layers are decoupled, making code instrumentation independent of output configuration.
 Intermediates can also be added between input and output for features or performance characteristics. 
 Although this handbook covers input before output, implementation can certainly be performed the other way around.
 For more details, consult the [docs](https://docs.rs/dipstick/).
 ## Metrics Input
 A metrics library first job is to help a program collect measurements about its operations.
 Dipstick provides a restricted but robust set of _four_ instrument types, taking a stance against 
 an application's functional code having to pick what statistics should be tracked for each defined metric.
 This helps to enforce contracts with downstream metrics systems and keeps code free of configuration elements.
 #### Counter
 Count number of elements processed, e.g. number of bytes received.
 #### Marker 
 A monotonic counter. e.g. to record the processing of individual events.
 Default aggregated statistics for markers are not the same as those for counters.
 Value-less metric also makes for a safer API, preventing values other than 1 from being passed.  
 #### Timer
 Measure an operation's duration.
 Usable either through the time! macro, the closure form or explicit calls to start() and stop().
 While timers internal precision are in nanoseconds, their accuracy depends on platform OS and hardware. 
 Timer's default output format is milliseconds but is scalable up or down.
 ```rust,skt-run
 let app_metrics = metric_scope(to_stdout());
 let timer =  app_metrics.timer("my_timer");
 time!(timer, {/* slow code here */} );
 timer.time(|| {/* slow code here */} );
 let start = timer.start();
 /* slow code here */
 timer.stop(start);
 timer.interval_us(123_456);
 ```
 ### Gauge
 An instant observation of a resource's value.
 Observation of gauges neither automatic or tied to the output of metrics, 
 it must be scheduled independently or called explicitly through the code.
 ### Names
 Each metric must be given a name upon creation.
 Names are opaque to the application and are used only to identify the metrics upon output.
 Names may be prepended with a namespace by each configured backend.
 Aggregated statistics may also append identifiers to the metric's name.
 Names should exclude characters that can interfere with namespaces, separator and output protocols.
 A good convention is to stick with lowercase alphanumeric identifiers of less than 12 characters.
 ```rust,skt-run
 let app_metrics = metric_scope(to_stdout());
 let db_metrics = app_metrics.add_prefix("database");
 let _db_timer = db_metrics.timer("db_timer");
 let _db_counter = db_metrics.counter("db_counter");
 ```
 ### Labels
 Some backends (such as Prometheus) allow "tagging" the metrics with labels to provide additional context,
 such as the URL or HTTP method requested from a web server.
 Dipstick offers the thread-local ThreadLabel and global AppLabel context maps to transparently carry 
 metadata to the backends configured to use it.
 Notes about labels:
 - Using labels may incur a significant runtime cost because 
  of the additional implicit parameter that has to be carried around. 
 - Labels runtime costs may be even higher if async queuing is used 
  since current context has to be persisted across threads.
 - While internally supported, single metric labels are not yet part of the input API. 
  If this is important to you, consider using dynamically defined metrics or open a GitHub issue!
 ### Static vs dynamic metrics
 Metric inputs are usually setup statically upon application startup.
 ```rust,skt-plain
 #[macro_use] 
 extern crate dipstick;
 use dipstick::*;
 metrics!("my_app" => {
    COUNTER_A: Counter = "counter_a";
 });
 fn main() {
    route_aggregate_metrics(to_stdout());
    COUNTER_A.count(11);
 }
 ```
 The static metric definition macro is just `lazy_static!` wrapper.
 ## Dynamic metrics
 If necessary, metrics can also be defined "dynamically", with a possibly new name for every value. 
 This is more flexible but has a higher runtime cost, which may be alleviated with caching.
 ```rust,skt-run
 let user_name = "john_day";
 let app_metrics = to_log().with_cache(512);
 app_metrics.gauge(format!("gauge_for_user_{}", user_name)).value(44);
 ```
 ## Metrics Output
 A metrics library's second job is to help a program emit metric values that can be used in further systems.
 Dipstick provides an assortment of drivers for network or local metrics output.
 Multiple outputs can be used at a time, each with its own configuration. 
 ### Types
 These output type are provided, some are extensible, you may write your own if you need to.
 #### Stream
 Write values to any Write trait implementer, including files, stderr and stdout.
 #### Log
 Write values to the log using the log crate.
 ### Map
 Insert metric values in a map.  
 #### Statsd
 Send metrics to a remote host over UDP using the statsd format. 
 #### Graphite
 Send metrics to a remote host over TCP using the graphite format. 
 #### TODO Prometheus
 Send metrics to a remote host over TCP using the Prometheus JSON or ProtoBuf format.
 ### Attributes
 Attributes change the outputs behavior.
 #### Prefixes
 Outputs can be given Prefixes. 
 Prefixes are prepended to the Metrics names emitted by this output.
 With network outputs, a typical use of Prefixes is to identify the network host, 
 environment and application that metrics originate from.       
 #### Formatting
 Stream and Log outputs have configurable formatting that enables usage of custom templates.
 Other outputs, such as Graphite, have a fixed format because they're intended to be processed by a downstream system.
 #### Buffering
 Most outputs provide optional buffering, which can be used to optimized throughput at the expense of higher latency.
 If enabled, buffering is usually a best-effort affair, to safely limit the amount of memory that is used by the metrics.
 #### Sampling
 Some outputs such as statsd also have the ability to sample metrics.
 If enabled, sampling is done using pcg32, a fast random algorithm with reasonable entropy.
 ```rust,skt-fail
 let _app_metrics = to_statsd("server:8125")?.with_sampling_rate(0.01);
 ```
 ## Intermediates
 ### Proxy
 Because the input's actual _implementation_ depends on the output configuration,
 it is necessary to create an output channel before defining any metrics.
 This is often not possible because metrics configuration could be dynamic (e.g. loaded from a file),
 which might happen after the static initialization phase in which metrics are defined.
 To get around this catch-22, Dipstick provides a Proxy which acts as intermediate output, 
 allowing redirection to the effective output after it has been set up.
 ### Bucket
 Another intermediate output is the Bucket, which can be used to aggregate metric values. 
 Bucket-aggregated values can be used to infer statistics which will be flushed out to
 Bucket aggregation is performed locklessly and is very fast.
 Count, Sum, Min, Max and Mean are tracked where they make sense, depending on the metric type.
 #### Preset bucket statistics
 Published statistics can be selected with presets such as `all_stats` (see previous example),
 `summary`, `average`.
 #### Custom bucket statistics
 For more control over published statistics, provide your own strategy:
 ```rust,skt-run
 metrics(aggregate());
 set_default_aggregate_fn(|_kind, name, score|
    match score {
        ScoreType::Count(count) => 
            Some((Kind::Counter, vec![name, ".per_thousand"], count / 1000)),
        _ => None
    });
 ```
 #### Scheduled publication
 Aggregate metrics and schedule to be periodical publication in the background:
 ```rust,skt-run
 use std::time::Duration;
 let app_metrics = metric_scope(aggregate());
 route_aggregate_metrics(to_stdout());
 app_metrics.flush_every(Duration::from_secs(3));
 ```
 ### Multi
 Like Constructicons, multiple metrics outputs can assemble, creating a unified facade that transparently dispatches 
 input metrics to each constituent output. 
 ```rust,skt-fail,no_run
 let _app_metrics = metric_scope((
        to_stdout(), 
        to_statsd("localhost:8125")?.with_namespace(&["my", "app"])
    ));
 ```
 ### Queue
 Metrics can be recorded asynchronously:
 ```rust,skt-run
 let _app_metrics = metric_scope(to_stdout().queue(64));
 ```
 The async queue uses a Rust channel and a standalone thread.
 If the queue ever fills up under heavy load, the behavior reverts to blocking (rather than dropping metrics).
 ## Facilities
--- a/README.md
+++ b/README.md
@ -11,7 +11,7 @@ minimal impact on applications and a choice of output to downstream systems.
 Dipstick is a toolkit to help all sorts of application collect and send out metrics.
 As such, it needs a bit of set up to suit one's needs.
-Skimming through the [handbook](https://github.com/fralalonde/dipstick/tree/master/handbook)
+Skimming through the [handbook](https://github.com/fralalonde/dipstick/tree/master/HANDBOOK.md)
 should help you get an idea of the possible configurations.
 In short, dipstick-enabled apps _can_:
@ -31,7 +31,8 @@ For convenience, dipstick builds on stable Rust with minimal, feature-gated depe
 ### Non-goals
-For performance reasons, dipstick will not
+Dipstick's focus is on metrics collection (input) and forwarding (output).
 Although it will happily track aggregated statistics, for the sake of simplicity and performance Dipstick will not
 - plot graphs
 - send alerts
 - track histograms
@ -77,4 +78,3 @@ dipstick = "0.7.0"
 ## License
 Dipstick is licensed under the terms of the Apache 2.0 and MIT license.
--- a/handbook/01_basics.md
+++ b/handbook/01_basics.md
@ -1,36 +0,0 @@
 # The dipstick handbook
 This handbook's purpose is to get you started instrumenting your apps with dipstick
 and give an idea of what's possible.
 For more details, consult the [docs](https://docs.rs/dipstick/).
 ## Overview
 To achieve it's flexibility, Dipstick decouples the metrics _inputs_ from the metric _outputs_.
 For example, incrementing a counter in the application may not result in immediate output to a file or to the network.
 Conversely, it is also possible that an app will output metrics data even though no values were recorded.
 While this makes things generally simpler, it requires the programmer to decide beforehand how metrics will be handled.
 ## Static metrics
 For speed and easier maintenance, metrics are usually defined statically:
 ```rust,skt-plain
 #[macro_use] 
 extern crate dipstick;
 use dipstick::*;
 metrics!("my_app" => {
    COUNTER_A: Counter = "counter_a";
 });
 fn main() {
    route_aggregate_metrics(to_stdout());
    COUNTER_A.count(11);
 }
 ```
 (Metric definition macros are just `lazy_static!` wrappers.)
--- a/handbook/02_inputs.md
+++ b/handbook/02_inputs.md
@ -1,78 +0,0 @@
 # Input
 Metrics input are the measurement instruments that are called from application code.
 The inputs are high-level components that are assumed to be callable
 from all contexts, regardless of threading, security, etc.
 Each metric input has a name and a kind.
 A metric's name is a short alphanumeric identifier.
 A metric's kind can be one of four kinds:
 - Counter
 - Marker
 - Timer
 - Gauge
 The actual flow of measured values varies depending on how the metrics backend has been configured.
 Skip to the output section for more details on backend configuration.
 ## Counters and Markers
 ## Timers
 ## Gauges
 ## namespace
 Related metrics can share a namespace:
 ```rust,skt-run
 let app_metrics = metric_scope(to_stdout());
 let db_metrics = app_metrics.add_prefix("database");
 let _db_timer = db_metrics.timer("db_timer");
 let _db_counter = db_metrics.counter("db_counter");
 ```
 ## proxy
 ## counter
 ## marker
 ## timer
 Timers can be used multiple ways:
 ```rust,skt-run
 let app_metrics = metric_scope(to_stdout());
 let timer =  app_metrics.timer("my_timer");
 time!(timer, {/* slow code here */} );
 timer.time(|| {/* slow code here */} );
 let start = timer.start();
 /* slow code here */
 timer.stop(start);
 timer.interval_us(123_456);
 ```
 ## gauge
 ## ad-hoc metrics
 Where necessary, metrics can also be defined _ad-hoc_ (or "inline"):
 ```rust,skt-run
 let user_name = "john_day";
 let app_metrics = metric_scope(to_log()).with_cache(512);
 app_metrics.gauge(format!("gauge_for_user_{}", user_name)).value(44);
 ```
 ## ad-hoc metrics cache 
 Defining a cache is optional but will speed up re-definition of common ad-hoc metrics.
 ## local vs global scopes
--- a/handbook/03_outputs.md
+++ b/handbook/03_outputs.md
@ -1,38 +0,0 @@
 # outputs
 ## statsd
 ## graphite
 ## text
 ## logging
 ## prometheus
 ## combination
 Send metrics to multiple outputs:
 ```rust,skt-fail,no_run
 let _app_metrics = metric_scope((
        to_stdout(), 
        to_statsd("localhost:8125")?.with_namespace(&["my", "app"])
    ));
 ```
 ## buffering
 ## sampling
 Apply statistical sampling to metrics:
 ```rust,skt-fail
 let _app_metrics = to_statsd("server:8125")?.with_sampling_rate(0.01);
 ```
 A fast random algorithm (PCG32) is used to pick samples.
 Outputs can use sample rate to expand or format published data.
--- a/handbook/04_aggregation.md
+++ b/handbook/04_aggregation.md
@ -1,36 +0,0 @@
 # aggregation
 ## bucket
 Aggregation is performed locklessly and is very fast.
 Count, sum, min, max and average are tracked where they make sense.
 ## schedule
 Aggregate metrics and schedule to be periodical publication in the background:
 ```rust,skt-run
 use std::time::Duration;
 let app_metrics = metric_scope(aggregate());
 route_aggregate_metrics(to_stdout());
 app_metrics.flush_every(Duration::from_secs(3));
 ```
 ## preset statistics
 Published statistics can be selected with presets such as `all_stats` (see previous example),
 `summary`, `average`.
 ## custom statistics
 For more control over published statistics, provide your own strategy:
 ```rust,skt-run
 metrics(aggregate());
 set_default_aggregate_fn(|_kind, name, score|
    match score {
        ScoreType::Count(count) => 
            Some((Kind::Counter, vec![name, ".per_thousand"], count / 1000)),
        _ => None
    });
 ```
--- a/handbook/05_concurrency.md
+++ b/handbook/05_concurrency.md
@ -1,12 +0,0 @@
 # concurrency concerns
 ## locking
 ## queueing
 Metrics can be recorded asynchronously:
 ```rust,skt-run
 let _app_metrics = metric_scope(to_stdout().with_async_queue(64));
 ```
 The async queue uses a Rust channel and a standalone thread.
 The current behavior is to block when full.
--- a/src/output/format.rs
+++ b/src/output/format.rs
@ -15,7 +15,7 @@ pub enum LineOp {
    /// Print metric value as text.
    ValueAsText,
    /// Print metric value, divided by the given scale, as text.
-    ScaledValueAsText(MetricValue),
+    ScaledValueAsText(f64),
    /// Print the newline character.labels.lookup(key)
    NewLine,
 }
@ -51,7 +51,7 @@ impl LineTemplate {
                Literal(src) => output.write_all(src.as_ref())?,
                ValueAsText => output.write_all(format!("{}", value).as_ref())?,
                ScaledValueAsText(scale) => {
-                    let scaled = value / scale;
+                    let scaled = value as f64 / scale;
                    output.write_all(format!("{}", scaled).as_ref())?
                },
                NewLine => writeln!(output)?,