superdocc

2018-11-02 00:16:44 +00:00 · 2018-11-02 00:16:44 +00:00 · b904fb795a
parent ea812eb9a0
commit b904fb795a
9 changed files with 269 additions and 205 deletions
--- a/.gitignore
+++ b/.gitignore
@ -2,3 +2,5 @@ target/
 **/*.rs.bk
 Cargo.lock
 src/prometheus_proto.rs
+.idea
+cmake-*
--- a/HANDBOOK.md
+++ b/HANDBOOK.md
@ -0,0 +1,262 @@
+# The dipstick handbook
+This handbook's purpose is to get you started instrumenting your apps with Dipstick
+and give an idea of what's possible.
+
+# Background
+Dipstick was born of the desire to build a metrics library that would allow to select from,
+switch between and combine multiple backends.
+Such a design has multiple benefits:
+- simplified instrumentation
+- flexible configuration
+- easier metrics testing
+
+Because of its Rust nature, performance, safety and ergonomy are also prime concerns. 
+
+
+## API Overview
+Dipstick's API is split between _input_ and _output_ layers.
+The input layer provides named metrics such as counters and timers to be used by the application.
+The output layer controls how metric values will be recorded and emitted by the configured backend(s).
+Input and output layers are decoupled, making code instrumentation independent of output configuration.
+Intermediates can also be added between input and output for features or performance characteristics. 
+
+Although this handbook covers input before output, implementation can certainly be performed the other way around.
+
+For more details, consult the [docs](https://docs.rs/dipstick/).
+
+
+## Metrics Input
+A metrics library first job is to help a program collect measurements about its operations.
+
+Dipstick provides a restricted but robust set of _four_ instrument types, taking a stance against 
+an application's functional code having to pick what statistics should be tracked for each defined metric.
+This helps to enforce contracts with downstream metrics systems and keeps code free of configuration elements.
+  
+#### Counter
+Count number of elements processed, e.g. number of bytes received.
+
+#### Marker 
+A monotonic counter. e.g. to record the processing of individual events.
+Default aggregated statistics for markers are not the same as those for counters.
+Value-less metric also makes for a safer API, preventing values other than 1 from being passed.  
+
+#### Timer
+Measure an operation's duration.
+Usable either through the time! macro, the closure form or explicit calls to start() and stop().
+While timers internal precision are in nanoseconds, their accuracy depends on platform OS and hardware. 
+Timer's default output format is milliseconds but is scalable up or down.
+ 
+```rust,skt-run
+let app_metrics = metric_scope(to_stdout());
+let timer =  app_metrics.timer("my_timer");
+time!(timer, {/* slow code here */} );
+timer.time(|| {/* slow code here */} );
+
+let start = timer.start();
+/* slow code here */
+timer.stop(start);
+
+timer.interval_us(123_456);
+```
+ 
+### Gauge
+An instant observation of a resource's value.
+Observation of gauges neither automatic or tied to the output of metrics, 
+it must be scheduled independently or called explicitly through the code.
+
+### Names
+Each metric must be given a name upon creation.
+Names are opaque to the application and are used only to identify the metrics upon output.
+
+Names may be prepended with a namespace by each configured backend.
+Aggregated statistics may also append identifiers to the metric's name.
+
+Names should exclude characters that can interfere with namespaces, separator and output protocols.
+A good convention is to stick with lowercase alphanumeric identifiers of less than 12 characters.
+
+```rust,skt-run
+let app_metrics = metric_scope(to_stdout());
+let db_metrics = app_metrics.add_prefix("database");
+let _db_timer = db_metrics.timer("db_timer");
+let _db_counter = db_metrics.counter("db_counter");
+```
+
+
+### Labels
+
+Some backends (such as Prometheus) allow "tagging" the metrics with labels to provide additional context,
+such as the URL or HTTP method requested from a web server.
+Dipstick offers the thread-local ThreadLabel and global AppLabel context maps to transparently carry 
+metadata to the backends configured to use it.
+
+Notes about labels:
+- Using labels may incur a significant runtime cost because 
+  of the additional implicit parameter that has to be carried around. 
+- Labels runtime costs may be even higher if async queuing is used 
+  since current context has to be persisted across threads.
+- While internally supported, single metric labels are not yet part of the input API. 
+  If this is important to you, consider using dynamically defined metrics or open a GitHub issue!
+
+
+### Static vs dynamic metrics
+  
+Metric inputs are usually setup statically upon application startup.
+
+```rust,skt-plain
+#[macro_use] 
+extern crate dipstick;
+
+use dipstick::*;
+
+metrics!("my_app" => {
+    COUNTER_A: Counter = "counter_a";
+});
+
+fn main() {
+    route_aggregate_metrics(to_stdout());
+    COUNTER_A.count(11);
+}
+```
+
+The static metric definition macro is just `lazy_static!` wrapper.
+
+## Dynamic metrics
+
+If necessary, metrics can also be defined "dynamically", with a possibly new name for every value. 
+This is more flexible but has a higher runtime cost, which may be alleviated with caching.
+
+```rust,skt-run
+let user_name = "john_day";
+let app_metrics = to_log().with_cache(512);
+app_metrics.gauge(format!("gauge_for_user_{}", user_name)).value(44);
+```
+    
+
+## Metrics Output
+A metrics library's second job is to help a program emit metric values that can be used in further systems.
+
+Dipstick provides an assortment of drivers for network or local metrics output.
+Multiple outputs can be used at a time, each with its own configuration. 
+
+### Types
+These output type are provided, some are extensible, you may write your own if you need to.
+
+#### Stream
+Write values to any Write trait implementer, including files, stderr and stdout.
+
+#### Log
+Write values to the log using the log crate.
+
+### Map
+Insert metric values in a map.  
+
+#### Statsd
+Send metrics to a remote host over UDP using the statsd format. 
+
+#### Graphite
+Send metrics to a remote host over TCP using the graphite format. 
+
+#### TODO Prometheus
+Send metrics to a remote host over TCP using the Prometheus JSON or ProtoBuf format.
+
+### Attributes
+Attributes change the outputs behavior.
+
+#### Prefixes
+Outputs can be given Prefixes. 
+Prefixes are prepended to the Metrics names emitted by this output.
+With network outputs, a typical use of Prefixes is to identify the network host, 
+environment and application that metrics originate from.       
+
+#### Formatting
+Stream and Log outputs have configurable formatting that enables usage of custom templates.
+Other outputs, such as Graphite, have a fixed format because they're intended to be processed by a downstream system.
+
+#### Buffering
+Most outputs provide optional buffering, which can be used to optimized throughput at the expense of higher latency.
+If enabled, buffering is usually a best-effort affair, to safely limit the amount of memory that is used by the metrics.
+
+#### Sampling
+Some outputs such as statsd also have the ability to sample metrics.
+If enabled, sampling is done using pcg32, a fast random algorithm with reasonable entropy.
+
+```rust,skt-fail
+let _app_metrics = to_statsd("server:8125")?.with_sampling_rate(0.01);
+```
+
+
+## Intermediates
+
+### Proxy
+
+Because the input's actual _implementation_ depends on the output configuration,
+it is necessary to create an output channel before defining any metrics.
+This is often not possible because metrics configuration could be dynamic (e.g. loaded from a file),
+which might happen after the static initialization phase in which metrics are defined.
+To get around this catch-22, Dipstick provides a Proxy which acts as intermediate output, 
+allowing redirection to the effective output after it has been set up.
+
+### Bucket
+
+Another intermediate output is the Bucket, which can be used to aggregate metric values. 
+Bucket-aggregated values can be used to infer statistics which will be flushed out to
+
+Bucket aggregation is performed locklessly and is very fast.
+Count, Sum, Min, Max and Mean are tracked where they make sense, depending on the metric type.
+
+#### Preset bucket statistics
+
+Published statistics can be selected with presets such as `all_stats` (see previous example),
+`summary`, `average`.
+
+#### Custom bucket statistics
+
+For more control over published statistics, provide your own strategy:
+```rust,skt-run
+metrics(aggregate());
+set_default_aggregate_fn(|_kind, name, score|
+    match score {
+        ScoreType::Count(count) => 
+            Some((Kind::Counter, vec![name, ".per_thousand"], count / 1000)),
+        _ => None
+    });
+```
+
+#### Scheduled publication
+
+Aggregate metrics and schedule to be periodical publication in the background:
+    
+```rust,skt-run
+use std::time::Duration;
+
+let app_metrics = metric_scope(aggregate());
+route_aggregate_metrics(to_stdout());
+app_metrics.flush_every(Duration::from_secs(3));
+```
+
+
+### Multi
+
+Like Constructicons, multiple metrics outputs can assemble, creating a unified facade that transparently dispatches 
+input metrics to each constituent output. 
+
+```rust,skt-fail,no_run
+let _app_metrics = metric_scope((
+        to_stdout(), 
+        to_statsd("localhost:8125")?.with_namespace(&["my", "app"])
+    ));
+```
+
+### Queue
+
+Metrics can be recorded asynchronously:
+```rust,skt-run
+let _app_metrics = metric_scope(to_stdout().queue(64));
+```
+The async queue uses a Rust channel and a standalone thread.
+If the queue ever fills up under heavy load, the behavior reverts to blocking (rather than dropping metrics).
+
+
+## Facilities
+
+
--- a/README.md
+++ b/README.md
@ -11,7 +11,7 @@ minimal impact on applications and a choice of output to downstream systems.

 Dipstick is a toolkit to help all sorts of application collect and send out metrics.
 As such, it needs a bit of set up to suit one's needs.
-Skimming through the [handbook](https://github.com/fralalonde/dipstick/tree/master/handbook)
+Skimming through the [handbook](https://github.com/fralalonde/dipstick/tree/master/HANDBOOK.md)
 should help you get an idea of the possible configurations.

 In short, dipstick-enabled apps _can_:
@ -31,7 +31,8 @@ For convenience, dipstick builds on stable Rust with minimal, feature-gated depe

 ### Non-goals

-For performance reasons, dipstick will not
+Dipstick's focus is on metrics collection (input) and forwarding (output).
+Although it will happily track aggregated statistics, for the sake of simplicity and performance Dipstick will not
 - plot graphs
 - send alerts
 - track histograms
@ -77,4 +78,3 @@ dipstick = "0.7.0"
 ## License

 Dipstick is licensed under the terms of the Apache 2.0 and MIT license.
-
--- a/handbook/01_basics.md
+++ b/handbook/01_basics.md
@ -1,36 +0,0 @@
-# The dipstick handbook
-
-This handbook's purpose is to get you started instrumenting your apps with dipstick
-and give an idea of what's possible.
-
-For more details, consult the [docs](https://docs.rs/dipstick/).
-
-## Overview
-
-To achieve it's flexibility, Dipstick decouples the metrics _inputs_ from the metric _outputs_.
-For example, incrementing a counter in the application may not result in immediate output to a file or to the network.
-Conversely, it is also possible that an app will output metrics data even though no values were recorded.
-While this makes things generally simpler, it requires the programmer to decide beforehand how metrics will be handled.
-
-
-## Static metrics
-
-For speed and easier maintenance, metrics are usually defined statically:
-
-```rust,skt-plain
-#[macro_use] 
-extern crate dipstick;
-
-use dipstick::*;
-
-metrics!("my_app" => {
-    COUNTER_A: Counter = "counter_a";
-});
-
-fn main() {
-    route_aggregate_metrics(to_stdout());
-    COUNTER_A.count(11);
-}
-```
-
-(Metric definition macros are just `lazy_static!` wrappers.)
--- a/handbook/02_inputs.md
+++ b/handbook/02_inputs.md
@ -1,78 +0,0 @@
-# Input
-
-Metrics input are the measurement instruments that are called from application code.
-The inputs are high-level components that are assumed to be callable
-from all contexts, regardless of threading, security, etc.
-
-Each metric input has a name and a kind.
-A metric's name is a short alphanumeric identifier.
-A metric's kind can be one of four kinds:
- Counter
- Marker
- Timer
- Gauge
-
-The actual flow of measured values varies depending on how the metrics backend has been configured.
-Skip to the output section for more details on backend configuration.
-
-## Counters and Markers
-
-## Timers
-
-## Gauges
-
-
-
-
-## namespace
-
-Related metrics can share a namespace:
-```rust,skt-run
-let app_metrics = metric_scope(to_stdout());
-let db_metrics = app_metrics.add_prefix("database");
-let _db_timer = db_metrics.timer("db_timer");
-let _db_counter = db_metrics.counter("db_counter");
-```
-
-## proxy
-
-## counter
-
-## marker
-
-## timer
-
-Timers can be used multiple ways:
-```rust,skt-run
-let app_metrics = metric_scope(to_stdout());
-let timer =  app_metrics.timer("my_timer");
-time!(timer, {/* slow code here */} );
-timer.time(|| {/* slow code here */} );
-
-let start = timer.start();
-/* slow code here */
-timer.stop(start);
-
-timer.interval_us(123_456);
-```
-
-## gauge
-
-
-## ad-hoc metrics
-
-Where necessary, metrics can also be defined _ad-hoc_ (or "inline"):
-
-```rust,skt-run
-let user_name = "john_day";
-let app_metrics = metric_scope(to_log()).with_cache(512);
-app_metrics.gauge(format!("gauge_for_user_{}", user_name)).value(44);
-```
-
-## ad-hoc metrics cache 
-
-Defining a cache is optional but will speed up re-definition of common ad-hoc metrics.
-
-
-## local vs global scopes
-
--- a/handbook/03_outputs.md
+++ b/handbook/03_outputs.md
@ -1,38 +0,0 @@
-# outputs
-
-## statsd
-
-## graphite
-
-## text
-
-## logging
-
-## prometheus
-
-## combination
-
-Send metrics to multiple outputs:
-
-```rust,skt-fail,no_run
-let _app_metrics = metric_scope((
-        to_stdout(), 
-        to_statsd("localhost:8125")?.with_namespace(&["my", "app"])
-    ));
-```
-
-## buffering
-
-## sampling
-
-Apply statistical sampling to metrics:
-
-```rust,skt-fail
-let _app_metrics = to_statsd("server:8125")?.with_sampling_rate(0.01);
-```
-
-A fast random algorithm (PCG32) is used to pick samples.
-Outputs can use sample rate to expand or format published data.
-
-
-
--- a/handbook/04_aggregation.md
+++ b/handbook/04_aggregation.md
@ -1,36 +0,0 @@
-# aggregation
-
-## bucket
-
-Aggregation is performed locklessly and is very fast.
-Count, sum, min, max and average are tracked where they make sense.
-
-## schedule
-
-Aggregate metrics and schedule to be periodical publication in the background:
-```rust,skt-run
-use std::time::Duration;
-
-let app_metrics = metric_scope(aggregate());
-route_aggregate_metrics(to_stdout());
-app_metrics.flush_every(Duration::from_secs(3));
-```
-
-## preset statistics
-
-Published statistics can be selected with presets such as `all_stats` (see previous example),
-`summary`, `average`.
-
-
-## custom statistics
-
-For more control over published statistics, provide your own strategy:
-```rust,skt-run
-metrics(aggregate());
-set_default_aggregate_fn(|_kind, name, score|
-    match score {
-        ScoreType::Count(count) => 
-            Some((Kind::Counter, vec![name, ".per_thousand"], count / 1000)),
-        _ => None
-    });
-```
--- a/handbook/05_concurrency.md
+++ b/handbook/05_concurrency.md
@ -1,12 +0,0 @@
-# concurrency concerns
-
-## locking
-
-## queueing
-
-Metrics can be recorded asynchronously:
-```rust,skt-run
-let _app_metrics = metric_scope(to_stdout().with_async_queue(64));
-```
-The async queue uses a Rust channel and a standalone thread.
-The current behavior is to block when full.
--- a/src/output/format.rs
+++ b/src/output/format.rs
@ -15,7 +15,7 @@ pub enum LineOp {
    /// Print metric value as text.
    ValueAsText,
    /// Print metric value, divided by the given scale, as text.
-    ScaledValueAsText(MetricValue),
+    ScaledValueAsText(f64),
    /// Print the newline character.labels.lookup(key)
    NewLine,
 }
@ -51,7 +51,7 @@ impl LineTemplate {
                Literal(src) => output.write_all(src.as_ref())?,
                ValueAsText => output.write_all(format!("{}", value).as_ref())?,
                ScaledValueAsText(scale) => {
-                    let scaled = value / scale;
+                    let scaled = value as f64 / scale;
                    output.write_all(format!("{}", scaled).as_ref())?
                },
                NewLine => writeln!(output)?,