802 lines
30 KiB
Elixir
802 lines
30 KiB
Elixir
|
defmodule Telemetry.Metrics do
|
||
|
@moduledoc """
|
||
|
Common interface for defining metrics based on
|
||
|
[`:telemetry`](https://github.com/beam-telemetry/telemetry) events.
|
||
|
|
||
|
Metrics are aggregations of Telemetry events with specific names, providing
|
||
|
a view of the system's behaviour over time.
|
||
|
|
||
|
To give a more concrete example, imagine that somewhere in your code there is
|
||
|
a function which sends an HTTP request, measures the time it took to get a
|
||
|
response, and emits an event with the information:
|
||
|
|
||
|
:telemetry.execute([:http, :request, :stop], %{duration: duration})
|
||
|
|
||
|
You could define a counter metric, which counts how many HTTP requests were
|
||
|
completed:
|
||
|
|
||
|
Telemetry.Metrics.counter("http.request.stop.duration")
|
||
|
|
||
|
or you could use a summary metric to see statistics about the request duration:
|
||
|
|
||
|
Telemetry.Metrics.summary("http.request.stop.duration")
|
||
|
|
||
|
This documentation is going to cover all the available metrics and how to use
|
||
|
them, as well as options, and how to integrate those metrics with reporters.
|
||
|
|
||
|
## Metrics
|
||
|
|
||
|
There are five metric types provided by `Telemetry.Metrics`:
|
||
|
|
||
|
* `counter/2` which counts the total number of emitted events
|
||
|
* `sum/2` which keeps track of the sum of selected measurement
|
||
|
* `last_value/2` holding the value of the selected measurement from
|
||
|
the most recent event
|
||
|
* `summary/2` calculating statistics of the selected measurement,
|
||
|
like maximum, mean, percentiles etc.
|
||
|
* `distribution/2` which builds a histogram of selected measurement
|
||
|
|
||
|
The first argument to all metric functions is the metric name. Metric
|
||
|
name can be provided as a string (e.g. `"http.request.stop.duration"`) or a
|
||
|
list of atoms (`[:http, :request, :stop, :duration]`). The metric name is
|
||
|
automatically used to infer the telemetry event and measurement. For example,
|
||
|
In the `"http.request.stop.duration"` example, the source event name is
|
||
|
`[:http, :request, :stop]` and metric values are drawn from `:duration`
|
||
|
measurement. Like this:
|
||
|
|
||
|
[:http , :request, :stop] :duration
|
||
|
<----- event name ------> <-- measurement -->
|
||
|
|
||
|
You can also explicitly specify the event name and measurement
|
||
|
if you prefer.
|
||
|
|
||
|
The second argument is a list of options. Below is the description of the
|
||
|
options common to all metric types:
|
||
|
|
||
|
* `:event_name` - the source event name. Can be represented either as a
|
||
|
string (e.g. `"http.request"`) or a list of atoms (`[:http, :request]`).
|
||
|
By default the event name is all but the last segment of the metric name.
|
||
|
* `:measurement` - the event measurement used as a source of a metric values.
|
||
|
By default it is the last segment of the metric name. It can be:
|
||
|
* an arbitrary term
|
||
|
* a key in the event's measurements map
|
||
|
* an unary function accepting the whole measurements map and returning the
|
||
|
actual value to be used.
|
||
|
* a binary function accepting the measurements and metadata maps and
|
||
|
returning the actual value to be used.
|
||
|
* `:tags` - a subset of metadata keys by which aggregations will be broken down.
|
||
|
Defaults to an empty list.
|
||
|
* `:tag_values` - a function that receives the metadata and returns a map with
|
||
|
the tags as keys and their respective values. Defaults to returning the
|
||
|
metadata itself.
|
||
|
* `:keep` - a predicate function that evaluates the metadata to conditionally
|
||
|
record a given event. `:keep` and `:drop` cannot be combined. Defaults to `nil`.
|
||
|
* `:drop` - a predicate function that evaluates the metadata to conditionally
|
||
|
skip recording a given event. `:keep` and `:drop` cannot be combined. Defaults to `nil`.
|
||
|
* `:description` - human-readable description of the metric. Might be used by
|
||
|
reporters for documentation purposes. Defaults to `nil`.
|
||
|
* `:unit` - an atom describing the unit of selected measurement, typically in
|
||
|
singular, such as `:millisecond`, `:byte`, `:kilobyte`, etc. It may also be
|
||
|
a tuple indicating that a measurement should be converted from one unit to
|
||
|
another before a metric is updated. Currently, only time and byte unit
|
||
|
conversions are supported. We discuss those in detail in the "Converting Units"
|
||
|
section.
|
||
|
* `:reporter_options` - a keyword list of reporter-specific options for the metric.
|
||
|
|
||
|
## Breaking down metric values by tags
|
||
|
|
||
|
Sometimes it's not enough to have a global overview of all HTTP requests received
|
||
|
or all DB queries made. It's often more helpful to break down this data, for example,
|
||
|
we might want to have separate metric values for each unique database table and
|
||
|
operation name (`select`, `insert` etc.) to see how these particular queries behave.
|
||
|
|
||
|
This is where tagging comes into play. All metric definitions accept a `:tags` option:
|
||
|
|
||
|
counter("db.query.duration", tags: [:table, :operation])
|
||
|
|
||
|
The above definition means that we want to keep track of the number of queries, but
|
||
|
we want a separate counter for each unique pair of table and operation. Tag values are
|
||
|
fetched from event metadata - this means that in this example, `[:db, :query]` events
|
||
|
need to include `:table` and `:operation` keys in their payload:
|
||
|
|
||
|
:telemetry.execute([:db, :query], %{duration: 198}, %{table: "users", operation: "insert"})
|
||
|
:telemetry.execute([:db, :query], %{duration: 112}, %{table: "users", operation: "select"})
|
||
|
:telemetry.execute([:db, :query], %{duration: 201}, %{table: "sessions", operation: "insert"})
|
||
|
:telemetry.execute([:db, :query], %{duration: 212}, %{table: "sessions", operation: "insert"})
|
||
|
|
||
|
The result of aggregating the events above looks like this:
|
||
|
|
||
|
| table | operation | count |
|
||
|
| -------- | --------- | ----- |
|
||
|
| users | insert | 1 |
|
||
|
| users | select | 1 |
|
||
|
| sessions | insert | 2 |
|
||
|
|
||
|
The approach where we create a separate metric for some unique set of properties
|
||
|
is called a multi-dimensional data model.
|
||
|
|
||
|
### Transforming event metadata for tagging
|
||
|
|
||
|
Finally, sometimes there is a need to modify event metadata before it's used for
|
||
|
tagging. Each metric definition accepts a function in `:tag_values` option which
|
||
|
transforms the metadata into desired shape. Note that this function is called for
|
||
|
each event, so it's important to keep it fast if the rate of events is high.
|
||
|
|
||
|
## Converting Units
|
||
|
|
||
|
It might happen that the unit of measurement we're tracking is not the desirable unit
|
||
|
for the metric values, e.g. events are emitted by a 3rd-party library we do not control,
|
||
|
or a reporter we're using requires specific unit of measurement.
|
||
|
|
||
|
For these scenarios, each metric definition accepts a `:unit` option in a form of a tuple:
|
||
|
|
||
|
summary("http.request.stop.duration", unit: {from_unit, to_unit})
|
||
|
|
||
|
This means that the measurement will be converted from `from_unit` to `to_unit` before
|
||
|
being used for updating the metric. Currently, only time and byte conversions are
|
||
|
supported.
|
||
|
|
||
|
### Time Conversions
|
||
|
|
||
|
Most time measurements in the Erlang VM are done in the `:native` unit, which we need
|
||
|
to convert to the desired precision. The supported time units are: `:second`, `:millisecond`,
|
||
|
`:microsecond`, `:nanosecond` and `:native`.
|
||
|
|
||
|
For example, to convert HTTP request duration from `:native` time unit to milliseconds
|
||
|
you'd write:
|
||
|
|
||
|
summary("http.request.stop.duration", unit: {:native, :millisecond})
|
||
|
|
||
|
### Byte Conversions
|
||
|
|
||
|
Some metrics, like VM memory's usage are reported in bytes. You might want to convert this
|
||
|
to megabytes, for example. The supported byte units are: `:byte`, `:kilobyte` and `:megabyte`.
|
||
|
|
||
|
In order to convert a metric value from bytes to megabytes, you can write the following:
|
||
|
|
||
|
last_value("vm.memory.total", unit: {:byte, :megabyte})
|
||
|
|
||
|
## VM metrics
|
||
|
|
||
|
`Telemetry.Metrics` doesn't have a special treatment for the VM metrics - they need
|
||
|
to be based on the events like all other metrics.
|
||
|
|
||
|
`:telemetry_poller` package (http://hexdocs.pm/telemetry_poller) exposes a bunch of
|
||
|
VM-related metrics and also provides custom periodic measurements. You can add
|
||
|
telemetry poller as a dependency:
|
||
|
|
||
|
{:telemetry_poller, "~> 0.4"}
|
||
|
|
||
|
By simply adding `:telemetry_poller` as a dependency, two events will become available:
|
||
|
|
||
|
* `[:vm, :memory]` - contains the total memory, as well as the memory used for
|
||
|
binaries, processes, etc. See `erlang:memory/0` for all keys;
|
||
|
* `[:vm, :total_run_queue_lengths]` - returns the run queue lengths for CPU and
|
||
|
IO schedulers. It contains the `total`, `cpu` and `io` measurements;
|
||
|
|
||
|
You can consume those events with `Telemetry.Metrics` with the following sample metrics:
|
||
|
|
||
|
last_value("vm.memory.total", unit: :byte)
|
||
|
last_value("vm.total_run_queue_lengths.total")
|
||
|
last_value("vm.total_run_queue_lengths.cpu")
|
||
|
last_value("vm.total_run_queue_lengths.io")
|
||
|
|
||
|
If you want to change the frequency of those measurements, you can set the
|
||
|
following configuration in your config file:
|
||
|
|
||
|
config :telemetry_poller, :default, period: 5_000 # the default
|
||
|
|
||
|
Or disable it completely with:
|
||
|
|
||
|
config :telemetry_poller, :default, false
|
||
|
|
||
|
The `:telemetry_poller` package also allows you to run your own poller, which is
|
||
|
useful to retrieve process information or perform custom measurements periodically.
|
||
|
For example, to keep track of the number of users, inside a supervision tree, you could do:
|
||
|
|
||
|
measurements = [
|
||
|
{:process_info,
|
||
|
event: [:my_app, :worker],
|
||
|
name: MyApp.Worker,
|
||
|
keys: [:message_queue_len, :memory]},
|
||
|
|
||
|
{MyApp, :measure_users, []}
|
||
|
]
|
||
|
|
||
|
Supervisor.start_link([
|
||
|
# Run the given measurements every 10 seconds
|
||
|
{:telemetry_poller, measurements: measurements(), period: 10_000}
|
||
|
], strategy: :one_for_one)
|
||
|
|
||
|
Where `MyApp.measure_users/0` could be written like this:
|
||
|
|
||
|
defmodule MyApp do
|
||
|
def measure_users do
|
||
|
:telemetry.execute([:my_app, :users], %{total: MyApp.users_count()}, %{})
|
||
|
end
|
||
|
end
|
||
|
|
||
|
Now with measurements in place, you can define the metrics for the
|
||
|
events above:
|
||
|
|
||
|
last_value("my_app.worker.memory", unit: :byte)
|
||
|
last_value("my_app.worker.message_queue_len")
|
||
|
last_value("my_app.users.total")
|
||
|
|
||
|
## Optionally Recording Metric Events
|
||
|
|
||
|
There may be occasions where you want to record metrics differently or not at all based
|
||
|
upon metadata. Rather than depending on changing event names with prefixes, you can instead
|
||
|
provide a predicate function which returns true when you want the metric to be
|
||
|
processed for this event and false when you do not.
|
||
|
|
||
|
Let's examine some a few examples where optional recording can be helpful.
|
||
|
|
||
|
### Filtering on Metadata
|
||
|
|
||
|
Let's assume you are using an HTTP client library in your application which has the following
|
||
|
event_name: `[:http_client, :request, :stop]`. You use this library in multiple places and
|
||
|
you'd like to monitor the request duration.
|
||
|
|
||
|
You can use the event provided by the library but you have very different acceptable
|
||
|
performance requirements for a critical request, so it would be better to provide a different
|
||
|
metric name for monitoring. The client library includes a user configured `:name` option which
|
||
|
you can set and is passed in the event metadata.
|
||
|
|
||
|
Let's create a default distribution metric and one for the high performance call.
|
||
|
|
||
|
distribution(
|
||
|
"http.client.request.duration",
|
||
|
event_name: [:http_client, :request, :stop],
|
||
|
drop: &(match?(%{name: :fast_client}, &1))
|
||
|
)
|
||
|
|
||
|
distribution(
|
||
|
"http.fast.client.request.duration",
|
||
|
event_name: [:http_client, :request, :stop],
|
||
|
keep: &(match?(%{name: :fast_client}, &1))
|
||
|
)
|
||
|
|
||
|
With this configuration, you can now monitor these requests separately. In the first example,
|
||
|
any events where the client's name is `:fast_client` will be dropped for that metric.
|
||
|
Conversely, any matching events in the second example metric will be kept.
|
||
|
|
||
|
Note: only `:keep` OR `:drop` may be set, never both.
|
||
|
|
||
|
### Other Uses
|
||
|
|
||
|
Another potential use case for `:keep | :drop` could be per metric sampling rates. As long
|
||
|
as your function returns a boolean, it can determine if the event should be processed.
|
||
|
|
||
|
### Reporter Support
|
||
|
|
||
|
Event optional recording must be supported by the reporter you are using. Check your reporter's
|
||
|
documentation before relying on this functionality.
|
||
|
|
||
|
The `keep` function should be evaluated by reporters _prior_ to `tag_values` using the
|
||
|
raw `:telemetry.medatadata()` values from the event.
|
||
|
|
||
|
## Reporters
|
||
|
|
||
|
So far, we have talked about metrics and how to describe them, but we haven't discussed
|
||
|
how those metrics are consumed and published to a system that provides data visualization,
|
||
|
aggregation, and more. The job of subscribing to events and processing the actual metrics
|
||
|
is a responsibility of reporters.
|
||
|
|
||
|
Generally speaking, a reporter is a process that you would start in your supervision
|
||
|
tree with a list of metrics as input. For example, `Telemetry.Metrics` ships with a
|
||
|
`Telemetry.Metrics.ConsoleReporter` module, which prints data to the terminal as an
|
||
|
example. You would start it as follows:
|
||
|
|
||
|
metrics = [
|
||
|
last_value("my_app.worker.memory", unit: :byte),
|
||
|
last_value("my_app.worker.message_queue_len"),
|
||
|
last_value("my_app.users.total")
|
||
|
]
|
||
|
|
||
|
Supervisor.start_link([
|
||
|
{Telemetry.Metrics.ConsoleReporter, metrics: metrics}
|
||
|
], strategy: :one_for_one)
|
||
|
|
||
|
Reporters take metric definitions as an input, subscribe to relevant events and
|
||
|
aggregate data when the events are emitted. Reporters may push metrics to StatsD,
|
||
|
some time-series database, or exposing a HTTP endpoint for Prometheus to scrape.
|
||
|
In a nutshell, `Telemetry.Metrics` defines only how metrics of particular type
|
||
|
should behave and reporters provide the actual implementation for these aggregations.
|
||
|
|
||
|
Official reporters, maintained by the Observability Working Group of the Erlang
|
||
|
Ecosystem Foundation, can be found on the [BEAM Telemetry organization on GitHub](https://github.com/beam-telemetry/).
|
||
|
You may also [find community reporters on hex.pm](https://hex.pm/packages?search=telemetry_metrics).
|
||
|
You can also read the [Writing Reporters](writing_reporters.html) page for general
|
||
|
information on how to write a reporter.
|
||
|
|
||
|
## Wiring it all up
|
||
|
|
||
|
Over the previous sections we discussed how to setup metrics and pass them to reporters
|
||
|
and how to configure a poller for measurements. We can wire it all up into a single
|
||
|
module as shown below. The example below would be used in the context of a Phoenix
|
||
|
application, where we have web metrics, database metrics (through Ecto) as well as
|
||
|
from the database, Phoenix metrics as well as VM metrics.
|
||
|
|
||
|
The first step is to add both `:telemetry_metrics` and `:telemetry_poller` as
|
||
|
dependencies:
|
||
|
|
||
|
[
|
||
|
{:telemetry_poller, "~> 0.4"},
|
||
|
{:telemetry_metrics, "~> 0.4"}
|
||
|
]
|
||
|
|
||
|
Then you could define a module that wires everything up:
|
||
|
|
||
|
defmodule MyAppWeb.Telemetry do
|
||
|
use Supervisor
|
||
|
import Telemetry.Metrics
|
||
|
|
||
|
def start_link(arg) do
|
||
|
Supervisor.start_link(__MODULE__, arg, name: __MODULE__)
|
||
|
end
|
||
|
|
||
|
def init(_arg) do
|
||
|
children = [
|
||
|
{:telemetry_poller,
|
||
|
measurements: periodic_measurements(),
|
||
|
period: 10_000},
|
||
|
# Or TelemetryMetricsPrometheus or TelemetryMetricsFooBar
|
||
|
{TelemetryMetricsStatsd, metrics: metrics()}
|
||
|
]
|
||
|
|
||
|
Supervisor.init(children, strategy: :one_for_one)
|
||
|
end
|
||
|
|
||
|
defp metrics do
|
||
|
[
|
||
|
# VM Metrics
|
||
|
last_value("vm.memory.total", unit: :byte),
|
||
|
last_value("vm.total_run_queue_lengths.total"),
|
||
|
last_value("vm.total_run_queue_lengths.cpu"),
|
||
|
last_value("vm.total_run_queue_lengths.io"),
|
||
|
|
||
|
last_value("my_app.worker.memory", unit: :byte),
|
||
|
last_value("my_app.worker.message_queue_len"),
|
||
|
|
||
|
# Database Time Metrics
|
||
|
summary("my_app.repo.query.total_time", unit: {:native, :millisecond}),
|
||
|
summary("my_app.repo.query.decode_time", unit: {:native, :millisecond}),
|
||
|
summary("my_app.repo.query.query_time", unit: {:native, :millisecond}),
|
||
|
summary("my_app.repo.query.idle_time", unit: {:native, :millisecond}),
|
||
|
summary("my_app.repo.query.queue_time", unit: {:native, :millisecond}),
|
||
|
|
||
|
# Phoenix Time Metrics
|
||
|
summary("phoenix.endpoint.stop.duration",
|
||
|
unit: {:native, :millisecond}),
|
||
|
summary(
|
||
|
"phoenix.router_dispatch.stop.duration",
|
||
|
unit: {:native, :millisecond},
|
||
|
tags: [:plug]
|
||
|
)
|
||
|
]
|
||
|
end
|
||
|
|
||
|
defp periodic_measurements do
|
||
|
[
|
||
|
{:process_info,
|
||
|
event: [:my_app, :worker],
|
||
|
name: Rumbl.Worker,
|
||
|
keys: [:message_queue_len, :memory]}
|
||
|
]
|
||
|
end
|
||
|
end
|
||
|
"""
|
||
|
|
||
|
require Logger
|
||
|
|
||
|
alias Telemetry.Metrics.{Counter, Sum, LastValue, Summary, Distribution}
|
||
|
|
||
|
@typedoc """
|
||
|
The name of the metric, either as string or a list of atoms.
|
||
|
"""
|
||
|
@type metric_name :: String.t() | normalized_metric_name()
|
||
|
|
||
|
@typedoc """
|
||
|
The name of the metric represented as a list of atoms.
|
||
|
"""
|
||
|
@type normalized_metric_name :: [atom(), ...]
|
||
|
|
||
|
@type measurement ::
|
||
|
term()
|
||
|
| (:telemetry.event_measurements() -> number())
|
||
|
| (:telemetry.event_measurements(), :telemetry.event_metadata() -> number())
|
||
|
@type tag :: term()
|
||
|
@type tags :: [tag()]
|
||
|
@type tag_values :: (:telemetry.event_metadata() -> :telemetry.event_metadata())
|
||
|
@type keep :: (:telemetry.event_metadata() -> boolean())
|
||
|
@type drop :: (:telemetry.event_metadata() -> boolean())
|
||
|
@type description :: nil | String.t()
|
||
|
@type unit :: atom()
|
||
|
@type time_unit_conversion() :: {time_unit(), time_unit()}
|
||
|
@type time_unit() :: :second | :millisecond | :microsecond | :nanosecond | :native
|
||
|
@type byte_unit() :: :megabyte | :kilobyte | :byte
|
||
|
@type byte_unit_conversion() :: {byte_unit(), byte_unit()}
|
||
|
@type reporter_options :: keyword()
|
||
|
@type metric_option ::
|
||
|
{:event_name, String.t() | :telemetry.event_name()}
|
||
|
| {:measurement, measurement()}
|
||
|
| {:tags, tags()}
|
||
|
| {:tag_values, tag_values()}
|
||
|
| {:keep, keep()}
|
||
|
| {:drop, drop()}
|
||
|
| {:description, description()}
|
||
|
| {:unit, unit() | time_unit_conversion() | byte_unit_conversion()}
|
||
|
| {:reporter_options, reporter_options()}
|
||
|
@type metric_options :: [metric_option()]
|
||
|
|
||
|
@typedoc """
|
||
|
One of the base metric definitions.
|
||
|
"""
|
||
|
@type t() :: Counter.t() | LastValue.t() | Sum.t() | Summary.t() | Distribution.t()
|
||
|
|
||
|
# API
|
||
|
|
||
|
@doc """
|
||
|
Returns a definition of counter metric.
|
||
|
|
||
|
Counter metric keeps track of the total number of specific events emitted.
|
||
|
|
||
|
The value of the counter is always incremented by one, regardless of the
|
||
|
value of the measurement. However, note the measurement must still be
|
||
|
available in the event, otherwise the event is not accounted for.
|
||
|
|
||
|
See the ["Metrics"](#module-metrics) section in the top-level documentation of this module for more
|
||
|
information.
|
||
|
|
||
|
## Example
|
||
|
|
||
|
counter(
|
||
|
"http.request.count",
|
||
|
tags: [:controller, :action]
|
||
|
)
|
||
|
"""
|
||
|
@spec counter(metric_name(), metric_options()) :: Counter.t()
|
||
|
def counter(metric_name, options \\ []) do
|
||
|
struct(Counter, common_fields(metric_name, options))
|
||
|
end
|
||
|
|
||
|
@doc """
|
||
|
Returns a definition of sum metric.
|
||
|
|
||
|
Sum metric keeps track of the sum of selected measurement's values carried by specific events.
|
||
|
|
||
|
See the ["Metrics"](#module-metrics) section in the top-level documentation of this module for more
|
||
|
information.
|
||
|
|
||
|
## Example
|
||
|
|
||
|
sum(
|
||
|
"user.session_count",
|
||
|
event_name: "user.session_count",
|
||
|
measurement: :delta,
|
||
|
tags: [:role]
|
||
|
)
|
||
|
"""
|
||
|
@spec sum(metric_name(), metric_options()) :: Sum.t()
|
||
|
def sum(metric_name, options \\ []) do
|
||
|
struct(Sum, common_fields(metric_name, options))
|
||
|
end
|
||
|
|
||
|
@doc """
|
||
|
Returns a definition of last value metric.
|
||
|
|
||
|
Last value keeps track of the selected measurement found in the most recent event.
|
||
|
|
||
|
See the ["Metrics"](#module-metrics) section in the top-level documentation of this module for more
|
||
|
information.
|
||
|
|
||
|
## Example
|
||
|
|
||
|
last_value(
|
||
|
"vm.memory.total",
|
||
|
description: "Total amount of memory allocated by the Erlang VM", unit: :byte
|
||
|
)
|
||
|
"""
|
||
|
@spec last_value(metric_name(), metric_options()) :: LastValue.t()
|
||
|
def last_value(metric_name, options \\ []) do
|
||
|
struct(LastValue, common_fields(metric_name, options))
|
||
|
end
|
||
|
|
||
|
@doc """
|
||
|
Returns a definition of summary metric.
|
||
|
|
||
|
This metric aggregates measurement's values into statistics, e.g. minimum and maximum, mean, or
|
||
|
percentiles. It is up to the reporter to decide which statistics exactly are exposed.
|
||
|
|
||
|
See the ["Metrics"](#module-metrics) section in the top-level documentation of this module for more
|
||
|
information.
|
||
|
|
||
|
## Example
|
||
|
|
||
|
summary(
|
||
|
"db.query.duration",
|
||
|
tags: [:table],
|
||
|
unit: {:native, :millisecond}
|
||
|
)
|
||
|
"""
|
||
|
@spec summary(metric_name(), metric_options()) :: Summary.t()
|
||
|
def summary(metric_name, options \\ []) do
|
||
|
struct(Summary, common_fields(metric_name, options))
|
||
|
end
|
||
|
|
||
|
@doc """
|
||
|
Returns a definition of distribution metric.
|
||
|
|
||
|
Distribution metric builds a histogram of selected measurement's values. It is up to the reporter
|
||
|
to decide how the boundaries of the distribution buckets are configured - via `:reporter_options`,
|
||
|
configuration of the aggregating system, or other means.
|
||
|
|
||
|
See the ["Metrics"](#module-metrics) section in the top-level documentation of this module for more
|
||
|
information.
|
||
|
|
||
|
## Example
|
||
|
|
||
|
distribution(
|
||
|
"http.request.duration",
|
||
|
tags: [:controller, :action],
|
||
|
)
|
||
|
|
||
|
"""
|
||
|
@spec distribution(metric_name(), metric_options()) :: Distribution.t()
|
||
|
def distribution(metric_name, options \\ []) do
|
||
|
fields = common_fields(metric_name, options)
|
||
|
struct(Distribution, fields)
|
||
|
end
|
||
|
|
||
|
# Helpers
|
||
|
|
||
|
@spec common_fields(metric_name(), [metric_option() | {atom(), term()}]) :: map()
|
||
|
defp common_fields(metric_name, options) do
|
||
|
metric_name = validate_metric_or_event_name!(metric_name)
|
||
|
{event_name, [measurement]} = Enum.split(metric_name, -1)
|
||
|
{event_name, options} = Keyword.pop(options, :event_name, event_name)
|
||
|
{measurement, options} = Keyword.pop(options, :measurement, measurement)
|
||
|
{keep, options} = Keyword.pop(options, :keep)
|
||
|
{drop, options} = Keyword.pop(options, :drop)
|
||
|
validate_recording_rule_fun_options!(keep, drop)
|
||
|
keep = validate_recording_rule_fun!(keep)
|
||
|
drop = validate_recording_rule_fun!(drop)
|
||
|
event_name = validate_metric_or_event_name!(event_name)
|
||
|
{unit, options} = Keyword.pop(options, :unit, :unit)
|
||
|
{unit, conversion_ratio} = validate_unit!(unit)
|
||
|
measurement = maybe_convert_measurement(measurement, conversion_ratio)
|
||
|
validate_metric_options!(options)
|
||
|
|
||
|
options
|
||
|
|> fill_in_default_metric_options()
|
||
|
|> Map.new()
|
||
|
|> Map.merge(%{
|
||
|
name: metric_name,
|
||
|
event_name: event_name,
|
||
|
measurement: measurement,
|
||
|
keep: keep_fun(keep, drop),
|
||
|
unit: unit
|
||
|
})
|
||
|
end
|
||
|
|
||
|
@spec validate_metric_or_event_name!(term()) :: [atom(), ...]
|
||
|
defp validate_metric_or_event_name!(metric_or_event_name)
|
||
|
when metric_or_event_name == [] or metric_or_event_name == "" do
|
||
|
raise ArgumentError, "metric or event name can't be empty"
|
||
|
end
|
||
|
|
||
|
defp validate_metric_or_event_name!(metric_or_event_name) when is_list(metric_or_event_name) do
|
||
|
if Enum.all?(metric_or_event_name, &is_atom/1) do
|
||
|
metric_or_event_name
|
||
|
else
|
||
|
raise ArgumentError,
|
||
|
"expected metric or event name to be a list of atoms or a string, " <>
|
||
|
"got #{inspect(metric_or_event_name)}"
|
||
|
end
|
||
|
end
|
||
|
|
||
|
defp validate_metric_or_event_name!(metric_or_event_name)
|
||
|
when is_binary(metric_or_event_name) do
|
||
|
segments = String.split(metric_or_event_name, ".")
|
||
|
|
||
|
if Enum.any?(segments, &(&1 == "")) do
|
||
|
raise ArgumentError,
|
||
|
"metric or event name #{metric_or_event_name} contains leading, " <>
|
||
|
"trailing or consecutive dots"
|
||
|
end
|
||
|
|
||
|
Enum.map(segments, &String.to_atom/1)
|
||
|
end
|
||
|
|
||
|
defp validate_metric_or_event_name!(term) do
|
||
|
raise ArgumentError,
|
||
|
"expected metric name to be a string or a list of atoms, got #{inspect(term)}"
|
||
|
end
|
||
|
|
||
|
defp validate_recording_rule_fun!(nil), do: nil
|
||
|
defp validate_recording_rule_fun!(fun) when is_function(fun, 1), do: fun
|
||
|
|
||
|
defp validate_recording_rule_fun!(term) do
|
||
|
raise ArgumentError,
|
||
|
"expected recording rule to be a function accepting metadata, got #{inspect(term)}"
|
||
|
end
|
||
|
|
||
|
defp validate_recording_rule_fun_options!(keep, drop) when is_nil(keep) or is_nil(drop), do: :ok
|
||
|
|
||
|
defp validate_recording_rule_fun_options!(_keep, _drop) do
|
||
|
raise ArgumentError,
|
||
|
"either the keep or drop option may be set, but not both"
|
||
|
end
|
||
|
|
||
|
defp keep_fun(nil, nil), do: nil
|
||
|
defp keep_fun(nil, drop), do: fn meta -> not drop.(meta) end
|
||
|
defp keep_fun(keep, nil), do: keep
|
||
|
|
||
|
@spec fill_in_default_metric_options([metric_option()]) :: [metric_option()]
|
||
|
defp fill_in_default_metric_options(options) do
|
||
|
Keyword.merge(default_metric_options(), options)
|
||
|
end
|
||
|
|
||
|
@spec default_metric_options() :: [metric_option()]
|
||
|
defp default_metric_options() do
|
||
|
[
|
||
|
tags: [],
|
||
|
tag_values: & &1,
|
||
|
nil: nil,
|
||
|
description: nil,
|
||
|
reporter_options: []
|
||
|
]
|
||
|
end
|
||
|
|
||
|
@spec validate_metric_options!([metric_option()]) :: :ok | no_return()
|
||
|
defp validate_metric_options!(options) do
|
||
|
if tags = Keyword.get(options, :tags), do: validate_tags!(tags)
|
||
|
if tag_values = Keyword.get(options, :tag_values), do: validate_tag_values!(tag_values)
|
||
|
if description = Keyword.get(options, :description), do: validate_description!(description)
|
||
|
|
||
|
if reporter_options = Keyword.get(options, :reporter_options),
|
||
|
do: validate_reporter_options!(reporter_options)
|
||
|
end
|
||
|
|
||
|
@spec validate_tags!(term()) :: :ok | no_return()
|
||
|
defp validate_tags!(list) when is_list(list) do
|
||
|
:ok
|
||
|
end
|
||
|
|
||
|
defp validate_tags!(term) do
|
||
|
raise ArgumentError, "expected tag keys to be a list, got: #{inspect(term)}"
|
||
|
end
|
||
|
|
||
|
@spec validate_tag_values!(term()) :: :ok | no_return()
|
||
|
defp validate_tag_values!(fun) when is_function(fun, 1) do
|
||
|
:ok
|
||
|
end
|
||
|
|
||
|
defp validate_tag_values!(term) do
|
||
|
raise ArgumentError,
|
||
|
"expected tag_values fun to be a one-argument function, got: #{inspect(term)}"
|
||
|
end
|
||
|
|
||
|
@spec validate_description!(term()) :: :ok | no_return()
|
||
|
defp validate_description!(term) do
|
||
|
if is_binary(term) and String.valid?(term) do
|
||
|
:ok
|
||
|
else
|
||
|
raise ArgumentError, "expected description to be a string, got #{inspect(term)}"
|
||
|
end
|
||
|
end
|
||
|
|
||
|
@spec validate_reporter_options!(term()) :: :ok | no_return()
|
||
|
defp validate_reporter_options!(reporter_options) do
|
||
|
if Keyword.keyword?(reporter_options) do
|
||
|
:ok
|
||
|
else
|
||
|
raise ArgumentError,
|
||
|
"expected reporter options to be a keyword list, got #{inspect(reporter_options)}"
|
||
|
end
|
||
|
end
|
||
|
|
||
|
@spec maybe_convert_measurement(measurement(), conversion_ratio :: non_neg_integer()) ::
|
||
|
measurement()
|
||
|
defp maybe_convert_measurement(measurement, 1) do
|
||
|
# Don't wrap measurement if no conversion is required.
|
||
|
measurement
|
||
|
end
|
||
|
|
||
|
defp maybe_convert_measurement(measurement, conversion_ratio)
|
||
|
when is_function(measurement, 2) do
|
||
|
fn measurements, metadata ->
|
||
|
if measurement = measurement.(measurements, metadata) do
|
||
|
measurement * conversion_ratio
|
||
|
end
|
||
|
end
|
||
|
end
|
||
|
|
||
|
defp maybe_convert_measurement(measurement, conversion_ratio)
|
||
|
when is_function(measurement, 1) do
|
||
|
fn measurements ->
|
||
|
if measurement = measurement.(measurements) do
|
||
|
measurement * conversion_ratio
|
||
|
end
|
||
|
end
|
||
|
end
|
||
|
|
||
|
defp maybe_convert_measurement(measurement, conversion_ratio) do
|
||
|
fn measurements ->
|
||
|
case measurements do
|
||
|
%{^measurement => nil} -> nil
|
||
|
%{^measurement => value} -> value * conversion_ratio
|
||
|
%{} -> nil
|
||
|
end
|
||
|
end
|
||
|
end
|
||
|
|
||
|
@spec validate_unit!(term()) :: {unit(), conversion_ratio :: number()} | no_return()
|
||
|
defp validate_unit!({from_unit, to_unit} = t) do
|
||
|
cond do
|
||
|
time_unit?(from_unit) and time_unit?(to_unit) ->
|
||
|
{to_unit, time_unit_conversion_ratio(from_unit, to_unit)}
|
||
|
|
||
|
byte_unit?(from_unit) and byte_unit?(to_unit) ->
|
||
|
{to_unit, byte_unit_conversion_ratio(from_unit, to_unit)}
|
||
|
|
||
|
true ->
|
||
|
raise ArgumentError,
|
||
|
"expected both elements of the unit conversion tuple " <>
|
||
|
"to represent time or byte units, got #{inspect(t)}"
|
||
|
end
|
||
|
end
|
||
|
|
||
|
defp validate_unit!(unit) when is_atom(unit) do
|
||
|
{unit, 1}
|
||
|
end
|
||
|
|
||
|
defp validate_unit!(term) do
|
||
|
raise ArgumentError,
|
||
|
"expected unit to be an atom or a two-element tuple, got #{inspect(term)}"
|
||
|
end
|
||
|
|
||
|
# Maybe warn or raise if the result of conversion is 0?
|
||
|
@spec time_unit_conversion_ratio(time_unit(), time_unit()) :: number()
|
||
|
defp time_unit_conversion_ratio(unit, unit), do: 1
|
||
|
|
||
|
defp time_unit_conversion_ratio(from_unit, to_unit) do
|
||
|
case System.convert_time_unit(1, from_unit, to_unit) do
|
||
|
0 ->
|
||
|
1 / System.convert_time_unit(1, to_unit, from_unit)
|
||
|
|
||
|
ratio ->
|
||
|
ratio
|
||
|
end
|
||
|
end
|
||
|
|
||
|
@spec byte_unit_conversion_ratio(byte_unit(), byte_unit()) :: number()
|
||
|
defp byte_unit_conversion_ratio(unit, unit), do: 1
|
||
|
|
||
|
defp byte_unit_conversion_ratio(from_unit, to_unit) do
|
||
|
multiplier = byte_unit_factor(to_unit)
|
||
|
divisor = byte_unit_factor(from_unit)
|
||
|
1 * multiplier / divisor
|
||
|
end
|
||
|
|
||
|
@spec byte_unit_factor(byte_unit()) :: number()
|
||
|
defp byte_unit_factor(:megabyte), do: 1
|
||
|
defp byte_unit_factor(:kilobyte), do: 1_000
|
||
|
defp byte_unit_factor(:byte), do: 1_000_000
|
||
|
|
||
|
@spec time_unit?(term()) :: boolean()
|
||
|
defp time_unit?(:native), do: true
|
||
|
defp time_unit?(:second), do: true
|
||
|
defp time_unit?(:millisecond), do: true
|
||
|
defp time_unit?(:microsecond), do: true
|
||
|
defp time_unit?(:nanosecond), do: true
|
||
|
defp time_unit?(_), do: false
|
||
|
|
||
|
@spec byte_unit?(term()) :: boolean()
|
||
|
defp byte_unit?(:byte), do: true
|
||
|
defp byte_unit?(:kilobyte), do: true
|
||
|
defp byte_unit?(:megabyte), do: true
|
||
|
defp byte_unit?(_), do: false
|
||
|
end
|