otto/rfc/0011-step-library-format.adoc

= RFC-0011: Step Library Packaging Format
:toc: preamble
:toclevels: 3
ifdef::env-github[]
:tip-caption: :bulb:
:note-caption: :information_source:
:important-caption: :heavy_exclamation_mark:
:caution-caption: :fire:
:warning-caption: :warning:
endif::[]

.**RFC Template**

.Metadata
[cols="1h,1"]
|===
| RFC
| 0011

| Title
| Step Library Packaging Format

| Sponsor
| link:https://github.com/rtyler[R Tyler Croy]

| Status
| Draft :speech_balloon:

| Type
| Standards

| Created
| 2020-10-17

|===

== Abstract

In order to provide for a simple and extensible way to implement steps, the
step library packaging format allows for native tools and scripts to be
distributed and loaded by agents.


== Specification

Each step effectively has an `entrypoint` which is a binary executable. The
agents in Otto will execute this file with an <<invocation-file>> containing
all the necessary configuration and parameters.

[[manifest-file]]
=== Manifest file

The `manifest.yml` file is the step library package description. It contains
all the information on how the step can be invokoed, but also details about how
the build tooling should package the artifact for use within Otto.

.Example manifest file
[source,yaml]
----
# This manifest captures the basic functionality of the Jenkins Pipeline `sh`
# step
---
# The symbol defines how this step should present in the pipeline
symbol: sh
# Description is help text
description: |
  The `sh` step executes a shell script within the given execution context

# Instructs the agent to create a local cache for this step, this will default
# to false as _most_ steps do not need their own caching directory.
#
# Agents are expected to construct the caches in their
# workdir/caches/<step-symbol> and to pass that into the step via the "cache"
# invocation file parameter
cache: true

# List all the files/globs to include in the packaged artifact
includes:
  # Paths are treated as relative from wherever osp is invoked from
  - name: target/release/sh-step
    # Steps the entire prefix of the file name, placing the file in the root of
    # the artifact
    flatten: true
  # A name starting with ./ is treated to be relative to the manifest.yml
  - name: ./README.adoc

# The entrypoint tells the Otto agent which actual binary to use when
# executing.
entrypoint:
  path: sh-step
  # Multiarch tells the agent that this should be executed on all platforms. In
  # which case case it may be "blindly" invoked.
  #
  # Non-multiarch steps will be attempt to be invoked with
  # `${entrypoint.path}-${arch}-${vendor}-${system}-${abi}` similar to how
  # Rust manages target triples: https://doc.rust-lang.org/nightly/rustc/platform-support.html
  multiarch: false

# The configuration helps the step self-express the configuration variables it
# requires from Otto in order to execute properly
#
# The names of these variables are to be considered globally flat by default,
# allowing for multiple steps to share the same configuration values. Should a
# step wish to _not_ share its configuration values, it should namespace them
# in the key name with the convention of `{step}.{key}` (e.g.
# `sh.default_shell`)
#
# The configuration variables are also a means of requesting credentials from
# Otto.
configuration:
  default_shell:
    description: |
      The default shell to use for the invocation of `sh` steps
    required: false
    default: '/bin/sh'

# The parameters array allows for keyword invocation and positional invocation
# of the step
parameters:
  - name: script
    required: true
    type: string
    description: |
      Runs a Bourne shell script, typically on a Unix node. Multiple lines are accepted.

      An interpreter selector may be used, for example: `#!/usr/bin/perl`

      Otherwise the system default shell will be run, using the `-xe` flags (you can specify `set +e` and/or `set +x` to disable those).

  - name: encoding
    description: |
      Encoding of the stdout/stderr output, not typically needed as the system will
      default to whatever `LC_TYPE` is defined.
    type: string
    required: false

  - name: label
    description: |
      A label to identify the shell step in a GUI.
    type: string
    required: false

  - name: returnStatus
    description: Compatibility support only, doesn't do anything
    type: boolean
    required: false

  - name: returnStdout
    description: Compatibility support only, doesn't do anything
    type: boolean
    required: false
----


[[invocation-file]]
=== Invocation file

The invocation file is a YAML file generated at runtime and made available to
the step binary on the agent. The invocation file should carry all parameters,
environment variables, and internal configuration necessary for the step binary
to execute correctly.

.Example invocation file passed to entrypoint
[source,yaml]
----
---
configuration:
  self: 'some-uuid-v4' # <1>
  ipc: '/tmp/agent-5171.sock' # <2>
  endpoints: # <3>
    objects:
      url: 'http://localhost:8080'

parameters:
  script: 'ls -lah'
----
<1> `self` contains the identifier the step can use to identify itself when interacting with the control socket or other services.
<2> `ipc` will have a path to the agent's control socket, which speaks HTTP.
<3> `endpoints` is a map of endpoints which the step may interact with to perform its functions.


[[control-socket]]
=== Agent Control Socket

Each agent will open up a control socket for the steps it launches to safely
communicate back with the long-lived agent daemon. The agent _may_ create a
single long-lived IPC socket which is open for all steps, or generate a unique
IPC connection for each step. The messages must be JSON structured and steps
should wait for a response before proceeding to their next operation.


For the inter-process communication (IPC) between steps and the agent, the agent
should bind an HTTP service to a local unix socket on platforms which support it.
All requests should then be formed as JSON over HTTP, which is further described
in <<0005-json-over-http.adoc#abstract, RFC #5>>.

Examples of requests are detailed below.

.Response
[source,json]
----
{
    "type" : "Received"
}
----

==== Manage pipeline status

The total number of pipeline status is subject of another document, but for example
purposes assume there are: `Success`, `Failure`, `Unstable`, and `Aborted`.

[NOTE]
====
If the step returns a non-zero status, it will automatically set the status to `Failure`.
See the `Status` enum in the agent code for a mapping of exit codes to status.
====

.Change the status
[source,json]
----
{
    "type" : "SetPipelineStatus",
    "status" : "Unstable"
}
----

.Terminate the pipeline
[source,json]
----
{
    "type" : "TerminatePipeline"
}
----

==== Variables

Capturing variables should be pretty straightforward.

.Example step capturing a variable
[source]
----
prompt msg: 'What color should the bike shed be?', into: 'color'
----

.Variable capture message
[source,json]
----
{
    "type" : "CaptureVariable",
    "name" : "color",
    "value" : "blue"
}
----

These can then be accessed in the steps remaining in the scope (e.g. a stage)
via a special environment variable: `VAR_COLOR`

Storing a new variable should replace it, but a `drop` step should also exist, e.g.:

[source]
----
drop name: 'color'
----

.Drop variable message
[source,json]
----
{
    "type" : "DropVariable",
    "name" : "color"
}
----


== Motivation

Otto requires a means of defining pipeline behavior in a customizable fashion.
This includes a standard set of steps which address common user needs, along
with a pattern to allow for user-defined steps.


== Reasoning

The approach defined in this document intends to address some of the goals
defined at the outset of the Otto project, discussed below.

=== Safe extensibility

[quote]
====
Extensibility must not come at the expense of system integrity. Systems which
allow for administrator, or user-injected code at runtime cannot avoid system
reliability and security problems. Extensibility is an important characteristic
to support, but secondary to system integrity.
====

The process boundaries required by the design of step libraries is the **key
feature** of this design. The extensibility of the system is inherently
process-based which allows for steps to do numerous things which are not
currently known or defined.


=== User-defined extension

[quote]
====
Usage cannot grow across an organization without user-defined extension. The
operators of the system will not be able to provide for every eventual
requirement from users. Some mechanism for extending or consolidating aspects
of a continuous delivery process must exist.
====

Implied, but not defined by this document is the notion of a "standard library"
of steps to cover common cases which many users will have in _most_ of their
pipelines. That said, because step libraries are "just tarballs" that means
that a user should be able to bring their own step libraries and in some cases
even override the step libraries defined by the standard library.

Because the step libraries are dumb tarballs, they also don't require any
specific platform support for users to bring their own implementations of
steps. An `entrypoint` in the <<manifest-file>> is nothing more than an file
executable by the agent. This allows users to write their own custom steps in
Bash, Ruby, Python, Java, C#, etc. So long as the process can read the
<<invocation-file>> and emit the appropriate outputs, any number of steps
implemented in different languages should have no problem co-existing within
the same pipeline.


=== Avoid mixing execution contexts

[quote]
====
Mixing of management and execution contexts causes a myriad of issues. Many
tools allow the management/orchestrator process to run user-defined workloads.
This allows breaches of isolation between user-defined workloads and
administrator configuration and data.
====

The step library approach pushes execution of user-defined code solely to the
agent which is executed the pipeline. In many cases this will be a machine with
few privileges or an ephemeral cloud/container instance. As such there is zero
execution of user-defined workloads in the locations where system execution
occurs, such as in the parser, orchestrator, or other services.


=== Deterministic runtime behavior

[quote]
====
Non-deterministic runtime behavior adds instability. Without being able to
"explain" a set of operations which should occur before runtime, it is
impossible to determine whether or not a given delivery pipeline is correctly
constructed.
====

Execution of steps from a step library can and should be done without an
"interpreter." That is to say the modeling language which sits on top of the
step libraries doesn't need to be Turing-complete
footnote:[https://en.wikipedia.org/wiki/Turing_completeness] and can be
"explained" prior to execution.

This approach _should_ open the door to future enhancements which can perform a
type of static analysis on pipelines to find redundant steps or performance
bottlenecks that can be improved upon at a later date.


== Backwards Compatibility

Since Otto has no previous step subsystem, no backwards compatibility concerns.


== Security

=== Variables

Variables must be stored within the agent per-pipeline, such that pipelines
cannot "pollute" the variable namespace of other pipelines. Current test
implementations require a single agent invocation per pipeline invocation, so
it's not yet possible to even have an agent run multiple successive or
concurrent pipelines. Should that ever become the case, it is expected that
variables will be associated with pipeline which declares them.

=== Credentials

The retrieval and use of credentials by steps is not subject to this document,
and is at this time still under active deliberation.


== Testing

Testing of step libraries is covered in the main source repository. This
includes testing of steps as well, which are generally using
link:https://github.com/kward/shunit2/[shunit2].

== Prototype Implementation

The prototype implementation is found in the
link:https://github.com/rtyler/otto[main branch].

== References