otto/rfc/0011-step-library-format.adoc

12 KiB
Raw Permalink Blame History

<html lang="en"> <head> </head>

RFC-0011: Step Library Packaging Format

Abstract

In order to provide for a simple and extensible way to implement steps, the step library packaging format allows for native tools and scripts to be distributed and loaded by agents.

Specification

Each step effectively has an entrypoint which is a binary executable. The agents in Otto will execute this file with an Invocation file containing all the necessary configuration and parameters.

Manifest file

The manifest.yml file is the step library package description. It contains all the information on how the step can be invokoed, but also details about how the build tooling should package the artifact for use within Otto.

Example manifest file
# This manifest captures the basic functionality of the Jenkins Pipeline `sh`
# step
---
# The symbol defines how this step should present in the pipeline
symbol: sh
# Description is help text
description: |
  The `sh` step executes a shell script within the given execution context

# Instructs the agent to create a local cache for this step, this will default
# to false as _most_ steps do not need their own caching directory.
#
# Agents are expected to construct the caches in their
# workdir/caches/<step-symbol> and to pass that into the step via the "cache"
# invocation file parameter
cache: true

# List all the files/globs to include in the packaged artifact
includes:
  # Paths are treated as relative from wherever osp is invoked from
  - name: target/release/sh-step
    # Steps the entire prefix of the file name, placing the file in the root of
    # the artifact
    flatten: true
  # A name starting with ./ is treated to be relative to the manifest.yml
  - name: ./README.adoc

# The entrypoint tells the Otto agent which actual binary to use when
# executing.
entrypoint:
  path: sh-step
  # Multiarch tells the agent that this should be executed on all platforms. In
  # which case case it may be "blindly" invoked.
  #
  # Non-multiarch steps will be attempt to be invoked with
  # `${entrypoint.path}-${arch}-${vendor}-${system}-${abi}` similar to how
  # Rust manages target triples: https://doc.rust-lang.org/nightly/rustc/platform-support.html
  multiarch: false

# The configuration helps the step self-express the configuration variables it
# requires from Otto in order to execute properly
#
# The names of these variables are to be considered globally flat by default,
# allowing for multiple steps to share the same configuration values. Should a
# step wish to _not_ share its configuration values, it should namespace them
# in the key name with the convention of `{step}.{key}` (e.g.
# `sh.default_shell`)
#
# The configuration variables are also a means of requesting credentials from
# Otto.
configuration:
  default_shell:
    description: |
      The default shell to use for the invocation of `sh` steps
    required: false
    default: '/bin/sh'

# The parameters array allows for keyword invocation and positional invocation
# of the step
parameters:
  - name: script
    required: true
    type: string
    description: |
      Runs a Bourne shell script, typically on a Unix node. Multiple lines are accepted.

      An interpreter selector may be used, for example: `#!/usr/bin/perl`

      Otherwise the system default shell will be run, using the `-xe` flags (you can specify `set +e` and/or `set +x` to disable those).

  - name: encoding
    description: |
      Encoding of the stdout/stderr output, not typically needed as the system will
      default to whatever `LC_TYPE` is defined.
    type: string
    required: false

  - name: label
    description: |
      A label to identify the shell step in a GUI.
    type: string
    required: false

  - name: returnStatus
    description: Compatibility support only, doesn't do anything
    type: boolean
    required: false

  - name: returnStdout
    description: Compatibility support only, doesn't do anything
    type: boolean
    required: false

Invocation file

The invocation file is a YAML file generated at runtime and made available to the step binary on the agent. The invocation file should carry all parameters, environment variables, and internal configuration necessary for the step binary to execute correctly.

Example invocation file passed to entrypoint
---
configuration:
  self: 'some-uuid-v4' # (1)
  ipc: '/tmp/agent-5171.sock' # (2)
  endpoints: # (3)
    objects:
      url: 'http://localhost:8080'

parameters:
  script: 'ls -lah'
  1. self contains the identifier the step can use to identify itself when interacting with the control socket or other services.

  2. ipc will have a path to the agents control socket, which speaks HTTP.

  3. endpoints is a map of endpoints which the step may interact with to perform its functions.

Agent Control Socket

Each agent will open up a control socket for the steps it launches to safely communicate back with the long-lived agent daemon. The agent may create a single long-lived IPC socket which is open for all steps, or generate a unique IPC connection for each step. The messages must be JSON structured and steps should wait for a response before proceeding to their next operation.

For the inter-process communication (IPC) between steps and the agent, the agent should bind an HTTP service to a local unix socket on platforms which support it. All requests should then be formed as JSON over HTTP, which is further described in RFC #5.

Examples of requests are detailed below.

Response
{
    "type" : "Received"
}

Manage pipeline status

The total number of pipeline status is subject of another document, but for example purposes assume there are: Success, Failure, Unstable, and Aborted.

Note

If the step returns a non-zero status, it will automatically set the status to Failure. See the Status enum in the agent code for a mapping of exit codes to status.

Change the status
{
    "type" : "SetPipelineStatus",
    "status" : "Unstable"
}
Terminate the pipeline
{
    "type" : "TerminatePipeline"
}

Variables

Capturing variables should be pretty straightforward.

Example step capturing a variable
prompt msg: 'What color should the bike shed be?', into: 'color'
Variable capture message
{
    "type" : "CaptureVariable",
    "name" : "color",
    "value" : "blue"
}

These can then be accessed in the steps remaining in the scope (e.g. a stage) via a special environment variable: VAR_COLOR

Storing a new variable should replace it, but a drop step should also exist, e.g.:

drop name: 'color'
Drop variable message
{
    "type" : "DropVariable",
    "name" : "color"
}

Motivation

Otto requires a means of defining pipeline behavior in a customizable fashion. This includes a standard set of steps which address common user needs, along with a pattern to allow for user-defined steps.

Reasoning

The approach defined in this document intends to address some of the goals defined at the outset of the Otto project, discussed below.

Safe extensibility

Extensibility must not come at the expense of system integrity. Systems which allow for administrator, or user-injected code at runtime cannot avoid system reliability and security problems. Extensibility is an important characteristic to support, but secondary to system integrity.

The process boundaries required by the design of step libraries is the key feature of this design. The extensibility of the system is inherently process-based which allows for steps to do numerous things which are not currently known or defined.

User-defined extension

Usage cannot grow across an organization without user-defined extension. The operators of the system will not be able to provide for every eventual requirement from users. Some mechanism for extending or consolidating aspects of a continuous delivery process must exist.

Implied, but not defined by this document is the notion of a "standard library" of steps to cover common cases which many users will have in most of their pipelines. That said, because step libraries are "just tarballs" that means that a user should be able to bring their own step libraries and in some cases even override the step libraries defined by the standard library.

Because the step libraries are dumb tarballs, they also dont require any specific platform support for users to bring their own implementations of steps. An entrypoint in the Manifest file is nothing more than an file executable by the agent. This allows users to write their own custom steps in Bash, Ruby, Python, Java, C#, etc. So long as the process can read the Invocation file and emit the appropriate outputs, any number of steps implemented in different languages should have no problem co-existing within the same pipeline.

Avoid mixing execution contexts

Mixing of management and execution contexts causes a myriad of issues. Many tools allow the management/orchestrator process to run user-defined workloads. This allows breaches of isolation between user-defined workloads and administrator configuration and data.

The step library approach pushes execution of user-defined code solely to the agent which is executed the pipeline. In many cases this will be a machine with few privileges or an ephemeral cloud/container instance. As such there is zero execution of user-defined workloads in the locations where system execution occurs, such as in the parser, orchestrator, or other services.

Deterministic runtime behavior

Non-deterministic runtime behavior adds instability. Without being able to "explain" a set of operations which should occur before runtime, it is impossible to determine whether or not a given delivery pipeline is correctly constructed.

Execution of steps from a step library can and should be done without an "interpreter." That is to say the modeling language which sits on top of the step libraries doesnt need to be Turing-complete [1] and can be "explained" prior to execution.

This approach should open the door to future enhancements which can perform a type of static analysis on pipelines to find redundant steps or performance bottlenecks that can be improved upon at a later date.

Backwards Compatibility

Since Otto has no previous step subsystem, no backwards compatibility concerns.

Security

Variables

Variables must be stored within the agent per-pipeline, such that pipelines cannot "pollute" the variable namespace of other pipelines. Current test implementations require a single agent invocation per pipeline invocation, so its not yet possible to even have an agent run multiple successive or concurrent pipelines. Should that ever become the case, it is expected that variables will be associated with pipeline which declares them.

Credentials

The retrieval and use of credentials by steps is not subject to this document, and is at this time still under active deliberation.

Testing

Testing of step libraries is covered in the main source repository. This includes testing of steps as well, which are generally using shunit2.

Prototype Implementation

The prototype implementation is found in the main branch.

References

</html>