Models subcommands

Publish Machine Learning models

The publish subcommand is used to export local Machine learning models to an Arkindex instance. These models used to be stored in Git repositories, and now a new storage system is being implemented on Arkindex. The publish subcommand is a temporary tool used to migrate existing models towards this new system.

Usage

arkindex models publish

This command uses the .arkindex.yml configuration file, located where the command is launched. Multiple models may be declared, assuming they follow the convention described below.

To make the connection to the Arkindex instance easier, you may set the ARKINDEX_API_URL and ARKINDEX_API_TOKEN variables in your environment. This is recommended when calling this command in Docker containers.

YAML configuration

The configuration file is always named .arkindex.yml and should be found at the root of the repository.

Required attributes

The following attributes are required in every .arkindex.yml file:

version : Version of the configuration file in use. An error will occur if the version number is not set to 2.

Example configuration
---
version: 2
models:
  - models/config.yml

This would match models/config.yml starting at the root of the repository.

Model repository attributes

The models attribute is a list of the following:

  • Paths to a YAML file holding the configuration for a single model
  • Unix-style patterns matching paths to YAML files holding the configuration for a single model
  • The configuration of a single model embedded directly into the file
Single model configuration

The following describes the attributes of a YAML file configuring one model, or of the configuration embedded directly in the .arkindex.yml file.

All attributes are optional unless explicitly specified.

name : Mandatory. Name of the model, for display and unicity purposes. To publish a new version of an existing model, the name should be exactly the same as the existing one.

path : Mandatory. Path to the folder containing the model. All contents inside this folder will be compressed, at the root of the archive, and uploaded to the Arkindex storage system.

description : Path to a file containing the model version's description.

tag : Model version's tag. If no tag is provided, the publication command will use the content of the CI_COMMIT_TAG environment variable.

parent : UUID of the model version's parent.

configuration : Mapping holding any string keys and values that can be later accessed in the worker's Python code.

Example configuration
---
version: 2

models:
  # Path to a single YAML file
  - path/to/model.yml
  # Pattern matching any YAML file in the configuration folder
  # or in its sub-directories
  - configuration/**/*.yml
  # Configuration embedded directly into this file
  - name: Book of hours | Historical
    path: path/to/model
    description: path/to/file.md
    tag: official
    parent: cafecafe-cafe-cafe-cafe-cafecafecafe
    configuration:
      anyKey: anyValue
      classes: [X, Y, Z]