Elements commands

The elements subcommands allow you to perform operations on existing Arkindex elements.

Linking elements

The link subcommand can be used to create relations between parent and children elements, in order to organize your data on Arkindex.

arkindex elements link --parent $PARENT_ELEMENT_ID --child $CHILD_ELEMENT_ID

You can only link elements that belong to the same corpus. The link subcommand only creates new relations between elements, it does not destroy/replace existing ones.

Parent element

You can use two (required, mutually exclusive) arguments when specifying the parent element.

  • --parent: the ID of an existing element on Arkindex.
arkindex elements link --parent $PARENT_ELEMENT_ID --child $CHILD_ELEMENT_ID
  • --create: instead of providing the ID of an Arkindex element, create that element and link the child element(s) to the newly created element. You will be prompted to enter the ID of the corpus in which to create the parent element (the same corpus the child element(s) belong to, for the link to work), an element type (one that exists in the target corpus) and a name for that new element.
arkindex elements link --create --child $CHILD_ELEMENT_ID

Child element(s)

You can use four (required, mutually exclusive) arguments as child inputs.

  • --child: one or more Arkindex element ID(s).
arkindex elements link --parent $PARENT_ELEMENT_ID --child $ID_1 $ID_2 $ID_3
  • --uuid-list: a path to a text file containing a list of element IDs, with one ID per line.
arkindex elements link --parent $PARENT_ELEMENT_ID --uuid-list $PATH/TO/file.txt
  • --selection: link the elements in your selection on Arkindex to the parent element.
arkindex elements link --parent $PARENT_ELEMENT_ID --selection
  • --stray-pages is specific to one operation: the linking of all the page elements that do not have a parent element, and are situated directly at the root of a corpus, to a parent element.
arkindex elements link --parent $PARENT_ELEMENT_ID --stray-pages

Unlinking elements

The unlink subcommand can be used to destroy relationships between parent and children elements, in order to organize your data on Arkindex.

arkindex elements unlink --parent $PARENT_ELEMENT_ID --child $CHILD_ELEMENT_ID

You can only unlink elements that belong to the same corpus. (Only elements belonging to the same corpus can be linked in the first place.)

Parent element

You have to specify the ID of the element from which you want to unlink the child element(s), using the --parent argument.

arkindex elements unlink --parent $PARENT_ELEMENT_ID --child $CHILD_ELEMENT_ID

Child element(s)

You can use three (required, mutually exclusive) arguments as child inputs.

  • --child: one or more Arkindex element ID(s).
arkindex elements unlink --parent $PARENT_ELEMENT_ID --child $ID_1 $ID_2 $ID_3
  • --uuid-list: a path to a text file containing a list of element IDs, with one ID per line.
arkindex elements unlink --parent $PARENT_ELEMENT_ID --uuid-list $PATH/TO/file.txt
  • --selection: unlink the elements in your selection on Arkindex from the parent element.
arkindex elements unlink --parent $PARENT_ELEMENT_ID --selection

Optional arguments

The --orphan flag allows you to unlink an element from the parent even if it does not have any other parent elements, which results in the element ending up directly at the root of the corpus.

arkindex elements unlink --parent $PARENT_ELEMENT_ID --child $CHILD_ELEMENT_ID --orphan

Without the --orphan argument, you cannot unlink an element and its parent if it has no other parent element.

Copying page elements

The page-copy subcommand can be used to copy Page elements to a folder, within or outside of their corpus of origin. It only copies Page elements, without any of their children. Technically, it creates a new Page element from the same Image, with the same name and type, inside another folder.

arkindex elements page-copy --folder $PARENT_ELEMENT_ID --pages $PAGE_ELEMENT_1 $PAGE_ELEMENT_2 $PAGE_ELEMENT_3

You can only copy pages to a folder-type element, using the --folder argument. It is not possible to create page elements at the root of a corpus using this subcommand.

The copied pages can be specified with three (required, mutually exclusive) arguments:

  • --pages: one or more page IDs as input.
arkindex elements page-copy --folder $PARENT_ELEMENT_ID --pages $PAGE_ELEMENT_ID
  • --selection: the pages to be copied are retrieved from your current selection on Arkindex.
arkindex elements page-copy --folder $PARENT_ELEMENT_ID --selection
  • --uuid-list: path to a text file containing a list of page IDs, one ID per line.
arkindex elements page-copy --folder $PARENT_ELEMENT_ID --uuid-list $PATH/TO/file.txt

Rejecting classifications

The reject-classifications subcommand can be used to reject (if the classification was created by a worker) or delete (if the classification was created manually) one or more classification(s) from one or more element(s).

arkindex elements reject-classifications --element $ELEMENT_ID_1 $ELEMENT_ID_2 --classes $ML_CLASS_NAME

Target elements

The elements to reject/remove classifications from can be retrieved using three (required, mutually exclusive) arguments:

  • --element: one or more element IDs.
arkindex elements reject-classifications --element $ELEMENT_ID --classes $ML_CLASS_NAME
  • --selection: the target elements are retrieved from your selection on Arkindex.
arkindex elements reject-classifications --selection --classes $ML_CLASS_NAME
  • --uuid-list: the target elements are retrieved from a text file containing a list of element IDs, with one ID per line.
arkindex elements reject-classifications --uuid-list $PATH/TO/file.txt --classes $ML_CLASS_NAME

Target classifications

The classes to reject/remove can be specified using two (required, mutually exclusive) arguments:

  • --all: all the classifications on the target elements will be rejected/deleted.
arkindex elements reject-classifications --element $ELEMENT_ID --all
  • --classes: specify which classes will be rejected/removed, using their names.
arkindex elements reject-classifications --element $ELEMENT_ID --classes $ML_CLASS_1 $ML_CLASS_2

Creating data splits for machine learning

The ml-splits subcommand can be used to organise data in an Arkindex project into splits to train Machine Learning models.

Basic usage

arkindex elements ml-splits --project $ARKINDEX_PROJECT_ID

This command will create, in the target project, a "Training dataset" dataset containing three sets: "train", "val" and "test". These sets will contain respectively 80%, 10% and 10% of all the page elements in the target project.

Required arguments

For the command to run, either one of the --project or --folder arguments has to be set. If using a Project UUID, then the "Training dataset" dataset will contain elements from this project regardless of eventual parent folders. If using a parent element / Folder UUID then the "Training dataset" will only contain elements that are children of this parent.

Optional arguments

The ml-splits command can take a number of optional arguments.

  • --element-type: only use elements of one or multiple given type(s) to created your training dataset. Defaults to page.
arkindex elements ml-splits --project $ARKINDEX_PROJECT_ID --element-type $ELEMENT_TYPE_1 $ELEMENT_TYPE_2
  • --recursive: when using elements from a parent element (with the --folder argument), adding the --recursive option lists those elements recursively, instead of only retrieving those that are direct children of the parent element.
arkindex elements ml-splits --folder $ARKINDEX_ELEMENT_ID --recursive --element-type $ELEMENT_TYPE
  • --set: this argument defines different sets with their ratio to split your data between them. The ratios have to be greater than 0 and inferior to 1. The sum of the ratios must be equal to 1. Defaults to train:0.8 val:0.1 test:0.1.
arkindex elements ml-splits --folder $ARKINDEX_ELEMENT_ID --set train:0.7 val:0.1 test:0.2
  • --nb-elements: you can use this argument to limit the number of elements used to create the training dataset. If not set, the training dataset will contain all the retrieved elements of the given type and from the given parent/project.
arkindex elements ml-splits --folder $ARKINDEX_ELEMENT_ID --recursive --nb-elements 200
  • --dataset-name: set the name of the training dataset folder which will be created at the root of the target project and contain the Train, Validation and Test folders. Defaults to Training dataset.
arkindex elements ml-splits --folder $ARKINDEX_ELEMENT_ID --dataset-name $DATASET_NAME