Saving workflows

To save a workflow call the WorkflowGenerator.save() method:

wf.save('workflow.cwl')

By default, the paths in the run field of workflow steps are absolute. This means that a workflow created on one machine cannot be run on another machine. However, there are many options for creating portable workflows.

Saving workflows with relative paths

To get relative paths in the run field of workflow steps, use relative=True:

wf.save('workflow.cwl', relative=True)

The paths in the run field are relative to where the workflow is saved. This option is convenient when you are creating workflows using a single directory with possible workflow steps.

Using a working directory

If you have multiple directories containing workflow steps and the locations of these directories may differ depending on where software is installed (for example, if you want to use the generic NLP steps from nlppln, but also need project specific data processing steps), it is possible to specify a working directory when creating the WorkflowGenerator object. If you this, all steps are copied to the working directory. When you save the workflow using wd=True, the paths in the run fields are set to the basename of the step (because all steps are in the same directory).

from scriptcwl import WorkflowGenerator

with WorkflowGenerator(working_dir='path/to/working_dir') as wf:
  wf.load(steps_dir='some/path/')
  wf.load(steps_dir='some/other/path/')

  # add inputs, steps and outputs

  wf.save('workflow', wd=True)

The workflow is saved in the working directory and then copied to the specified location. To be able to run the workflow, use the copy in the working directory (please note that the working directory is not deleted automatically).

Also, steps from urls are not copied to the working directory.

Embedding steps in the workflow

It is also possible to embed the workflow steps in the workflow. Workflows with embedded steps do not have paths in the run fields and can therefore be run on any machine. To do this use the inline=True option:

wf.save('workflow.cwl', inline=True)

(This is similar to the --pack option of cwltool, but the result is slightly more human readable.)

Please note that embedding CommandLineTools always works as expected, but if you want to embed subworkflows, things get more complicated. Naming conflicts arise if you include a subworkflow more than once. If you want to have a stand-alone version of a workflow with subworkflows, we recommend to pack the workflow (see Pack workflows).

With inline set to True, the example workflow looks like:

#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: Workflow
inputs:
  num1: int
  num2: int
outputs:
  final_answer:
    type: int
    outputSource: multiply/answer
steps:
  add:
    run:
      cwlVersion: v1.0
      class: CommandLineTool
      baseCommand: [python, -m, scriptcwl.examples.add]

      inputs:
      - type: int
        inputBinding:
          position: 1
        id: _:add.cwl#x
      - type: int
        inputBinding:
          position: 2

        id: _:add.cwl#y
      stdout: cwl.output.json

      outputs:
      - type: int
        id: _:add.cwl#answer
      id: _:add.cwl
    in:
      y: num2
      x: num1
    out:
    - answer
  multiply:
    run:
      cwlVersion: v1.0
      class: CommandLineTool
      baseCommand: [python, -m, scriptcwl.examples.multiply]

      inputs:
      - type: int
        inputBinding:
          position: 1
        id: _:multiply.cwl#x
      - type: int
        inputBinding:
          position: 2

        id: _:multiply.cwl#y
      stdout: cwl.output.json

      outputs:
      - type: int
        id: _:multiply.cwl#answer
      id: _:multiply.cwl
    in:
      y: num2
      x: add/answer
    out:
    - answer

Pack workflows

Another way to create workflows with all steps in one file is to save it with pack=True:

wf.save('workflow.cwl', pack=True)

Please note that packed workflows cannot be used as a building block in scriptcwl. If you try to load a packed workflow, you will get a warning.

With pack set to True, the example workflow looks like:

{
    "cwlVersion": "v1.0",
    "$graph": [
        {
            "class": "CommandLineTool",
            "baseCommand": [
                "python",
                "-m",
                "scriptcwl.examples.add"
            ],
            "inputs": [
                {
                    "type": "int",
                    "inputBinding": {
                        "position": 1
                    },
                    "id": "#add.cwl/x"
                },
                {
                    "type": "int",
                    "inputBinding": {
                        "position": 2
                    },
                    "id": "#add.cwl/y"
                }
            ],
            "stdout": "cwl.output.json",
            "outputs": [
                {
                    "type": "int",
                    "id": "#add.cwl/answer"
                }
            ],
            "id": "#add.cwl"
        },
        {
            "class": "CommandLineTool",
            "baseCommand": [
                "python",
                "-m",
                "scriptcwl.examples.multiply"
            ],
            "inputs": [
                {
                    "type": "int",
                    "inputBinding": {
                        "position": 1
                    },
                    "id": "#multiply.cwl/x"
                },
                {
                    "type": "int",
                    "inputBinding": {
                        "position": 2
                    },
                    "id": "#multiply.cwl/y"
                }
            ],
            "stdout": "cwl.output.json",
            "outputs": [
                {
                    "type": "int",
                    "id": "#multiply.cwl/answer"
                }
            ],
            "id": "#multiply.cwl"
        },
        {
            "class": "Workflow",
            "inputs": [
                {
                    "type": "int",
                    "id": "#main/num1"
                },
                {
                    "type": "int",
                    "id": "#main/num2"
                }
            ],
            "outputs": [
                {
                    "type": "int",
                    "outputSource": "#main/multiply-1/answer",
                    "id": "#main/final_answer"
                }
            ],
            "steps": [
                {
                    "run": "#add.cwl",
                    "in": [
                        {
                            "source": "#main/num1",
                            "id": "#main/add-1/x"
                        },
                        {
                            "source": "#main/num2",
                            "id": "#main/add-1/y"
                        }
                    ],
                    "out": [
                        "#main/add-1/answer"
                    ],
                    "id": "#main/add-1"
                },
                {
                    "run": "#multiply.cwl",
                    "in": [
                        {
                            "source": "#main/add-1/answer",
                            "id": "#main/multiply-1/x"
                        },
                        {
                            "source": "#main/num2",
                            "id": "#main/multiply-1/y"
                        }
                    ],
                    "out": [
                        "#main/multiply-1/answer"
                    ],
                    "id": "#main/multiply-1"
                }
            ],
            "id": "#main"
        }
    ]
}

Workflow validation

Before the workflow is saved, it is validated using cwltool. Validation can also be triggered manually:

wf.validate()

It is also possible to disable workflow validation on save:

wf.save('workflow.cwl', validate=False)

File encoding

By default, the encoding used to save workflows is utf-8. If necessary, a different encoding can be specified:

wf.save('workflow.cwl', encoding='utf-16')