graphrag/examples/interdependent_workflows/pipeline.yml

workflows:
  - name: aggregate_workflow
    steps:
      - verb: "aggregate"  # https://github.com/microsoft/datashaper/blob/main/python/datashaper/datashaper/verbs/aggregate.py
        args:
            groupby: "type"
            column: "col_multiplied"
            to: "aggregated_output"
            operation: "sum"
        input:
          source: "workflow:derive_workflow" # reference the derive_workflow, cause this one requires that one to run first
            # Notice, these are out of order, the indexing engine will figure out the right order to run them in

  - name: derive_workflow
    steps:
      - verb: "derive" # https://github.com/microsoft/datashaper/blob/main/python/datashaper/datashaper/verbs/derive.py
        args:
          column1: "col1"  # from above
          column2: "col2"  # from above
          to: "col_multiplied"  # new column name
          operator: "*"  # multiply the two columns,
    # Since we're trying to act on the dataset, we don't need explicitly to specify an input
      # "input": { "source": "source" } # use the dataset as the input to this verb. This is the default, so you can omit it.
Initial Release 2024-07-01 15:25:30 -06:00			`workflows:`
			`- name: aggregate_workflow`
			`steps:`
Correct links to datashaper verbs in comments (#1068) Correct links to verbs in comments Updated the links in comments to reflect new paths for 'derive' and 'aggregate' verbs. This improves documentation and ensures that references are up to date for future developers. Co-authored-by: Alonso Guevara <alonsog@microsoft.com> 2024-09-13 03:44:38 +09:00			`- verb: "aggregate" # https://github.com/microsoft/datashaper/blob/main/python/datashaper/datashaper/verbs/aggregate.py`
Initial Release 2024-07-01 15:25:30 -06:00			`args:`
			`groupby: "type"`
			`column: "col_multiplied"`
			`to: "aggregated_output"`
			`operation: "sum"`
			`input:`
			`source: "workflow:derive_workflow" # reference the derive_workflow, cause this one requires that one to run first`
			`# Notice, these are out of order, the indexing engine will figure out the right order to run them in`

			`- name: derive_workflow`
			`steps:`
Correct links to datashaper verbs in comments (#1068) Correct links to verbs in comments Updated the links in comments to reflect new paths for 'derive' and 'aggregate' verbs. This improves documentation and ensures that references are up to date for future developers. Co-authored-by: Alonso Guevara <alonsog@microsoft.com> 2024-09-13 03:44:38 +09:00			`- verb: "derive" # https://github.com/microsoft/datashaper/blob/main/python/datashaper/datashaper/verbs/derive.py`
Initial Release 2024-07-01 15:25:30 -06:00			`args:`
			`column1: "col1" # from above`
			`column2: "col2" # from above`
			`to: "col_multiplied" # new column name`
			`operator: "*" # multiply the two columns,`
			`# Since we're trying to act on the dataset, we don't need explicitly to specify an input`
			`# "input": { "source": "source" } # use the dataset as the input to this verb. This is the default, so you can omit it.`