Vector Strategies

This folder contains the default vector pipeline used by AIKPs that publish vector data through OCLI:

makegeojson.py → convert vector input into a styled GeoJSON + create a DB-facing metadata document
upload.py → upload the GeoJSON payload + DB metadata to Cloud Object Storage (COS)
publish.py → register (publish) the metadata document into the docs database

These strategies are typically called by an AIKP template (for example ocli.aikp.vector_basic.template.Template).

High-level flow

ai makegeojson (vector)
- Reads the vector source.
- Writes a processed GeoJSON for the UI.
- Writes a companion *_db.geojson document that points at the COS object.
ai upload
- Uploads both the processed GeoJSON and the *_db.geojson document to COS.
ai publish post
- Publishes the *_db.geojson document to the docs DB so it becomes visible in the UI.

The three scripts rely on consistent file naming and task/recipe fields (described below).

`makegeojson.py` (`VectorGeoJsonStrategy`)

Purpose: turn a vector dataset into:

a styled GeoJSON file (the “payload”)
a metadata GeoJSON document (the “DB doc”) that references the payload in COS

Inputs

From task.config:

file_path (optional): path to the vector input
output_file_name (optional): output GeoJSON filename; defaults to <input_stem>.geojson
group_by_column (optional): attribute to color by; defaults to index
colormap (optional): matplotlib colormap name; defaults to viridis
alpha (optional): alpha channel used in rgba(...); defaults to 128

From task.get_recipe():

OUTDIR: where outputs are written (this should typically be the task AI results folder)
COS.bucket, COS.ResultKey, COS.endpoint: used to populate the DB doc
GeoJSON.static.url: a string template used to build properties.source.url

From Click context (ai makegeojson populates these in ctx.meta):

--cos-key: overrides the recipe COS key; if it starts with +, it is treated as a suffix
--friendly-name: overrides the friendly name; if it starts with +, it is treated as a suffix
--print / --less: pretty-print the generated DB doc

What it produces

Processed GeoJSON written to OUTDIR:

File name: output_file_name or <input_stem>.geojson
Content: original features plus two styling fields:
- fill: rgba(r, g, b, a)
- stroke: same as fill
Note: upload.py/publish.py currently derive expected filenames from task.config.file_path (the <input_stem>). If you set output_file_name to a different stem, the built-in upload/publish steps will not find the files unless you also adjust those steps or keep the default naming.

Color assignment is computed by:

selecting group_by_column (default index)
assigning each unique group a color from the chosen matplotlib colormap

DB metadata document written to OUTDIR:

File name: <stem>_db.geojson
Content: a GeoJSON Feature with:
- properties.ResultKey, properties.Bucket, properties.Endpoint
- properties.friendly_name
- properties.source.url built from GeoJSON.static.url and the bucket + result key
- geometry set to the dataset bounding box

Source discovery behavior

If task.config.file_path is not set, the strategy searches the task AI results directory for the first match in:

**/*.shp
**/*.kml
**/*.geojson

`upload.py` (`VectorUploadStrategy`)

Purpose: upload the processed GeoJSON and the DB doc to COS.

Inputs

From task.config or recipe:

file_path: used only to determine the output stem/name

From Click context (ai upload populates these in ctx.meta):

--dry-run: print what would upload
--cos-key: overrides upload key

Files it expects

The upload step looks in the task AI results folder for files derived from task.config.file_path:

<input_stem>.geojson
<input_stem>_db.geojson

It does not currently consider task.config.output_file_name when locating the files to upload.

It then uploads them as:

COS_KEY
COS_KEY_db.geojson

Where COS_KEY is determined by:

--cos-key (if provided), otherwise
properties.ResultKey read from <input_stem>_db.geojson

--cos-key is passed through string.Template(...).substitute(task.config), so keys can reference task config values like ${id}.

`publish.py` (`VectorPublishStrategy`)

Purpose: publish the DB doc (*_db.geojson) to the configured docs database.

Inputs

From Click context (set by ai publish post):

--dry-run
--print-json

From recipe:

APP_DOCS_DB: target docs database name/URL (consumed by ocli.cli.publish.publish_json)

File it publishes

The publish step reads from the task AI results folder, derived from task.config.file_path:

<input_stem>_db.geojson

It does not currently consider task.config.output_file_name when locating the DB doc to publish.

…and sends it to the docs DB.

Using the strategies in an AIKP template

An AIKP template can call the strategy classes directly:

from ocli.classes.task_template import TaskTemplate
from ocli.ai.vector_strategies.makegeojson import VectorGeoJsonStrategy
from ocli.ai.vector_strategies.upload import VectorUploadStrategy
from ocli.ai.vector_strategies.publish import VectorPublishStrategy

class Template(TaskTemplate):
    @classmethod
    def makegeojson(cls, task, cos_key=None, friendly_name=None):
        VectorGeoJsonStrategy(task).make_geojson()

    @classmethod
    def upload(cls, task, dry_run=False, cos_key=None):
        VectorUploadStrategy(task).upload()

    @classmethod
    def publish(cls, task, dry_run=False, cos_key=None):
        VectorPublishStrategy(task).publish()

Vector Strategies

High-level flow

makegeojson.py (VectorGeoJsonStrategy)

Inputs

What it produces

Source discovery behavior

upload.py (VectorUploadStrategy)

Inputs

Files it expects

publish.py (VectorPublishStrategy)

Inputs

File it publishes

Using the strategies in an AIKP template

`makegeojson.py` (`VectorGeoJsonStrategy`)

`upload.py` (`VectorUploadStrategy`)

`publish.py` (`VectorPublishStrategy`)