API Reference¶

mindoff-dataport has a deliberately small public surface: four functions that mirror the four steps of building a report. This page documents each one in full. The reference below is generated directly from the docstrings in the source, so it always matches the version you have installed.

Import Alias¶

The recommended entry point bundles all four functions under one namespace:

from mindoff_dataport import mo_dataport

mo_dataport.extract(...)   # extract_template
mo_dataport.inputs(...)    # get_template_inputs
mo_dataport.compile(...)   # compile_report_bundle
mo_dataport.export(...)    # export_report_bundle

Every function is also importable at the top level under both its full name and a short alias:

from mindoff_dataport import (
    extract_template,        # alias: extract
    get_template_inputs,     # alias: inputs
    compile_report_bundle,   # alias: compile
    export_report_bundle,    # alias: export
)

Functions¶

1. Template Extraction¶

Read an .xlsx template and return a WorkbookSchema.

Extraction captures cell values, styles (font, fill, alignment, borders), column widths and row heights, merged regions, manual print breaks, theme colors, and every {{key:type}} placeholder it finds. The schema is the in-memory blueprint everything downstream is built from.

Usage

from mindoff_dataport import mo_dataport

schema = mo_dataport.extract("invoice_template.xlsx")

# or the explicit name:
from mindoff_dataport import extract_template
schema = extract_template("invoice_template.xlsx")

Parameter	Type	Required	Description
`path`	`str`	Yes	Path to the `.xlsx` template file

Returns: WorkbookSchema — the extracted template blueprint.

2. Input Discovery¶

Inspect a schema and report every input the template expects.

Returns a sheet-scoped dictionary keyed by sheet name, then by placeholder key, with the value being the placeholder type ("string", "number", "date", "dataframe", and so on). Call this before building a payload so you know exactly what compile() will ask for — no guessing, no trial runs.

Usage

contract = mo_dataport.inputs(schema)

# or the explicit name:
from mindoff_dataport import get_template_inputs
contract = get_template_inputs(schema)

Parameter	Type	Required	Description
`template`	`WorkbookSchema`	Yes	Schema produced by `extract()`

Returns: dict[str, dict[str, str | list]] — the per-sheet input contract.

Example output

{
    "Sales Summary": {
        "report_title": "string",
        "generated_on": "date",
        "sales_rows": "dataframe",
    }
}

3. Bundle Compilation¶

Bind runtime data to a template and produce a ReportBundle.

Compilation validates the payload against the sheet contract, resolves scalar cells in place, materialises Polars DataFrames / LazyFrames to Parquet, stores compact dataframe anchors and repeat plans, and (optionally) shifts template content out of the way of expanding dataframes. The result is a portable bundle you can export now, or persist on disk and export later from any process.

Usage

bundle = mo_dataport.compile(
    template=schema,
    data=payload,
    bundle_path="out_bundle",      # omit to keep the bundle in memory
    dataframe_options=None,
    dataframe_shift="both",
)

Parameter	Type	Required	Description
`template`	`WorkbookSchema`	Yes	Schema from `extract()`
`data`	`dict[str, Any]`	Yes	Sheet-scoped payload (see the Data Contract guide)
`bundle_path`	`str \| None`	No	Write the bundle to this directory; omit for in-memory only
`dataframe_options`	`dict[str, Any] \| None`	No	Per-sheet, per-placeholder column occupation and alignment overrides
`dataframe_shift`	`str`	No	How surrounding cells/merges move around dataframe output: `"both"`, `"horizontal"`, `"vertical"`, or `"none"`

Returns: ReportBundle — the compiled, exportable bundle.

Raises: KeyError if a required placeholder key is missing from the payload.

4. Bundle Export¶

Render a compiled bundle to a file on disk.

Accepts either an in-memory ReportBundle or a path to a persisted bundle directory, and writes .xlsx or .pdf output. XLSX supports a full-fidelity mode and a low-memory streaming mode; PDF always paginates automatically. Format-specific keyword options (export mode, sizing, fonts, page size, and so on) are passed through **options.

Usage

mo_dataport.export(bundle, "report.xlsx", format="xlsx")
mo_dataport.export("out_bundle", "report.pdf", format="pdf")

Parameter	Type	Required	Default	Description
`bundle_or_path`	`ReportBundle \| str`	Yes	—	In-memory bundle or path to a bundle directory
`output_path`	`str`	Yes	—	Destination file path (`.xlsx` or `.pdf`)
`format`	`str`	No	`"xlsx"`	`"xlsx"` or `"pdf"`. `"image"` is reserved and raises `NotImplementedError`
`**options`	—	No	—	Sizing and format-specific options (see the Exporting guides)

Returns: None for fidelity XLSX and all PDF exports. For streaming XLSX, a list[str]: one workbook path when no split is needed, or a single .zip path when the export is split across multiple workbooks.

Supporting Types¶

`ReportBundle`¶

The canonical intermediate artifact produced by compile() and consumed by export(). It can live in memory or be persisted to a directory and reloaded later:

from mindoff_dataport import ReportBundle

bundle = ReportBundle.load("saved_bundle")   # reload a persisted bundle
mo_dataport.export(bundle, "report.xlsx")

The on-disk layout is documented in Architecture → The Pipeline.

`repeat_records(...)`¶

A helper that wraps source-backed repeat payloads (an ordered set of scalar record columns plus constant dataframe payloads) so large repeat sections don't have to materialise every record in memory.

from mindoff_dataport import repeat_records

records = repeat_records(scalar_records, constants={"line_items": shared_df})

See the Repeat Sections recipe for context on when to reach for it.

API Reference¶

Import Alias¶

Functions¶

1. Template Extraction¶

2. Input Discovery¶

3. Bundle Compilation¶

4. Bundle Export¶

Supporting Types¶

ReportBundle¶

repeat_records(...)¶

`ReportBundle`¶

`repeat_records(...)`¶