API Reference¶
mindoff-dataport has a deliberately small public surface: four functions that mirror the four steps of building a report. This page documents each one in full. The reference below is generated directly from the docstrings in the source, so it always matches the version you have installed.
Import Alias¶
The recommended entry point bundles all four functions under one namespace:
from mindoff_dataport import mo_dataport
mo_dataport.extract(...) # extract_template
mo_dataport.inputs(...) # get_template_inputs
mo_dataport.compile(...) # compile_report_bundle
mo_dataport.export(...) # export_report_bundle
Every function is also importable at the top level under both its full name and a short alias:
from mindoff_dataport import (
extract_template, # alias: extract
get_template_inputs, # alias: inputs
compile_report_bundle, # alias: compile
export_report_bundle, # alias: export
)
Functions¶
1. Template Extraction¶
Read an .xlsx template and return a WorkbookSchema.
Extraction captures cell values, styles (font, fill, alignment, borders),
column widths and row heights, merged regions, manual print breaks, theme
colors, and every {{key:type}} placeholder it finds. The schema is the
in-memory blueprint everything downstream is built from.
Usage
from mindoff_dataport import mo_dataport
schema = mo_dataport.extract("invoice_template.xlsx")
# or the explicit name:
from mindoff_dataport import extract_template
schema = extract_template("invoice_template.xlsx")
| Parameter | Type | Required | Description |
|---|---|---|---|
path |
str |
Yes | Path to the .xlsx template file |
Returns: WorkbookSchema — the extracted template blueprint.
2. Input Discovery¶
Inspect a schema and report every input the template expects.
Returns a sheet-scoped dictionary keyed by sheet name, then by placeholder
key, with the value being the placeholder type ("string", "number",
"date", "dataframe", and so on). Call this before building a payload so
you know exactly what compile() will ask for — no guessing, no trial runs.
Usage
contract = mo_dataport.inputs(schema)
# or the explicit name:
from mindoff_dataport import get_template_inputs
contract = get_template_inputs(schema)
| Parameter | Type | Required | Description |
|---|---|---|---|
template |
WorkbookSchema |
Yes | Schema produced by extract() |
Returns: dict[str, dict[str, str | list]] — the per-sheet input contract.
Example output
{
"Sales Summary": {
"report_title": "string",
"generated_on": "date",
"sales_rows": "dataframe",
}
}
3. Bundle Compilation¶
Bind runtime data to a template and produce a ReportBundle.
Compilation validates the payload against the sheet contract, resolves scalar cells in place, materialises Polars DataFrames / LazyFrames to Parquet, stores compact dataframe anchors and repeat plans, and (optionally) shifts template content out of the way of expanding dataframes. The result is a portable bundle you can export now, or persist on disk and export later from any process.
Usage
bundle = mo_dataport.compile(
template=schema,
data=payload,
bundle_path="out_bundle", # omit to keep the bundle in memory
dataframe_options=None,
dataframe_shift="both",
)
| Parameter | Type | Required | Description |
|---|---|---|---|
template |
WorkbookSchema |
Yes | Schema from extract() |
data |
dict[str, Any] |
Yes | Sheet-scoped payload (see the Data Contract guide) |
bundle_path |
str | None |
No | Write the bundle to this directory; omit for in-memory only |
dataframe_options |
dict[str, Any] | None |
No | Per-sheet, per-placeholder column occupation and alignment overrides |
dataframe_shift |
str |
No | How surrounding cells/merges move around dataframe output: "both", "horizontal", "vertical", or "none" |
Returns: ReportBundle — the compiled, exportable bundle.
Raises: KeyError if a required placeholder key is missing from the payload.
4. Bundle Export¶
Render a compiled bundle to a file on disk.
Accepts either an in-memory ReportBundle or a path to a persisted bundle
directory, and writes .xlsx or .pdf output. XLSX supports a full-fidelity
mode and a low-memory streaming mode; PDF always paginates automatically.
Format-specific keyword options (export mode, sizing, fonts, page size, and
so on) are passed through **options.
Usage
mo_dataport.export(bundle, "report.xlsx", format="xlsx")
mo_dataport.export("out_bundle", "report.pdf", format="pdf")
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
bundle_or_path |
ReportBundle | str |
Yes | — | In-memory bundle or path to a bundle directory |
output_path |
str |
Yes | — | Destination file path (.xlsx or .pdf) |
format |
str |
No | "xlsx" |
"xlsx" or "pdf". "image" is reserved and raises NotImplementedError |
**options |
— | No | — | Sizing and format-specific options (see the Exporting guides) |
Returns: None for fidelity XLSX and all PDF exports. For streaming
XLSX, a list[str]: one workbook path when no split is needed, or a single
.zip path when the export is split across multiple workbooks.
Supporting Types¶
ReportBundle¶
The canonical intermediate artifact produced by compile() and consumed by export(). It can live in memory or be persisted to a directory and reloaded later:
from mindoff_dataport import ReportBundle
bundle = ReportBundle.load("saved_bundle") # reload a persisted bundle
mo_dataport.export(bundle, "report.xlsx")
The on-disk layout is documented in Architecture → The Pipeline.
repeat_records(...)¶
A helper that wraps source-backed repeat payloads (an ordered set of scalar record columns plus constant dataframe payloads) so large repeat sections don't have to materialise every record in memory.
from mindoff_dataport import repeat_records
records = repeat_records(scalar_records, constants={"line_items": shared_df})
See the Repeat Sections recipe for context on when to reach for it.