The Data Contract¶
Payloads in mindoff-dataport are sheet-scoped: the top-level keys of your data dict are sheet names, and each maps to that sheet's values. This one rule keeps multi-sheet reports unambiguous: every value knows exactly which sheet it belongs to. If you ever forget the exact shape a template wants, mo_dataport.inputs(schema) will tell you (see Input Discovery).
Prerequisites¶
- Template extracted with
mo_dataport.extract()(see Installation) - Placeholder keys confirmed with
mo_dataport.inputs(schema)before building a payload
Implementation¶
1. Static Sheet¶
The common case: a sheet with some scalar values and a table. The outer key matches the sheet name in the template; the inner keys match your placeholder keys.
{
"Invoice": {
"customer_name": "Acme Industries", # string
"invoice_number": 1024, # number
"due_date": datetime.date(2026, 5, 1), # date
"line_items": polars_dataframe, # dataframe / LazyFrame
}
}
2. Repeat Section¶
When a sheet contains a repeat block ({{key:repeat-start}} ... {{key:repeat-end}}), the payload key matching that block holds an ordered list of record payloads, one per rendered block.
{
"Sheet1": {
"reports": [ # key matches the repeat-start/end key
{"customer_name": "Acme", "line_items": acme_df},
{"customer_name": "Globex", "line_items": globex_df},
]
}
}
The blocks render top to bottom in list order. For the layout rules (sibling sections, static rows, merged-cell limits), see the Repeat Sections recipe.
3. Dynamic Sheet Group¶
When a template sheet is named exactly {{key}}, it's a stencil for many output sheets. Pass a dict of output_sheet_name -> payload under that placeholder key:
{
"region_sheet": { # the sheet-name placeholder key
"North Sheet": { # becomes an output sheet
"region_name": "North",
"owner": "Alice",
"sales_rows": north_df,
},
"South Sheet": {
"region_name": "South",
"owner": "Bob",
"sales_rows": south_df,
},
}
}
Output sheet order follows the payload dict's insertion order. Output sheet names must be unique.
When you introspect a template with a dynamic group, inputs(schema) reports it under the same placeholder key, using "*" to mean "every generated sheet shares this contract":
{
"region_sheet": {
"*": {
"region_name": "string",
"owner": "string",
"sales_rows": "dataframe",
}
}
}
4. Using Polars LazyFrames for Large Data¶
For anything big, prefer a LazyFrame over a DataFrame. A LazyFrame stays disk-backed and is never fully loaded into memory; the library reads it in batches at export time.
import polars as pl
rows = pl.scan_parquet("sales.parquet").select(["product", "units", "revenue"])
bundle = mo_dataport.compile(schema, {"Sheet1": {"sales_rows": rows}})
Rule of thumb
Use pl.scan_parquet(...) (lazy) for large or unbounded data and pl.DataFrame(...) (eager) for small, already-in-memory tables. Both are accepted anywhere a dataframe is expected.
Troubleshooting¶
KeyErrorduringcompile(). A placeholder the template requires is missing from your payload. Runinputs(schema)and compare keys side by side.- A sheet renders empty. Confirm the outer payload key exactly matches the sheet name (including spaces and case).
- Type validation failed.
The value you passed doesn't match the placeholder type, for example a
strwhere the template marked:number. The error names the offending key.