Skip to content

Python API Reference

Core C++ components exposed as a Python API via nanobind.

tguf.TGUFSchema(path: str, edge_capacity: int | None = None, msg_dim: int | None = None, label_dim: int | None = None, node_feat_capacity: int | None = None, node_feat_dim: int | None = None, label_capacity: int | None = None, negatives_start_e_id: int | None = None, negatives_per_edge: int | None = None, val_start: int | None = None, test_start: int | None = None)

Metadata defining the layout of a TGUF dataset.

This schema specifies dataset capacities, feature dimensions, and optional evaluation splits. It is required to initialize a :class:TGUFBuilder.

Parameters:

  • path (str) –

    Path to the .tguf binary file.

  • edge_capacity (int, default: None ) –

    Maximum number of edges.

  • msg_dim (int, default: None ) –

    Dimension of edge features.

  • label_dim (int, default: None ) –

    Dimension of label targets.

  • node_feat_capacity (int, default: None ) –

    Maximum number of nodes with static features.

  • node_feat_dim (int, default: None ) –

    Dimension of node features.

  • label_capacity (int, default: None ) –

    Maximum number of label events.

  • negatives_start_e_id (int, default: None ) –

    Edge index where precomputed negatives begin (for evaluation).

  • negatives_per_edge (int, default: None ) –

    Number of negatives per edge.

  • val_start (int, default: None ) –

    Global edge index where validation split begins.

  • test_start (int, default: None ) –

    Global edge index where test split begins.

Notes

If val_start or test_start are not provided, the dataset is treated as fully training unless overridden during loading.

tguf.TGUFBuilder(schema: TGUFSchema)

High-performance writer for creating TGUF datasets on disk.

Uses an internal buffering strategy to minimize disk I/O.

Parameters:

  • schema (tguf._tguf_py.TGUFSchema) –

    Dataset schema defining layout and capacities.

See also
  • :class:TGUFSchema
  • :class:Batch

append_edges(batch: Batch) -> None

Append a batch of temporal edges to the dataset.

Parameters:

  • batch (tguf._tguf_py.Batch) –

    A batch of temporal edge data.

Notes

Releases the Python GIL during execution.

append_labels(n_id: NDArray, time: NDArray, target: NDArray) -> None

Append label events to the dataset.

Parameters:

  • n_id (ndarray) –

    Node IDs of shape [B], dtype=int64.

  • time (ndarray) –

    Event timestamps of shape [B], dtype=int64.

  • target (ndarray) –

    Label targets of shape [B, label_dim], dtype=float32.

Notes

Releases the Python GIL during execution.

append_node_feats(n_id: NDArray, node_feat: NDArray) -> None

Append static node features to the dataset.

Parameters:

  • n_id (ndarray) –

    Node IDs of shape [N], dtype=int64.

  • node_feat (ndarray) –

    Node features of shape [N, node_feat_dim], dtype=float32.

Notes

Releases the Python GIL during execution.

finalize() -> None

Finalize the dataset.

Writes headers and flushes all buffered data to disk.

Notes

Must be called after all data has been appended. Releases the Python GIL during execution.

tguf.Batch(src: NDArray, dst: NDArray, time: NDArray, msg: NDArray, neg_dst: NDArray | None = None)

Container for temporal edge data.

This structure represents a batch of temporal interactions and is used as input to :meth:TGUFBuilder.append_edges.

Parameters:

  • src (ndarray) –

    Source node IDs of shape [B], dtype=int64.

  • dst (ndarray) –

    Destination node IDs of shape [B], dtype=int64.

  • time (ndarray) –

    Timestamps of shape [B], dtype=int64.

  • msg (ndarray) –

    Edge features of shape [B, msg_dim], dtype=float32.

  • neg_dst (ndarray, default: None ) –

    Negative destination nodes for link prediction of shape [B, negatives_per_edge], dtype=int64.

Notes

All inputs are converted to PyTorch tensors internally.

See also
  • :class:TGUFBuilder

dst: Annotated[NDArray[numpy.int64], dict(shape=1)] property

Destination node IDs

msg: Annotated[NDArray[numpy.float32], dict(shape=2)] property

Edge Features

neg_dst: object property

Optional negative destinations for link prediction

src: Annotated[NDArray[numpy.int64], dict(shape=1)] property

Source node IDs

time: Annotated[NDArray[numpy.int64], dict(shape=1)] property

Edge Timestamps