Python API Reference
Core C++ components exposed as a Python API via nanobind.
tguf.TGUFSchema(path: str, edge_capacity: int | None = None, msg_dim: int | None = None, label_dim: int | None = None, node_feat_capacity: int | None = None, node_feat_dim: int | None = None, label_capacity: int | None = None, negatives_start_e_id: int | None = None, negatives_per_edge: int | None = None, val_start: int | None = None, test_start: int | None = None)
Metadata defining the layout of a TGUF dataset.
This schema specifies dataset capacities, feature dimensions, and optional
evaluation splits. It is required to initialize a :class:TGUFBuilder.
Parameters:
-
path(str) –Path to the
.tgufbinary file. -
edge_capacity(int, default:None) –Maximum number of edges.
-
msg_dim(int, default:None) –Dimension of edge features.
-
label_dim(int, default:None) –Dimension of label targets.
-
node_feat_capacity(int, default:None) –Maximum number of nodes with static features.
-
node_feat_dim(int, default:None) –Dimension of node features.
-
label_capacity(int, default:None) –Maximum number of label events.
-
negatives_start_e_id(int, default:None) –Edge index where precomputed negatives begin (for evaluation).
-
negatives_per_edge(int, default:None) –Number of negatives per edge.
-
val_start(int, default:None) –Global edge index where validation split begins.
-
test_start(int, default:None) –Global edge index where test split begins.
Notes
If val_start or test_start are not provided, the dataset is treated
as fully training unless overridden during loading.
tguf.TGUFBuilder(schema: TGUFSchema)
High-performance writer for creating TGUF datasets on disk.
Uses an internal buffering strategy to minimize disk I/O.
Parameters:
-
schema(tguf._tguf_py.TGUFSchema) –Dataset schema defining layout and capacities.
See also
- :class:
TGUFSchema - :class:
Batch
append_edges(batch: Batch) -> None
Append a batch of temporal edges to the dataset.
Parameters:
-
batch(tguf._tguf_py.Batch) –A batch of temporal edge data.
Notes
Releases the Python GIL during execution.
append_labels(n_id: NDArray, time: NDArray, target: NDArray) -> None
Append label events to the dataset.
Parameters:
-
n_id(ndarray) –Node IDs of shape [B], dtype=int64.
-
time(ndarray) –Event timestamps of shape [B], dtype=int64.
-
target(ndarray) –Label targets of shape [B, label_dim], dtype=float32.
Notes
Releases the Python GIL during execution.
append_node_feats(n_id: NDArray, node_feat: NDArray) -> None
Append static node features to the dataset.
Parameters:
-
n_id(ndarray) –Node IDs of shape [N], dtype=int64.
-
node_feat(ndarray) –Node features of shape [N, node_feat_dim], dtype=float32.
Notes
Releases the Python GIL during execution.
finalize() -> None
Finalize the dataset.
Writes headers and flushes all buffered data to disk.
Notes
Must be called after all data has been appended. Releases the Python GIL during execution.
tguf.Batch(src: NDArray, dst: NDArray, time: NDArray, msg: NDArray, neg_dst: NDArray | None = None)
Container for temporal edge data.
This structure represents a batch of temporal interactions and is used
as input to :meth:TGUFBuilder.append_edges.
Parameters:
-
src(ndarray) –Source node IDs of shape [B], dtype=int64.
-
dst(ndarray) –Destination node IDs of shape [B], dtype=int64.
-
time(ndarray) –Timestamps of shape [B], dtype=int64.
-
msg(ndarray) –Edge features of shape [B, msg_dim], dtype=float32.
-
neg_dst(ndarray, default:None) –Negative destination nodes for link prediction of shape [B, negatives_per_edge], dtype=int64.
Notes
All inputs are converted to PyTorch tensors internally.
See also
- :class:
TGUFBuilder
dst: Annotated[NDArray[numpy.int64], dict(shape=1)]
property
Destination node IDs
msg: Annotated[NDArray[numpy.float32], dict(shape=2)]
property
Edge Features
neg_dst: object
property
Optional negative destinations for link prediction
src: Annotated[NDArray[numpy.int64], dict(shape=1)]
property
Source node IDs
time: Annotated[NDArray[numpy.int64], dict(shape=1)]
property
Edge Timestamps