Recipe100k

class dhg.data.Recipe100k(data_root=None)[source]

Bases: dhg.data.base.BaseData

The Recipe100k dataset is a recipe-ingredient network dataset for vertex classification task. The vertex features are the bag of words from the sentence that making the recipe. Hyperedges are the ingredients of the recipe or the Keywords for food preparation steps. The original dataset is created in SHARE: a System for Hierarchical Assistive Recipe Editing paper.

The content of the Recipe100k dataset includes the following:

  • num_classes: The number of classes: \(8\).

  • num_vertices: The number of vertices: \(101,585\).

  • num_edges: The number of edges: \(12,387\).

  • dim_features: The dimension of features: \(2,254\).

  • features: The vertex feature matrix. torch.Tensor with size \((101,585 \times 2,254)\).

  • edge_list: The edge list. List with length \(12,387\).

  • labels: The label list. torch.LongTensor with size \((101,585, )\).

Parameters

data_root (str, optional) – The data_root has stored the data. If set to None, this function will auto-download from server and save into the default direction ~/.dhg/datasets/. Defaults to None.