This repository contains datasets generated using the Physics-Informed Generative Model for Extrapolating Beyond Known Motifs (PIGEN) and its conditioned variants. Each dataset is provided as a CSV file in which each row corresponds to a single generative-model structure (deduplicated by composition). Columns include: cif – CIF string representation of the generated structure composition – chemical formula of the generated candidate SPP – structural validity score composition_novelty – novelty of the composition with respect to training data compactness – geometric compactness metric Iconf – structural complexity measure K per atom total_entropy – local-environment diversity (MLED) Additional conditioning-specific metrics depending on the model variant The repository includes datasets from PIGEN models conditioned on single or multiple targets, such as compactness, MLED, hull energy, diversity, and complexity. Files related to Figure 5 For the compositions LiNb₇N₈ and Rb₂Er₄O₇, two files are provided for each system: Generated structures – raw PIGEN-generated candidates in CIF format Optimised structures – globally optimised crystal structures obtained via crystal structure prediction workflows (CIF format) These examples correspond to the structures highlighted in Figure 5 of the associated manuscript.