The data contained within this repository is the supporting information for Jack Simpson's PhD thesis "Computational Prediction of Gel Properties". It is split by chapter in the following ways: Chapter 2 - Classifcation Models For each descriptor set explored in the chapter the data in this folder is as follows: - Contains Pipeline Protocol protocols for descriptor calculation - Contains R scripts for data cleaning and model building (True and randomised models) - Contains R objects containing the trained classifier models for each algorithm explored in Chapter 2. (True and randomised models) - Contains R and Python scripts that carry out the SHAP model interpretation (all except fp_as_bits). Chapter 3 - Non-Bayesian Regression - Contains Pipeline Protocol protocols for descriptor calculation - Contains R scripts for data cleaning and model building (True and randomised models) - Contains R objects containing the trained regresion models for each algorithm explored in Chapter 3 and each data set generated. (True and randomised models) - Contains R and Python scripts that carry out the SHAP model interpretation. Chapter 4 - Bayesian Regression - Contains Python scripts for: - Descriptor calculation - Model Building (True and randomised models) - SHAP analysis. Chapter 5 - DeNovo Generation - Contains python scripts for the genetic algorithm - Contains the necessary scripts to train a Recurrent Neural Network using the approach undertaken in Chapter 5 and also contains a range of trained models. - Contanins benchmarking work including: - Scripts used to run the benchmarking - Outputs for each run - Scripts used to analyse the benchmarks.