Show simple item record

Prediction of the retention time of natural product metabolites using transfer learning strategies

dc.contributor.advisorMatsoukas, Minos-Timotheos
dc.contributor.authorΚατσάρα, Βασιλική
dc.date.accessioned2024-10-22T06:07:39Z
dc.date.available2024-10-22T06:07:39Z
dc.date.issued2024-10-11
dc.identifier.urihttps://polynoe.lib.uniwa.gr/xmlui/handle/11400/7837
dc.identifier.urihttp://dx.doi.org/10.26265/polynoe-7669
dc.descriptionThe research for this thesis was conducted at CEU San Pablo University in Madrid, Spain.el
dc.description.abstractRetention time (RT) prediction in chromatography can play an important role for numerous analytical applications, including drug discovery and environmental monitoring. This study aims to enhance RT prediction accuracy by employing deep learning techniques, particularly focusing on transfer learning to adapt models trained on synthetic compounds acquired by High Pressure Liquid chromatography–Mass Spectrometry (HPLC-MS) to predict RTs for natural products in different chromatographic methods. We utilized the extensive METLIN Small Molecule Retention Time (SMRT) dataset, comprising over 80,000 synthetic compounds, to train a deep neural network (DNN). This model was then fine-tuned on smaller datasets of natural products, from the RepoRT database using a two-stage transfer learning approach. Initially, the DNN’s upper layers were frozen to retain knowledge about high level features while training on the new data. Subsequently, all layers were unfrozen for further training with a reduced learning rate, ensuring both general and unique patterns were captured. Hyperparameter optimization was conducted using Optuna, leveraging a 5-fold nested cross-validation to ensure robust performance. The evaluation metrics that we computed were the Mean Absolute Error (MAE), the Median Absolute Error (MedAE) and Mean Absolute Percentage error (MAPE). Transfer learning was then compared with new trained DNNs directly trained on the RepoRT database and showed that the strategy was successful according to the MAE and MadAE metric, although not according to the MAPE. We decided to remove the outliers and noticed that with the cleared data transfer learning performed better considering all the metrics. In the future, it will be necessary to refine this strategy to improve its performance, either by testing it on the same datasets or by incorporating additional data.el
dc.format.extent66el
dc.language.isoenel
dc.publisherΠανεπιστήμιο Δυτικής Αττικήςel
dc.rightsΑναφορά Δημιουργού - Μη Εμπορική Χρήση - Παρόμοια Διανομή 4.0 Διεθνές*
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Διεθνές*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectMetabolomicsel
dc.subjectRetention timeel
dc.subjectMolecular fingerprintsel
dc.subjectMolecular descriptorsel
dc.subjectTransfer learningel
dc.subjectMachine learningel
dc.titlePrediction of the retention time of natural product metabolites using transfer learning strategiesel
dc.title.alternativeΠρόβλεψη του χρόνου κατακράτησης μεταβολιτών φυσικών προϊόντων με χρήση στρατηγικών μεταφοράς μάθησηςel
dc.typeΔιπλωματική εργασίαel
dc.contributor.committeeAthanasiadis, Emmanouil
dc.contributor.committeeKostopoulos, Spiros
dc.contributor.facultyΣχολή Μηχανικώνel
dc.contributor.departmentΤμήμα Μηχανικών Βιοϊατρικήςel


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Αναφορά Δημιουργού - Μη Εμπορική Χρήση - Παρόμοια Διανομή 4.0 Διεθνές
Except where otherwise noted, this item's license is described as
Αναφορά Δημιουργού - Μη Εμπορική Χρήση - Παρόμοια Διανομή 4.0 Διεθνές