Faculty of Computer Science

Research Group Theoretical Computer Science

Oberseminar: Heterogene formale Methoden

Date: 2020, September 22
Time: 11:00
Place: Online
Author: Memariani, Adel
Title: Automatic Chemical Compound Classification Based on Modern Deep Neural Networks


In this thesis project, we aim to evaluate on how recent advances in deep learning methods (i.e. Bidirectional Encoder Representations from Transformers (BERT) and Tree-structured Recursive Neural Networks) can facilitate the work of manual annotations in chemical ontologies such as ChEBI. This thesis project is inspired by the resemblance between language models and chemical classification tasks. We use SMILES representation of chemical compounds as the input for our models. From our perspective, SMILES is a language with atoms and their bonds as the alphabet and a number of grammatical rules. This indicates that there is an insightful correspondence between the understanding of a compound by a chemist and the understanding of a term by a language speaker. To this end, we propose to formulate the problem of chemical classification as a multi-label classification task (similar to sentence tagging task in natural language processing).

Back to the Oberseminar web page