Profile Summary
I am a software engineer and data scientist with an interdisciplinary background covering computational and experimental molecular biology as well as software engineering. Previous projects include: attempting to launch a start-up on multiplexed antibody quantification, predicting compound activity (QSAR) and drug mode of action using machine learning, using deep learning for representation learning in cell painting experiments (JUMP-CP consortium), and image processing and quantification to study gene regulation in yeast.Career and Education
Since 2023/11: Data scientist at Bayer AG, Computational Life Sciences unit.
2023/02 - 2023/08: Data scientist and software engineer at HMS Analytical Software.
2022/03 - 2023/01: Project co-lead at "Seromux" biotech spin-off
- Development of a business plan and an R&D concept to acquire funding.
- Reached 2nd stage of the federal “EXIST” tech transfer funding scheme.
- Data analysis for a novel multiplexed antibody quantification technology.
2021/02 - 2022/01: Data scientist at Bayer AG, one year Life Science Collaboration project
- Developed and deployed a machine learning tool to predict compound activity.
- Integrated explainable AI tools to interpret model predictions.
- Used self-supervised representation learning with convolutional neural networks to derive embeddings from cell painting imaging data.
2017 - 2020: Postdoc on predicting drug mode of action (joint EIPOD between EMBL Heidelberg and Stanford)
- Assessed and optimized machine learning algorithms powered by chemical genetics data to predict drug mode of action in E. coli. Identified mode of action specific fingerprint genes.
- Designed and analysis of follow-up experiments (thermal proteome profiling, imaging, cell biological assays).
2013 - 2017: PhD project studying gene regulation by quantitative imaging
- Used high throughput yeast genetics and quantitative high throughput microscopy to study antisense RNAs in yeast.
- Integration of genomics and transcriptomics data revealed features of antisense regulated genes. Identification of a new gene regulation mechanism.
2011 - 2012: Researcher, University of Vienna
- Bioinformatic and experimental studies on repeats in human gene expression.
2010 - 2011: MSc in Molecular Medicine, Imperial College London
- Master thesis on retrotransposons in aspergillosis, distinction.
2006 - 2009: BSc in Molecular Biology, University of Vienna
- Bachelor thesis on Draxin in neurogenesis, distinction.
Skills
Statistics and data analysis
- Machine learning: Supervised: decision trees, deep neural networks, convolutional neural networks, network architectures, regularisation procedures, correct tuning and evaluation of models, performance metrics. Unsupervised: standard dimensionality reduction and clustering algorithms. Explainability and uncertainty metrics.
- Statistics: probability theory, hypothesis testing, parameter estimates, uni- and multivariate regression methods, PCA, enrichment analysis.
- Image processing and computer vision: image postprocessing techniques and algorithms, segmentation, CNNs, image quantification. Deep understanding of fluorescence microscopy.
Programming
- R: in-depth, >10 years of experience, e.g. tidyverse, Bioconductor, mlr, Shiny.
- Python: advanced, 5 years of experience. Standard library and SciPy stack (e.g. NumPy, Pandas, sklearn, skimage). PyTorch for deep learning. RDKit for cheminformatics.
- Experienced with Git, Linux, Bash, and working in HPC environments.
- Current main interest: Usage of DevOps/MLOps tools and cloud technologies for efficient data science project management and model deployment.
Life sciences
- High throughput biology: Design and analysis of high throughput screens, analysis and integration of genomics data. Expert in fluorescence microscopy.
- Mechanistic studies: Expert in the design, execution, and analysis of experiments to elucidate molecular mechanisms. Genetic engineering and biochemistry techniques.
- Deep understanding of gene regulation, mode of action, and microbiology.
Social skills and community building
- R and machine learning trainer at Uni Heidelberg and in “Data Carpentry” workshops.
- Co-founder and former coordinator of “emblr”, an R user group at EMBL.
Publications
First or shared first authorship
Huber, F., Bassler, S., Dubois, L., Knopp, M., Mateus, A., Savitskii, M.M., Zeller, G., and Typas, A. Predicting drug mode of action using machine learning and chemical genetics reveals thiolutin as a cell wall damaging agent in Escherichia coli. In preparation.