Machine Learning and Data Mining

Andreas Maunz received his diploma (german equivalent to M. Sc.) in computer science from Albert-Ludwigs University Freiburg in 2007, and received his doctoral degree from Technical University Munich in 2013. He is principal scientist in research and early development at Roche‘s main site in Basel (CH).

Data Mining (clustering, pattern detection, signal processing) and Machine Learning (gradient boosting, deep learning), and Inferential Statistics (parametric models of biological processes with quantification of uncertainty). Clinical trial data, biostatistics, data-driven workflows for quantitative assessment of data quality.

Practical applications:
Machine Learning on images and image-derived data (deep learning), Image analysis (processing, graph cut methods, kernels), Advanced statistical models and simulations, e.g. mixed effects models for hierarchical time series data in longitudinal studies, analysis of mobile sensor data.

Technical Platforms: Long-term experience in R (Rstudio, R-Shiny, Package authoring, in-depth knowledge of all major statistical packages), Python (keras, tensorflow, numpy), as well as C/C++, Javascript (D3), Java (IntelliJ, Maven, Java 8), SQL (Oracle) and document DBs (mongodb). Systems level: Linux (Shells, Pipelines, POSIX), high performance computing (slurm)


  • 2018: SABS Graduate Program, Oxford University: “Deep Learning in the Life Sciences”


  • 2011: Article in Special Issue of „Machine Learning” (Springer-Verlag), see publications
  • 2007: Appearance among the top third of Freiburg CS graduates in 2007.
  • 2003 – 2007:  Scholarship granted by Hans-Böckler Stiftung.

Research Project Activities

International research projects:

  • 2011 – 2013: Toxicological Risk Assessment in the food industry.
    Research collaboration with Nestlé SA (Vevey, Switzerland) for structure-based modeling chronic toxicity and carcinogenicity (TD50).
    Activity: Responsible for leading the complete modeling in the project using web technology and statistical learners in R.
  • 2008 – 2011: Modelling of adverse chemical properties in pre-clinical screening.
    Research Project “OpenTox” of the Seventh Framework Programme of the European Union – development of an interoperable programming frameworks for predictive toxicology.
    Activity: design, testing and implementation of data mining and prediction modules using Ruby web services and diverse modelling algorithms.
  • 2005 – 2010 Research Project “Sens-it-iv” of the Seventh Framework Programme of the European Union – Development of “in vitro” Alternatives to Animal Testing in the risk assessment of allergy Pathogens.
    Activity: implementation and operation of an inductive database for experimental data, using Ruby on Rails and statistical toolboxes.

National research projects:

  • 2011 – 2013: Formation of chemical categories for the prediction of repeated-dose toxicity: Project “REACH” of the German Federal Ministry of Education and Research
    Activity: Implementation of a statistical toolbox using R.


  • Machine Learning Journal (2014)
  • KDD Conference (2014, 2011), ECML/PKDD Conference (2010)
  • Journal of Intelligent Information Systems (2012)
  • Journal of Chemical Information and Modelling (2011)
  • ICDM Conference (2010)


  • LibBBRC for mining backbone refinement class representatives
  • LibLAST for latent structure pattern mining
  • Ready-to-use appliance with both BBRC and LAST, based on VirtualBox.
  • Multi-View Clustering (Bickel und Scheffer, ICDM 2004) in R (also available from CRAN).

Conference Posters

  • J. Sahni, A. Maunz, F. Arcadu, YP. Zhang. Y. Li, T. Albrecht, A. Thalhammer, F. Benmansour. „Machine Learning Approach to Predict Response to anti-VEGF Treatment in Patients with Neovascular Age-Related Macular Degeneration using SD-OCT“, ARVO Annual Meeting, Vancouver, Canada (April 28 – May 02, 2019).
  • F. Arcadu, F. Benmansour, A. Maunz, J. Willis, M. Prunotto, Z. Haskova. „Deep Learning Algorithm for Patient-Level Prediction of Diabetic Retinopathy Response to Vascular Endothelial Growth Factor Inhibition“, ARVO Annual Meeting, Vancouver, Canada (April 28 – May 02, 2019).
  • F. Arcadu, J. Willis, A. Maunz, J. Michon, Z. Haskova, F. Benmansour. „Deep Learning Predicts OCT Measures of Diabetic Macular Thickening From Color Fundus Photographs“, 42nd Annual Macula Society Meeting, ARVO Annual Meeting, Vancouver, Canada (April 28 – May 02, 2019).
  • F. Arcadu, F. Benmansour, A. Maunz, J. Michon, Z. Haskova, D. McClintock, J. Willis, M. Prunotto. „Automated Image Quality Evaluation of Color Fundus Photographs Using Deep Learning Architecture“, ARVO Imaging in the Eye Conference 2018, Honolulu (HI), USA (April 28, 2018)
  • Maunz A, Ulrich E, Sternberger L, Blumenroehr C. Custom Scientific Visualizations in TIBCO Spotfire for Better Informed Decisions“, 15th Annual Bio-IT World Conference & Expo, Boston, USA, April 5-7, 2016.
  • Amrein K, Vercruysse M, Prunotto M, Wolf L, Schmucki R, Racek T, Clausen I, Blum Marti R, Araujo Del Rosario A, Benmansour F, Maunz A, Jensen Zoffmann S. „Mode of action characterization of antibiotics using expression profile fingerprint“, Modern Phenotypic drug development, the path forward (Keystone Symposium), Big Sky, Montana, USA, April 2—6, 2016.
  • Maier A, Dhar S, Maunz A, Zeitouni B, Peille A-L, Giesemann T, Fiebig HH. „High-throughput analysis of 3D tumor colony formation of primary cell suspensions derived from xenografts to identify  efficacy of anti-tumor agents in single agent or combination therapy“, 27th AACR-NCI-EORTC
    International Conference on Molecular Targets and Cancer Therapeutics, Boston, MA, USA, November 5-9, 2015, abstract B70
  • Maunz A, Gilsdorf M, Fournier S, Blumenröhr C, Horstmöller R, Schmiedle J. „Visual Analysis of Gene Interaction Networks using Hive Plots“, VIZBI 2015, The 6th International Meeting on Visualizing Biological Data, Broad Institute of MIT  and Harvard (March 25-27 2015).
  • Vorgrimmler D, Rautenberg M, Gütlein, M, Maunz A, Gebele D, and Helma C. „Lazar – A Modular Predictive Toxicology Framework“, OpenTox Euro 2013 – Innovation in Predictive Toxicology, Johannes
    Gutenberg University of Mainz (30 September – 2 October 2013).

Conference Talks

  • F. Arcadu, F. Benmansour, A. Maunz, J. Willis, M. Prunotto, Z. Haskova. „Deep Learning Algorithm to Predict Diabetic Retinopathy Progression on the Individual Patient Level“, ARVO Annual Meeting, Vancouver, Canada (April 28 – May 02, 2019).
  • A. Adamis, F. Arcadu, F. Benmansour, A. Maunz, J. Michon, Z. Haskova, D. McClintock, M. Prunotto, J. Willis. „Deep Learning Predicts OCT Measures of Diabetic Macular Thickening From Color Fundus  Photographs“, 42nd Annual Macula Society Meeting, Bonita Springs (FL), USA (February 13-16, 2019).
  • S. Zoffmann, A. Maunz, L. Wolf, F. Benmansour, M. Vercruysse, K. Amrein, M. Burcin, M. Prunotto. „Machine Learning Powered Antibiotics Phenotypic Drug Discovery“, Keystone Symposium on Phenotypic Drug Discovery, Breckenridge (CO), USA (March 03-07, 2019).
  • Maunz A, Wolf L, Jensen Zoffmann S, Prunotto M,  Amrein K, Vercruysse M, Blum Marti R, Zhu S, Ding  H, Benmansour F. „Automating a high content screening assay“, 15th Annual Bio-IT World
    Conference & Expo, Boston, USA, April 5-7, 2016.
  • Helma C, Maunz A, „Vorstellung des Tools Lazar (Lazy Structure-Activity Relationships)“, Read-Across  und Grouping zur Füllung von Datenlücken unter  REACH, DGPT Jahrestagung 2013, Halle/Saale (05. – 07. März 2013).
  • A. Maunz, C. Helma, and S. Kramer: Large Scale  Graph Mining using Backbone Refinement Classes. In 7th International Workshop on Mining and Learning with Graphs, Leuven, Belgium (02-04 July 2009).
    The submitted abstract was one of eight that were selected for oral presentation (21 accepted in total) at the conference. Following the conference, the work was selected for publication in the joint
    MLG/SRL/ILP workshop special issue of the Machine Learning Journal (among five other works out of 83).
  • Maunz A, Helma C. „New Lazar Developments“, presented at the eCheminfo Community of Practice InterAction Meeting, Autumn 2008, Bryn Mawr College, Philadelphia (13-17 October 2008).
  • Maunz A: „Instance-based Regression Models for Quantitative Biological Activities using Support Vector Machines and Multilinear Models“, presented at the Scarlet Workshop on in silico methods for carcinogenicity and mutagenicity, Milano (April 2008).


  • F. Arcadu, F. Benmansour, A. Maunz, J. Willis, Z. Haskova, M. Prunotto “Deep learning algorithm predicts diabetic retinopathy progression in individual patients”. npj Digital Medicine 2, 92, Sep 2019.
  • F. Arcadu, J. Willis, A. Maunz, J. Michon, Z. Haskova, M. Prunotto, F. Benmansour. “Deep Learning Predicts OCT Measures of Diabetic Macular Thickening From Color Fundus Photographs”. Investigative Ophthalmology and Visual Science, 60:852–857, Mar 2019.
  • S. Zoffmann, M. Vercruysse, F. Benmansour, A. Maunz, L. Wolf, R. Blum Marti, T. Heckel, H. Ding, H. Truong, M. Prummer, R. Schmucki, C. Mason, K. Bradley, A. I. Jacob, C. Lerner, A. Araujo del Rosario, M. Burcin, K. Amrein, and M. Prunotto. “Machine Learning-Powered Antibiotics Phenotypic Drug Discovery”. Nature Scientific Reports, 9:5013, 2019.
  • A. Moisan, M. Gubler, J. D. Zhang, Y. Tessier, K. Dumong Erichsen, S. Sewing, R. Gerard, B. Avignon, S. Huber, F. Benmansour, X. Chen, R. Villasenor, A. Braendli-Baiocco, M. Festag, A. Maunz, T. Singer, F. Schuler, and A. B. Roth. “Inhibition of EGF Uptake by Nephrotoxic Antisense Drugs In Vitro and Implications for Preclinical Safety Profiling”. Molecular Therapy Nucleic Acids, 6:89–105, Mar 2017.
  • Batke M, Gütlein M, Partosch F, Gundert-Remy U, Helma C, Kramer S, Maunz A, Seeland M, and Bitsch A. “Innovative Strategies to Develop  Chemical Categories using a Combination of Structural and Toxicological Properties” Frontiers in Pharmacology, 7:321, 2016.
  • Lo Piparo E, Maunz A, Helma C, Vorgrimmler  D, Schilter B. (2014) “Automated and Reproducible  Read-Across like Models for Predicting Carcinogenic  Potency” Regulatory Toxicology and Pharmacology 70(1).
  • Seeland M, Maunz A, Karwath A, Kramer S.  “Extracting Information from Support Vector  Machines for Pattern-Based Classification” Proceedings of the 29th Symposium On Applied Computing, SAC ’14, pages 129–135, New York, NY,  USA, 2014. ACM. [pdf, bib]
  • Maunz A, Vorgrimmler D, Helma C. “Out-of-Bag  Discriminative Graph Mining” Proceedings of the  28th Symposium On Applied Computing, SAC ’13, 109–114, New York, NY, USA, 2013. ACM. [pdf, bib]
  • Maunz A, Gütlein M, Rautenberg M, Vorgrimmler D, Gebele D, Helma C. “Lazar: A Modular Predictive Toxicology Framework” Frontiers in Pharmacology 4:38, 2013. [pdf]
  • Batke M, Bitsch A, Gundert-Remy U, Guetlein M, Helma Ch, Kramer S, Maunz A, Partosch F, Seeland M, Stahlmann R. “New Strategies to develop Chemical Categories in the Context of REACH-Work in progress” Toxicology Letters, 221:84, 2013.
  • Maunz A, Helma C, and Kramer S. “Efficient Mining for Structurally Diverse Subgraph Patterns in Large Molecular Databases” Machine Learning, 83:2, 193-218, Springer Netherlands, 2011. [pdf]
  • Maunz A, Helma C, Cramer T, and Kramer S. “Latent Structure Pattern Mining” ECML/PKDD 2010: Machine Learning and Knowledge Discovery in Databases, 6322, 353-368, Springer Berlin / Heidelberg, 2011. [pdf, poster pdf]
  • Suenderhauf, C, Hammann, F, Maunz, A, Helma, C, and Huwyler, J. “Combinatorial QSAR Modeling of Human Intestinal Absorption” Molecular Pharmaceutics, 8(1):213-224, 2011.
  • Hardy, B, Douglas, N, Helma, C, Rautenberg, M, Jeliazkova, N, Jeliazkov, V, Nikolova, I, Benigni, R, Tcheremenskaia, O, Kramer, S, Girschick, T, Buchwald, F, Wicker, J, Karwath, A, Gütlein, M, Maunz, A, Sarimveis, H, Melagraki, G, Afantitis, A, Sopasakis, P, Gallagher, D, Poroikov, V, Filimonov, D, Zakharov, A, Langunin, A, Gloriozova, T, Novikov, S, Skvortsova, N, Druzhilovsky, D, Chawla, S, Gosh, I, Ray, S, Patel, H, and Escher, S. “Collaborative Development of Predictive Toxicology Applications” Journal of Cheminformatics, 2:7, 2010. [pdf]
  • Maunz A, Helma C, and Kramer S (2009). “Large Scale Graph Mining using Backbone Refinement Classes” KDD ’09: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 617-626, New York,  NY, USA, ACM. [pdf]
    Video lecture: slides 1-3.exe, 4-6.exe, 7-10.exe, 11-15.exe, 16-19.exe, 20-23.exe or view the conference recordings.
  • Hammann F, Gutmann H, Jecklin U, Maunz A, Helma C, and Drewe J. “Development of Decision Tree Models for Substrates, Inhibitors, and Inducers of p-Glycoprotein” Current Drug Metabolism, 10:4, 339-346, 2009.
  • Maunz A and Helma C. “Prediction of Toxic Effects of Pharmaceutical Agents” Pharmaceutical Data Mining: Approaches and Applications for Drug Discovery, ed. by Konstantin V. Balakin, Sean Ekins. Wiley, New York, NY, USA, chap. 5, pp. 145-176, 2009. [abstract]
  • Maunz A and Helma C. “Prediction of Chemical Toxicity with Local Support Vector Regression and Activity-Specific Kernels” SAR and QSAR in Environmental Research, 19(5-6):413-431, 2008. [pdf]

Unpublished Material

  • Maunz A: “A Quantitative Extension to the Lazar Algorithm for the Prediction of Chemical Properties.” [pdf diplomarbeit.pdf].
  • Maunz A: “On the Number of Backbone Refinement Classes”, derivation and proof of a formula for counting the number of BBRCs in a perfect binary tree. Comparison to the total number of subtrees [pdf bbrc-no.pdf].
  • Maunz A: “On the Co-Occurrence and Diversity of Backbone Refinement Classes”, euclidean embedding of BBRC features and instances in 2D as well as feature similarity alike to ORIGAMI approach [pdf bbrc-rep.pdf].
  • Maunz A: “Support Vectors and the Margin in a Nutshell”, brief outline of support vector theory and the concept of margin. (see ch. 1,2 and 7 of Schölkopf and Smola, 2002) [pdf sv-margin.pdf].

Selected Talks and Tutorials

  • A. Maunz, C. Helma, and S. Kramer: Large Scale Graph Mining using Backbone Refinement Classes. In 7th International Workshop on Mining and Learning with Graphs, Leuven, Belgium (02-04 July 2009). The submitted abstract was one of eight (out of 21) that were selected for oral presentation at the conference. Following the conference, the work was selected for publication in the joint MLG/SRL/ILP workshop special issue of the Machine Learning Journal (among five other works out of 83).

Short Scientific CV

I studied computer science in Freiburg from 2001 to 2007 and graduated in June 2007. During this time I worked as an assistant at the Machine Learning Lab. My Thesis treated the quantitative prediction of toxicological properties of small molecules (eg, carcinogenicity, mutagenicity).

In 2009, I became a Ph.D. student at the Bioinformatics Department I/12 at Technical University Munich, chair of Prof. Dr. Stefan Kramer. In this phase I have been publishing more than ten technical articles in conferences and journals, including the two largest international data mining conferences (KDD, ECML / PKDD) and in a special edition of the Machine Learning Journal (Springer Verlag).

From 2011 to 2012, I led a research collaboration with Nestlé SA for structure-based prediction of chronic toxicity and carcinogenicity where I could gather experience in the food industry. The resulting statistical models are used productively by Nestlé for establishing levels of safety concern, and for regulatory purposes with the Swiss authorities. They were also made available publicly as web services. The work was published in 2014 in Regulatory Toxicology and Pharmacology.

In December 2013 I received my doctoral degree from Technical University Munich for my thesis “Graph Mining methods for Predictive Toxicology” with grade “magna cum laude” (very good). During 2013 and throughout most of 2014 I worked at Oncotest GmbH (Freiburg) on the computer-based synergy determination of drug combinations in preclinical screenings of anti-cancer agents, on visualization in Data Mining workflows, and on Data Mining interfaces for mutational data at.

In November 2014, I joined Hoffmann-La Roche AG (Basel) as scientist in their Pharma Research and Early Development Informatics division. My projects at their main site in Basel include drug-project work, establishing a JavaScript programming framework in TIBCO Spotfire, development of D3-based visualizations and Data Mining and Machine Learning on high-content screening data. I was promoted to senior scientist in October 2016.


  • Related to Publications:
    • “Virtuelle Versuchskaninchen”, in Deutschlandradio about Lazar [html].
    • “Mehr Tierversuche durch Chemikalienrichtlinie Reach”, TV report in 3sat about animal testing and alternative test methods in the EU, featuring an interview with Christoph Helma. [html].
  • Other material:
    • “Machine Learning”, course by Andrew Ng in Stanford University 2009 [html].
    • “Graph Mining and Graph Kernels”, video lecture by Karsten Borgwardt and Xifeng Yan in KDD ’08 [html].

Comments are closed.