Friday, 2nd October – Bioinformatics for Lipidomics – Online Symposium

The objective of this symposium is to consolidate the bioinformatics community within the lipidomics field and to provide a platform for interaction and exchange of internationally acknowledged scientists and young researchers. The format uses 5 minute flash talks followed by 5 minute discussions. The symposium will close with an open discussion round, moderated by the organizers and session chairs.

Please see the program page for a detailed schedule.

Abstracts


Session Speaker Title Abstract
Keynote Steffen Neumann The metaRbolomics Toolbox in Bioconductor and beyond Metabolomics aims to measure and characterise the complex composition
of metabolites in a biological system using analytical techniques such
as mass spectrometry and nuclear magnetic resonance spectroscopy. The
scientific community has developed a wide range of open source
software, providing freely available advanced processing and analysis
approaches. The programming and statistics environment R has emerged as
one of the most popular environments to process and analyse
metabolomics datasets. A major benefit of such an environment is the
possibility of connecting different tools into more complex workflows.
We provide an extensive overview of existing packages in R for
different steps in a typical computational metabolomics workflow,
including data processing, biostatistics, metabolite annotation and
identification, and biochemical network and pathway analysis.
Multifunctional workflows, possible user interfaces and integration
into workflow management systems are also covered. We also address the
Findability, Accessibility, Interoperability and Reusability of
software, and how these can be improved through repositories and
semantic annotation, and now maintain the resource as an Open Source
book with Continuous Integration as part of the RforMassSpectrometry
initiative, which aims to provide efficient, thoroughly documented,
tested and flexible R software for the analysis and interpretation of
high throughput mass spectrometry data.
Databases, Ontologies and Online Resources Alan Bridge SwissLipids, a knowledge resource for lipid biology

Lucila Aimo, Robin Liechti, Nevila Hyka-Nouspikel, Lou Götz, Anne Niknejad, Anne Gleizes, Dmitry Kuznetsov, Fabrice P.A. David, Gisou van der Goot, Howard Riezman, Ioannis Xenarios, Elisabeth Coudert, Alan Bridge

SwissLipids (www.swisslipids.org) is an expert curated knowledge resource for lipids and their biology that includes more than 590,000 known and theoretical lipid structures belonging to over 500 lipid classes, each enriched with information on lipid components, reactions and enzymes, with supporting links to primary literature. All lipids are mapped to the widely used chemical ontology ChEBI (www.ebi.ac.uk/chebi/), while lipid reactions are described using the Rhea knowledgebase of biochemical reactions (www.rhea-db.org) (itself based on ChEBI) and enzymes using proteins from the UniProt knowledgebase UniProtKB (www.uniprot.org).

SwissLipids features a hierarchical classification based on the shorthand notation for mass spectrometry (PMID: 23549332) that maps lipidomics data to structures and biological knowledge, like this:

PC(38:4) -> PC(18:0_20:4) -> PC(18:0/20:4) -> PC(18:0/20:4(5Z,8Z,11Z,14Z)) -> PLA2G4A

Users can browse the lipid hierarchy, as well as search for lipids (by lipid name, abbreviation, formula, SMILES, InChIKey, mass) and enzymes (using gene names and UniProtKB accessions), or use the ID mapping tool to map identifiers from reference resources like LIPID MAPS and HMDB to their corresponding structures in SwissLipids, reactions from Rhea, and enzymes from UniProtKB. Our REST API provides programmatic access to search, browse and ID mapping tools, and all data is freely available for download and reuse under a CC-BY-4.0 license.

Current work focuses on improving the coverage and annotation of SwissLipids, using enzyme knowledge from UniProtKB (also using Rhea, see PMID: 31688925) as the basis to build and annotate more extensive lipid libraries, eventually covering all taxa in UniProtKB.

Databases, Ontologies and Online Resources Valerie O’Donnell LIPID MAPS LIPID MAPS supports bioinformatics analysis of lipidomics datasets through several mechanisms. We provide a suite of databases, including manually curated and computationally generated software. Tools for analysing lipidomics data online include LipidFinder and BioPAN. LipidFinder is designed to clean up high resolution MS data (first analysed using XCMS) so that only real lipids are retained.  BioPAN is a tool that interrogates your data to predict which genes maybe involved in regulating lipids in biological samples, and is interfaced with LipidLynx.  LIPID MAPS also curates new lipids structures, and organises these into an international classification system.  Shorthand nomenclature has been recently updated to specifically support MS lipidomics analysis. Guidelines for lipidomics data are being developed in collaboration with LSI.
Visualization and Exploration Michelle Hill An open source software for analysis of lipidomics experiments – lipidr

Ahmed Mohamed, Jeffrey Molendijk, Michelle M Hill

This talk will present lipidr, an open source R/Bioconductor package for comprehensive analysis of lipidomics experiments. Input includes both targeted and untargeted lipidomics datasets, with univariate and multivariate analyses available to determine the differences in lipid class, chain length, total unsaturation levels and lipid set enrichment in specified experimental groups. Outputs include tables, graphs and interactive visualization. A companion online guide with two example datasets is available at https://www.lipidr.org/.

Visualization and Exploration Bo Burla Supervised Data Processing and Quality Control Workflows in MS-based Lipidomics and the R/Shiny toolbox MIDAR Post-processing, quality control of metabolomics/lipidomics datasets follows certain assumptions and can be a source of variability, artefacts and errors affecting downstream data interpretation. Insufficient documentation of data processing methods and differences in reporting/sharing of datasets further impact comparability of lipidomics results and their value for the community. We here present our modular data processing and QA/QC workflow and an accompanying software pipeline, MIDAR, implemented as an R package and a Shiny app. We show some examples how this pipeline is used in our lab for supervised, automated processing of established large-scale assays, as well as in method optimization and development. We hope that this workflow and pipeline can contribute to improved lipidomics data reproducibility and quality.
Visualization and Exploration Dominik Schwudke Systematic Lipidome Comparisons Applying a Homology Concept
Data Integration and Applications Nils Hoffmann FAIR Data Standards and Workflow Interoperability in metabolomics and lipidomics – the mzTab-M data exchange format Working within the consortia of the Metabolomics Standards Initiative, Proteomics Standards Initiative, and the Metabolomics Society, we published mzTab for Metabolomice (mzTab-M) in 2019 as a common output format from analytical approaches using MS on small molecules. The format reports final quantification values from analyses, as well as the evidence trail in terms of features measured directly from MS and different types of approaches used to identify molecules.
Data Integration and Applications Michael Witting Lipids, pathways and genome-scale metabolic models – Many gaps to fill… Lipidomics technologies enable the high-throughput, quantitative analysis of several hundreds to thousands of lipids in short time. While the analytical methods are very advanced and continue to improve, analysis of obtained results in the context of biology is still a manual and sometime tedious process. One possibility for improved, (semi)automatic analysis is the representation of lipidomics data in the context of biochemical pathways or genome-scale metabolic models (GSMs). However, these are often not on par with the detail of the lipidomic analysis. Therefore, new approaches are required. In this presentation a short overview on current shortcomings of GSMs in regard to lipid metabolism and potential solutions are presented.
Data Integration and Applications Robert Ahrends Lipidomics Informatics for Life Science
Data Integration and Applications Tim Rose Embedding lipidomics into the omics landscape Multi-level omics data is getting available for an increasing number
of experiments. This requires dedicated computational tools being able
to integrate and datamine heterogeneous types of data and produce
insights going beyond the separate analysis of each data set. Such
tools usually require databases, with functional interactions, such as
biological networks. While methods are already available and used in
other omics disciplines, they are only partially applicable to
lipidomics data. In this talk, existing available databases and
computational tools in the field of lipidomics are discussed with
Data Integration and Applications Dominik Kopczynski Goslin: A Grammar of Succinct Lipid Nomenclature Goslin is a polyglot grammar for normalizing common lipid nomenclatures as LIPID MAPS, SwissLipids, and HMDB nomenclature and the shorthand nomenclature by Liebisch and coauthors. Goslins key features are (1) simplifying the implementation of lipid name handling for developers of mass spectrometry-based lipidomics tools, (2) offering a tool that normalizes unifies the main existing lipid name dialects enabling a lipidomics analysis in a high-throughput fashion, and (3) to provide a consistent mapping from lipid shorthand names to lipid building blocks and structural properties. We provide implementations of Goslin in four major programming languages, namely, C++, Java, Python 3, and R for rapid adoption and integration. For direct usage or automated pipelines, we set up a web service with an easy usable API. All implementations are available free of charge under a permissive open source license.
Identification and Quantification Tools Jürgen Hartler Lipid Data Analyzer: Automated Lipid Species Annotation by Decision Rule Sets Technological advances in mass spectrometry make now broad-based quantitative lipid profiling possible. Automated annotation of this wealth of data by computational means is challenging, due to the structural diversity of lipids, presence of isomeric/isobaric lipid species and fragments. We present a reliable tool for structural annotation and quantitation of lipid species that is based on decision rule sets. Decision rule sets are a reflection of an MSn interpretation process by a trained expert, consisting of the definition of fragments, their intensity relations, and rules for differentiating between potential isomeric/isobaric candidates. Besides the reliability in species annotation, this concept allows for utmost flexibility, such as independence of MS platforms, detection of coeluting molecular lipid species, and it covers possible variations of the lipid base structure, e.g., number of hydroxylation sites on sphingolipids. Further the software can easily be extended to include additional lipid categories, subclasses and adducts.
Identification and Quantification Tools Hiroshi Tsugawa Elucidating the diversity of lipid structures by computational mass spectrometry Lipids are extremely diverse molecules, and this is evidenced by the curation of >40,000 structures in LIPID MAPS. To grasp the diversity, liquid chromatography tandem mass spectrometry (LC-MS/MS) and ion mobility MS/MS are the gold-standard techniques. Importantly, the full potential of lipidomics is only realized by the advances in computational mass spectrometry (CompMS) aiming at the conversion of raw MS data into molecule structures. Here, we recently established an untargeted lipidomics program MS-DIAL 4 (http://prime.psc.riken.jp/) untangling lipid mass fragmentations of 117 lipid subclass categories. In addition, we created a comprehensive database of retention time and collision cross section (CCS) based on machine learning to make an accurate atlas of lipids (~FDR rate of 1 to 2%). In this 5-min talk, the standardized lipidomics procedure containing raw data import, peak picking, annotation, nomenclature, semi-quantitative definitions, and mztab-M export is introduced to enhance the harmonization of lipidomics data across laboratories.
Identification and Quantification Tools Oliver Alka DIAMetAlyzer: Automated, false-discovery rate controlled analysis for data-independent acquisition in metabolomics The extraction of meaningful biological knowledge from high-throughput mass spectrometry relies on accurate analyte identification, limiting false discoveries to a manageable amount. We present a fully automated open-source workflow for high-throughput metabolomics that combines data-dependent and data-independent acquisition for library generation, analysis and statistical validation, with control of the false-discovery rate while matching manual analysis with regards to quantification accuracy.
Keynote Sebastian Böcker SIRIUS, CANOPUS and COSMIC: Turning tandem mass spectra into metabolite structure information Liquid Chromatography Mass Spectrometry is a highly sensitive experimental platform for the analysis of metabolites and other small molecules. Unfortunately, structural elucidation of metabolites from tandem MS data remains highly challenging. I will present computational tools developed in my group for this task: SIRIUS 4 computes fragmentation trees and integrates CSI:FingerID for the annotation of tandem mass spectra with structures; ZODIAC refines the assignment of molecular formulas using complete LC-MS datasets; and the workflow COSMIC combines all of the above but also assigns confidence to structure annotations. Finally, CANOPUS tests and assigns 1000s compound classes to an unknown query compound based on its tandem MS data. All of these tools are best-of-class for the respective tasks. I will also report some biological results such as testing complete compound classes for fold change over 100s of LC-MS runs, or the discovery of novel structures currently not contained in any structure database.
Identification and Quantification Tools Tobias Kind Lipid identifications with the LipidBlast in-silico MS/MS libraries

One of the central dilemmas in lipidomics is the identification of unknown lipids. Computer generated in-silico tandem mass spectral libraries have contributed to a large increase in lipid annotations in mammals, plants, bacteria and algae. Yet many challenges remain such as comprehensive adduct ion coverage, modelling of accurate collision voltages from QTOF and orbital ion trap instruments and the identification of oxidized lipids. We discuss some of the recent advances in LipidBlast MS/MS tandem mass spectral library matching that have allowed the detection of novel biologically important lipids such as BMPs, DMEDs and acyl-CoAs.

Identification and Quantification Tools Kai Schuhmann Accurate quantification of molecular lipid species by LipidXte In bottom-up shotgun lipidomics, molecular lipid species are quantified using specific fragments consistently detected in tandem mass spectra. Glycerophospholipids explicitly are quantified by the carboxylate anion (CA) fragments produced from the fatty acid moieties of the precursor anions. Here, synthetic lipid standards which fatty acid moieties are different in mass from the endogenous analytes serve as internal reference for species of the same lipid class. However, their structural and resulting fragmentation differences can introduce strong biases to the MS/MS-based quantification, which can easily exceed 50%. We therefore developed the open source software LipidXte, which relies upon a generic and portable fragmentation model that harmonizes the abundances of CA fragments of Orbitrap HCD FT MS/MS originating from different sn-1/2 positions on the glycerol backbone, hydrocarbon chain length or the number and position of double bonds. The model is generic and independent of instrument settings (e. g. collision energy) and allows unbiased and accurate absolute (molar) quantification independently of employed internal standards, which is particularly important for the analysis of polyunsaturated lipids.
Identification and Quantification Tools Ronny Herzog Developing a GMP-certified lipid identification software Applying lipidomics in a regulated environment such as Good Manufacturing Practice (GMP) not only poses analytical challenges with regard to method validation, but also puts high demands on the software used for lipid identification. In order to meet these requirements, we have successfully implemented GAMP5 recommendations for software development and validation for LipotypeXplorer. These include system requirements specifications, architectural specifications, tests for each specification level and comprehensive validation tests as well as audit trails and change control processes. We provide an overview of these measures, which enabled the use of LipotypeXplorer as an integral part of GMP-certified lipidomics assays.
Identification and Quantification Tools Ni Zhixu From high throughput Lipid profiling to data integration in systems level LipidHunter2, a software for high-throughput lipid identification, was updated in terms of lipid class coverage and now, in addition to PLs, supports identification of lysoPL, DG, TG, SM, Cer, and HexCer providing possibility for comprehensive lipidome description at molecular species level. To support further conversion and matching of lipid annotations provided by LipidHunter or other lipid identification software to unified shorthand notation system, we developed LipidLynxX serving as a hub facilitating data flow from high-throughput lipidomics analysis to systems biology data integration. LipidLynxX allows direct link of lipidomics datasets to multiple data integration resources including lipid ontology, pathway and network mapping. LipidHunter and LipidLynxX are a flexible, customizable open-access tools freely available for download at https://github.com/SysMedOs.
Visualization and Exploration Ansgar Korf Multidimensional Kendrick Mass Plots: A graphical analysis tool for lipids Latest advances in high resolution mass spectrometry allow the mapping of complex lipidomes. Especially the combination of multidimensional separation techniques, such as liquid chromatography and ion mobility result in highly informative, yet even more challenging data sets to process. In this presentation conventional Kendrick Mass Plot analysis is extended by characteristics of the above mentioned separation techniques, such as retention time and collision cross section, to extract as much information from lipidomics data sets. Multidimensional Kendrick Mass Plots are enabled within MetaboScape software to support overall lipid identification workflows and enable fast spotting of false positive annotations.
Keynote Christer Ejsing Functional lipidomics – a molecular perspective on cellular lipid biochemistry
Identification and Quantification Tools Bing Peng LipidCreator: A workbench to probe the lipidomic landscape We have developed LipidCreator, a workbench and knowledge base for the automated generation of targeted lipidomics MS assays and the in-silico spectral libraries for data evaluation. LipidCreator offers a comprehensive framework to compute MS/MS fragment masses for over 60 lipid classes, providing all functionalities needed to define fragments, manage stable isotope labeling, and optimize collision energy. Assay generation can be conducted with a graphical user interface (GUI) or by using command line functionality, supporting friendly to analytical researchers and bioinformatics. LipidCreator not only can work standalone, but is also fully integratable into the small molecule system of Skyline (a well-known software for targeted proteomics) which allows vendor-independent assay usage, data visualization and quality control of MS and MS/MS data.
Identification and Quantification Tools Laura Goracci Tailoring bioinformatics tools to research needs: the Lipostar approach Lipidomics has emerged as an extremely powerful discipline not only to monitor the cellular functions of known lipids in health and disease, but also for biomarker discovery and for characterization of the epilipidome. Due to its versatility, lipidomics analysis can be performed by applying several analytical techniques and protocols, and more will come in the next future. In this context, a flexible and versatile architecture is required when bioinformatics and cheminformatics tools for lipidomics are designed. Lipostar, a vendor-neutral high-throughput software to support targeted and untargeted LC-MS lipidomics, was designed aiming at covering a number of different research workflows, with an easy cross-talking among different aspects of lipidomics analysis, including statistics, lipid identification, pathways analysis and quantification, and with the user being able to improve software capabilities by direct intervention. During this talk, a brief overview of the Lipostar approach will be provided.
Identification and Quantification Tools Jeremy Koelmel Millions of Possibilities: The Uncharted Waters of Redox Lipidomics Redox lipidomics, the comprehensive measurement of oxidized lipids has significant implications across multiple fields; in clinical science oxidative stress can occur due to numerous disease states and hence oxidized lipids may indicate both mechanisms and biomarkers of early stages of biological stress. In food science, oxidized lipids are a biproduct of cooking and effect the health, taste, and storage life of food. To date, little is known about the diversity of individual oxidized lipid species, and how they may contribute to health and disease. Employing liquid chromatography high-resolution tandem mass spectrometry (LC-HRMS/MS) and in-silico generation of oxidized lipids and their fragmentation patterns within our lipidomics software platform, LipidMatch Flow, we discover thousands of individual oxidized lipids across a diverse range of lipid classes in both biological samples and cooking oils. Identifications were benchmarked against LPPtiger which uses more stringent identification criteria. Current progress and limitation in redox lipidomics employing LC-HRMS/MS will be discussed.
Identification and Quantification Tools Douglas McCloskey SmartPeak Technological advances in high-resolution mass spectrometry vastly increased the number of samples that can be processed in a life science experiment, as well as volume and complexity of the generated data. To address the bottleneck of high throughput data processing, we present SmartPeak (https://github.com/dmccloskey/SmartPeak2),  an application that encapsulates advanced algorithms to enable fast, accurate, and automated processing of CE-, GC- and LC-MS(/MS) data, and HPLC data for targeted and semi-targeted metabolomics, lipidomics, and fluxomics experiments. The application allows for an approximate 100 fold reduction in the data processing time compared to manual processing while enhancing quality and reproducibility of the results.
Identification and Quantification Tools Justin van der Hooft Spec2Vec: Improved mass spectral similarity scoring through learning of structural relationships Spectral similarity is used as a proxy for structural similarity in many tandem mass spectrometry (MS/MS) based metabolomics analyses, such as library matching and molecular networking. This is based upon the assumption that spectral similarity resembles structural similarity. Although weaknesses in the relationship between common spectral similarity scores and the true structural similarities have been pointed out, little development of alternative scores  has been undertaken. Here, I introduce Spec2Vec, a novel spectral similarity score inspired by a natural language processing algorithm — Word2Vec. Where Word2Vec learns relationships between words in sentences, Spec2Vec does so for mass fragments and neutral losses in MS/MS spectra. The spectral similarity score is based on spectral embeddings learnt from the fragmental relationships within a large set of spectral data. I will demonstrate the advantages of Spec2Vec in the key task of library matching. Finally, Spec2Vec is also computationally more scalable allowing us to search for structural analogues in a large database within seconds.
Visualization and Exploration Jennifer Kyle Lipid Mini-On: mining and ontology tool for enrichment analysis of lipidomic data Lipid Mini-On is an open-source tool that performs enrichment analyses and visualizations of lipidomics data. Lipid Mini-On uses a text-mining process to group individual lipid names into multiple lipid ontology groups based on the classification (e.g. LipidMaps) and other characteristics, such as chain length and number of double bonds as well as total acyl carbons and double bonds. Lipid classes can be added to customize the user’s database to remain current as novel lipid classes are discovered or unique samples are analyzed. The tool contains five statistical approaches (e.g., Fisher’s Exact, EASE score) for enrichment analyses and visualization of results is available for all classification options. Results are also visualized through an editable network between the individual lipids and their associated lipid ontology terms. Lipid Mini-On is available as a R script and a Shiny app (https://omicstools.pnnl.gov/shiny/lipid-mini-on/).