Skip to content
English
  • There are no suggestions because the search field is empty.

What can I do to my data with missing values or zeros before creating visualisations?

Zeroes can have different meanings across omics data types. This page outlines the suggested approaches for handling zeroes in your omics data within Mass Dynamics.

Some visualisations and analyses, such as PCA and heatmaps, may fail if the dataset contains missing values or zero values. This can happen because:

  • the analysis requires a complete numerical matrix, or
  • a log2-transformation is applied before generating the plot, which cannot be performed on zero values.

The recommended preprocessing steps depend on your omics dataset type.

Mass spectrometry-style datasets

Examples:

  • Proteins
  • Peptides
  • PTMs
  • Metabolites

These datasets commonly contain missing values. Visualisations requiring complete data cannot be generated if missing values are still present.

What to do

Before generating the plot:

  • Impute missing values using the Normalization & Imputation workflow.
  • Ensure missing values are correctly represented in the dataset when you upload your data via the MD Format. 
    For example, if missing values were uploaded with 0 intensities, the corresponding Imputed column must be set to 1. Otherwise, the system will treat these values as real measurements instead of missing values, and they will not be imputed correctly.

Gene count datasets

Examples:

  • RNA-Seq gene count datasets

In these datasets, zero values are valid and expected because some genes may not be detected in a sample.

However, many visualisations, such as PCA and heatmaps, are generated using log-transformed data rather than raw counts. Log-transformation stabilises the variance, reduces the impact of highly expressed genes, and improves comparisons between samples.

What to do

Before generating the plot:

  • Apply a Count Per Million (CPM) transformation using the Normalization & Imputation workflow.

This transformation enables safe log2-transformation before visualisation generation.