ImpLimet: Getting optimal imputation for each dataset in LIpidomics, METabolomics, and general data mining

March 7th, 2025, by Janice Ou

Figure

Missing values occur frequently in metabolite measurements for experimental or biological reasons. If left untreated in the dataset, they can affect downstream data analysis, such as limiting the number of applicable statistical methods and introducing bias. The missing values, depending on their relationships with the observed values, can be categorized into three types: missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR). To address the problem of missing data, we developed Imputation for Lipidomics and Metabolomics (ImpLiMet), a user-friendly UI platform that enables researchers to impute missing values in their datasets using one of the eight implemented methods, including five univariate and three multivariate methods. A unique feature of ImpLiMet is its ability to suggest an optimal method and parameter settings based on the user’s datasets. With user-defined data thresholds, ImpLiMet will generate missing data in all three forms from a section of the user’s dataset without any missing values, impute generated missing datasets using all available methods, and compare the imputed datasets with the original complete dataset for the lowest error to determine the optimal imputation method automatically. After data imputation, users can visually assess the effects of data imputation in terms of data distribution, skewness, and kurtosis. ImpLiMet is freely available at https://complimet.ca/shiny/implimet/ , and the software codes are accessible at https://github.com/complimet/ImpLiMet.

Want to learn more?

Ou, H., Surendra, A., McDowell, G. S. V., Hashimoto-Roth, E., Xia, J., Bennett, S. A. L., and Čuperlović-Culf, M. (2025) Imputation for Lipidomics and Metabolomics (ImpLiMet): a web-based application for optimization and method selection for missing data imputation Bioinform Adv 5, vbae209, doi: 10.1093/bioadv/vbae209.