当前位置: X-MOL 学术J. Bioinform. Comput. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Small groups in multidimensional feature space: Two examples of supervised two-group classification from biomedicine
Journal of Bioinformatics and Computational Biology ( IF 1 ) Pub Date : 2024-01-10 , DOI: 10.1142/s0219720023500257
Dmitriy Karpenko 1 , Aleksei Bigildeev 1
Affiliation  

Some biomedical datasets contain a small number of samples which have large numbers of features. This can make analysis challenging and prone to errors such as overfitting and misinterpretation. To improve the accuracy and reliability of analysis in such cases, we present a tutorial that demonstrates a mathematical approach for a supervised two-group classification problem using two medical datasets. A tutorial provides insights on effectively addressing uncertainties and handling missing values without the need for removing or inputting additional data. We describe a method that considers the size and shape of feature distributions, as well as the pairwise relations between measured features as separate derived features and prognostic factors. Additionally, we explain how to perform similarity calculations that account for the variation in feature values within groups and inaccuracies in individual value measurements. By following these steps, a more accurate and reliable analysis can be achieved when working with biomedical datasets that have a small sample size and multiple features.



中文翻译:

多维特征空间中的小组:生物医学监督两组分类的两个例子

一些生物医学数据集包含少量具有大量特征的样本。这可能会使分析变得具有挑战性,并且容易出现过度拟合和误解等错误。为了提高此类情况下分析的准确性和可靠性,我们提出了一个教程,演示使用两个医学数据集解决监督两组分类问题的数学方法。教程提供了有关有效解决不确定性和处理缺失值的见解,而无需删除或输入额外的数据。我们描述了一种方法,该方法考虑特征分布的大小和形状,以及测量特征之间的成对关系作为单独的派生特征和预后因素。此外,我们还解释了如何执行相似性计算,以考虑组内特征值的变化以及个体值测量的不准确性。通过遵循这些步骤,在处理样本量较小且具有多个特征的生物医学数据集时,可以实现更准确、更可靠的分析。

更新日期:2024-01-10
down
wechat
bug