J. M. Sánchez Santos, J. De Las Rivas, F. J. Campos Laborie
Most of the methodologies to detect differences using omic-wide data in biomedical studies (i.e. SAM, LIMMA, t-test or modified t-test like COPA, OS, ORT and MOST) are based on the analyses of significant mean/median changes to calculate the differential expression. They perform correctly when biomarkers do not show heterogeneous behaviour within each group.
We are developing in R an algorithm to address this point of possible wrong class labelling and to identify specific markers for each class. We build an incidence matrix with the frequency with which the genes have differential signal expression in different subsets. Then we perform a non-symmetric correspondence analysis (NSCA) to represent the genes in the space of the samples, and a cluster analysis to find highly significant associations between genes and samples. We assign a score to each gene covering both the significance of its differential signal and the number of samples with which it is associated.
Palabras clave: marker genes, heterogeneous data, non-symmetric correspondence analysis, clustering
Programado
X04 Pausa Café. Sesión Posters. Reunión TEST - Edificio 1
7 de septiembre de 2016 11:40
Edificio 1