Prof. Sanghamitra Bandyopadhyay: Studying the MicroRNA Induced Regulatory Network in Colorectal and Breast Cancer
It is well-known that certain proteins, called transcription factors (TFs), regulate the expression of other genes weaving a complex regulatory network in the cell. Discovery of microRNAs, small non-coding RNAs altering gene expression at a post-transcriptional level, has added a new dimension in this context. Our studies indicate that microRNAs play a crucial role in fine-tuning the balance of myriad cellular activities.
It is well established that miRNAs regulate the expression of their target genes at a post-transcriptional level, i.e., after the genes have produced the corresponding messenger RNAs. Identifying the target mRNAs of a miRNA is an important task in order to determine its regulatory role on a global scale. At the same time, understanding how the miRNAs themselves are regulated is also important. This is particularly true for intergenic miRNAs that are expected to have independent transcriptional machinery. In this regard, knowing the miRNA transcription start site (TSS) becomes important. Both the tasks, viz., miRNA target prediction and TSS prediction can be formulated as classification problems, enabling the application of machine learning techniques. Here we provide on overall view of the related work going on in our group at the Indian Statistical Institute, Kolkata. We will first briefly describe our approach of miRNA target prediction and TSS prediction. This will be followed by integration of several predicted and validated results for building the network of TF-miRNA-gene. Finally, we focus on this network for two types of cancer, namely colorectal and breast. Graph theoretic analysis of the network yields an interesting three level hierarchical structure. MicroRNAs appear in majority in the topmost level of this hierarchy, indicating that these molecules may be useful for quick signal propagation and also could be potential biomarkers. Their importance in facilitating cross-talk between transcription factors is also highlighted.
Dr. Ujjwal Maulik: Multiobjective clustering with SVM based ensembling for analysis of gene expression data
Data clustering is a popular unsupervised data mining tool that is used for partitioning a given data set into homogeneous groups based on some similarity/dissimilarity metric. Two well know clustering algorithms available in the literature are KMean and FCM. We will demonstrate how Metaheuristic techniques can be used to solve the problem of KMean/FCM. Result will be demonstrated for pixel classification of satellite images.
In the second part of the problem we will discuss Multiobjective clustering, in which multiple objective functions are simultaneously optimized. Selecting one solution from the set of Pareto Optimal solutions is always a critical issue. We will also discuss how machine learning techniques like Support Vector Machine (SVM) can be used to combine the Pareto Optimal solutions to evolve even a better solution along with the result for classification of Gene Micro Array Data.