latest news & announcements

Asthma: disease causing MicroRNA's identified

Background: Asthma is classified according to severity and inflammatory phenotype and is likely to be distinguished by specific microRNA (miRNA) expression profiles.

Objective: We sought to associate miRNA expression in sputum supernatants with the inflammatory cell profile and disease severity in asthmatic patients.

Needles: Toward Large-Scale Genomic Prediction with Marker-by-Environment Interaction

Genomic prediction relies on genotypic marker information to predict the agronomic performance of future hybrid breeds based on trial records. Because the effect of markers may vary substantially under the influence of different environmental conditions, marker-by-environment interaction effects have to be taken into account. However, this may lead to a dramatic increase in the computational resources needed for analyzing large-scale trial data. A high-performance computing solution, called Needles, is presented for handling such data sets.

Of dups and dinos: evolution at the K/Pg boundary

Fifteen years into sequencing entire plant genomes, more than 30 paleopolyploidy events could be mapped on the tree of flowering plants (and many more when also transcriptome data sets are considered). While some genome duplications are very old and have occurred early in the evolution of dicots and monocots, or even before, others are more recent and seem to have occurred independently in many different plant lineages. Strikingly, a majority of these duplications date somewhere between 55 and 75 million years ago (mya), and thus likely correlate with the K/Pg boundary.

Designing biomedical proteomics experiments: state-of-the-art and future perspectives

With the current expanded technical capabilities to perform mass spectrometry-based biomedical proteomics experiments, an improved focus on the design of experiments is crucial. As it is clear that ignoring the importance of a good design leads to an unprecedented rate of false discoveries which would poison our results, more and more tools are developed to help researchers designing proteomic experiments.

Evolutionary approaches for prototype selection algorithms

One of the most accurate types of prototype selection algorithms, preprocessing techniques that select a subset of instances from the data before applying nearest neighbor classification to it, are evolutionary approaches. These algorithms result in very high accuracy and reduction rates, but unfortunately come at a substantial computational cost. In this paper, we introduce a framework that allows to efficiently use the intermediary results of the prototype selection algorithms to further increase their accuracy performance.

Noncoding RNAs Not So Noncoding

Bits of the transcriptome once believed to function as RNA molecules are in fact translated into small proteins. Gerben Menschaert (UGent BIG N2N) and colleagues established a searchable sORF database called with the aim of accumulating and centralizing data on sORFs and their translation potential.

(From TheScientist, by Ruth Williams)

New BIG N2N partner: Prof. Willem Waegeman

Willem Waegeman is a professor at Ghent University, and a member of the research unit Knowledge-based Systems (KERMIT) of the Department of Mathematical Modelling, Statistics and Bioinformatics. His main interests are machine learning and data science, including theoretical research and various applications in the life sciences. Specific interests include multi-target prediction problems, constructive machine learning and preference learning.

See also

Jabba: hybrid error correction for long sequencing reads

Background: Third generation sequencing platforms produce longer reads with higher error rates than second generation technologies. While the improved read length can provide useful information for downstream analysis, underlying algorithms are challenged by the high error rate. Error correction methods in which accurate short reads are used to correct noisy long reads appear to be attractive to generate high-quality long reads. Methods that align short reads to long reads do not optimally use the information contained in the second generation data, and suffer from large runtimes.