Everyday is cloudy, so how should I come up with a title… Putting two independent topics in one title does not make much sense, but it is fine for a blog.
- /PhylDiag: identifying complex synteny blocks that include tandem duplications using phylogenetic gene trees/, BMC Bioinformatics, 2014: I’ve seen this paper cited by the SCORPiOs paper ( Synteny-guided CORrection of Paralogies and Orthologies in gene trees, kind of a non-sense name, sorry…). It is similar to i-adhore to identify syntenic blocks (only for pairwise comparisons, though), but PhylDiag uses information from gene trees, including family members as well orthologous and paralogous relationships. Note that the trees are built with TreeBeST, the one used in EnsemblCompara pipeline. It is an old gene tree inference pipeline guided by a species tree and seems to be able to infer duplication and speciation events. SCORPiOs also reimplemented TreeBeST in their package, which might be helpful.
- /Is phylotranscriptomics as reliable as phylogenomics/, Molecular Biology and Evolution, 2020. Many phylogenetic studies nowadays use transcriptomes because RNA-Seq is relatively cheap and can scale species sampling enormously. Here, the paper shows that orthology identification is the critical issue for using transcriptomes to infer phylogenies. Especially, orthologs identified by a tree-based approach developed by Yang and Smith (2014) produce more similar phylogenetic trees to phylogenomic trees than do tree-free methods. For phylogenetic analysis, the purpose is to curate a robust dataset of single-copy genes, so orthologous identifications can drop lots of data and only remain the best ones. In contrast, if all the gene families matter to analysis (e.g., building gene trees with all unigenes), it is still unclear how transcriptomes or unigenes are comparable to predicted genes in genomes.