Wednesday, June 4, 2008

Over- Vs. Underpartitioning in Bayesian Phylogenetics

Although it was published nearly a year ago, I'm still learning from Brown and Lemmon's important contribution on partitioning Bayesian phylogenetic analyses (The importance of data partitioning and the utility of bayes factors in Bayesian phylogenetics. Systematic Biology 56:643-655). One of the things they're most likely to be cited for is the conclusion that under-parameterization is considerably more problematic than over-parameterization. This result is evident in their Fig. 4, which uses the same type of visualization implemented by AWTY. The dots in this figure indicate posterior probability values for individual nodes obtained from two separate analyses. The degree to which points stray from the diagonal is an indication of how much the results of two Bayesian analyses disagree about support for particular nodes. As you can see in this figure, points tend to stray more from the diagonal in the plots toward the upper right corner of this figure (where analyses are under-parameterized) relative to the lower left corner (where models are over-parameterized). Although it seems clear that partitioning can have an important impact on the tree topologies obtained from Bayesian analysis, it also seems worth noting that the worst case scenario - nodes that are strongly supported under one partitioning strategy while absent in the other partitioning strategy - is never realized in Brown & Lemmon's simulations. Has anybody obtained such a result with real data? I tend to get plots similar to Brown & Lemmon's, leading me to believe that really well-supported nodes are generally robust to alternative partitioning strategies.


No comments: