State of Art

Flowering plants (angiosperms) are the dominant life form on land. They drive terrestrial ecosystems by using photosynthesis to produce most of our oxygen and complex nutrients. We depend on them, directly or indirectly, for almost all of our food, in addition to pharmaceuticals, textiles, timber, biofuel and the sheer aesthetic pleasure of flowers and other garden plants (1). Most plant biology research is directed at a few very successful “model plant” systems (especially the “thale cress”, Arabidopsis thaliana (2) a model for the dicots; and recently the “false brome” Brachypodium distachyon (3) a model for the monocots) that present very small genomes and are simple to cultivate and handle in molecular research, in addition to crop plants with major economical importance in Europe and elsewhere. Substantial recent effort has focused on developing and applying genomic tools for these species [1].

A major difficulty in interpreting vast genomic data sets from single species is that plant genomes are known to be fluid over relatively short evolutionary time spans, due to substantial genomic turnover through gene duplication, diversification and extinction. This is true for most of the major gene families governing organism biology (4). Turnover includes tandem gene duplication events on individual chromosomes and occasional genome duplication events (5-7). Since all plant genes ultimately derive from duplication and subsequent diversification processes (8,9), understanding deep evolutionary history (phylogeny) is important for predicting the function of genes in multigene families (ie most genes). Most plants that are useful to humanity belong to one of two flowering plant angiosperm groups, the monocots and eudicots. We now know that these two major groups diverged from each other close to the origin of the flowering plants (10).

As such, it would be invaluable to develop genomic tools for a model plant lineage that pre-dated this major evolutionary split. This would be broadly useful for plant biology research, as it would allow us to pinpoint which gene families predate vs. postdate the origin of most the world’s major crop plants and to reveal recent vs. ancient genes, permitting inference of gene orthologs, gene copies that are inherited in different plants through speciation events, vs. paralogs, gene copies derived from gene or genome duplication events, which may often gain new functionality (11,12).