Comparative Functional Genomics of Nectaries and Nectar in the Dicots

Metabolomic Analysis

Nectar and nectary metabolomes will be analyzed using methods optimized for the preliminary data. These methods are based on procedures used to determine the metabolomes of several plant organs, including Arabidopsis seedling leaves, soybean seeds, maize leaves, and Jacaranda nectar. The procedure extracts metabolites regardless of their chemical and physical properties, and is optimized for the efficient recovery of all metabolites in the sample. Two fractions, representing polar and non-polar metabolites, can be analyzed by a variety of methods including GC-MS, LC-MS and CE-MS, which are all available at the WM Keck Metabolomics facility and can be used as needed.  In the preliminary study we selected to analyze these samples via GC-MS because it is a highly sensitive, robust and stable analytical platform, and the chemical identification relies on comprehensive mass-spectral libraries that have proven reliable in identifying metabolites.  However, as evidenced by the nature of the compounds that were identified and based on prior metabolomics studies, the GC-MS analytical platform is limited to the detection of relatively small, non-polar metabolites (this is because analytes need to be volatilized into the gas-phase for separation and analytical purposes).  Therefore, to expand the types of compounds that can be evaluated, we will in parallel conduct analyses using two LC-MS platforms.  These analyses will utilize an Applied Biosystems QSTAR XL Hybrid System and Bruker SolariX FT-ICR.  The former will be used for quantification purposes, whereas the latter will be used for chemical identification of the analytes.  The latter is particularly important in these experiments because there are no reliable and robust “libraries” that can be used to chemically identify analytes from LC-MS platforms.  Based on the prior experience at the WM Keck Metabolomics Research Laboratory, we have developed a small in house mass-spectral library that provides a means of identifying about 50 compounds that enriched in phenolics and terpenes.  However, additional effort will be required to expand this capability, and the recent NSF-MRI supported acquisition of the Bruker SolariX FT-ICR has greatly enhance this capability for the Laboratory.  This is primarily because this instrument has very high mass-accuracy (about 1 ppm) and mass-resolution, which provides a robust means of determining empirical chemical formulae of analytes and their MS-MS fragments.  Therefore, this should provide a means of identifying a larger portion of the 2/3rds of the analytes, which currently are not annotated in these types of metabolomics studies. 

In brief, to prepare samples, liquid N frozen nectar and pollen samples will be spiked with two internal standards for quantification purposes (ribitol and nonadecanoic acid), and metabolites will be extracted with a mono-phasic tripartite solvent system (consisting of methanol, chloroform, and water) using a Mixer Mill 301 for pulverization. Following centrifugation, the extract will separate into two phases. The upper (polar) layer and lower (nonpolar) layer will be separately removed and dried.  For GC-MS analyses the two extracts will be methoximated and silylated. The derivatized samples (1 mL aliquot) will be analyzed with an Agilent 6890 GC interfaced to a 5975C mass spectrometer using a HP-5ms column.  The GC-MS data files will be analyzed and searched against the NIST 08 Mass Spectral Library. Analyte peaks will be integrated, and peak areas normalized relative to the internal standards and the volume or weight of nectar or pollen used in the analysis, respectively. 

For LC-MS, the polar fraction will not be derivatized, but will be subjected to parallel LC-MS analyses on the Applied Biosystems QSTAR XL Hybrid System and Bruker SolariX FT-ICR.  In both analyses the exact-same reverse-phase LC-system will be used for separation in both instruments, and thus retention times should be comparable.  As explained above, the former instrument will be used for quantification purposes, and all biological replicates will be subjected to this analysis.  The latter instrument will be used for chemical identification purposes, and for each species the most complex pollen and nectar sample will be subjected to two analytical runs.  First, will be an LC-MS run that will not fragment ions, which will provide the m/z ratio of all “intact” analytes.  And the second analytical run will be conducted in MS-MS mode, to provide fragmentation patterns of the analytes, which can be used for identification purposes.

Interpretation. The goal of this objective is to document secondary chemical presence, identity and concentration in nectar and pollen of plant species that are resources for bees. In addition to providing these data as an important resource for growers and beekeepers, we will use descriptive statistics to (1) compare whether wild species have higher levels of secondary compounds than crops, as is often the case in vegetative tissue , (2) ascertain the extent to which compounds are unique or shared across species and tissues within species (nectar, pollen and leaves, using published studies of leaf chemicals), (3) assess whether secondary compound levels are higher in pollen vs. nectar, and (4) determine whether nectar or pollen composition may explain patterns of species that are particularly preferred (e.g., raspberry) or avoided (e.g., onion) by bees. Our extensive dataset will provide ecological and evolutionary insights into how flowers are defended, as well as provide a practical dataset of compounds that bees are exposed to in agricultural and wild habitats.

Metabolomics Database

A general framework for a metabolomics database, including the importance of data consistency and deposition of full metadata, has been described.  At present, only a few publically accessible metabolomics databases exist; most of these contain datasets from carefully defined samples with a common biological theme.  For example, one such database, Plant Metabolomics ( contains metabolomics data from Arabidopsis seedlings representing 200 mutants in genes of unknown function, and mutations in these genes do not show an obvious morphological phenotype. Hence, the sole criteria for additional research on these mutants would be metabolic differences that are revealed to the research community via this data.  Such data has enabled research on the role of novel plant lipids, such as Lipid A, a lipid that is considered unique to gram-positive bacteria the role of enzyme redundancies associated with FAE1-like and ELO-like fatty acid elongase components of Arabidopsis, and informed novel evolutionary and functional insights into the non-enzymatic FAP proteins. AtMetExpress, ( contains data and comprehensive metadata from carefully defined organs and developmental stages of Arabidopsis (for which microarray data is available from AtGeneExpress) , as well as from 20 ecotypes of this species.  A third example, The Medicinal Plant Metabolomics Resource ( presents metabolomics data for 12 species.  Its companion database, Medicinal Plant Genomics Resource ( contains transcriptomics data from the same biological samples. To date, these resources have supported the identification of metabolic intermediates, reactions and genes from medicinal species including: identification of a gene encoding a cytochrome P450 which catalyzes a reaction in the synthesis of the alkaloid 19-O-acetylhorhammericine in Catharanthus roseus ; identification of unusual phloroacylglucinols in Hypericum gentianoides; characterization of the evolutionary origin of different accessions of Prunellla vulgaris; insight into the stereospecificity of quinoline alkaloid synthesis in Camptotheca acuminate; cloning of three enzymes of  cardenolide synthase, C4 sterol methyloxidase, and progesterone 5b-reductase from Digitalis purpurea; and identification of genes encoding valerena-1,10-diene synthase in Valeriana officinalis, and their role in the synthesis of sesquiterpenes in that species that have biological activities in mammals.