Materials and methods:
GENOME-WIDE ASSOCIATION STUDY
SAMPLE COLLECTION AND PREPARATION
This panel was grown and harvested in Pullman, WA in 2016 and Pullman, Central Ferry, and Mansfield, WA in 2017. Pullman is located at 46.73° N, 117.18° W on the eastern border of Washington state and the elevation is 717 m above sea level. The average annual precipitation exceeds 500mm, soil types range from silt loams to silty clay loams and are part of the Palouse soil series, and winter wheat grain yields average around 7900 kg/ha. Central Ferry is located approximately 95 km southwest of Pullman at 46.62° N, 117.79° W, was irrigated (600 mm), and averages approximately 6700 kg/ha of grain yield. It sits at an elevation of 195 m. on silt loam soils classified mostly under the Chard soil series. Mansfield, located approximately 320 km northwest of Pullman at 47.81° N,119.64° W, has an annual precipitation of less than 300mm and grain yields average around 3550 kg/ha. The soils are mostly part of the Touhey soil series, classified as ashy fine sandy loam, and the elevation is 692 m. One half meter row of straw from each cultivar in the panel was harvested at ground level at harvest maturity (stage 11.4 on Feekes’ scale) and the heads were removed. For consistency, the leaves and nodes were removed from the samples. The remaining internode portion was cut into 1-2 cm pieces and then ground to pass a one mm sieve using a FOSS Cyclotec 1093 (FOSS North America, Eden Prairie, MN).
FIBER AND NUTRIENT ANALYSIS
Specialized filter bags (ANKOM Technology, Macedon, NY) were used to enclose 0.5-0.55 grams of ground winter wheat straw while it was analyzed for neutral detergent fiber (NDF), acid detergent fiber (ADF), and acid detergent lignin (ADL) using the ANKOM protocol with minor modifications. The NDF procedure removes starches, sugars, free amino acids, and other water soluble components, leaving hemicellulose, cellulose, and ADL. The ADF procedure removes the hemicelluloses, leaving only the cellulose and ADL. The ADL procedure removes the cellulose from the straw. The NDF and ADF procedures were performed sequentially using an ANKOM 200 Fiber Analyzer (ANKOM Technology, Macedon, NY) and following ANKOM procedures. The straw samples were then digested in 72% H2SO4 to determine the ADL. After each procedure, the samples dried overnight in a lab hood and then were dried in a hybridization incubator for a minimum of 6 hours at 64 degrees C. The samples were then removed from the incubator, placed in plastic bags with desiccators, and individually weighed. Cellulose and hemicellulose values were derived from the NDF, ADF, and ADL values. Dry combustion with a LECO TruSpec Analyzer (LECO Corp., St. Joseph, MI) was used to determine carbon (C), nitrogen (N), and C/N ratio.
STATISTICAL ANALYSIS
Prior to any statistical analysis, influential outliers were removed based upon analysis of the residuals. Summary statistics were calculated in R (R Foundation for Statistical Computing, Vienna, Austria) using a one-way ANOVA and Tukey post hoc test. Trait correlations were calculated in JMP Genomics (SAS Institute Inc., Cary, NC, USA). Population structure is present in this panel and was accounted for by using principle component analysis (PCA) calculated in GAPIT2. For each trait, the number of principle components used was determined by analyzing the Q-Q plots and the best fit for the model was selected. Principle components one and three were used for NDF, ADF and C. Principle components two and three were the best fit for ADL. The first principle component was used for cellulose whereas only the second principle component was used for N. Zero principle components were used for hemicellulose, indicating that the population structure within the sampled cultivars did not have a significant influence on this trait. Due to wide variations of phenotypic data across multiple environments for each trait, best linear unbiased predictions (BLUPs) were calculated using the statistical software R with both genotype and environment included as random effects. The marker-trait association analysis was performed using a mixed linear model (MLM) in Fixed and random model Circulating Probability Unification (FarmCPU) and implemented in GAPIT2. To correct for the use of multiple statistical tests, the Bonferroni correction method was applied with a significance level of α = 0.05.
NEAR-INFRARED SPECTROSCOPY
SAMPLE COLLECTION AND PREPARATION
Two separate populations were used in this study. The first was a panel of 480 advanced soft white winter wheat cultivars from breeding programs in the Pacific Northwest (Washington State University, University of Idaho, Oregon State University, USDA-ARS). This panel was harvested from Pullman, WA in 2016 and from Pullman, Central Ferry, and Mansfield, WA in 2017. Pullman is located at 46.73° N, 117.18° W on the eastern border of Washington state and the elevation is 717 m above sea level. The average annual precipitation exceeds 500mm, soil types range from silt loams to silty clay loams (USDA-NRCS, 2017), and winter wheat grain yields average around 7900 kg/ha (Wheat & Small Grains, 2018). Central Ferry is located approximately 95 km southwest of Pullman at 46.62°N, 117.79°W, was irrigated (600 mm), and averages approximately 6700 kg/ha of grain yield. It sits at an elevation of 195 m. on silt loam soils (USDA-NRCS, 2017). Mansfield, located approximately 320 km northwest of Pullman at 47.81°N.,119.64°W, has an annual precipitation of less than 300mm and grain yields average around 3550 kg/ha (Wheat & Small Grains, 2018). The soils are classified as ashy fine sandy loam and the elevation is 692 m. An augmented design was used for this population with repeating checks every 20 entries and one rep per location. The second population included 167 recombinant inbred lines (RIL) developed through single seed descent after crossing Finch (PI 628640) and Eltan (PI 536994) winter wheat cultivars (Peterson, et al., 1991; Garland-Campbell et al., 2005; Balow et al., 2019). These two cultivars were selected for crossing because they are cultivars released for production in the low rainfall areas of Washington and due to the stark contrast in decomposition potential between the two (Stubbs et al., 2009). This population was harvested from Pullman, Mansfield, and Waterville, WA in 2015, whereas samples were harvested in 2017 from Pullman and Mansfield. Waterville is located approximately 50 km southwest of Mansfield, has mostly silty loam and sandy loam soils, and has grain yields comparable to Mansfield. The coordinates of Waterville are 47.65°N, 120.07°W and the elevation is 800 m. The field design for this population was a randomized complete block with two reps per location and a repeating check every 20 entries. One half meter row of straw from each cultivar in both populations was cut just above ground level at harvest maturity (Stage 11.4 on Feekes’ scale) (Large, 1954; Feekes, 1941) and placed in brown paper sacks with the grain removed. For consistency, the leaves and the nodes were removed, leaving only the internode portion of the straw. The internode residue was cut into 1-2 cm pieces and then ground to pass through a one mm sieve using a FOSS Cyclotec Sample Mill (FOSS North America, Eden Prairie, MN).
FIBER AND NUTRIENT ANALYSIS
Neutral detergent fiber (NDF), acid detergent fiber (ADF), and acid detergent lignin (ADL) were determined by analyzing 0.5-0.55 grams of ground winter wheat straw using the VanSoest et al. (1991) procedure modified slightly by using an ANKOM automated system with specialized filter bags (ANKOM Technology, Macedon, NY). The NDF procedure removes starches, sugars, free amino acids, and other water soluble components, leaving hemicellulose, cellulose, and ADL. The ADF procedure removes the hemicelluloses, leaving only the cellulose and ADL. The ADL procedure removes the cellulose from the straw. The NDF and ADF procedures were performed sequentially using an ANKOM 200 Fiber Analyzer (ANKOM Technology, Macedon, NY) and following ANKOM procedures. The straw samples were then digested in 72% H2SO4 to determine the ADL. After each procedure, the samples dried overnight in a lab hood and then were dried in a hybridization incubator for a minimum of 6 hours at 64 degrees C. The samples were then removed from the incubator, placed in desiccator pouches to cool, and individually weighed. Cellulose and hemicellulose values were derived from the NDF, ADF, and ADL values. Dry combustion with a LECO TruSpec Analyzer (LECO Corp., St. Joseph, MI) was used to determine carbon (C), nitrogen (N), and C/N ratio as described by Gazulla et al. (2012).
NEAR-INFRARED SPECTROSCOPY
Finely ground winter wheat residue was enclosed in metal ring cups (36 mm inside diameter) and scanned with a FOSS XDS Rapid Content Analyzer (FOSS North America, Eden Prairie, MN) using ISIscan software, version 3.10 (Infrasoft International, State College, PA). Each sample was scanned twice, with the cup rotated 90° between the first and second scan, using the wavelength range 400-2498 nm at 2 nm intervals. The two resulting spectra from each sample were averaged.
STATISTICS
Each of the 4 environments from the first population were combined into one set to develop broad range prediction equations. After laboratory reference data was combined with the corresponding spectra, each data set was randomly divided into two parts for developing calibration equations (n = 1360) and for validation of equations (n = 500) using WinISI software, version 4.0 (Infrasoft International, State College, PA). Principle component analysis was used to eliminate spectral outliers, which were defined as spectra with Mahalanobis distance (H) values greater than 3.0. Modified partial least-squares (MPLS) and cross validation were used to develop prediction equations. Standard normal variant and detrend (SNV-D) was applied as a scatter correction, as well as math treatments for derivative order number, gap, first smoothing, and second smoothing. For all equations, the second smoothing was set at 1 to indicate that no second smoothing was used. The best prediction equation was determined by identifying that which displayed the highest 1-variance ratio (1-VR) and lowest standard error of cross validation (SECV). The ratio of the standard deviation (SD) to the SECV was calculated because it is essential for determining whether an equation is acceptable for quantitative prediction, screening only, or is not useful. The second population of RIL was used to test the equation’s performance on breeding populations, and analyses were repeated as described above.
Research results and discussion:
Through analysis of the phenotypic data, the variation in fiber constituents across environments was observed and recorded. On average, a trend was observed for NDF, ADF, ADL, and cellulose values across environments. The high rainfall environments (Pullman 2016, Pullman 2017) had the highest values whereas the low rainfall region (Mansfield 2017) had the lowest values. Our irrigated environment (Central Ferry 2017) always had values somewhere in-between. However, this trend was not the same for hemicellulose, nitrogen, and carbon. There was no significant difference between average hemicellulose content in Mansfield 2017 and Pullman 2016. Central Ferry 2017 had the highest average nitrogen values. Also, there was no significant difference between Mansfield 2017 and Pullman 2017 for average carbon content.
A genome-wide association study (GWAS) was conducted using this phenotypic data along with corresponding genotypic data collected prior to the beginning of this project. The GWAS was successful in identifying 5 chromosomal locations that correspond with cellulose, ADF, and NDF and may be useful as a reference for future fiber/nutrient genetic analyses. We identified a total of 23 marker-trait associations that were distributed across 12 wheat chromosomes. No single marker explained more than 11% of the phenotypic variation of any trait and, therefore, will not be useful in marker-assisted selection. However, considering the very low heritability of these traits (ranged from .03 - 0.21), the phenotypic variation explained by several of these markers was fairly high. The phenotypic variation explained by a marker should not exceed the heritability for the trait involved in the marker-trait association. This wide distribution of low-effect markers allowed us to conclude that the genetic architecture of these fiber/nutrient traits is very complex and GWAS may not be the best tool for assisting in selection of these decomposition constituents. The results did help to estimate the usefulness of each trait for similar future studies. Carbon and N were the least useful for identifying regions of interest as only three SNPs explaining very little of the phenotypic variation were associated with the two traits. Cellulose, ADF, and NDF contributed the most to identifying chromosomal regions of interest and provided valuable information regarding the genetic composition of fiber production in straw. Nevertheless, genomic selection may be better suited for analyzing traits such as these.
Near-infrared spectroscopy calibration equations have been developed for prediction of NDF, ADF, ADL, cellulose, hemicellulose, C, and N. The equations accurately predicted NDF and ADF in a validation subset with coefficients of determination (R2) of 0.85 and 0.86, respectively. The R2 for the cellulose prediction was 0.88. Nitrogen was also predicted with decent accuracy in the validation subset (R2 = 0.73). The standard deviation (SD) to standard error of cross validation (SECV) ratio is a measure commonly used to determine whether an equation is useful for quantitative prediction or not. Equations with SD/SECV ratios > 3.0 are useful for quantitative prediction whereas ratios < 3.0 and > 2.5 are effective for screening purposes only. An equation with SD/SECV ratio < 2.5 is not useful. According to these guidelines, only our cellulose equation is entirely useful for quantitative prediction (ratio = 3.17) whereas our equations for NDF and ADF are useful for screening purposes (ratio = 2.71 and 2.84, respectively).
Lab reference data and NIR spectra has already been collected for a second population. This population was derived from the crossing of two commercial cultivars from the Pacific Northwest, Finch and Eltan, and is representative of the populations we hope to assess with our NIRS equations. The data from this population encompasses five environments. Overall, the accuracy of our equations decreased when predicting this population when compared to the first validation set. Similarly, however, the highest R2 values were generally found in NDF, ADF, and cellulose across all five environments. The R2 values ranged from 0.33 to 0.77 for NDF, 0.40 to 0.69 for ADF, and 0.44 to 0.79 for cellulose. Hemicellulose was consistently difficult to predict, demonstrated by R2 values that ranged from 0.10 to 0.50. NIRS was generally unsuccessful in predicting C and N.
Our NIRS equations were successful in predicting NDF, ADF, and cellulose of the first validation set with high accuracy while hemicellulose and ADL were predicted with the lower accuracy. The overall predictive ability of NIRS decreased when used to predict the same traits in the Finch x Eltan breeding population but was still moderately high for NDF, ADF, and cellulose and suitable for our purposes of estimating decomposition potential. The C and N prediction accuracies were too low across every environment to be trusted. Using NIRS for screening, rather than prediction, will identify whether a sample fits into a range of high or low NDF, ADF, and cellulose values, which will be sufficient for recommendations and breeding purposes. By grouping NDF, ADF, and cellulose values into a “high” or “low” category, and accompanied with C and N values obtained through TruSpec analysis, an estimate of fast or slow decomposition for individual cultivars can be made. Additional samples from a wider range of environments would be beneficial for increasing the predictive accuracy and reliability of the NIRS equations.
We evaluated 36 entries from the WSU variety testing program of commercial varieties that are available for production across the various production regions of the state. Lines were sampled from three different locations across both high and low rainfall regions according to the methods previously described. Wet chemistry analysis and NIRS was performed as previously mentioned. Based on data, nine cultivars could confidently be categorized at rapid straw breakdown, and are cultivars which are currently adapted to the high rainfall regions of the state. This information will be disseminated at summer grower field days across the state to encourage growers to utilize these cultivars if they are operating no-till production systems. There were seven cultivars which were identified as slower decomposing lines, which would be useful under low rainfall production systems. Of these seven lines, only four are adapted to the deep-furrow planting traditional of the low rainfall region. These cultivars will also be talked about at field days during the summer months to better educate growers about cultivars selection decisions. One cultivar gave opposing data, with one location indicating fast breakdown, and two locations indicating slow breakdown. The remaining 19 lines could not be classified as either fast or slow straw breakdown lines. This indicated that they are either moderate lines, or that additional data is required to properly categorize them. Regardless, we were able to collect enough information to at least start giving preliminary recommendations of cultivars to be grown in different production regions based on the needs and desires of the individual growers.