Can low-cost NIR reflectometers predict Potential Mineralizable Nitrogen in organic farms?

Final report for GNC23-368

Project Type: Graduate Student
Funds awarded in 2023: $14,999.00
Projected End Date: 09/02/2025
Grant Recipient: Michigan State University
Region: North Central
State: Michigan
Graduate Student:
Faculty Advisor:
Dr. Kimberly Cassida
Michigan State University
Expand All

Project Information

Summary:

Majority of growers report seeking further support in managing nutrients in their farms. Nutrient supply is challenging in organic systems because synthetic fertilizers are prohibited. Knowledge of the biotic and abiotic processes in an organic farm is key in understanding the nutrient demand in organic systems. As such, farmers usually send soil samples to soil testing centers to understand soil properties that help make informed decisions about managing Nitrogen in their farms. These traditional lab analyses are time consuming, expensive, and require the use of harsh chemicals. NIR reflectometers are cheap, rapid, relatively easy to use, and allow non-destructive and repeated soil sampling of near-infrared reflectance which is tested to predict several soil properties of interest to farmers. 

Organic farms sometimes must rely on organic sources of N such as from cover crop residues, as manure demand is very high. Potential mineralizable N is an excellent metric that allows farmers to estimate the amount of N supplied by inputs (manure, residue, etc). While NIR reflectometers are widely used to estimate total soil C and N, their use to estimate PMN is limited. The PMN is a metric for estimating N supply, mediated by microbial communities, and useful to predict soils' inherent capacity to supply N to cash crops via organic sources.  

In this project, we examined the use of a hand-held reflectometer to rapidly estimate soil PMN. Hand-held NIR spectrometers are cost-effective, offer rapid results, non-destructive, and do not involve using chemical reagents. Developing predictive models that tie spectral NIR signal with PMN could enable farmers to make rapid, site-specific N management decisions tailored to their unique farm fields. Our study employed soil samples from a cover cropped study across 20 organic farms planted with three cover crop treatments (cereal rye, crimson clover, and a 4 way-mix of cereal rye, crimson clover, rapeseed, and oats) in the Southwest and the Thumb region of Michigan. The study was designed as a randomized complete block design with four replicates, resulting in a total of 240 samples. 

Lab analysis included 7-day anaerobic incubation PMN, soil characterization at the block level, Sci-ware Neospectra handheld reflectometer (1350-2550 nm range) with six replicate scans per sample and a benchtop attenuated total reflectance- Fourier transform infrared in the MIR (2500 nm - 25000 nm range). Both NIR and MIR spectra were preprocessed using standard normalization (SNV) and Savitzky- Golay smoothing. Principal Component Analysis was used to reduce wavelength dimensions while preserving the variation in the spectra.  We used partial least squares regression (PLSR) to develop models : 1. PMN as a function of NIR, 2. PMN as a function of MIR, 3. PMN as a function of NIR and MIR combined. A leave-one-farm-out cross-validation (LOFO-CV) and 10-fold CV was employed, resulting in six models. We used coefficient of determination (R²) to access model prediction accuracy. 

We found that PMN did not vary significantly across the three treatments. Potential mineralizable N varied across all farms ranging from 9 to 108 mg kg-1 soil. Soil NIR and MIR spectra in the principal component space showed that the spectral signature for both NIR and MIR differed across farms. Our results showed that the NIR spectra explained 21 % variation in the PMN when generalized across farms. However, within farms, NIR explained 51 % variation in PMN. Prediction of PMN using the MIR spectra produced unreliable estimates of PMN. Combining NIR and MIR spectra also did not improve model performance. Overall, our findings suggest that within farm generalizations of PMN estimates using NIR spectra results in moderate performance but across farm generalizations have limited predictive values. However, the MIR spectra do not contain spectral information helpful in predicting PMN, which is a dynamic soil metric mediated by microbial communities. Either large datasets are needed such that the models are trained on a wide range of farms to extrapolate results to newer farms, or spectral signatures should be confined to predicting static soil properties such as total N and total C, rather than predicting PMN, which is a dynamic soil metric mediated by microbial communities. 

 

 

Project Objectives:

The objectives of this study were to:

  1. Evaluate the efficacy of hand-held spectrometers to estimate PMN.
  2. Develop predictive models using hand-held spectrometers that provide measures of PMN.
  3. Compare PMN estimates from NIR hand-held spectrometers with bench top MIR spectroscopy to develop guides tailored to farmers.

We hypothesized that: 

  1. The variation in PMN across farms and cover crop treatments can be explained by the NIR spectral information from the reflectometer.
  2. The MIR spectroscopy estimates are better associated with PMN lab measurements than NIR spectroscopy.

This project focused on foundational research to access the feasibility of using handheld NIR reflectometer for PMN prediction in organic farms. The participating growers contributed site and soil sample access, making this collaborative research possible. While farmer learning outcomes and adoptive actions from this study were part of the proposal, our results showed that across-farm generalization of PMN estimates using NIR is not reliable and the educational program was not pursued to avoid promoting an unreliable method for estimating PMN. Research findings will be disseminated  through a poster presentation at the Tri-Societies Meeting 2025, and via a peer-reviewed publication to inform the broader scientific community. 

Cooperators

Click linked name(s) to expand/collapse or show everyone's info
  • Faisal Sheriff (Researcher)

Research

Materials and methods:

Definitions: 

  1. Hand held spectrometers: A hand held device that generates near infrared spectrum in the range of 1350 - 2500 nm.
  2. Potential mineralizable N (PMN): Amount of organic N that can be converted to plant available (mineral) N under anaerobic conditions in the lab.
  3. Sci-ware Neospectra scanner: Hand held spectrometer used in our study. 
  4. NIR spectroscopy: Technique that analyzes the chemical, physical, and biological properties of solids (soils in our study) using near infrared (NIR) spectrum.
  5. MIR spectroscopy: Technique that analyzes the chemical, physical, and biological properties of solids (soils in our study) using mid-infrared (MIR) spectrum.

Study design and sample collection 

This study was conducted across 20 organic grain farms in the Southwest and Thumb regions of Michigan. The experiment was designed as a randomized complete block design with four replicates in each farm with three cover crop treatments. The cover crop treatments were:  cereal rye monoculture (RYE), crimson clover monoculture (CCLO), and a four-way mix of RYE, CCLO, rapeseed, and oats (4-WAY MIX).  Soil samples were collected at the plot level, with 240 total samples at the depth of 0-20 cm. 

Laboratory analyses

We measured PMN in the lab using the air-dried soil samples. Specifically, air-dried, 2-mm sieved soils were extracted using 2 M KCl to determine initial inorganic N. The extracts were used to determine initial nitrate and ammonium via colorimetric analysis. Similarly, another set of soils were incubated in an incubator after adding deionized water in the soils, purging with N2 gas to create anaerobic conditions for 7 days. After 7 days, samples were analyzed for ammonium content. The nitrate under anaerobic conditions is microbially converted into ammonium, and the PMN in our study is the difference between ammonium content after incubations and initial ammonium content. Soil characterization was performed at the block level to capture baseline soil properties.

Spectroscopic measurements

A sci-ware Neospectra handheld reflectometer (1350-2550nm) was used to generate NIR for the soil samples. Six replicate scans were collected per sample, and were averaged to result one NIR spectra per sample. Similarly, total attenuated reflectance, Fourier transform infrared (ATR-FTIR) spectroscopy was performed using the Perkin Elmer II spectrometer to generate total reflectance in the 2500-25000 nm range (4000-400 cm⁻¹) for each soil sample. Six replicates per scan were measured and averaged, similar to NIR spectra generation. 

Spectral preprocessing

Both NIR and MIR spectra were preprocessed to remove scattering and enhance spectral features. Standard normal variate (SNV) transformation was applied to correct light scattering. Similarly, Savitzky- Golay smoothing was applied to reduce noise while preserving spectral peaks. For MIR spectra, a subset analysis was also performed in the 1400-400 cm-1 range. 

Data reduction and modeling

Principal component analysis (PCA) was employed to reduce the dimensionality of the spectral data and to compare the spectra information between farms by plotting the spectra in the principal component space. Similarly, partial least square regression (PLSR) models were developed to predict PMN from spectral data. Three models were constructed, PMN = f(NIR), PMN = f(MIR) and PMN = f(NIR + MIR).

For both NIR and MIR spectra, models were tested with three preprocessing approaches: raw absorbance, SNV-transformed spectra, and SNV with first-derivative Savitzky-Golay smoothing. Model optimization involved testing up to 20 principal components, with the optimal number selected based on minimum root mean square error of cross-validation (RMSECV).

Model Validation

We employed two complementary cross-validation methods in our study.  First, a leave-one-farm-out cross-validation (LOFO-CV) was used, where one farm was held out as the test set while the model was trained on the remaining 19 farms. This process was repeated 20 times, with each farm serving as the test set once. LOFO-CV evaluates model ability to generalize across different farms with varying soil conditions and management histories. A 10-fold cross-validation was also used where the whole dataset was randomly divided into 10 subsets, with each subset serving as the test set once while the model was trained on the other 9 subsets. This method was used internally during model training to select the optimal number of components for the PLSR models. 

Separate from the LOFO-CV analysis mentioned above, the entire dataset (240 samples) was randomly divided into 10 subsets (folds) using stratified sampling. Each fold was used as a test set once while the model was trained on the remaining 9 folds. For each fold, internal 10-fold cross-validation was performed on the training data to select the optimal number of components. We used this validation method to ignore unique farms and to evaluate within-dataset prediction accuracy. This method provides an upper bound on model performance when samples are randomly distributed rather than spatially clustered by farm.

Block-Level Aggregation

To account for the hierarchical structure of the data, block-level aggregation was performed by averaging spectral and soil measurements across the three cover crop treatments within each replicate, and was used for certain soil property correlations and PCA visualizations.

Model Performance Evaluation

Model prediction was assessed using the coefficient of determination (R2), which was calculated as R2 = 1 - RSS/TSS. This measures the proportion of variance in observed PMN which is explained by the predicted values.

All data were pre-processed, analyzed, and visualized using the statistical software R (version 4.5.1). We used tidyverse for data processing and visualization, prospectr for spectral data preprocessing, pls for PLSR modeling and factoextra for conducting PCA visualization. Seed was set at 2025 to ensure reproducibility.

Research results and discussion:

Potential mineralizable nitrogen 

Potential mineralizable N (PMN) differed across farms, but not between the cover crop treatments (p= 0.823). The PMN ranged from 9-108 mg kg-1 soil across the farms studied, with an overall average of 51 mg kg-1 soil (Figure 1). These values are typical of PMN values found across the temperate region of the US Midwest. The variability in PMN is expected across farms with diverse soil properties, environmental conditions, and management practices. The fact that PMN was not significantly different among the cover crop treatments suggests that the PMN variability were due to the unique farms, rather than the cover crop treatments. 

NIR and MIR spectra visualization

Principal component analysis of the NIR and MIR spectra showed distinct spectral information across one farm, which was classified as a Thomas Muck soil by the Web Soil Survey staff (Figure 2, Figure 3). This spectral differentiation indicates that each farm possesses unique soil characteristics that influence reflectance patterns, which may be related to differences in soil organic matter composition, mineral content, texture, and moisture. The fact that farms cluster distinctly in spectral space suggests that spectroscopy captures comprehensive soil compositional information. However, PMN is a dynamic biological property, which may be influenced by factors not fully captured by spectral reflectance. 

Model performance: NIR spectroscopy

Model performance using NIR spectroscopy showed varying performance depending upon the validation approach employed. When 10-fold CV was used, NIR based PLSR models explained 51 % of variation in PMN (R2 = 0.51). This suggests that NIR may provide useful information for PMN prediction when interpolating, and when soil samples are included from all farms in the training set. However, when using a more robust, leave-one-farm out CV, model performance dropped to R2 = 0.21, explaining only 21 percent of the PMN variation. We found that NIR based models developed on one set of farms have limited ability to generalize to new farms when the farms are not represented in the training datasets.

Among the NIR preprocessing methods tested (raw absorbance, SNV transformation, and Savitzky-Golay first derivative), raw absorbances held the best performance in both validation scenarios. The  Savitzky-Golay preprocessing resulted in negative R2, which means that the models were worse than simple mean predictions.

Model performance: MIR spectroscopy

Mid infrared spectroscopy produced unreliable estimates of PMN under both validation approaches. Using LOFO-CV, the MIR based model produced a negative R² value (R² = -0.21), indicating that predictions were worse than using the mean PMN value. Similar poor performance was observed with the truncated MIR range (1400-400 cm⁻¹) that focused on the fingerprint region. These results suggest that despite MIR sensitivity to functional groups, the spectral information in the MIR range does not capture dynamic, biological-mediated processes that determine PMN. Unlike total soil nitrogen or carbon, which are relatively stable chemical properties directly related to molecular bond vibrations detectable by infrared spectroscopy, PMN represents the potential for biological nitrogen mineralization mediated by microbial communities. This biological activity depends on factors such as microbial community composition, substrate availability, and soil moisture conditions. These factors may not have distinct spectral signatures in the infrared range.

Model Performance: Combined NIR and MIR Spectroscopy

Combining NIR and MIR spectra worsened the model performance than using either NIR or MIR spectra alone. Under LOFO-CV, the combined spectral model performed similarly to MIR models, suggesting that the additional MIR information did not provide any predictive value and introduced noise that interfered with the NIR signal. This finding supports that MIR spectra does not contain complementary information useful for PMN prediction, and that model complexity should not be increased without corresponding improvements in predictive accuracy.  A summary of the model performances in the study conducted is listed in Table 1

Participation Summary
13 Farmers participating in research

Educational & Outreach Activities

1 Journal articles
1 Webinars / talks / presentations

Participation Summary:

13 Farmers participated
Education/outreach description:

We completed the research on assessing the feasibility of using handheld NIR spectroscopy for predicting PMN in organic farming systems in Michigan.  We leveraged already established relationship with 13 participating farmers and their soils data and assessed predictive models using NIR and MIR spectral data to predict PMN. Research findings will be presented at the Tri-Societies Meeting 2025 as a poster.(Tri Societies 2025 Poster) A scientific manuscript is currently being prepared for submission in PLOS-One Journal, which is an excellent platform for publishing studies with null results.  Given poor model performance, with the best model only explaining 21 % variability in PMN, our models are not reliable enough to promote this technology as it is. Instead, we focus on presenting our results to the scientific community where these findings can inform other researchers about the challenges of predicting dynamic, microbially mediated soil properties like PMN using NIR spectroscopy, the need for a larger, representative sample such that trained models can be extrapolated to newer farms. Data generated from this project will be made publicly available in a publicly available repository. 

Project Outcomes

1 New working collaboration
Project outcomes:

While our study did not support the hypotheses tested, knowledge gained that dynamic processes like PMN are challenging to predict using NIR can be helpful as farmers make informed decision based on studies such as ours. We demonstrated that the PMN estimates are unreliable when using NIR spectroscopy, and avoid over-optimistic bias in our study using strict cross-validation methods. Indeed, NIR is a function of chemistry, and static soil properties such as total N and C may be better predicted using NIR spectroscopy compared to PMN, which is inherently a function of biology. This information can provide an economic benefit to informed farmers. Having that said, our study findings can still help future studies guide alternative testing strategies. 

Knowledge Gained:

This project highlighted the fact that dynamic, biological processes mediated by living microbes such as PMN cannot be predicted effectively by using NIR and MIR spectroscopy, which is focused on chemistry rather than biology. While me and my advisers were optimistic about the performance, the data suggested otherwise. Also, while the literature has used different validation methods for predictions, we used the one that would generalize results the most (LOFO-CV) and given a moderate sample size, we realized that the training set did not have enough variability to learn PMN patterns across farms. While static properties like total C and total N can be predicted with high reliability (R2 > 0.9), we realized that neither NIR spectroscopy  nor MIR spectroscopy contain enough signal that predicts dynamic processes such as PMN. Within farm generalization for PMN estimates may be reliable, but this still includes generating lab measured PMN values to calibrate the model. In addition, soils were unique to the farms studied, and the "one-size-fits-all" solutions were inappropriate given the data size (240) we used for this study. 

Recommendations:

For future studies, we recommend that NIR spectroscopy should target stable soil properties (total N, total C, pH, SOM). While NIR spectroscopy has proven applications for predicting plant tissue properties (lignin, cellulose content), it is challenging to predict soil dynamic properties which is usually spatially and biologically affected. NIR relates to chemistry, but dynamic soil process like PMN relates to biology. If PMN predictions using NIR spectroscopy continue, multi-regional databases may help improve model performance. 

Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the author(s) and should not be construed to represent any official USDA or U.S. Government determination or policy.