Water Availability Sources for Land Value Prediction Using Machine Learning Methods

Guillermo, Marcillo
Vestal, Mallory
Guerrero, Bridget
Thompson, Kevyn
Golden, Bill
Journal Title
Journal ISSN
Volume Title

Accurately predicting land values requires a solid understanding of geographical, economic, and management factors affecting water availability in a given region. To this end, most parcel valuation studies utilize hedonic regression - or some of its variants- to model final land prices as the aggregation of all of the intrinsic and market-driven characteristics affecting the land-unit. However, the prediction capabilities of hedonic pricing models are limited when fitted to datasets of increasing complexity across time and space, especially those derived from merging multiple layers of information. In this paper, we used a land valuation dataset, comprising originally 65,000 sale-transaction records paired to 160 possible geographical, socioeconomic, and agricultural features to evaluate three alternatives of penalty-based regression (LASSO, Ridge, and Elastic-Net), and two ensemble learning methods (Random Forest, and, Gradient Boosting Regression). It is expected that the learning algorithms evaluated increase land-price prediction accuracy relative to conventional hedonic pricing models. Also, this project will evaluate the generalization capabilities of the algorithms evaluated in terms of error sources derived from bias (BIAS) and variance (VAR) in the predictions. Future work is expected to assess also local stationarity as well as potential sources of dependence between observations in our dataset.

Geolocated data was sourced from the Water Information Management and Analysis System (WIMAS, 2015) & the Kansas Department of Revenue (KDR, 2021). Land sale prices (adjusted for inflation, ref= 2020 dollars) were obtained from 2,000 dryland and irrigated land sale transactions (2012 to 2021). Water data and crop sources was recovered from 2,497 individual wells in GMD4 from 2005 to 2015. Prior to modeling, the two georeferenced databases were merged on the basis of Euclidean, and curvature-corrected (haversine), nearest distance analysis.
2023 Faculty Research Poster Session and Research Fair, West Texas A&M University, Department of Agricultural Sciences, Land value, Hedonic pricing model