Conference paper
Conference paper
Geospatial Sampling by Maximizing Information Entropy
Abstract
To refine unsupervised geospatial model training, we introduce a novel method emphasizing diverse and clean datasets. Extracting finer-resolution metrics like land use, temperature, and precipitation, we cluster similar statistics to comprehend data distribution comprehensively. Weighted sampling based on cluster size ensures representative data points, with a down-weighting strategy favoring less frequent data for enhanced diversity. This achieves a balanced dataset representation, significantly improving the geospatial foundation model's accuracy. Our study underscores the potential for optimizing geospatial data sampling, enhancing model accuracy, and broadening practical applications.
Related
Conference paper
Do not have enough data? Deep learning to the rescue!
Conference paper