pyDMS

@author: Radoslaw Guzinski Copyright: (C) 2017, Radoslaw Guzinski

class DecisionTreeRegressorWithLinearLeafRegression(linearRegressionExtrapolationRatio=0.25, decisionTreeRegressorOpt={})

Decision tree regressor with added linear (bayesian ridge) regression for all the data points falling within each decision tree leaf node.

Parameters
  • linearRegressionExtrapolationRatio (float (optional, default: 0.25)) – A limit on extrapolation allowed in the per-leaf linear regressions. The ratio is multiplied by the range of values present in each leaves’ training dataset and added (substracted) to the maxiumum (minimum) value.

  • decisionTreeRegressorOpt (dictionary (optional, default: {})) – Options to pass to DecisionTreeRegressor constructor. See http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html for possibilities.

Return type

None

fit(X, y, sample_weight, fitOpt={})

Build a decision tree regressor from the training set (X, y).

Parameters
  • X (array-like or sparse matrix, shape = [n_samples, n_features]) – The training input samples. Internally, it will be converted to dtype=np.float32 and if a sparse matrix is provided to a sparse csc_matrix.

  • y (array-like, shape = [n_samples] or [n_samples, n_outputs]) – The target values (real numbers). Use dtype=np.float64 and order=’C’ for maximum efficiency.

  • sample_weight (array-like, shape = [n_samples] or None) – Sample weights. If None, then samples are equally weighted. Splits that would create child nodes with net zero or negative weight are ignored while searching for a split in each node.

  • fitOpt (dictionary (optional, default: {})) – Options to pass to DecisionTreeRegressor fit function. See http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html for possibilities.

Return type

Self

predict(X, predictOpt={})

Predict class or regression value for X.

Parameters
Returns

y – The predicted classes, or the predict values.

Return type

array of shape = [n_samples] or [n_samples, n_outputs]

set_fit_request(*, fitOpt: Union[bool, None, str] = '$UNCHANGED$', sample_weight: Union[bool, None, str] = '$UNCHANGED$') DecisionTreeRegressorWithLinearLeafRegression

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters
  • fitOpt (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for fitOpt parameter in fit.

  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

Returns

self – The updated object.

Return type

object

set_predict_request(*, predictOpt: Union[bool, None, str] = '$UNCHANGED$') DecisionTreeRegressorWithLinearLeafRegression

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters

predictOpt (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for predictOpt parameter in predict.

Returns

self – The updated object.

Return type

object

set_score_request(*, sample_weight: Union[bool, None, str] = '$UNCHANGED$') DecisionTreeRegressorWithLinearLeafRegression

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns

self – The updated object.

Return type

object

class DecisionTreeSharpener(highResFiles, lowResFiles, lowResQualityFiles=[], lowResGoodQualityFlags=[], cvHomogeneityThreshold=0, movingWindowSize=0, disaggregatingTemperature=False, perLeafLinearRegression=True, linearRegressionExtrapolationRatio=0.25, regressorOpt={}, baggingRegressorOpt={})

Decision tree based sharpening (disaggregation) of low-resolution images using high-resolution images. The implementation is mostly based on Gao et al.1, 2012.

Decision tree based regressor is trained with high-resolution data resampled to low resolution and low-resolution data and then applied directly to high-resolution data to obtain high-resolution representation of the low-resolution data.

The implementation includes selecting training data based on homogeneity statistics and using the homogeneity as weight factor (Gao et al.1, 2012, section 2.2), performing linear regression with samples located within each regression tree leaf node (Gao et al.1, 2012, section 2.1), using an ensemble of regression trees (Gao et al.1, 2012, section 2.1), performing local (moving window) and global regression and combining them based on residuals (Gao et al.1, 2012, section 2.3) and performing residual analysis and bias correction (Gao et al.1, 2012, section 2.4)

Parameters
  • highResFiles (list of strings) – A list of file paths to high-resolution images to be used during the training of the sharpener.

  • lowResFiles (list of strings) – A list of file paths to low-resolution images to be used during the training of the sharpener. There must be one low-resolution image for each high-resolution image.

  • lowResQualityFiles (list of strings (optional, default: [])) – A list of file paths to low-resolution quality images to be used to mask out low-quality low-resolution pixels during training. If provided there must be one quality image for each low-resolution image.

  • lowResGoodQualityFlags (list of integers (optional, default: [])) – A list of values indicating which pixel values in the low-resolution quality images should be considered as good quality.

  • cvHomogeneityThreshold (float (optional, default: 0)) – A threshold of coeficient of variation below which high-resolution pixels resampled to low-resolution are considered homogeneous and usable during the training of the disaggregator. If threshold is 0 or negative then it is set automatically such that 80% of pixels are below it.

  • movingWindowSize (integer (optional, default: 0)) – The size of local regression moving window in low-resolution pixels. If set to 0 then only global regression is performed.

  • disaggregatingTemperature (boolean (optional, default: False)) – Flag indicating whether the parameter to be disaggregated is temperature (e.g. land surface temperature). If that is the case then at some points it needs to be converted into radiance. This is becasue sensors measure energy, not temperature, plus radiance is the physical measurements it makes sense to average, while radiometric temperature behaviour is not linear.

  • perLeafLinearRegression (boolean (optional, default: True)) – Flag indicating if linear regression should be performed on all data points falling within each regression tree leaf node.

  • linearRegressionExtrapolationRatio (float (optional, default: 0.25)) – A limit on extrapolation allowed in the per-leaf linear regressions. The ratio is multiplied by the range of values present in each leaves’ training dataset and added (substracted) to the maxiumum (minimum) value.

  • regressorOpt (dictionary (optional, default: {})) – Options to pass to DecisionTreeRegressor constructor See http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html for possibilities. Note that max_leaf_nodes and min_samples_leaf parameters will beoverwritten in the code.

  • baggingRegressorOpt (dictionary (optional, default: {})) – Options to pass to BaggingRegressor constructor. See http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.BaggingRegressor.html for possibilities.

Return type

None

trainSharpener()

Train the sharpener using high- and low-resolution input files and settings specified in the constructor. Local (moving window) and global regression decision trees are trained with high-resolution data resampled to low resolution and low-resolution data. The training dataset is selected based on homogeneity of resampled high-resolution data being below specified threshold and quality mask (if given) of low resolution data. The homogeneity statistics are also used as weight factors for the training samples (more homogenous - higher weight).

Parameters

None

Return type

None

applySharpener(highResFilename, lowResFilename=None)

Apply the trained sharpener to a given high-resolution image to derive corresponding disaggregated low-resolution image. If local regressions were used during training then they will only be applied where their moving window extent overlaps with the high resolution image passed to this function. Global regression will be applied to the whole high-resolution image wihtout geographic constraints.

Parameters
  • highResFilename (string) – Path to the high-resolution image file do be used during disaggregation.

  • lowResFilename (string (optional, default: None)) – Path to the low-resolution image file corresponding to the high-resolution input file. If local regressions were trained and low-resolution filename is given then the local and global regressions will be combined based on residual values of the different regressions to the low-resolution image (see Gao et al.1, 2012, section 2.3). If local regressions were trained and low-resolution filename is not given then only the local regressions will be used.

Returns

outImage – The file object contains an in-memory, georeferenced disaggregator output.

Return type

GDAL memory file object

residualAnalysis(disaggregatedFile, lowResFilename, lowResQualityFilename=None, doCorrection=True)

Perform residual analysis and (optional) correction on the disaggregated file (see Gao et al.1, 2012, section 2.4).

Parameters
  • disaggregatedFile (string or GDAL file object) – If string, path to the disaggregated image file; if gdal file object, the disaggregated image.

  • lowResFilename (string) – Path to the low-resolution image file corresponding to the high-resolution disaggregated image.

  • lowResQualityFilename (string (optional, default: None)) – Path to low-resolution quality image file. If provided then low quality values are masked out during residual analysis. Otherwise all values are considered to be of good quality.

  • doCorrection (boolean (optional, default: True)) – Flag indication whether residual (bias) correction should be performed or not.

Returns

  • residualImage (GDAL memory file object) – The file object contains an in-memory, georeferenced residual image.

  • correctedImage (GDAL memory file object) – The file object contains an in-memory, georeferenced residual corrected disaggregated image, or None if doCorrection was set to False.

1(1,2,3,4,5,6,7,8)

Feng Gao, William P Kustas, and Martha C Anderson. A data mining approach for sharpening thermal satellite imagery over land. Remote Sensing, 4(11):3287–3319, 2012.