lidar_platform.classification.cc_3dmasc

Created on Fri Aug 5 18:12:03 2022

@author: Mathilde Letard, Baptiste Feldmann

lidar_platform.classification.cc_3dmasc.apply_confidence_threshold(pred, true, confid_pred, threshold)[source]

Compute the classification accuracy when discarding points classified with a prediction probability below a given threshold.

Parameters:
  • pred – numpy array array containing predicted labels

  • true – numpy array array containing true labels

  • confid_pred – numpy array array containing the prediction probabilities

  • threshold – float prediction probability threshold to apply

Returns: float

new classification accuracy

lidar_platform.classification.cc_3dmasc.classif_errors_confidence(pred, true, confid_pred)[source]

Get statistics about the prediction probability obtained by misclassified points.

Parameters:
  • pred – numpy array array containing the predicted labels

  • true – numpy array array containing the true labels

  • confid_pred – numpy array array containing the prediction probabilities

Returns: dict

dictionary with mean, median, min, max, and std of prediction probability for misclassified points.

lidar_platform.classification.cc_3dmasc.confidence_filtering_report(pred, true, confid_pred)[source]

Get a report of the evolution of classification accuracy depending on the prediction probability threshold applied. All points with prediction probabilities below a given threshold are not consided for accuracy computation.

Parameters:
  • pred – numpy array array containing predicted labels

  • true – numpy array array containing true labels

  • confid_pred – numpy array array containing prediction probabilities

Returns: dict

dictionary containing the accuracy and the percentage of remaining points for the following prediction probability thresholds: 0.5, 0.6, 0.7, 0.8, 0.9, 0.95

lidar_platform.classification.cc_3dmasc.feature_clean(features)[source]

Delete NaN and Inf values in the features set (no normalization, just NaN and Inf values cleaning)

Parameters:

features (numpy array) – input features dataset (for ex., the “features” field of a dict obtained with load_sbf_features.

Returns:

dataset – a dataset containing no more NaN of Inf values.

Return type:

numpy array

lidar_platform.classification.cc_3dmasc.get_acc_expe(trads, testds, plot=True, save=False, model=0)[source]

Train a random forest model for point cloud features classification and get metrics describing its performances.

Parameters:
  • trads (dictionary of numpy arrays) – training features dictionary.

  • testds (dict of numpy arrays) – test features dictionary.

  • save (bool) – defines if plot must be saved.

  • plot (bool) – defines if plot must be opened.

  • model (int (0 or 1)) – type of model. 0 = scikit-learn random forest, 1 = OpenCV random forest

Returns:

  • accuracy (float Overall Accuracy of classifier)

  • fscore (float F1-score (averaged on all classes))

  • numpy.mean(confid_pred) (float Mean prediction confidence)

  • recall (float Recall (averaged on all classes))

  • precision (float Precision (averaged on all classes))

  • uas (numpy.array(float) User’s accuracies (per class))

  • pas (numpy.array(float) Producer’s accuracies (per class))

  • fscores (numpy.array(float) F1-score per class)

  • confc (numpy.array(float) Mean prediction confidence per class)

  • recalls (numpy.array(float) Recall per class)

  • precisions (numpy.array(float) Precision per class)

  • labels (numpy.array(float) labels)

  • feat_imptce (numpy.array(float) feature importance values)

  • classifier (sklearn RandomForestClassifier or OpenCV RTrees classifier)

  • labels_pred (np.array(int) model predictions)

lidar_platform.classification.cc_3dmasc.get_shap_expl(classifier, testds, save=True)[source]

Get the shap summary plot of a random forest classifier trained on the given dataset (only works with scikit-learn models).

Parameters:
  • classifier (scikit-learn RandomForestClassifier) – trained classifier.

  • testds (dict,) – ‘features’ : numpy.array of computed features ‘names’ : list of str, name of each column feature ‘labels’ : list of int, class labels training dataset.

  • save (bool) – whether to save the resulting plot.

lidar_platform.classification.cc_3dmasc.load_sbf_features(sbf_filepath, params_filepath, labels=False, coords=False)[source]

Read an SBF file containing a point cloud with 3DMASC features.

Parameters:
  • sbf_filepath (str, absolute path to core-points SBF file (containing the features))

  • params_filepath (str, txt parameters file for 3DMASC)

  • labels (bool (default=False), to train a model, set it to True (you need to read the labels))

  • coords (bool (default=False), set to True if you want to get the coordinates too)

Returns:

data – ‘features’ : numpy.array of containing the features present in the SBF file ‘names’ : list of str, name of each feature column ‘labels’ : list of int, class labels ‘coords’ : numpy.array of point coordinates

Return type:

dict,

lidar_platform.classification.cc_3dmasc.plot_corr_mat(trads, plot=True, save=False)[source]

Visualize correlation between features as a heatmap.

Parameters:
  • trads (dictionary of numpy arrays) – training features dictionary.

  • save (bool) – defines if plot must be saved.

  • plot (bool) – defines if plot must be opened.

lidar_platform.classification.cc_3dmasc.plot_feat_imp(feat_imp, trads, save=False, plot=True)[source]

Plot the random forest feature importance.

Parameters:
  • feat_imp (numpy.array()) – array containing feature importances values.

  • trads (dictionary of numpy arrays) – training features dictionary.

  • save (bool) – defines if plot must be saved.

  • plot (bool) – defines if plot must be opened.

lidar_platform.classification.cc_3dmasc.test(testds, classifier, model=0)[source]

Test the random forest model obtained. The model is tested on the test dataset, and classification metrics are computed.

Parameters:
  • testds (dict of numpy arrays) – test features dictionary.

  • classifier (sklearn RandomForestClassifier or OpenCV RTrees) – trained random forest model ready for use.

  • model (int (0 or 1)) – type of the model. 0 = scikit-learn random forest, 1 = OpenCV random forest

Returns:

  • labels_pred (numpy array) – labels predicted by the model for each set of features.

  • confid_pred (numpy array) – prediction confidence for each set of features.

  • feat_imptce (numpy array) – importance of each feature (as computed in sklearn’s RandomForestClassifier).

  • oa (float) – overall accuracy of the classifier

  • fs (float) – F1-score of the classifier averaged on all classes

lidar_platform.classification.cc_3dmasc.train(trads, model=0)[source]

Train a random forest model for point cloud features classification. This function handles two types of RF models: scikit-learn (parallelized computing), and OpenCV (same as in CloudCompare, but not parallelized).

Parameters:
  • trads (dictionary of numpy arrays) – training features dictionary.

  • model (int (0 or 1)) – type of model. 0 = scikit-learn random forest, 1 = OpenCV random forest

Returns:

model – trained random forest model ready for use.

Return type:

sklearn RandomForestClassifier or OpenCV RTrees