lidar_platform.classification.cc_3dmasc

Created on Fri Aug 5 18:12:03 2022

@author: Mathilde Letard, Baptiste Feldmann

lidar_platform.classification.cc_3dmasc.apply_confidence_threshold(pred, true, confid_pred, threshold)[source]

Compute the classification accuracy when discarding points classified with a prediction probability below a given threshold.

Parameters:

pred – numpy array array containing predicted labels
true – numpy array array containing true labels
confid_pred – numpy array array containing the prediction probabilities
threshold – float prediction probability threshold to apply

Returns: float: new classification accuracy

lidar_platform.classification.cc_3dmasc.classif_errors_confidence(pred, true, confid_pred)[source]

Get statistics about the prediction probability obtained by misclassified points.

Parameters:

pred – numpy array array containing the predicted labels
true – numpy array array containing the true labels
confid_pred – numpy array array containing the prediction probabilities

Returns: dict: dictionary with mean, median, min, max, and std of prediction probability for misclassified points.

lidar_platform.classification.cc_3dmasc.confidence_filtering_report(pred, true, confid_pred)[source]

Get a report of the evolution of classification accuracy depending on the prediction probability threshold applied. All points with prediction probabilities below a given threshold are not consided for accuracy computation.

Parameters:

pred – numpy array array containing predicted labels
true – numpy array array containing true labels
confid_pred – numpy array array containing prediction probabilities

Returns: dict: dictionary containing the accuracy and the percentage of remaining points for the following prediction probability thresholds: 0.5, 0.6, 0.7, 0.8, 0.9, 0.95

lidar_platform.classification.cc_3dmasc.feature_clean(features)[source]

Delete NaN and Inf values in the features set (no normalization, just NaN and Inf values cleaning)

Parameters:: features (numpy array) – input features dataset (for ex., the “features” field of a dict obtained with load_sbf_features.
Returns:: dataset – a dataset containing no more NaN of Inf values.
Return type:: numpy array

lidar_platform.classification.cc_3dmasc.get_acc_expe(trads, testds, plot=True, save=False, model=0)[source]

Train a random forest model for point cloud features classification and get metrics describing its performances.

Parameters:

trads (dictionary of numpy arrays) – training features dictionary.
testds (dict of numpy arrays) – test features dictionary.
save (bool) – defines if plot must be saved.
plot (bool) – defines if plot must be opened.
model (int (0 or 1)) – type of model. 0 = scikit-learn random forest, 1 = OpenCV random forest

Returns:

accuracy (float Overall Accuracy of classifier)
fscore (float F1-score (averaged on all classes))
numpy.mean(confid_pred) (float Mean prediction confidence)
recall (float Recall (averaged on all classes))
precision (float Precision (averaged on all classes))
uas (numpy.array(float) User’s accuracies (per class))
pas (numpy.array(float) Producer’s accuracies (per class))
fscores (numpy.array(float) F1-score per class)
confc (numpy.array(float) Mean prediction confidence per class)
recalls (numpy.array(float) Recall per class)
precisions (numpy.array(float) Precision per class)
labels (numpy.array(float) labels)
feat_imptce (numpy.array(float) feature importance values)
classifier (sklearn RandomForestClassifier or OpenCV RTrees classifier)
labels_pred (np.array(int) model predictions)

lidar_platform.classification.cc_3dmasc.get_shap_expl(classifier, testds, save=True)[source]

Get the shap summary plot of a random forest classifier trained on the given dataset (only works with scikit-learn models).

Parameters:

classifier (scikit-learn RandomForestClassifier) – trained classifier.
testds (dict,) – ‘features’ : numpy.array of computed features ‘names’ : list of str, name of each column feature ‘labels’ : list of int, class labels training dataset.
save (bool) – whether to save the resulting plot.

lidar_platform.classification.cc_3dmasc.load_sbf_features(sbf_filepath, params_filepath, labels=False, coords=False)[source]

Read an SBF file containing a point cloud with 3DMASC features.

Parameters:

sbf_filepath (str, absolute path to core-points SBF file (containing the features))
params_filepath (str, txt parameters file for 3DMASC)
labels (bool (default=False), to train a model, set it to True (you need to read the labels))
coords (bool (default=False), set to True if you want to get the coordinates too)

Returns:

data – ‘features’ : numpy.array of containing the features present in the SBF file ‘names’ : list of str, name of each feature column ‘labels’ : list of int, class labels ‘coords’ : numpy.array of point coordinates

Return type:

dict,

lidar_platform.classification.cc_3dmasc.plot_corr_mat(trads, plot=True, save=False)[source]

Visualize correlation between features as a heatmap.

Parameters:

trads (dictionary of numpy arrays) – training features dictionary.
save (bool) – defines if plot must be saved.
plot (bool) – defines if plot must be opened.

lidar_platform.classification.cc_3dmasc.plot_feat_imp(feat_imp, trads, save=False, plot=True)[source]

Plot the random forest feature importance.

Parameters:

feat_imp (numpy.array()) – array containing feature importances values.
trads (dictionary of numpy arrays) – training features dictionary.
save (bool) – defines if plot must be saved.
plot (bool) – defines if plot must be opened.

lidar_platform.classification.cc_3dmasc.test(testds, classifier, model=0)[source]

Test the random forest model obtained. The model is tested on the test dataset, and classification metrics are computed.

Parameters:

testds (dict of numpy arrays) – test features dictionary.
classifier (sklearn RandomForestClassifier or OpenCV RTrees) – trained random forest model ready for use.
model (int (0 or 1)) – type of the model. 0 = scikit-learn random forest, 1 = OpenCV random forest

Returns:

labels_pred (numpy array) – labels predicted by the model for each set of features.
confid_pred (numpy array) – prediction confidence for each set of features.
feat_imptce (numpy array) – importance of each feature (as computed in sklearn’s RandomForestClassifier).
oa (float) – overall accuracy of the classifier
fs (float) – F1-score of the classifier averaged on all classes

lidar_platform.classification.cc_3dmasc.train(trads, model=0)[source]

Train a random forest model for point cloud features classification. This function handles two types of RF models: scikit-learn (parallelized computing), and OpenCV (same as in CloudCompare, but not parallelized).

Parameters:

trads (dictionary of numpy arrays) – training features dictionary.
model (int (0 or 1)) – type of model. 0 = scikit-learn random forest, 1 = OpenCV random forest

Returns:

model – trained random forest model ready for use.

Return type:

sklearn RandomForestClassifier or OpenCV RTrees