Metrics¶
auc
¶
get_auc(real, synthetic, n_folds=10)
¶
Calculate the AUC score of a dataset using a Random Forest Classifier.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
real |
DataFrame
|
Real dataset. |
required |
synthetic |
DataFrame
|
Either decoded or synthetic dataset. |
required |
n_folds |
int
|
Number of folds for cross-validation. Defaults to 10. |
10
|
Returns:
Type | Description |
---|---|
Tuple[float, float, int]
|
Tuple[float, float, int]: Partial AUC, AUC, Number of samples. |
Raises:
Type | Description |
---|---|
ValueError
|
If "VISIT" or "SUBJID" columns are present in the dataset. |
Source code in vambn/metrics/auc.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 |
|
categorical
¶
accuracy(pred, target, mask)
¶
Calculate the accuracy of predictions for a categorical variable.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pred |
Tensor
|
Predictions of shape (batch_size, n_categories). |
required |
target |
Tensor
|
Ground truth of shape (batch_size, n_categories). |
required |
mask |
Tensor
|
Mask of shape (batch_size, n_categories). |
required |
Returns:
Name | Type | Description |
---|---|---|
Tensor |
Tensor
|
The accuracy of predictions. |
Source code in vambn/metrics/categorical.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
|
continous
¶
nrmse(pred, target, mask)
¶
Calculate the normalized root mean squared error (NRMSE).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pred |
Tensor
|
The predicted values. |
required |
target |
Tensor
|
The target values. |
required |
mask |
LongTensor
|
The mask to be applied, must be the same size as pred and target. |
required |
Returns:
Type | Description |
---|---|
Tensor
|
torch.Tensor: The normalized root mean squared error. |
Source code in vambn/metrics/continous.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
|
jensen_shannon
¶
jensen_shannon_distance(real, synthetic, data_type)
¶
Calculate the Jensen-Shannon distance between two tensors.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
real |
ndarray | Tensor
|
Real data tensor. |
required |
synthetic |
ndarray | Tensor
|
Synthetic data tensor. |
required |
data_type |
str
|
Type of data. Possible values are "real", "pos", "truncate_norm", "count", "cat", "truncate_norm". |
required |
Returns:
Name | Type | Description |
---|---|---|
float |
float
|
Jensen-Shannon distance. |
Raises:
Type | Description |
---|---|
Exception
|
If the data type is unknown or all columns contain too many NaN values and were removed. |
Source code in vambn/metrics/jensen_shannon.py
69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 |
|
jensen_shannon_distance_kde(tensor1, tensor2, data_type, bins=30)
¶
Calculate the Jensen-Shannon distance between two tensors.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tensor1 |
ndarray | Tensor
|
Tensor 1. |
required |
tensor2 |
ndarray | Tensor
|
Tensor 2. |
required |
data_type |
str
|
Type of data. Possible values are "real", "pos", "truncate_norm", "count", "cat", "truncate_norm", "gamma". |
required |
bins |
int
|
Number of bins for count. Defaults to 30. |
30
|
Returns:
Name | Type | Description |
---|---|---|
float |
float
|
Jensen-Shannon distance. |
Raises:
Type | Description |
---|---|
Exception
|
If the data type is unknown. |
Source code in vambn/metrics/jensen_shannon.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 |
|
relative_correlation
¶
RelativeCorrelation
¶
Class for calculating relative correlation metrics between data sets.
Source code in vambn/metrics/relative_correlation.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
|
error(real, synthetic, method='spearman')
staticmethod
¶
Calculate the relative error of correlation between two pandas DataFrames.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
real |
DataFrame
|
First DataFrame. |
required |
synthetic |
DataFrame
|
Second DataFrame. |
required |
method |
str
|
Method for correlation. Defaults to "spearman". |
'spearman'
|
Returns:
Name | Type | Description |
---|---|---|
tuple |
tuple[Any, DataFrame, DataFrame]
|
A tuple containing: - float: The relative error of correlation between the two DataFrames. - pd.DataFrame: The correlation matrix of the real DataFrame. - pd.DataFrame: The correlation matrix of the synthetic DataFrame. |
Source code in vambn/metrics/relative_correlation.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
|