Simplifying Neural Network Training Under Class Imbalance
- Small batch size - Class-imbalanced - settings, where small batch sizes shine.
- Data augmentations have an amplified impact on performance under class imbalance, especially on minority-class accuracy
- Adding a self-supervised loss during training can improve feature representations
- Label smoothing, especially on minority class examples, helps prevent overfitting. We adapt label smoothing for the class-imbalanced setting by applying more smoothing to minorityclass examples than to majority-class examples
- A small modification of Sharpness-Aware Minimization (SAM) pulls decision boundaries away from minority samples and significantly improves minority-group accuracy
- Loss reweighting. Reweighting methods assign different weights to majority and minority class loss functions, increasing the influence of minority samples which would otherwise play little role in the loss function
Label smoothing is a technique often used in training deep learning models, particularly for classification tasks. It modifies the target labels, making them a blend of the original hard labels and some uniform or prior distribution. This can lead to better generalization by preventing the model from becoming too confident about its predictions. In a class-imbalanced setting, where some classes have significantly more examples than others, label smoothing can help by reducing the model's bias towards the more frequent classes.
Label smoothing for the class-imbalanced setting python example
import tensorflow as tf | |
from sklearn.model_selection import train_test_split | |
from tensorflow.keras.models import Sequential | |
from tensorflow.keras.layers import Dense | |
from tensorflow.keras.utils import to_categorical | |
from sklearn.datasets import make_classification | |
from tensorflow.keras.optimizers import Adam | |
import numpy as np | |
import matplotlib.pyplot as plt | |
# Configure the random seed for reproducibility | |
seed = 42 | |
np.random.seed(seed) | |
tf.random.set_seed(seed) | |
def smooth_labels(labels, smoothing_factor=0.1): | |
# Assume `labels` are given as one-hot encoded vectors | |
# If `labels` are given as class indices, convert them to one-hot vectors first | |
num_classes = labels.shape[1] | |
return (1 - smoothing_factor) * labels + (smoothing_factor / num_classes) | |
# Create a synthetic dataset | |
X, y = make_classification( | |
n_samples=1000, n_features=20, n_classes=3, n_clusters_per_class=1, weights=[0.7, 0.2, 0.1], flip_y=0, random_state=seed | |
) | |
# Split the dataset into training and validation sets | |
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=seed) | |
# Convert class vectors to one-hot encoded vectors | |
y_train_one_hot = to_categorical(y_train) | |
y_val_one_hot = to_categorical(y_val) | |
# Create a simple neural network model | |
def create_model(): | |
model = Sequential([ | |
Dense(64, input_shape=(X_train.shape[1],), activation='relu'), | |
Dense(64, activation='relu'), | |
Dense(3, activation='softmax') | |
]) | |
model.compile(optimizer=Adam(), loss='categorical_crossentropy', metrics=['accuracy']) | |
return model | |
# Train the model without label smoothing | |
model_without_smoothing = create_model() | |
history_without_smoothing = model_without_smoothing.fit(X_train, y_train_one_hot, epochs=50, validation_data=(X_val, y_val_one_hot), verbose=0) | |
# Train the model with label smoothing | |
# We'll smooth the training labels, but not the validation labels | |
smoothed_y_train = smooth_labels(y_train_one_hot, smoothing_factor=0.1) | |
model_with_smoothing = create_model() | |
history_with_smoothing = model_with_smoothing.fit(X_train, smoothed_y_train, epochs=50, validation_data=(X_val, y_val_one_hot), verbose=0) | |
# Plot the classification accuracy for both models | |
plt.figure(figsize=(12, 8)) | |
plt.plot(history_without_smoothing.history['val_accuracy'], label='Without Label Smoothing') | |
plt.plot(history_with_smoothing.history['val_accuracy'], label='With Label Smoothing') | |
plt.title('Validation Accuracy with and without Label Smoothing') | |
plt.xlabel('Epoch') | |
plt.ylabel('Accuracy') | |
plt.legend() | |
plt.show() |
In practice, label smoothing does not change the dataset's inherent imbalance but softens the target distributions by moving a portion of the mass from the peak (corresponding to the hard label) to other classes, which can help during the training of a model, preventing it from becoming overly confident on the majority class.
Loss reweighting for the class-imbalanced setting python example
import numpy as np | |
from tensorflow.keras.models import Sequential | |
from tensorflow.keras.layers import Dense | |
from sklearn.utils.class_weight import compute_class_weight | |
from tensorflow.keras.utils import to_categorical | |
from sklearn.datasets import make_classification | |
from sklearn.model_selection import train_test_split | |
# Create an artificial imbalanced classification dataset | |
X, y = make_classification(n_samples=1000, n_features=20, n_clusters_per_class=1, | |
n_classes=3, weights=[0.7, 0.2, 0.1]) | |
# Split into training and test sets | |
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) | |
# Convert labels to one-hot encoding | |
y_train_one_hot = to_categorical(y_train) | |
y_test_one_hot = to_categorical(y_test) | |
# Compute the class weights | |
class_weights = compute_class_weight('balanced', classes=np.unique(y_train), y=y_train) | |
class_weights_dict = dict(enumerate(class_weights)) | |
# Create the model | |
model = Sequential([ | |
Dense(64, activation='relu', input_dim=20), | |
Dense(64, activation='relu'), | |
Dense(3, activation='softmax') # Output layer for 3 classes | |
]) | |
# Compile the model | |
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) | |
# Fit the model with class weights | |
model.fit(X_train, y_train_one_hot, epochs=20, validation_data=(X_test, y_test_one_hot), class_weight=class_weights_dict) | |
# Evaluate the model | |
loss, accuracy = model.evaluate(X_test, y_test_one_hot) | |
print(f'Test accuracy: {accuracy}') |
class_weight: Optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function (during training only).
Let's import the module first
from sklearn.utils import class_weight
In order to calculate the class weight do the following
class_weights = class_weight.compute_class_weight('balanced', np.unique(y_train), y_train)
Thirdly and lastly add it to the model fitting
model.fit(X_train, y_train, class_weight=class_weights)
Keep Exploring!!!
No comments:
Post a Comment