"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

December 09, 2023

Simplifying Neural Network Training Under Class Imbalance

Simplifying Neural Network Training Under Class Imbalance

  • Small batch size - Class-imbalanced - settings, where small batch sizes shine.
  • Data augmentations have an amplified impact on performance under class imbalance, especially on minority-class accuracy
  • Adding a self-supervised loss during training can improve feature representations
  • Label smoothing, especially on minority class examples, helps prevent overfitting. We adapt label smoothing for the class-imbalanced setting by applying more smoothing to minorityclass examples than to majority-class  examples
  • A small modification of Sharpness-Aware Minimization (SAM) pulls decision boundaries away from minority samples and significantly improves minority-group accuracy
  • Loss reweighting. Reweighting methods assign different weights to majority and minority class loss functions, increasing the influence of minority samples which would otherwise play little role in the loss function

Label smoothing is a technique often used in training deep learning models, particularly for classification tasks. It modifies the target labels, making them a blend of the original hard labels and some uniform or prior distribution. This can lead to better generalization by preventing the model from becoming too confident about its predictions. In a class-imbalanced setting, where some classes have significantly more examples than others, label smoothing can help by reducing the model's bias towards the more frequent classes.

Label smoothing for the class-imbalanced setting python example

import tensorflow as tf
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.utils import to_categorical
from sklearn.datasets import make_classification
from tensorflow.keras.optimizers import Adam
import numpy as np
import matplotlib.pyplot as plt
# Configure the random seed for reproducibility
seed = 42
np.random.seed(seed)
tf.random.set_seed(seed)
def smooth_labels(labels, smoothing_factor=0.1):
# Assume `labels` are given as one-hot encoded vectors
# If `labels` are given as class indices, convert them to one-hot vectors first
num_classes = labels.shape[1]
return (1 - smoothing_factor) * labels + (smoothing_factor / num_classes)
# Create a synthetic dataset
X, y = make_classification(
n_samples=1000, n_features=20, n_classes=3, n_clusters_per_class=1, weights=[0.7, 0.2, 0.1], flip_y=0, random_state=seed
)
# Split the dataset into training and validation sets
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=seed)
# Convert class vectors to one-hot encoded vectors
y_train_one_hot = to_categorical(y_train)
y_val_one_hot = to_categorical(y_val)
# Create a simple neural network model
def create_model():
model = Sequential([
Dense(64, input_shape=(X_train.shape[1],), activation='relu'),
Dense(64, activation='relu'),
Dense(3, activation='softmax')
])
model.compile(optimizer=Adam(), loss='categorical_crossentropy', metrics=['accuracy'])
return model
# Train the model without label smoothing
model_without_smoothing = create_model()
history_without_smoothing = model_without_smoothing.fit(X_train, y_train_one_hot, epochs=50, validation_data=(X_val, y_val_one_hot), verbose=0)
# Train the model with label smoothing
# We'll smooth the training labels, but not the validation labels
smoothed_y_train = smooth_labels(y_train_one_hot, smoothing_factor=0.1)
model_with_smoothing = create_model()
history_with_smoothing = model_with_smoothing.fit(X_train, smoothed_y_train, epochs=50, validation_data=(X_val, y_val_one_hot), verbose=0)
# Plot the classification accuracy for both models
plt.figure(figsize=(12, 8))
plt.plot(history_without_smoothing.history['val_accuracy'], label='Without Label Smoothing')
plt.plot(history_with_smoothing.history['val_accuracy'], label='With Label Smoothing')
plt.title('Validation Accuracy with and without Label Smoothing')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()


In practice, label smoothing does not change the dataset's inherent imbalance but softens the target distributions by moving a portion of the mass from the peak (corresponding to the hard label) to other classes, which can help during the training of a model, preventing it from becoming overly confident on the majority class.



Loss reweighting for the class-imbalanced setting python example

import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from sklearn.utils.class_weight import compute_class_weight
from tensorflow.keras.utils import to_categorical
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
# Create an artificial imbalanced classification dataset
X, y = make_classification(n_samples=1000, n_features=20, n_clusters_per_class=1,
n_classes=3, weights=[0.7, 0.2, 0.1])
# Split into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Convert labels to one-hot encoding
y_train_one_hot = to_categorical(y_train)
y_test_one_hot = to_categorical(y_test)
# Compute the class weights
class_weights = compute_class_weight('balanced', classes=np.unique(y_train), y=y_train)
class_weights_dict = dict(enumerate(class_weights))
# Create the model
model = Sequential([
Dense(64, activation='relu', input_dim=20),
Dense(64, activation='relu'),
Dense(3, activation='softmax') # Output layer for 3 classes
])
# Compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
# Fit the model with class weights
model.fit(X_train, y_train_one_hot, epochs=20, validation_data=(X_test, y_test_one_hot), class_weight=class_weights_dict)
# Evaluate the model
loss, accuracy = model.evaluate(X_test, y_test_one_hot)
print(f'Test accuracy: {accuracy}')
view raw classweights.py hosted with ❤ by GitHub


class_weight: Optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function (during training only).

Let's import the module first

from sklearn.utils import class_weight

In order to calculate the class weight do the following

class_weights = class_weight.compute_class_weight('balanced', np.unique(y_train), y_train)

Thirdly and lastly add it to the model fitting

model.fit(X_train, y_train, class_weight=class_weights)

Keep Exploring!!!

No comments: