pyBIA.cnn_model

Created on Thu Sep 16 22:40:39 2021

@author: daniel

Classes

Classifier

Creates and trains a convolutional neural network for binary classification,

Functions

custom_model(positive_class, negative_class[, ...])

Build and train a configurable CNN with 1–3 Conv2D(+pool) blocks followed by up to 3 dense layers.

AlexNet(positive_class, negative_class[, ...])

Build and train an AlexNet-style CNN adapted for binary classification of astronomical images.

VGG16(positive_class, negative_class[, ...])

Build and train a VGG16-style CNN for binary classification of astronomical images.

Resnet18(positive_class, negative_class[, ...])

Build and train a ResNet-18–style CNN with configurable stem, residual block widths,

resnet_block(x, filters_in, filters_out[, ...])

Basic residual block with two conv layers and an identity/projection skip.

f1_score(y_true, y_pred)

Binary F1 score (harmonic mean of precision and recall) for the current batch.

calculate_tp_fp(model, sample, y_true)

Compute batch true positives (TP) and false positives (FP) from model predictions.

focal_loss(y_true, y_pred[, gamma, alpha])

Binary focal loss for imbalanced classification (Lin et al., 2017).

dice_loss(y_true, y_pred[, smooth])

Dice loss for binary segmentation.

jaccard_loss(y_true, y_pred[, smooth])

Jaccard (IoU) loss for binary segmentation.

weighted_binary_crossentropy(weight)

Weighted binary cross-entropy (positive-class scaling).

get_optimizer(optimizer, lr[, momentum, decay, rho, ...])

Return a configured Keras optimizer instance.

get_loss_function(loss[, weight])

Return a Keras-compatible loss given a symbolic name.

print_params(batch_size, lr, decay, momentum, ...)

Print a formatted summary of training hyperparameters and model architecture settings.

format_labels(→ list)

Convert raw parameter keys into human-readable display labels.

Module Contents

class pyBIA.cnn_model.Classifier(positive_class=None, negative_class=None, val_positive=None, val_negative=None, img_num_channels=1, clf='alexnet', normalize=False, min_pixel=0, max_pixel=100, epochs=25, patience=5, metric='loss', opt_cv=None, augment_data=False, batch_positive=10, batch_negative=1, balance=True, image_size=70, shift=10, rotation=False, horizontal=False, vertical=False, mask_size=None, num_masks=None, blend_positive=0, blending_func='mean', num_images_to_blend=2, blend_negative=0, zoom_range=(0.9, 1.1), skew_angle=0, batch_size=32, optimizer='sgd', lr=0.0001, momentum=0.9, decay=0.0, nesterov=False, rho=0.9, beta_1=0.9, beta_2=0.999, amsgrad=False, conv_init='uniform_scaling', dense_init='truncated_normal', activation_conv='relu', activation_dense='relu', conv_reg=0, dense_reg=0, padding='same', model_reg='batch_norm', verbose=2, path=None, use_gpu=False)[source]

Creates and trains a convolutional neural network for binary classification, with optional normalization, augmentation, simple cross-validation, and convenience utilities for saving, loading, and visualization.

Parameters:
  • positive_class (ndarray or None, optional) – Training images for the positive class. Accepts (N, H, W) or (N, H, W, C) arrays where N is the number of samples, H×W are spatial dimensions, and C is the number of channels. Default is None.

  • negative_class (ndarray or None, optional) – Training images for the negative class. Accepts (N, H, W) or (N, H, W, C) arrays with the same conventions as positive_class. Default is None.

  • val_positive (ndarray or None, optional) – Optional validation images for the positive class using the same shape rules as training data. Default is None.

  • val_negative (ndarray or None, optional) – Optional validation images for the negative class using the same shape rules as training data. Default is None.

  • img_num_channels (int, optional) – Number of channels per image (last dimension). Inferred from 4-D inputs when possible; may be set explicitly for legacy compatibility. Default is 1.

  • clf ({'alexnet','vgg16','resnet18','custom_cnn'}, optional) – Backbone architecture to build and train. Default is ‘alexnet’.

  • normalize (bool, optional) – If True, min–max normalize each image/channel using min_pixel and max_pixel before training or prediction. Default is False.

  • min_pixel (float, optional) – Lower clamp applied during min–max normalization (used only if normalize=True). Default is 0.

  • max_pixel (float or list, optional) – Upper clamp applied during min–max normalization (used only if normalize=True). If multi-channel, a list may specify per-channel maxima. Default is 100.

  • epochs (int, optional) – Number of training epochs. If set to 0, the model is constructed but not trained. Default is 25.

  • patience (int, optional) – Early-stopping patience (epochs) for the monitored metric. Default is 5.

  • metric ({'loss','binary_accuracy','f1_score','all','val_loss','val_binary_accuracy','val_f1_score'}, optional) – Metric used for monitoring/selection during training/early stopping. Default is ‘loss’.

  • opt_cv (int or None, optional) – If set to an integer K, perform simple K-fold-like training by rotating validation blocks (requires val_positive/val_negative). Default is None.

  • augment_data (bool, optional) – If True, apply the configured augmentation pipeline to the training data (positive class, and optionally negative). Default is False.

  • batch_positive (int, optional) – Augmentation multiplier applied to the positive class (outputs per input). Default is 10.

  • batch_negative (int, optional) – Augmentation multiplier applied to the negative class (0 disables negative augmentation). Default is 1.

  • balance (bool, optional) – After augmentation/resizing, trim the larger class to match the smaller. Default is True.

  • image_size (int, optional) – Target square side length used by augmentation/resize utilities. Default is 70.

  • shift (int, optional) – Maximum absolute pixel shift applied horizontally and vertically during augmentation. Default is 10.

  • rotation (bool, optional) – If True, allow random rotations in the full 0–360° range. Default is False.

  • horizontal (bool, optional) – If True, allow random horizontal flips in augmentation. Default is False.

  • vertical (bool, optional) – If True, allow random vertical flips in augmentation. Default is False.

  • mask_size (int or tuple or None, optional) – Side length of random square cutouts applied during augmentation; if a tuple (low, high) is given, sizes are sampled uniformly from the range. Default is None.

  • num_masks (int or tuple or None, optional) – Number of cutouts per image when mask_size is set; if a tuple (low, high) is given, counts are sampled uniformly from the range. Default is None.

  • blend_positive (float, optional) – Blended-image synthesis factor for the positive class (≥1 adds synthetic samples, 0 disables blending). Default is 0.

  • blending_func ({'mean','max','min','random'}, optional) – Operator used when blending multiple images to synthesize samples. Default is ‘mean’.

  • num_images_to_blend (int, optional) – Number of images combined per synthetic blend operation. Default is 2.

  • blend_negative (float, optional) – Blended-image synthesis factor for the negative class (≥1 adds synthetic samples, 0 disables blending). Default is 0.

  • zoom_range (tuple of (float, float) or None, optional) – Random zoom range specified as (min_zoom, max_zoom). Default is (0.9, 1.1).

  • skew_angle (float, optional) – Maximum absolute skew angle in degrees; the actual angle is sampled uniformly from [−skew_angle, +skew_angle]. Default is 0.

  • batch_size (int, optional) – Mini-batch size used during training. Default is 32.

  • optimizer ({'sgd','adam','rmsprop','adadelta',...}, optional) – Optimizer name forwarded to the model builders. Default is ‘sgd’.

  • lr (float, optional) – Optimizer learning rate. Default is 0.0001.

  • momentum (float, optional) – Momentum parameter used by SGD-like optimizers. Default is 0.9.

  • decay (float, optional) – Per-epoch learning-rate decay. Default is 0.0.

  • nesterov (bool, optional) – If True, use Nesterov momentum with SGD. Default is False.

  • rho (float, optional) – Rho parameter for Adadelta/RMSprop optimizers. Default is 0.9.

  • beta_1 (float, optional) – Beta1 parameter for Adam-type optimizers. Default is 0.9.

  • beta_2 (float, optional) – Beta2 parameter for Adam-type optimizers. Default is 0.999.

  • amsgrad (bool, optional) – If True, use the AMSGrad variant of Adam. Default is False.

  • conv_init (str, optional) – Kernel initializer for convolutional layers. Default is ‘uniform_scaling’.

  • dense_init (str, optional) – Kernel initializer for dense layers. Default is ‘truncated_normal’.

  • activation_conv (str, optional) – Activation function used in convolutional layers. Default is ‘relu’.

  • activation_dense (str, optional) – Activation function used in dense layers. Default is ‘relu’.

  • conv_reg (float, optional) – L2 regularization strength applied to convolutional layers. Default is 0.

  • dense_reg (float, optional) – L2 regularization strength applied to dense layers. Default is 0.

  • padding ({'same','valid'}, optional) – Convolution padding mode used throughout the network. Default is ‘same’.

  • model_reg ({'batch_norm', None, ...}, optional) – Model-level regularization utility applied to the network. Default is ‘batch_norm’.

  • verbose ({0,1,2}, optional) – Keras verbosity level (0 = silent, 1 = progress bar, 2 = per-epoch line). Default is 2.

  • path (str or None, optional) – Base directory for saving/loading artifacts; the home directory is used when None. Default is None.

  • use_gpu (bool, optional) – If False, disable GPU via the environment variable CUDA_VISIBLE_DEVICES=’-1’. Default is False.

model[source]

Trained model (or list of models if opt_cv is used).

Type:

keras.Model or list[keras.Model] or None

history[source]

Keras history object(s) from training.

Type:

keras.callbacks.History or list[History] or None

model_train_metrics

Stacked training metrics per epoch: columns [binary_accuracy, loss, f1_score]. If CV, a list per fold.

Type:

ndarray or list[ndarray]

model_val_metrics

Same as above but for validation, if validation data was provided.

Type:

ndarray or list[ndarray]

path[source]

Folder where artifacts are saved/loaded (set by save()/load()).

Type:

str or None

positive_class = None[source]
negative_class = None[source]
val_positive = None[source]
val_negative = None[source]
img_num_channels = 1[source]
clf = 'alexnet'[source]
normalize = False[source]
min_pixel = 0[source]
max_pixel = 100[source]
epochs = 25[source]
patience = 5[source]
metric = 'loss'[source]
opt_cv = None[source]
augment_data = False[source]
batch_positive = 10[source]
batch_negative = 1[source]
balance = True[source]
image_size = 70[source]
shift = 10[source]
rotation = False[source]
horizontal = False[source]
vertical = False[source]
mask_size = None[source]
num_masks = None[source]
blending_func = 'mean'[source]
num_images_to_blend = 2[source]
blend_negative = 0[source]
zoom_range = (0.9, 1.1)[source]
skew_angle = 0[source]
blend_positive = 0[source]
batch_size = 32[source]
optimizer = 'sgd'[source]
lr = 0.0001[source]
momentum = 0.9[source]
decay = 0.0[source]
nesterov = False[source]
rho = 0.9[source]
beta_1 = 0.9[source]
beta_2 = 0.999[source]
amsgrad = False[source]
conv_init = 'uniform_scaling'[source]
dense_init = 'truncated_normal'[source]
activation_conv = 'relu'[source]
activation_dense = 'relu'[source]
conv_reg = 0[source]
dense_reg = 0[source]
padding = 'same'[source]
model_reg = 'batch_norm'[source]
verbose = 2[source]
path = None[source]
use_gpu = False[source]
model = None[source]
history = None[source]
create(overwrite_training=False, save_training=False)[source]

Build and train the configured CNN (optionally with augmentation and CV).

Models and histories are stored on the instance (self.model, self.history). If epochs == 0, the network is constructed but not trained.

Parameters:
  • overwrite_training (bool, optional) – If True, replace positive_class/negative_class (and validation sets, if any) with the processed arrays actually used for training (after normalization, resizing, and augmentation). Default is False.

  • save_training (bool, optional) – If True, persist the processed training/validation arrays alongside the model artifacts (location determined by path). Default is False.

Return type:

None

save(dirname=None, overwrite=False)[source]

Save the trained model(s), metrics, and class attributes to disk.

Creates a folder named ‘pyBIA_cnn_model’ under the base directory (path or home directory). If multiple CV models exist, each is saved separately.

Parameters:
  • dirname (str or None, optional) – Optional subdirectory created beneath the base path before saving (e.g., to group experiments). If the subdirectory already exists, an error is raised unless handled by the caller. Default is None.

  • overwrite (bool, optional) – If True, delete any existing ‘pyBIA_cnn_model’ folder at the target location and recreate it to avoid duplicates. Default is False.

Return type:

None

Raises:

ValueError – If the destination folder already exists and overwrite is False.

load(path=None, load_training_data=False)[source]

Load a saved model (or CV models), metrics, and class attributes from disk.

Looks for a ‘pyBIA_cnn_model’ folder under path (or the home directory if path is None). Optionally restores the saved training/validation arrays.

Parameters:
  • path (str or None, optional) – Base directory that contains the ‘pyBIA_cnn_model’ folder to load from. If None, the home directory is used. Default is None.

  • load_training_data (bool, optional) – If True, also load the saved positive_class, negative_class, and optional validation arrays (when present). Default is False.

Return type:

None

predict(data, target='LAB', return_proba=False, cv_model=0)[source]

Predict class labels for new images using the trained CNN.

Images are preprocessed using the current normalization settings and resized to the model’s required input size when necessary.

Parameters:
  • data (ndarray) – Input images as (N, H, W) for single-channel or (N, H, W, C) for multi-channel. A single image may be passed as (H, W) or (H, W, C) and will be promoted.

  • target (str, optional) – String label used for the positive class when returning class names. Default is ‘LAB’.

  • return_proba (bool, optional) – If True, also return predicted probabilities for the positive class. Default is False.

  • cv_model (int or {'all'}, optional) – Index of the CV model to use when multiple models were trained, or ‘all’ to average probabilities across all models. Default is 0.

Returns:

If return_proba is False, an array of predicted class strings with shape (N,). If return_proba is True, an array of shape (N, 2) with columns [predicted_label, probability_for_target].

Return type:

ndarray

Raises:

ValueError – If no trained model is available or if inputs are incompatible with the model.

augment_positive(batch=1, width_shift=0, height_shift=0, horizontal=False, vertical=False, rotation=False, fill='nearest', image_size=None, zoom_range=None, mask_size=None, num_masks=None, blend_multiplier=0, blending_func='mean', num_images_to_blend=2, skew_angle=0)[source]

Apply the configured augmentation pipeline to the positive class and replace it.

Parameters:
  • batch (int, optional) – Number of augmented outputs to create per input image. Default is 1.

  • width_shift (int, optional) – Maximum horizontal pixel shift applied uniformly at random. Default is 0.

  • height_shift (int, optional) – Maximum vertical pixel shift applied uniformly at random. Default is 0.

  • horizontal (bool, optional) – If True, allow random horizontal flips. Default is False.

  • vertical (bool, optional) – If True, allow random vertical flips. Default is False.

  • rotation (bool, optional) – If True, allow random rotations in the full 0–360° range. Default is False.

  • fill ({'constant','nearest','reflect','wrap'}, optional) – Fill mode for pixels introduced by rotations/shifts. Default is ‘nearest’.

  • image_size (int or None, optional) – Target square size after augmentation; if None, keep original size. Default is None.

  • zoom_range (tuple of (float, float) or None, optional) – Random zoom range given as (min_zoom, max_zoom). Default is None.

  • mask_size (int or None, optional) – Side length of each random square cutout; if None, disable cutouts. Default is None.

  • num_masks (int or None, optional) – Number of cutouts applied per image when mask_size is set. Default is None.

  • blend_multiplier (float, optional) – Synthetic blending factor (≥1 adds blended samples, 0 disables blending). Default is 0.

  • blending_func ({'mean','max','min','random'}, optional) – Operator used when blending multiple images. Default is ‘mean’.

  • num_images_to_blend (int, optional) – Number of images combined per synthetic blend operation. Default is 2.

  • skew_angle (float, optional) – Maximum absolute skew angle in degrees; sampled uniformly from [−skew_angle, +skew_angle]. Default is 0.

Return type:

None

augment_negative(batch=1, width_shift=0, height_shift=0, horizontal=False, vertical=False, rotation=False, fill='nearest', image_size=None, zoom_range=None, mask_size=None, num_masks=None, blend_multiplier=0, blending_func='mean', num_images_to_blend=2, skew_angle=0)[source]

Apply the configured augmentation pipeline to the negative class and replace it.

Parameters:
  • batch (int, optional) – Number of augmented outputs to create per input image. Default is 1.

  • width_shift (int, optional) – Maximum horizontal pixel shift applied uniformly at random. Default is 0.

  • height_shift (int, optional) – Maximum vertical pixel shift applied uniformly at random. Default is 0.

  • horizontal (bool, optional) – If True, allow random horizontal flips. Default is False.

  • vertical (bool, optional) – If True, allow random vertical flips. Default is False.

  • rotation (bool, optional) – If True, allow random rotations in the full 0–360° range. Default is False.

  • fill ({'constant','nearest','reflect','wrap'}, optional) – Fill mode for pixels introduced by rotations/shifts. Default is ‘nearest’.

  • image_size (int or None, optional) – Target square size after augmentation; if None, keep original size. Default is None.

  • zoom_range (tuple of (float, float) or None, optional) – Random zoom range given as (min_zoom, max_zoom). Default is None.

  • mask_size (int or None, optional) – Side length of each random square cutout; if None, disable cutouts. Default is None.

  • num_masks (int or None, optional) – Number of cutouts applied per image when mask_size is set. Default is None.

  • blend_multiplier (float, optional) – Synthetic blending factor (≥1 adds blended samples, 0 disables blending). Default is 0.

  • blending_func ({'mean','max','min','random'}, optional) – Operator used when blending multiple images. Default is ‘mean’.

  • num_images_to_blend (int, optional) – Number of images combined per synthetic blend operation. Default is 2.

  • skew_angle (float, optional) – Maximum absolute skew angle in degrees; sampled uniformly from [−skew_angle, +skew_angle]. Default is 0.

Return type:

None

plot_tsne(legend_loc='upper center', title='Feature Parameter Space', savefig=False)[source]

Plot a t-SNE projection of images (train and optional validation).

Data are flattened per image and embedded into 2D using sklearn’s TSNE.

Parameters:
  • legend_loc (str, optional) – Matplotlib legend location string (e.g., ‘upper center’). Default is ‘upper center’.

  • title (str, optional) – Figure title displayed above the plot. Default is ‘Feature Parameter Space’.

  • savefig (bool, optional) – If True, save the figure to ‘Images_tSNE_Projection.png’ instead of showing it. Default is False.

Return type:

None

Notes

Data should be normalized prior to plotting for meaningful distances (see the normalize option during training).

plot_performance(metric='acc', combine=False, cv_model=0, ylabel=None, title=None, xlim=None, ylim=None, xlog=False, ylog=False, legend_loc=9, savefig=False)[source]

Plot training (and optional validation) curves for a chosen metric.

Parameters:
  • metric ({'acc','loss','f1'}, optional) – Metric to visualize: accuracy, loss, or F1 score. Default is ‘acc’.

  • combine (bool, optional) – If True, include the corresponding validation curves when available. Default is False.

  • cv_model (int or {'all'}, optional) – Which CV model’s history to plot, or ‘all’ to overlay every fold. Default is 0.

  • ylabel (str or None, optional) – Custom y-axis label; if None, derived from metric. Default is None.

  • title (str or None, optional) – Custom plot title; if None, derived from metric. Default is None.

  • xlim (tuple or None, optional) – Matplotlib-style (xmin, xmax) limits for the x-axis. Default is None.

  • ylim (tuple or None, optional) – Matplotlib-style (ymin, ymax) limits for the y-axis. Default is None.

  • xlog (bool, optional) – If True, use a logarithmic x-axis. Default is False.

  • ylog (bool, optional) – If True, use a logarithmic y-axis. Default is False.

  • legend_loc (int or str, optional) – Legend location (Matplotlib convention). Default is 9.

  • savefig (bool, optional) – If True, save to ‘CNN_Training_History_<metric>.png’ instead of showing. Default is False.

Return type:

None

Raises:

ValueError – If histories are not available or combine=True without validation metrics.

_plot_positive(index=0, channel=0, default_scale=True, vmin=None, vmax=None, cmap='gray', title='')[source]

Display a single positive-class image (optionally a single channel or colorized).

Parameters:
  • index (int, optional) – Index of the sample within the positive class to display. Default is 0.

  • channel (int or {'all'}, optional) – Channel index (0-based) to display, or ‘all’ to show a colorized composite. Default is 0.

  • default_scale (bool, optional) – If True, use Matplotlib’s default scaling; if False, use vmin/vmax or compute robust limits. Default is True.

  • vmin (float or None, optional) – Lower display limit when default_scale is False; if None, compute robust limits. Default is None.

  • vmax (float or None, optional) – Upper display limit when default_scale is False; if None, compute robust limits. Default is None.

  • cmap (str, optional) – Colormap used when displaying a single channel. Default is ‘gray’.

  • title (str, optional) – Title displayed above the image. Default is ‘’.

Return type:

None

_plot_negative(index=0, channel=0, default_scale=True, vmin=None, vmax=None, cmap='gray', title='')[source]

Display a single negative-class image (optionally a single channel or colorized).

Parameters:
  • index (int, optional) – Index of the sample within the negative class to display. Default is 0.

  • channel (int or {'all'}, optional) – Channel index (0-based) to display, or ‘all’ to show a colorized composite. Default is 0.

  • default_scale (bool, optional) – If True, use Matplotlib’s default scaling; if False, use vmin/vmax or compute robust limits. Default is True.

  • vmin (float or None, optional) – Lower display limit when default_scale is False; if None, compute robust limits. Default is None.

  • vmax (float or None, optional) – Upper display limit when default_scale is False; if None, compute robust limits. Default is None.

  • cmap (str, optional) – Colormap used when displaying a single channel. Default is ‘gray’.

  • title (str, optional) – Title displayed above the image. Default is ‘’.

Return type:

None

pyBIA.cnn_model.custom_model(positive_class, negative_class, img_num_channels=1, normalize=True, min_pixel=0, max_pixel=100, val_positive=None, val_negative=None, epochs=100, batch_size=32, optimizer='sgd', lr=0.0001, momentum=0.9, decay=0.0, nesterov=False, rho=0.9, beta_1=0.9, beta_2=0.999, amsgrad=False, loss='binary_crossentropy', conv_init='uniform_scaling', dense_init='truncated_normal', activation_conv='relu', activation_dense='relu', conv_reg=0, dense_reg=0, padding='same', model_reg='batch_norm', filter_1=256, filter_size_1=7, strides_1=1, pooling_1='average', pool_size_1=3, pool_stride_1=3, filter_2=0, filter_size_2=0, strides_2=0, pooling_2=None, pool_size_2=0, pool_stride_2=0, filter_3=0, filter_size_3=0, strides_3=0, pooling_3=None, pool_size_3=0, pool_stride_3=0, dense_neurons_1=4096, dropout_1=0.5, dense_neurons_2=0, dropout_2=0, dense_neurons_3=0, dropout_3=0, patience=0, metric='binary_accuracy', early_stop_callback=None, checkpoint=False, weight=None, verbose=1, save_training_data=False, path=None)[source]

Build and train a configurable CNN with 1–3 Conv2D(+pool) blocks followed by up to 3 dense layers.

The network applies optional normalization, shuffles classes, constructs train/validation sets, and trains with optional early stopping and checkpointing. Batch normalization is applied after each Conv2D when model_reg=’batch_norm’.

Parameters:
  • positive_class (ndarray) – Training images for the positive class. Shape (N, H, W) for single-channel or (N, H, W, C) for multi-channel. Required.

  • negative_class (ndarray) – Training images for the negative class. Shape (N, H, W) or (N, H, W, C). Required.

  • img_num_channels (int, optional) – Number of channels per image (C). Used by preprocessing and input shape. Default is 1.

  • normalize (bool, optional) – If True, apply min–max normalization using min_pixel/max_pixel. Default is True.

  • min_pixel (float, optional) – Lower clip bound used during normalization when normalize is True. Default is 0.

  • max_pixel (float, optional) – Upper clip bound used during normalization when normalize is True. Default is 100.

  • val_positive (ndarray or None, optional) – Validation images for the positive class, same shape convention as training. Default is None.

  • val_negative (ndarray or None, optional) – Validation images for the negative class, same shape convention as training. Default is None.

  • epochs (int, optional) – Number of training epochs. Default is 100.

  • batch_size (int, optional) – Mini-batch size. Default is 32.

  • optimizer ({'sgd','adam','rmsprop','adadelta','adamw'} or str, optional) – Optimizer name understood by get_optimizer. Default is ‘sgd’.

  • lr (float, optional) – Base learning rate passed to the optimizer. Default is 1e-4.

  • momentum (float, optional) – Momentum parameter for SGD-like optimizers. Default is 0.9.

  • decay (float, optional) – Learning-rate decay per epoch (if supported by the optimizer). Default is 0.0.

  • nesterov (bool, optional) – If True, enable Nesterov momentum for SGD. Default is False.

  • rho (float, optional) – Decay factor used by Adadelta/RMSprop-style optimizers. Default is 0.9.

  • beta_1 (float, optional) – First-moment decay for Adam-style optimizers. Default is 0.9.

  • beta_2 (float, optional) – Second-moment decay for Adam-style optimizers. Default is 0.999.

  • amsgrad (bool, optional) – If True, use the AMSGrad variant for Adam-style optimizers. Default is False.

  • loss (str, optional) – Loss identifier passed to get_loss_function (supports class weighting via weight). Default is ‘binary_crossentropy’.

  • conv_init (str or tf.keras.initializers.Initializer, optional) – Convolution kernel initializer. The alias ‘uniform_scaling’ maps to VarianceScaling(scale=1.0, mode=’fan_in’, distribution=’uniform’). Default is ‘uniform_scaling’.

  • dense_init (str or tf.keras.initializers.Initializer, optional) – Dense kernel initializer. The alias ‘uniform_scaling’ maps as above. Default is ‘truncated_normal’.

  • activation_conv (str, optional) – Activation applied after each Conv2D (post-BN if enabled). Default is ‘relu’.

  • activation_dense (str, optional) – Activation applied in dense layers (post-BN if enabled). Default is ‘relu’.

  • conv_reg (float, optional) – L2 weight for convolution kernels. Default is 0.

  • dense_reg (float, optional) – L2 weight for dense kernels. Default is 0.

  • padding ({'same','valid'}, optional) – Padding mode for Conv2D and pooling layers. Default is ‘same’.

  • model_reg ({'batch_norm','local_response',None}, optional) – Per-block regularization: batch normalization or local response normalization (LRN after pooling), or None. Default is ‘batch_norm’.

  • filter_1 (int, optional) – Number of filters in Conv2D block 1. Set ≤0 to disable the block. Default is 256.

  • filter_size_1 (int, optional) – Kernel size (square) for block 1 Conv2D. Default is 7.

  • strides_1 (int, optional) – Convolution stride for block 1. Default is 1.

  • pooling_1 ({'max','average','min',None}, optional) – Pooling type after block-1 Conv2D (custom ‘min’ pooling supported). Default is ‘average’.

  • pool_size_1 (int, optional) – Pooling window size for block 1. Default is 3.

  • pool_stride_1 (int, optional) – Pooling stride for block 1. Default is 3.

  • filter_2 (int, optional) – Number of filters in Conv2D block 2. Set ≤0 to disable the block. Default is 0.

  • filter_size_2 (int, optional) – Kernel size (square) for block 2 Conv2D (required if block enabled). Default is 0.

  • strides_2 (int, optional) – Convolution stride for block 2 (required if block enabled). Default is 0.

  • pooling_2 ({'max','average','min',None}, optional) – Pooling type after block-2 Conv2D. Default is None.

  • pool_size_2 (int, optional) – Pooling window size for block 2. Default is 0.

  • pool_stride_2 (int, optional) – Pooling stride for block 2. Default is 0.

  • filter_3 (int, optional) – Number of filters in Conv2D block 3. Set ≤0 to disable the block. Default is 0.

  • filter_size_3 (int, optional) – Kernel size (square) for block 3 Conv2D (required if block enabled). Default is 0.

  • strides_3 (int, optional) – Convolution stride for block 3 (required if block enabled). Default is 0.

  • pooling_3 ({'max','average','min',None}, optional) – Pooling type after block-3 Conv2D. Default is None.

  • pool_size_3 (int, optional) – Pooling window size for block 3. Default is 0.

  • pool_stride_3 (int, optional) – Pooling stride for block 3. Default is 0.

  • dense_neurons_1 (int, optional) – Units in the first fully-connected (dense) layer. Default is 4096.

  • dropout_1 (float, optional) – Dropout rate after the first dense layer (0–1). Default is 0.5.

  • dense_neurons_2 (int, optional) – Units in the second dense layer; set ≤0 to skip the layer. Default is 0.

  • dropout_2 (float, optional) – Dropout rate after the second dense layer. Default is 0.

  • dense_neurons_3 (int, optional) – Units in the third dense layer; set ≤0 to skip the layer. Default is 0.

  • dropout_3 (float, optional) – Dropout rate after the third dense layer. Default is 0.

  • patience (int, optional) – Early-stopping patience (epochs without improvement on metric). A value of 0 disables early stopping. Default is 0.

  • metric ({'loss','val_loss','binary_accuracy','val_binary_accuracy','f1_score','val_f1_score'}, optional) – Metric monitored by early stopping and checkpointing. The token ‘all’ is coerced internally to ‘loss’ or ‘val_loss’. Default is ‘binary_accuracy’.

  • early_stop_callback (keras.callbacks.Callback or None, optional) – Additional callback (e.g., from an external optimizer) to signal pruning. Default is None.

  • checkpoint (bool, optional) – If True, save the best model weights to ‘~/checkpoint.hdf5’ monitored by metric. Default is False.

  • weight (float or None, optional) – Class weight used by certain custom loss wrappers (see get_loss_function). Default is None.

  • verbose ({0,1,2}, optional) – Keras verbosity level (0=silent, 1=progress bar, 2=one line per epoch). Default is 1.

  • save_training_data (bool, optional) – If True, save the processed training/validation arrays to path. Default is False.

  • path (str or None, optional) – Directory used when saving training data. If None, the home directory is used. Default is None.

Returns:

  • model (tf.keras.Model) – The compiled and trained Keras model (binary sigmoid output).

  • history (tf.keras.callbacks.History) – Keras history object containing per-epoch metrics.

Raises:

ValueError – If any enabled convolutional block is missing its required filter_size_* or strides_* arguments.

Notes

  • Inputs are shuffled within each class before constructing the training set.

  • When validation data are provided, the same normalization/clipping is applied.

  • Batch normalization can be unstable with very small batch_size; if training diverges (NaNs), try a larger batch or smaller learning rate.

  • model_reg=’local_response’ inserts LRN after pooling in each enabled block.

pyBIA.cnn_model.AlexNet(positive_class, negative_class, img_num_channels=1, normalize=True, min_pixel=0, max_pixel=100, val_positive=None, val_negative=None, epochs=100, batch_size=32, optimizer='sgd', lr=0.0001, momentum=0.9, decay=0.0, nesterov=False, rho=0.9, beta_1=0.9, beta_2=0.999, amsgrad=False, loss='binary_crossentropy', conv_init='uniform_scaling', dense_init='truncated_normal', activation_conv='relu', activation_dense='relu', conv_reg=0, dense_reg=0, padding='same', model_reg='local_response', filter_1=96, filter_size_1=11, strides_1=4, pooling_1='max', pool_size_1=3, pool_stride_1=2, filter_2=256, filter_size_2=5, strides_2=1, pooling_2='max', pool_size_2=3, pool_stride_2=2, filter_3=384, filter_size_3=3, strides_3=1, pooling_3='max', pool_size_3=3, pool_stride_3=2, filter_4=384, filter_size_4=3, strides_4=1, filter_5=256, filter_size_5=3, strides_5=1, dense_neurons_1=4096, dense_neurons_2=4096, dropout_1=0.5, dropout_2=0.5, patience=0, metric='binary_accuracy', early_stop_callback=None, checkpoint=False, weight=None, verbose=1, save_training_data=False, path=None)[source]

Build and train an AlexNet-style CNN adapted for binary classification of astronomical images.

The architecture follows the classic Conv–Pool blocks with Local Response Normalization (LRN) by default and provides options for Batch Normalization instead. Inputs can be min–max normalized to mitigate gradient issues; the model trains with optional early stopping and checkpointing.

Parameters:
  • positive_class (ndarray) – Training images for the positive class. Shape (N, H, W) for single-channel or (N, H, W, C) for multi-channel. Required.

  • negative_class (ndarray) – Training images for the negative class. Shape (N, H, W) or (N, H, W, C). Required.

  • img_num_channels (int, optional) – Number of channels per image (C). Used for preprocessing and model input shape. Default is 1.

  • normalize (bool, optional) – If True, apply min–max normalization using min_pixel and max_pixel. Default is True.

  • min_pixel (float, optional) – Lower clip bound used during normalization when normalize is True. Default is 0.

  • max_pixel (float, optional) – Upper clip bound used during normalization when normalize is True. Default is 100.

  • val_positive (ndarray or None, optional) – Validation images for the positive class, same shape convention as training. Default is None.

  • val_negative (ndarray or None, optional) – Validation images for the negative class, same shape convention as training. Default is None.

  • epochs (int, optional) – Number of training epochs. Default is 100.

  • batch_size (int, optional) – Mini-batch size. Default is 32.

  • optimizer ({'sgd','adam','rmsprop','adadelta','adamw'} or str, optional) – Optimizer name understood by get_optimizer. Default is ‘sgd’.

  • lr (float, optional) – Base learning rate passed to the optimizer. Default is 1e-4.

  • momentum (float, optional) – Momentum parameter for SGD-like optimizers. Default is 0.9.

  • decay (float, optional) – Learning-rate decay per epoch (if supported by the optimizer). Default is 0.0.

  • nesterov (bool, optional) – If True, enable Nesterov momentum for SGD. Default is False.

  • rho (float, optional) – Decay factor used by Adadelta/RMSprop-style optimizers. Default is 0.9.

  • beta_1 (float, optional) – First-moment decay for Adam-style optimizers. Default is 0.9.

  • beta_2 (float, optional) – Second-moment decay for Adam-style optimizers. Default is 0.999.

  • amsgrad (bool, optional) – If True, use the AMSGrad variant for Adam-style optimizers. Default is False.

  • loss (str, optional) – Loss identifier passed to get_loss_function (supports class weighting via weight). Default is ‘binary_crossentropy’.

  • conv_init (str or tf.keras.initializers.Initializer, optional) – Convolution kernel initializer. The alias ‘uniform_scaling’ maps to VarianceScaling(scale=1.0, mode=’fan_in’, distribution=’uniform’). Default is ‘uniform_scaling’.

  • dense_init (str or tf.keras.initializers.Initializer, optional) – Dense kernel initializer. The alias ‘uniform_scaling’ maps as above when used. Default is ‘truncated_normal’.

  • activation_conv (str, optional) – Activation applied after each Conv2D (post-BN if enabled). Default is ‘relu’.

  • activation_dense (str, optional) – Activation applied in dense layers (post-BN if enabled). Default is ‘relu’.

  • conv_reg (float, optional) – L2 weight for convolution kernels. Default is 0.

  • dense_reg (float, optional) – L2 weight for dense kernels. Default is 0.

  • padding ({'same','valid'}, optional) – Padding mode for Conv2D and pooling layers. Default is ‘same’.

  • model_reg ({'batch_norm','local_response',None}, optional) – Block-level regularization: Batch Normalization (after Conv2D), Local Response Normalization (after pooling, AlexNet-style), or None. Default is ‘local_response’.

  • filter_1 (int, optional) – Number of filters in Conv2D block 1. Default is 96.

  • filter_size_1 (int, optional) – Kernel size (square) for block-1 Conv2D. Default is 11.

  • strides_1 (int, optional) – Convolution stride for block-1. Default is 4.

  • pooling_1 ({'max','average','min',None}, optional) – Pooling type after block-1 Conv2D (custom ‘min’ pooling supported). Default is ‘max’.

  • pool_size_1 (int, optional) – Pooling window size for block-1. Default is 3.

  • pool_stride_1 (int, optional) – Pooling stride for block-1. Default is 2.

  • filter_2 (int, optional) – Number of filters in Conv2D block 2. Default is 256.

  • filter_size_2 (int, optional) – Kernel size (square) for block-2 Conv2D. Default is 5.

  • strides_2 (int, optional) – Convolution stride for block-2. Default is 1.

  • pooling_2 ({'max','average','min',None}, optional) – Pooling type after block-2 Conv2D. Default is ‘max’.

  • pool_size_2 (int, optional) – Pooling window size for block-2. Default is 3.

  • pool_stride_2 (int, optional) – Pooling stride for block-2. Default is 2.

  • filter_3 (int, optional) – Number of filters in Conv2D block 3. Default is 384.

  • filter_size_3 (int, optional) – Kernel size (square) for block-3 Conv2D. Default is 3.

  • strides_3 (int, optional) – Convolution stride for block-3. Default is 1.

  • pooling_3 ({'max','average','min',None}, optional) – Pooling type after block-3 Conv2D (applied after block-5 in this variant). Default is ‘max’.

  • pool_size_3 (int, optional) – Pooling window size for the final pooling stage. Default is 3.

  • pool_stride_3 (int, optional) – Pooling stride for the final pooling stage. Default is 2.

  • filter_4 (int, optional) – Number of filters in Conv2D block 4. Default is 384.

  • filter_size_4 (int, optional) – Kernel size (square) for block-4 Conv2D. Default is 3.

  • strides_4 (int, optional) – Convolution stride for block-4. Default is 1.

  • filter_5 (int, optional) – Number of filters in Conv2D block 5. Default is 256.

  • filter_size_5 (int, optional) – Kernel size (square) for block-5 Conv2D. Default is 3.

  • strides_5 (int, optional) – Convolution stride for block-5. Default is 1.

  • dense_neurons_1 (int, optional) – Units in the first fully-connected (dense) layer. Default is 4096.

  • dense_neurons_2 (int, optional) – Units in the second fully-connected (dense) layer. Default is 4096.

  • dropout_1 (float, optional) – Dropout rate after the first dense layer (0–1). Default is 0.5.

  • dropout_2 (float, optional) – Dropout rate after the second dense layer (0–1). Default is 0.5.

  • patience (int, optional) – Early-stopping patience (epochs without improvement on metric). A value of 0 disables early stopping. Default is 0.

  • metric ({'loss','val_loss','binary_accuracy','val_binary_accuracy','f1_score','val_f1_score','all'}, optional) – Metric monitored by early stopping and checkpointing. The token ‘all’ is coerced internally to a single monitor (‘loss’ or ‘val_loss’). Default is ‘binary_accuracy’.

  • early_stop_callback (keras.callbacks.Callback or None, optional) – Additional callback (e.g., from an external optimizer) to signal pruning. Default is None.

  • checkpoint (bool, optional) – If True, save the best model weights to ‘~/checkpoint.hdf5’ monitored by metric. Default is False.

  • weight (float or None, optional) – Class weight used by certain custom loss wrappers (see get_loss_function). Default is None.

  • verbose ({0,1,2}, optional) – Keras verbosity level (0=silent, 1=progress bar, 2=one line per epoch). Default is 1.

  • save_training_data (bool, optional) – If True, save the processed training/validation arrays to path. Default is False.

  • path (str or None, optional) – Directory used when saving training data. If None, the home directory is used. Default is None.

Returns:

  • model (tf.keras.Model) – The compiled and trained Keras model (single-neuron sigmoid output for binary classification).

  • history (tf.keras.callbacks.History) – Keras history object containing per-epoch metrics.

Notes

  • Inputs are shuffled within each class before constructing the training set.

  • When validation data are provided, the same normalization/clipping is applied.

  • Batch Normalization can be unstable with very small batch_size; if training diverges (NaNs), try a larger batch size or a smaller learning rate.

  • model_reg=’local_response’ inserts LRN after pooling in accordance with the original AlexNet paper; model_reg=’batch_norm’ places BatchNorm after Conv2D and before activation.

pyBIA.cnn_model.VGG16(positive_class, negative_class, img_num_channels=1, normalize=True, min_pixel=0, max_pixel=100, val_positive=None, val_negative=None, epochs=100, batch_size=32, optimizer='sgd', lr=0.0001, momentum=0.9, decay=0.0, nesterov=False, rho=0.9, beta_1=0.9, beta_2=0.999, amsgrad=False, loss='binary_crossentropy', conv_init='uniform_scaling', dense_init='truncated_normal', activation_conv='relu', activation_dense='relu', conv_reg=0, dense_reg=0, padding='same', model_reg=None, filter_1=64, filter_size_1=3, strides_1=1, pooling_1='max', pool_size_1=2, pool_stride_1=2, filter_2=128, filter_size_2=3, strides_2=1, pooling_2='max', pool_size_2=2, pool_stride_2=2, filter_3=256, filter_size_3=3, strides_3=1, pooling_3='max', pool_size_3=2, pool_stride_3=2, filter_4=512, filter_size_4=3, strides_4=1, pooling_4='max', pool_size_4=2, pool_stride_4=2, filter_5=512, filter_size_5=3, strides_5=1, pooling_5='max', pool_size_5=2, pool_stride_5=2, dense_neurons_1=4096, dense_neurons_2=4096, dropout_1=0.5, dropout_2=0.5, patience=0, metric='binary_accuracy', early_stop_callback=None, checkpoint=False, weight=None, verbose=1, save_training_data=False, path=None)[source]

Build and train a VGG16-style CNN for binary classification of astronomical images.

The model follows five convolutional blocks (with small 3×3 kernels) and two fully-connected layers, with configurable pooling and optional Batch Normalization or Local Response Normalization after pooling. Inputs can be min–max normalized prior to training.

Parameters:
  • positive_class (ndarray) – Training images for the positive class. Shape (N, H, W) or (N, H, W, C). Required.

  • negative_class (ndarray) – Training images for the negative class. Shape (N, H, W) or (N, H, W, C). Required.

  • img_num_channels (int, optional) – Number of channels per image (C) for preprocessing/input shape. Default is 1.

  • normalize (bool, optional) – If True, apply min–max normalization using min_pixel/max_pixel. Default is True.

  • min_pixel (float, optional) – Lower clip bound used during normalization when normalize is True. Default is 0.

  • max_pixel (float, optional) – Upper clip bound used during normalization when normalize is True. Default is 100.

  • val_positive (ndarray or None, optional) – Validation images for the positive class, same shape convention as training. Default is None.

  • val_negative (ndarray or None, optional) – Validation images for the negative class, same shape convention as training. Default is None.

  • epochs (int, optional) – Number of training epochs. Default is 100.

  • batch_size (int, optional) – Mini-batch size. Default is 32.

  • optimizer ({'sgd','adam','rmsprop','adadelta','adamw'} or str, optional) – Optimizer identifier understood by get_optimizer. Default is ‘sgd’.

  • lr (float, optional) – Base learning rate passed to the optimizer. Default is 1e-4.

  • momentum (float, optional) – Momentum for SGD-like optimizers. Default is 0.9.

  • decay (float, optional) – Learning-rate decay per epoch (if supported by optimizer). Default is 0.0.

  • nesterov (bool, optional) – If True, enable Nesterov momentum for SGD. Default is False.

  • rho (float, optional) – Decay factor used by Adadelta/RMSprop-style optimizers. Default is 0.9.

  • beta_1 (float, optional) – First-moment decay for Adam-style optimizers. Default is 0.9.

  • beta_2 (float, optional) – Second-moment decay for Adam-style optimizers. Default is 0.999.

  • amsgrad (bool, optional) – If True, use the AMSGrad variant of Adam. Default is False.

  • loss (str, optional) – Loss identifier passed to get_loss_function (supports class weighting via weight). Default is ‘binary_crossentropy’.

  • conv_init (str or tf.keras.initializers.Initializer, optional) – Convolution kernel initializer; ‘uniform_scaling’ maps to VarianceScaling(…). Default is ‘uniform_scaling’.

  • dense_init (str or tf.keras.initializers.Initializer, optional) – Dense kernel initializer; ‘uniform_scaling’ maps to VarianceScaling(…). Default is ‘truncated_normal’.

  • activation_conv (str, optional) – Activation applied after each Conv2D (post-BN if enabled). Default is ‘relu’.

  • activation_dense (str, optional) – Activation applied in dense layers (post-BN if enabled). Default is ‘relu’.

  • conv_reg (float, optional) – L2 weight for convolution kernels. Default is 0.

  • dense_reg (float, optional) – L2 weight for dense kernels. Default is 0.

  • padding ({'same','valid'}, optional) – Padding mode for Conv2D and pooling layers. Default is ‘same’.

  • model_reg ({'batch_norm','local_response',None}, optional) – Block regularization: BatchNorm (after Conv2D), LRN (after pooling), or None. Default is None.

  • filter_1 (int, optional) – Number of filters in block-1 Conv2D layers. Default is 64.

  • filter_size_1 (int, optional) – Kernel size (square) for block-1 Conv2D. Default is 3.

  • strides_1 (int, optional) – Convolution stride for block-1. Default is 1.

  • pooling_1 ({'max','average','min',None}, optional) – Pooling type after block-1. Default is ‘max’.

  • pool_size_1 (int, optional) – Pooling window size for block-1. Default is 2.

  • pool_stride_1 (int, optional) – Pooling stride for block-1. Default is 2.

  • filter_2 (int, optional) – Number of filters in block-2 Conv2D layers. Default is 128.

  • filter_size_2 (int, optional) – Kernel size (square) for block-2 Conv2D. Default is 3.

  • strides_2 (int, optional) – Convolution stride for block-2. Default is 1.

  • pooling_2 ({'max','average','min',None}, optional) – Pooling type after block-2. Default is ‘max’.

  • pool_size_2 (int, optional) – Pooling window size for block-2. Default is 2.

  • pool_stride_2 (int, optional) – Pooling stride for block-2. Default is 2.

  • filter_3 (int, optional) – Number of filters in block-3 Conv2D layers. Default is 256.

  • filter_size_3 (int, optional) – Kernel size (square) for block-3 Conv2D. Default is 3.

  • strides_3 (int, optional) – Convolution stride for block-3. Default is 1.

  • pooling_3 ({'max','average','min',None}, optional) – Pooling type after block-3. Default is ‘max’.

  • pool_size_3 (int, optional) – Pooling window size for block-3. Default is 2.

  • pool_stride_3 (int, optional) – Pooling stride for block-3. Default is 2.

  • filter_4 (int, optional) – Number of filters in block-4 Conv2D layers. Default is 512.

  • filter_size_4 (int, optional) – Kernel size (square) for block-4 Conv2D. Default is 3.

  • strides_4 (int, optional) – Convolution stride for block-4. Default is 1.

  • pooling_4 ({'max','average','min',None}, optional) – Pooling type after block-4. Default is ‘max’.

  • pool_size_4 (int, optional) – Pooling window size for block-4. Default is 2.

  • pool_stride_4 (int, optional) – Pooling stride for block-4. Default is 2.

  • filter_5 (int, optional) – Number of filters in block-5 Conv2D layers. Default is 512.

  • filter_size_5 (int, optional) – Kernel size (square) for block-5 Conv2D. Default is 3.

  • strides_5 (int, optional) – Convolution stride for block-5. Default is 1.

  • pooling_5 ({'max','average','min',None}, optional) – Pooling type after block-5. Default is ‘max’.

  • pool_size_5 (int, optional) – Pooling window size for block-5. Default is 2.

  • pool_stride_5 (int, optional) – Pooling stride for block-5. Default is 2.

  • dense_neurons_1 (int, optional) – Units in the first fully-connected (dense) layer. Default is 4096.

  • dense_neurons_2 (int, optional) – Units in the second fully-connected (dense) layer. Default is 4096.

  • dropout_1 (float, optional) – Dropout rate after the first dense layer (0–1). Default is 0.5.

  • dropout_2 (float, optional) – Dropout rate after the second dense layer (0–1). Default is 0.5.

  • patience (int, optional) – Early-stopping patience (epochs without improvement on metric); 0 disables early stopping. Default is 0.

  • metric ({'loss','val_loss','binary_accuracy','val_binary_accuracy','f1_score','val_f1_score','all'}, optional) – Metric monitored by early stopping/checkpointing; ‘all’ is coerced internally to a single monitor. Default is ‘binary_accuracy’.

  • early_stop_callback (keras.callbacks.Callback or None, optional) – Additional callback (e.g., for pruning within external HPO). Default is None.

  • checkpoint (bool, optional) – If True, save best model weights to ‘~/checkpoint.hdf5’ monitored by metric. Default is False.

  • weight (float or None, optional) – Class weight used by certain custom loss wrappers (see get_loss_function). Default is None.

  • verbose ({0,1,2}, optional) – Keras verbosity level (0=silent, 1=progress bar, 2=one line/epoch). Default is 1.

  • save_training_data (bool, optional) – If True, save processed training/validation arrays to path. Default is False.

  • path (str or None, optional) – Directory used when saving training data; home directory is used if None. Default is None.

Returns:

  • model (tf.keras.Model) – The compiled and trained Keras model (single-neuron sigmoid output for binary classification).

  • history (tf.keras.callbacks.History) – Keras history object with per-epoch metrics.

Notes

  • Inputs are shuffled within each class prior to constructing the training set.

  • When validation data are provided, the same normalization/clipping is applied.

  • Batch Normalization can be unstable with very small batch_size; if training diverges (NaNs), try a larger batch size or a smaller learning rate.

pyBIA.cnn_model.Resnet18(positive_class, negative_class, img_num_channels=1, normalize=True, min_pixel=0, max_pixel=100, val_positive=None, val_negative=None, epochs=100, batch_size=32, optimizer='sgd', lr=0.0001, momentum=0.9, decay=0.0, nesterov=False, rho=0.9, beta_1=0.9, beta_2=0.999, amsgrad=False, loss='binary_crossentropy', conv_init='uniform_scaling', dense_init='truncated_normal', activation_conv='relu', activation_dense='relu', conv_reg=0, dense_reg=0, padding='same', model_reg=None, filters=64, filter_size=7, strides=2, pooling='max', pool_size=3, pool_stride=2, block_filters_1=64, block_filters_2=128, block_filters_3=256, block_filters_4=512, block_filters_size=3, patience=0, metric='binary_accuracy', early_stop_callback=None, checkpoint=False, weight=None, verbose=1, save_training_data=False, path=None)[source]

Build and train a ResNet-18–style CNN with configurable stem, residual block widths, optimization hyperparameters, and optional Batch Normalization in the stem.

Parameters:
  • positive_class (ndarray) – Training images for the positive class, shape (N, H, W) or (N, H, W, C). Default is required.

  • negative_class (ndarray) – Training images for the negative class, shape (N, H, W) or (N, H, W, C). Default is required.

  • img_num_channels (int, optional) – Number of channels per image (C) for preprocessing/input shape. Default is 1.

  • normalize (bool, optional) – If True, apply min–max normalization using min_pixel/max_pixel. Default is True.

  • min_pixel (float, optional) – Lower clip bound used during normalization when normalize is True. Default is 0.

  • max_pixel (float, optional) – Upper clip bound used during normalization when normalize is True. Default is 100.

  • val_positive (ndarray or None, optional) – Validation images for the positive class, same shape convention as training. Default is None.

  • val_negative (ndarray or None, optional) – Validation images for the negative class, same shape convention as training. Default is None.

  • epochs (int, optional) – Number of training epochs. Default is 100.

  • batch_size (int, optional) – Mini-batch size. Default is 32.

  • optimizer ({'sgd','adam','rmsprop','adadelta','adamw'} or str, optional) – Optimizer identifier understood by get_optimizer. Default is ‘sgd’.

  • lr (float, optional) – Base learning rate passed to the optimizer. Default is 1e-4.

  • momentum (float, optional) – Momentum for SGD-like optimizers. Default is 0.9.

  • decay (float, optional) – Learning-rate decay per epoch (if supported by optimizer). Default is 0.0.

  • nesterov (bool, optional) – If True, enable Nesterov momentum for SGD. Default is False.

  • rho (float, optional) – Decay factor used by Adadelta/RMSprop-style optimizers. Default is 0.9.

  • beta_1 (float, optional) – First-moment decay for Adam-style optimizers. Default is 0.9.

  • beta_2 (float, optional) – Second-moment decay for Adam-style optimizers. Default is 0.999.

  • amsgrad (bool, optional) – If True, use the AMSGrad variant of Adam. Default is False.

  • loss (str, optional) – Loss identifier passed to get_loss_function (supports class weighting via weight). Default is ‘binary_crossentropy’.

  • conv_init (str or tf.keras.initializers.Initializer, optional) – Convolution kernel initializer; ‘uniform_scaling’ maps to VarianceScaling(…). Default is ‘uniform_scaling’.

  • dense_init (str or tf.keras.initializers.Initializer, optional) – Dense kernel initializer; ‘truncated_normal’ or initializer instance. Default is ‘truncated_normal’.

  • activation_conv (str, optional) – Activation used inside residual blocks and stem (post-BN if enabled). Default is ‘relu’.

  • activation_dense (str, optional) – Activation for dense layers (not used in standard ResNet-18 head). Default is ‘relu’.

  • conv_reg (float, optional) – L2 weight for convolution kernels. Default is 0.

  • dense_reg (float, optional) – L2 weight for dense kernels (not used by the default single-neuron head). Default is 0.

  • padding ({'same','valid'}, optional) – Convolution padding mode within residual blocks. Default is ‘same’.

  • model_reg ({'batch_norm', None}, optional) – If ‘batch_norm’, apply Batch Normalization in the stem and blocks; otherwise omit. Default is None.

  • filters (int, optional) – Number of filters in the stem 7×7 convolution. Default is 64.

  • filter_size (int, optional) – Kernel size (square) of the stem convolution. Default is 7.

  • strides (int, optional) – Stride of the stem convolution. Default is 2.

  • pooling ({'max','average','min'}, optional) – Pooling type after the stem convolution. Default is ‘max’.

  • pool_size (int, optional) – Pooling window size for the stem pooling layer. Default is 3.

  • pool_stride (int, optional) – Stride for the stem pooling layer. Default is 2.

  • block_filters_1 (int, optional) – Filters in stage-1 residual blocks (two blocks, no downsample). Default is 64.

  • block_filters_2 (int, optional) – Filters in stage-2 residual blocks (first block downsamples). Default is 128.

  • block_filters_3 (int, optional) – Filters in stage-3 residual blocks (first block downsamples). Default is 256.

  • block_filters_4 (int, optional) – Filters in stage-4 residual blocks (first block downsamples). Default is 512.

  • block_filters_size (int, optional) – Kernel size (square) for all residual block convolutions. Default is 3.

  • patience (int, optional) – Early-stopping patience (epochs without improvement on metric); 0 disables early stopping. Default is 0.

  • metric ({'loss','val_loss','binary_accuracy','val_binary_accuracy','f1_score','val_f1_score','all'}, optional) – Metric monitored by early stopping/checkpointing; ‘all’ is coerced internally to a single monitor. Default is ‘binary_accuracy’.

  • early_stop_callback (keras.callbacks.Callback or None, optional) – Additional callback (e.g., for pruning within external HPO). Default is None.

  • checkpoint (bool, optional) – If True, save best model weights to ‘~/checkpoint.hdf5’ monitored by metric. Default is False.

  • weight (float or None, optional) – Class weight used by certain custom loss wrappers (see get_loss_function). Default is None.

  • verbose ({0,1,2}, optional) – Keras verbosity level (0=silent, 1=progress bar, 2=one line/epoch). Default is 1.

  • save_training_data (bool, optional) – If True, save processed training/validation arrays to path. Default is False.

  • path (str or None, optional) – Directory used when saving training data; home directory is used if None. Default is None.

Returns:

  • model (tf.keras.Model) – The compiled and trained Keras model (global-average-pooling head with sigmoid output).

  • history (tf.keras.callbacks.History) – Keras history object with per-epoch metrics.

Notes

  • Inputs are per-class shuffled prior to constructing the training set; optional validation arrays receive the same preprocessing and clipping when normalize is True.

  • The stem applies ZeroPadding2D(3) before the 7×7 stride-2 convolution, followed by pooling.

  • Four residual stages are built as (2, 2, 2, 2) basic blocks; stages 2–4 downsample in the first block.

  • Batch Normalization can be unstable with very small batch_size; if training diverges (NaNs), try a larger batch size or a smaller learning rate.

pyBIA.cnn_model.resnet_block(x, filters_in, filters_out, filter_size=3, activation='relu', stride=1, padding='same', kernel_initializer='he_normal', model_reg='batch_norm')[source]

Basic residual block with two conv layers and an identity/projection skip.

Parameters:
  • x (tf.Tensor) – Input feature map tensor to be transformed by the block. Default is required.

  • filters_in (int) – Number of channels in the input tensor (used to decide if projection is needed). Default is required.

  • filters_out (int) – Number of output channels for both convolutions (and the projection, if used). Default is required.

  • filter_size (int, optional) – Square kernel size for both convolutions in the block. Default is 3.

  • activation (str, optional) – Activation applied after BatchNorm (or directly after conv if BN is disabled). Default is ‘relu’.

  • stride (int, optional) – Stride of the first convolution (and projection); controls downsampling. Default is 1.

  • padding ({'same','valid'}, optional) – Convolution padding for both conv layers in the block. Default is ‘same’.

  • kernel_initializer (str or tf.keras.initializers.Initializer, optional) – Kernel initializer for all conv layers and the projection. Default is ‘he_normal’.

  • model_reg ({'batch_norm', None}, optional) – If ‘batch_norm’, apply BatchNormalization after each conv; otherwise omit normalization. Default is ‘batch_norm’.

Returns:

Output tensor after two convolutions and residual addition (with projection when shape/stride differs).

Return type:

tf.Tensor

Notes

  • A 1×1 projection on the skip path is used when stride != 1 or filters_in != filters_out.

  • The final activation is applied after the residual addition.

pyBIA.cnn_model.f1_score(y_true, y_pred)[source]

Binary F1 score (harmonic mean of precision and recall) for the current batch.

Parameters:
  • y_true (tf.Tensor) – Ground-truth binary labels with shape (N,) or (N, 1). Values are expected to be 0 or 1.

  • y_pred (tf.Tensor) – Model outputs with shape matching y_true. Values are expected to be probabilities in [0, 1] (e.g., sigmoid outputs). A fixed threshold of 0.5 is applied internally via rounding.

Returns:

Scalar tensor containing the batch F1 score in [0, 1].

Return type:

tf.Tensor

pyBIA.cnn_model.calculate_tp_fp(model, sample, y_true)[source]

Compute batch true positives (TP) and false positives (FP) from model predictions.

The model’s predicted probabilities are thresholded at 0.5 (via rounding) to obtain binary predictions. Labels are clipped to [0, 1] and rounded. Sums are computed over the batch, returning scalar tensors.

Parameters:
  • model (tf.keras.Model or compatible) – Trained model providing predict(sample) → probabilities for the positive class.

  • sample (ndarray or tf.Tensor) – Input batch for inference, typically with shape (N, H, W, C) or (N, d). No default; must be provided.

  • y_true (ndarray or tf.Tensor) – Ground-truth labels for sample. Shape (N,) or (N, 1) with values in {0, 1}. One-hot labels must be preconverted to a single positive column. No default.

Returns:

  • tp (tf.Tensor) – Scalar tensor equal to the count of true positives in the batch.

  • fp (tf.Tensor) – Scalar tensor equal to the count of false positives in the batch.

pyBIA.cnn_model.focal_loss(y_true, y_pred, gamma=2.0, alpha=0.25)[source]

Binary focal loss for imbalanced classification (Lin et al., 2017).

Down-weights easy examples and focuses training on hard, misclassified ones. This implementation uses binary cross-entropy with from_logits=True, so y_pred must be raw logits (pre-sigmoid).

Parameters:
  • y_true (tf.Tensor) – Ground-truth binary labels in {0, 1}; shape broadcastable to y_pred. No default; must be provided.

  • y_pred (tf.Tensor) – Model outputs as logits (before sigmoid); same shape as y_true. No default; must be provided.

  • gamma (float, optional) – Focusing parameter; larger values increase down-weighting of easy examples. Default is 2.0.

  • alpha (float, optional) – Global weighting factor for the loss (often the positive-class weight). Default is 0.25.

Returns:

Element-wise focal loss with the same shape as y_true. Reduce with tf.reduce_mean or tf.reduce_sum for a scalar loss.

Return type:

tf.Tensor

pyBIA.cnn_model.dice_loss(y_true, y_pred, smooth=1e-07)[source]

Dice loss for binary segmentation.

Parameters:
  • y_true (tf.Tensor) – Ground-truth binary mask in {0, 1}; shape broadcastable to y_pred. No default; must be provided.

  • y_pred (tf.Tensor) – Predicted mask as probabilities in [0, 1] (apply sigmoid if logits); same shape as y_true. No default; must be provided.

  • smooth (float, optional) – Smoothing constant added to numerator and denominator for numerical stability. Default is 1e-7.

Returns:

Scalar Dice loss for the batch (1 − Dice coefficient).

Return type:

tf.Tensor

pyBIA.cnn_model.jaccard_loss(y_true, y_pred, smooth=1e-07)[source]

Jaccard (IoU) loss for binary segmentation.

Parameters:
  • y_true (tf.Tensor) – Ground-truth binary mask in {0, 1}; shape broadcastable to y_pred. No default; must be provided.

  • y_pred (tf.Tensor) – Predicted mask as probabilities in [0, 1] (apply sigmoid if logits); same shape as y_true. No default; must be provided.

  • smooth (float, optional) – Smoothing constant added to numerator and denominator for numerical stability. Default is 1e-7.

Returns:

Scalar Jaccard loss for the batch (1 − IoU).

Return type:

tf.Tensor

pyBIA.cnn_model.weighted_binary_crossentropy(weight)[source]

Weighted binary cross-entropy (positive-class scaling).

Returns a Keras-compatible loss function that computes binary cross-entropy with the positive term multiplied by weight. Use this to counter class imbalance by up-weighting positives (weight > 1) or down-weighting them (0 < weight < 1).

Parameters:

weight (float) – Non-negative scalar applied to the positive class term. Values > 1 increase the penalty for false negatives; values in (0, 1) decrease it. No default; must be provided.

Returns:

A loss function loss(y_true, y_pred) that returns the mean weighted binary cross-entropy over the last axis.

Return type:

Callable[[tf.Tensor, tf.Tensor], tf.Tensor]

pyBIA.cnn_model.get_optimizer(optimizer, lr, momentum=None, decay=None, rho=0.9, nesterov=False, beta_1=0.9, beta_2=0.999, amsgrad=False)[source]

Return a configured Keras optimizer instance.

Parameters:
  • optimizer ({'sgd','adam','adamax','nadam','adadelta','rmsprop'}) – Optimizer name to instantiate. No default.

  • lr (float) – Learning rate for the optimizer. No default.

  • momentum (float, optional) – Momentum term for SGD (ignored by other optimizers). Default is None.

  • decay (float, optional) – Learning-rate time decay (not used by this function). Default is None.

  • rho (float, optional) – Discounting factor for the moving average of squared grads (Adadelta/RMSprop). Default is 0.9.

  • nesterov (bool, optional) – Use Nesterov momentum with SGD. Default is False.

  • beta_1 (float, optional) – Exponential decay rate for first-moment estimates (Adam-family). Default is 0.9.

  • beta_2 (float, optional) – Exponential decay rate for second-moment estimates (Adam-family). Default is 0.999.

  • amsgrad (bool, optional) – Use the AMSGrad variant of Adam. Default is False.

Returns:

A compiled tf.keras.optimizers.Optimizer instance.

Return type:

optimizer

Raises:

ValueError – If optimizer is not one of the supported names.

pyBIA.cnn_model.get_loss_function(loss, weight=None)[source]

Return a Keras-compatible loss given a symbolic name.

Parameters:
  • loss ({'binary_crossentropy','hinge','squared_hinge','kld','logcosh','focal_loss','dice_loss','jaccard_loss','weighted_binary_crossentropy'}) – Name of the loss to construct. No default.

  • weight (float or None, optional) – Positive-class weight used only when loss=’weighted_binary_crossentropy’; ignored for all other losses. Default is None.

Returns:

loss_fn – Keras-compatible loss object. This may be a string identifier (for ‘binary_crossentropy’), a tf.keras.losses.* instance (e.g., Hinge(), KLDivergence(), LogCosh()), or a callable such as focal_loss, dice_loss, jaccard_loss, or the result of weighted_binary_crossentropy(weight).

Return type:

str or tf.keras.losses.Loss or callable

pyBIA.cnn_model.print_params(batch_size, lr, decay, momentum, nesterov, loss, optimizer, model_reg, conv_init, activation_conv, dense_init, activation_dense, filter1, filter2, filter3, filter4, filter5, filter_size_1, filter_size_2, filter_size_3, filter_size_4, filter_size_5, pooling_1, pooling_2, pooling_3, pooling_4, pooling_5, pool_size_1, pool_size_2, pool_size_3, pool_size_4, pool_size_5, conv_reg, dense_reg, dense_neurons_1, dense_neurons_2, dense_neurons_3, dropout_1, dropout_2, dropout_3, beta_1, beta_2, amsgrad, rho)[source]

Print a formatted summary of training hyperparameters and model architecture settings.

Parameters:
  • batch_size (int) – Number of samples per gradient update. No default.

  • lr (float) – Optimizer learning rate. No default.

  • decay (float) – Learning-rate time decay (per update/epoch, depending on optimizer). No default.

  • momentum (float) – SGD momentum coefficient. No default.

  • nesterov (bool) – Whether SGD uses Nesterov momentum. No default.

  • loss (str) – Name of the loss function (e.g., ‘binary_crossentropy’). No default.

  • optimizer (str) – Optimizer identifier (e.g., ‘sgd’,’adam’,’rmsprop’,’adadelta’,’adamax’,’nadam’). No default.

  • model_reg (str or None) – Model-level regularization flag (e.g., ‘batch_norm’,’local_response’, or None). No default.

  • conv_init (str or tf.keras.initializers.Initializer) – Convolutional kernel initializer (e.g., ‘he_normal’,’glorot_uniform’,’uniform_scaling’). No default.

  • activation_conv (str) – Activation function used after convolutional layers (e.g., ‘relu’). No default.

  • dense_init (str or tf.keras.initializers.Initializer) – Dense layer kernel initializer. No default.

  • activation_dense (str) – Activation function used in dense layers (e.g., ‘relu’,’tanh’). No default.

  • filter1 (int) – Number of filters in convolutional layer/block 1; zero disables the layer. No default.

  • filter2 (int) – Number of filters in convolutional layer/block 2; zero disables the layer. No default.

  • filter3 (int) – Number of filters in convolutional layer/block 3; zero disables the layer. No default.

  • filter4 (int) – Number of filters in convolutional layer/block 4; zero disables the layer. No default.

  • filter5 (int) – Number of filters in convolutional layer/block 5; zero disables the layer. No default.

  • filter_size_1 (int) – Kernel size for convolutional layer/block 1. No default.

  • filter_size_2 (int) – Kernel size for convolutional layer/block 2. No default.

  • filter_size_3 (int) – Kernel size for convolutional layer/block 3. No default.

  • filter_size_4 (int) – Kernel size for convolutional layer/block 4. No default.

  • filter_size_5 (int) – Kernel size for convolutional layer/block 5. No default.

  • pooling_1 ({'max','average','min', None}) – Pooling mode after layer/block 1; None disables pooling. No default.

  • pooling_2 ({'max','average','min', None}) – Pooling mode after layer/block 2; None disables pooling. No default.

  • pooling_3 ({'max','average','min', None}) – Pooling mode after layer/block 3; None disables pooling. No default.

  • pooling_4 ({'max','average','min', None}) – Pooling mode after layer/block 4; None disables pooling. No default.

  • pooling_5 ({'max','average','min', None}) – Pooling mode after layer/block 5; None disables pooling. No default.

  • pool_size_1 (int) – Pool window size after layer/block 1 (if pooling enabled). No default.

  • pool_size_2 (int) – Pool window size after layer/block 2 (if pooling enabled). No default.

  • pool_size_3 (int) – Pool window size after layer/block 3 (if pooling enabled). No default.

  • pool_size_4 (int) – Pool window size after layer/block 4 (if pooling enabled). No default.

  • pool_size_5 (int) – Pool window size after layer/block 5 (if pooling enabled). No default.

  • conv_reg (float) – L2 regularization coefficient applied to convolutional kernels. No default.

  • dense_reg (float) – L2 regularization coefficient applied to dense kernels. No default.

  • dense_neurons_1 (int) – Number of units in dense layer 1. No default.

  • dense_neurons_2 (int) – Number of units in dense layer 2; zero disables the layer. No default.

  • dense_neurons_3 (int) – Number of units in dense layer 3; zero disables the layer. No default.

  • dropout_1 (float) – Dropout rate applied after dense layer 1 (0–1). No default.

  • dropout_2 (float) – Dropout rate applied after dense layer 2 (0–1). No default.

  • dropout_3 (float or str) – Dropout rate after dense layer 3 (0–1); may be ‘N/A’ for models without this layer. No default.

  • beta_1 (float) – Adam/Nadam first-moment decay (β₁). No default.

  • beta_2 (float) – Adam/Nadam second-moment decay (β₂). No default.

  • amsgrad (bool) – Whether to use the AMSGrad variant of Adam. No default.

  • rho (float) – Exponential decay factor for Adadelta/RMSprop. No default.

Returns:

This function prints to stdout and returns nothing.

Return type:

None

pyBIA.cnn_model.format_labels(labels: list) list[source]

Convert raw parameter keys into human-readable display labels.

Parameters:

labels (list of str) – Sequence of raw label strings to format.

Returns:

List of formatted labels (same order and length as input).

Return type:

list of str

Notes

This function applies a set of explicit replacements; if no rule matches, the label is converted by replacing underscores with spaces and applying title case.