Convolutional Neural Networks

Documentation status (last updated May 19, 2026)

This documentation is outdated and will change soon!

Overview

While we configured pyBIA for astrophysical image filtering of diffuse Lyman-alpha emission, the program modules can be called directly to create any type of image classifier using Convolutional Neural Networks (CNN). It supports…

Example

The multi-band data for 20 lens candidates can be downloaded here. An accompanying set of 500 images to be used for the negative class can be downloaded here.

Model Creation

Start by loading the data arrays.

import numpy as np

lenses = np.load('lenses.npy')
other = np.load('other.npy')

The loaded arrays are 4-dimensional as per CNN convention: (num_images, img_width, img_height, img_num_channels). Note that the image size is 100x100 pixels, which is typically too large in the context of astrophysical filtering. The images are intentionally saved to be larger than ideal, so that if the data is oversampled via image augmentation techniques, the distorted, outer boundaries of the augmented image can be cropped out. To generate the classifier, initialize the Classifier class from the cnn_model module – note the argument img_num_channels, which should be set to be the number of channels in the data (3 filters for this example – gri).

The data is background subtracted but not normalized, which is especially important for deep-learning as the range of pixel values will directly impact the model’s ability to learn, we must set our normalization parameters accordingly, which will be used to apply min-max normalization:

from pyBIA import cnn_model

model = cnn_model.Classifier(lenses, other, img_num_channels=3, normalize=True, min_pixel=0, max_pixel=[100,100,100])

Note that each class is input individually, as they will be labeled 1 (positive) and 0 (negative), accordingly. In this example the max_pixel argument must be a list, containing one value per each band, as ordered in final axis of the 4-dimensional input array. In this example, the maximum pixel to min-max normalize by is set to 100 for all three channels – therefore any pixels less than 0 will be set to 0, and any greater than 100 will be set to 100 – after which the normalization will set the pixels to be between 0 and 1. If your data is not normalized the gradient during backpropagation will likely explode or vanish!

Currently, pyBIA supports the implementation of two popular CNN architectures: AlexNet, and VGG16. These are controlled by the clf attribute, which defaults to ‘alexnet’. To create a classifier using the AlexNet architecture, set the clf accordingly and call the create method:

model.clf = 'alexnet'
model.create()

By default, the verbose attribute is set to 0 following the Keras convention, but this can be set to 1 to visualize the model’s performance as it trains, epoch-per-epoch. The epochs attribute controls the total training epochs, which is set to 25 by default. To input validation data, set the val_positive and/or the val_negative attributes. To configure early-stopping criteria, the patience attribute can be set to a non-zero integer. This parameter will determine the number of epochs to stop the training at if there there is no training improvement, which would be indicative of over/under fitting behavior.

Our CNN pipeline supports three distinct tracking metrics, configured via the metric attribute: binary_accuracy, loss, and f1_score, as well as the validation equivalents (e``val_binary_accuracy``, val_loss, and val_f1_score).

Note that the Classifier does not support weighted loss functions, which are especially useful when the classes are imbalanced, as in this particular example. While data augmentation techniques are recommended in this scenario, if you wish to keep the training classes imbalanced, a weighted loss function can be applied by calling the CNN model functions directly,

model, history = cnn_model.AlexNet(lenses, other, img_num_channels=3, loss='weighted_binary_crossentropy', normalize=False, weight=2.0)

where the weight argument is a scalar factor that will control the relative weight of the positive class. When weight is greater than 1, for example, the loss function will assign more importance to the positive class, and vice versa (although in practice the positive class is the imbalanced one in binary classification problems, so it should not be less than 1). Note that setting weight equal to 1 is equivalent to using the standard binary cross-entropy loss function. Calling the models directly allows for maximum flexibility, as every argument is available for tuning including learning parameters, optional model callbacks and model-specific architecture arguments. For a full overview of the configurable model parameters, refer to the model-specific API documentation.

Data Augmentation

In this example, we suffer from major class-imbalance as we have only 20 positive lenses and 500 negative others (1:25 imbalance). The example below demonstrates how to apply image augmentation techniques to create new synthetic images.

The Classifier class allows you to augment your positive and/or negative data by using the following methods:

model.augment_positive()
model.augment_negative()

Running these methods automatically updates the positive_class and negative_class accordingly, but as no arguments were provided, the classes will remain unchanged. The number of augmentations to perform per individual sample is determined by the batch argument (1 by default). The current API supports the following variety of augmentation routines, which must be set directly when calling the augment_positive or augment_negative methods, all disabled by default:

width_shift (int): The max pixel shift allowed in either horizontal direction. If set to zero no horizontal shifts will be performed. Defaults to 0 pixels.
height_shift (int): The max pixel shift allowed in either vertical direction. If set to zero no vertical shifts will be performed. Defaults to 0 pixels.
horizontal (bool): If False no horizontal flips are allowed. Defaults to False.
vertical (bool): If False no random vertical reflections are allowed. Defaults to False.
rotation (int): If False no random 0-360 rotation is allowed. Defaults to False.
fill (str): This is the treatment for data outside the boundaries after roration and shifts. Default is set to ‘nearest’ which repeats the closest pixel values. Can be set to: {“constant”, “nearest”, “reflect”, “wrap”}.
image_size (int, bool): The length/width of the cropped image. This can be used to remove anomalies caused by the fill (defaults to 50). This can also be set to None in which case the image in its original size is returned.
mask_size (int): The size of the cutout mask. Defaults to None to disable random cutouts.
num_masks (int): Number of masks to apply to each image. Defaults to None, must be an integer if mask_size is used as this designates how many masks of that size to randomly place in the image.
blend_multiplier (float): Sets the amount of synthetic images to make via image blending. Must be a ratio greater than or equal to 1. If set to 1, the data will be replaced with randomly blended images, if set to 1.5, it will increase the training set by 50% with blended images, and so forth. Deafults to 0 which disables this feature.
blending_func (str): The blending function to use. Options are ‘mean’, ‘max’, ‘min’, and ‘random’. Only used when blend_multiplier >= 1. Defaults to ‘mean’.
num_images_to_blend (int): The number of images to randomly select for blending. Only used when blend_multiplier >= 1. Defaults to 2.
zoom_range (tuple): Tuple of floats (min_zoom, max_zoom) specifying the range of zoom in/out values. If set to (0.9, 1.1), for example, the zoom will be randomly chosen between 90% to 110% the original image size, note that the image size thus increases if the randomly selected zoom is greater than 1, therefore it is recommended to also input an appropriate image_size. Defaults to None, which disables this procedure.
skew_angle (float): The maximum absolute value of the skew angle, in degrees. This is the maximum because the actual angle to skew by will be chosen from a uniform distribution between the negative and positive skew_angle values. Defaults to 0, which disables this feature. Using this feature is not recommended!

Rotating, skewing, and flipping images can make the training model more robust to variations in the orientation and perspective of the input images. Likewise, shifting left/right and up/down will help make the model translation invariant and thus robust to the position of the object of interest within the image. These are the recommended methods to try at first, as other techniques such as blending and applying random mask cutouts may alter the classes too dramatically.

Image blending can help to generate new samples through the combination of different images using a variety of blending criteria. Note that by default two random images will be blended together to create one synthetic sample, and since this procedure is applied post-batch creation, the same unique sample may be randomly blended, which could be problematic if the configured augmentation parameters do not generate a sufficiently varied training class. Random cutouts can help increase the diversity of the training set and reduce overfitting, as applying this technique prevents the training model from relying too heavily on specific features of the image, thus encouraging the model to learn more general image attributes. As noted above, applying these techniques may result in an unstable classification engine as you may end up generating a synthetic class with image features that are too different, use with caution!

These techniques, when enabled, are applied in the following order:

(1) Random shift + flip + rotation: Generates batch number of images.

(2) Random zoom in or out.

(3) If image_size is set, the image is resized so as to crop the distorted boundary.

(4) Random image skewness is applied, with the skew_angle controlling the maximum angle, in degrees, to distort the image from its original position.

(5) The batch size is now increased by a factor of blend_multiplier, where each unique sample is generated by randomly merging num_images_to_blend together according to the blending function blend_function. As per the random nature, an original sample may be blended together at this stage, but with enough variation this may not be a problem.

(6) Circular cutouts of size mask_size are randomly placed in the image, whereby the cutouts replace the pixel values with zeroes. Note that as per the random nature of the routine, if num_masks is greater than 1, overlap between each cutout may occur, depending on the corresponding image size to mask_size ratio.

Note that pyBIA’s data augmentation routine is for offline data augmentation. Online augmentation may be preferred in certain cases as that exposes the training model to significantly more varied samples. If multiple image filters are being used, the data augmentation procedure will save the seeds from the augmentation of the first filter, after which the seeds will be applied to the remaining filters, thus ensuring the same augmentation procedure is applied across all channels.

For this example, we will augment each unique sample in the positive_class 25 times by setting the batch parameter, with each augmented sample generated by randomizing the enabled procedures:

batch = 25; image_size = 67
width_shift = height_shift = 10
vertical = horizontal = rotation = True
zoom_range = (0.9, 1.1)
mask_size = num_masks = 5

model.augment_positive(batch=10, width_shift=width_shift, height_shift=height_shift, vertical=vertical, horizontal=horizontal, rotation=rotation, zoom_range=zoom_range, image_size=image_size, mask_size=mask_size, num_masks=num_masks)

The positive_class will now contain 500 images so as to match our negative_class. Alternatively, we could have set batch to 10, and enabled the blend_multiplier option with a value of 2.5, to bring the final sample to 500 (20 original images times 10 augmentations times a 2.5 blending multiplier). When applying mask cutouts, it is avised to apply similar cutouts to the negative_class so as to prevent the model from associating random cutouts with the positive class:

model.augment_negative(mask_size=mask_size, num_masks=num_masks, image_size=image_size)

Note that the image_size paramter was set to 67 when augmenting the positive_class, so even if you wish to leave the other training class the same, you would still have to resize your data by running the augment_negative method with only the image_size argument. The _plot_positive and _plot_negative class attributes can be used for quick visualization.

model._plot_positive()
model._plot_negative()

If an image appears dark, run the methods again but manually set the vmin and vmax arguments, as by the default these limits are derived using a robust scaling. To re-do the augmentations, simply reset the positive and negative class attributes and try again:

model.positive_class = lenses
model.augment_positive(blend_multiplier=50, num_images_to_blend=3, blending_func='mean', image_size=image_size)

model.negative_class = other
model.augment_negative(blend_multiplier=1, num_images_to_blend=3, blending_func='mean', image_size=image_size)

In this attempt we apply only the blending routine, note that blend_multiplier is set to 1 for the negative class, so as to implement blending for the other class while keeping the original class size the same. When the classes are ready for training, simply call the create method.

No current options are available for augmenting the validation data, but this can be accomplished manually via the data_augmentation module.

Optimization

If you know what augmentation procedures are appropriate for your dataset, but don’t know what specfic thresholds to use, you can configure the Classifier class to identify the optimal augmentations parameter to apply to your dataset. To enable optimization, set optimize to True. This will always optimize the model learning parameters, including the learning rate, decay, optimizer, loss, activation functions, initializers and batch size).

pyBIA supports two optimization options, one is opt_aug, which when set to True, will optimize the augmentation options that have been enabled. The class attributes that control the augmentation optimization include:

batch_min (int): The minimum number of augmentations to perform per image on the positive class, only applicable if opt_aug=True. Defaults to 2.
batch_max (int): The maximum number of augmentations to perform per image on the positive class, only applicable if opt_aug=True. Defaults to 25.
batch_other (int): The number of augmentations to perform to the other class, presumed to be the majority class. Defaults to 1. This is done to ensure augmentation techniques are applied consistently across both classes.
image_size_min (int): The minimum image size to assess, only applicable if opt_aug=True. Defaults to 50.
image_size_max (int): The maximum image size to assess, only applicable if opt_aug=True. Defaults to 100.
opt_max_min_pix (int, optional): The minimum max pixel value to use when tuning the normalization procedure, only applicable if opt_aug=True. Defaults to None.
opt_max_max_pix (int, optional): The maximum max pixel value to use when tuning the normalization procedure, only applicable if opt_aug=True. Defaults to None.
shift (int): The max allowed vertical/horizontal shifts to use during the data augmentation routine, only applicable if opt_aug=True. Defaults to 10 pixels.
mask_size (int, optional): If enabled, this will set the pixel length of a square cutout, to be randomly placed somewhere in the augmented image. This cutout will replace the image values with 0, therefore serving as a regularizear. Only applicable if opt_aug=True. Defaults to None.
num_masks (int, optional): The number of masks to create, to be used alongside the mask_size parameter. If this is set to a value greater than one, overlap may occur.
blend_max (float): A float greater than 1.1, corresponding to the increase in the minority class after the blending augmentations, to be used if optimizing with opt_aug=True, then this parameter will be tuned and will be used as the maximum increase to accept. For example, if opt_aug=True and blend_max=5, then the optimization will return an optimal value between 1 and 5. If set to 1, then the blending procedure is applied but the minority class size remains same. If set to 5, then the minority class will be increased 500% via the blening routine. Defaults to 0 which disables this feature. To enable when opt_aug=True, set to greater than or equal to 1.1 (a minimum maximum of 10% increase is required), which would thus try different values for this during the optimization between 1 and 1.1.
blend_other (float): Must be greater than or equal to 1. Can be set to zero to avoid applying augmentation to the majority class. It is recommended to enable this if applying blending and/or cutouts so as to avoid training a classifier that associates these features with the positive class only.
zoom_range (tuple): This sets the allowed zoom in/out range. This is not optimized, and must be set carefully according to the data being used. During the optimization, random zooms will occur according to this designated range. Can be set to zero to disable.

The second optimization routine is enabled by setting opt_model to True. This will optimize the pooling types to apply (min, max or average) as well as the main regularizer (either batch normalization or LRN) to apply to the selected CNN architcture. If limit_search is set to False, the model hyperparameters will also be optimized (individual filter information including size and stride). It is not advised to disable the limit_search option if the augmentation procedure is also being optimized, as this will yield memory errors!

The following example configures the CNN model and sets the optimization parameters. For information regarding the parameters please refer to the documentation.

import numpy as np
from pyBIA import cnn_model

lenses = np.load('lenses.npy')
val_lenses = lenses[:4] #This will be used to validate the optimization
lenses = lenses[4:] #Positive class training data, will be augmented
opt_cv = 5  # Will cross-validate the positive class, since there are 20 images and 4 were selected for validation, this will create 5 models with a unique training data sample each time

other = np.load('other.npy')

# Model creation and optimization

clf='alexnet' # AlexNet CNN architecture will be used
img_num_channels = 3 # Creating a 3-Channel model
normalize = True # Will min-max normalize the images so all pixels are between 0 and 1

optimize = True # Activating the optimization routine
n_iter = 100 # Will run the optimization routine for 250 trials
batch_size_min, batch_size_max = 16, 64 # The training batch size will be optimized according to these bounds

opt_model = limit_search = True # Will also optimize the CNN model architecture but with limit search on, therefore only pooling type and the regularizer are optimized
train_epochs = 10 # Each optimization trial will train a model up to 10 epochs
epochs = 0 # The final model will not be generated, will instead be trained post-processing
patience = 3 # The model patience which will be applied during optimization

opt_aug = True # Will also optimize the data augmentation procedure (positive class only by design)
batch_min, batch_max = 10, 250 # The amount to augment EACH positive sample by, the optimal value will be selected according to this range
shift = 5 # Will randomly shift (horizontally & vertically) each augmented image between 0 and 5 pixels
rotation = horizontal = vertical = True # Will randomly apply rotations (0-360), and horizintal/vertical flips to each augmented image
zoom_range = (0.9,1.1) # Will randomly apply zooming in/out between plus and minus 10% to each augmented image
batch_other = 0 # The number of augmentations to perform to the negative class
balance = True # Will balance the negative class according to how many positive samples were generated during augmentation

image_size_min, image_size_max = 50, 100 # Will try different image sizes within these bounds, the optimal value will be selected according to this range
opt_max_min_pix, opt_max_max_pix = 10, 1500 # Will try different normalization values (the max pixel for the min-max normalization), one for each filter, the optimal value(s) will be selected according to this range

metric = 'val_loss' # The optimzation routine will operate according to this metric's value at the end of each trial, which must also follow the patience criteria
average = True # Will average out the above metric across all training epochs, this will be the trial value at the end

metric2 = 'f1_score' # Optional metric that will stop trials if this doesn't improve according to the patience
metric3 = 'binary_accuracy' # Optional metric that will stop trials if this doesn't improve according to the patience

monitor1 = 'binary_accuracy' # Hard stop, trials will be terminated if this metric falls above the specified threshold
monitor1_thresh = 0.99+1e-6 # Specified threshold, in this case the optimization trial will termiante if the training accuracy falls above this limit

monitor2 = 'loss' # Hard stop, trials will be terminated if this metric falls below the specified threshold
monitor2_thresh = 0.01-1e-6 # Specified threshold, in this case the optimization trial will termiante if the training loss falls below this limit

model = cnn_model.Classifier(positive_class=lenses, negative_class=other, val_positive=val_lenses, img_num_channels=img_num_channels, clf=clf, normalize=normalize, optimize=optimize, n_iter=n_iter, batch_size_min=batch_size_min, batch_size_max=batch_size_max, epochs=epochs, patience=patience, metric=metric, metric2=metric2, metric3=metric3, average=average, opt_model=opt_model, train_epochs=train_epochs, opt_cv=opt_cv, opt_aug=opt_aug, batch_min=batch_min, batch_max=batch_max, batch_other=batch_other, balance=balance, image_size_min=image_size_min, image_size_max=image_size_max, shift=shift, opt_max_min_pix=opt_max_min_pix, opt_max_max_pix=opt_max_max_pix, rotation=rotation, horizontal=horizontal, vertical=vertical, zoom_range=zoom_range, limit_search=limit_search, monitor1=monitor1, monitor1_thresh=monitor1_thresh, monitor2=monitor2, monitor2_thresh=monitor2_thresh, use_gpu=True, verbose=1)

model.create()
model.save(dirname='Optimized_CNN_Model_CV5')

As the epochs parameter was set to zero, the final model(s) are not saved after the optimization is completed. While the models created during the optimziation routine were trained to 10 epochs as per the train_epochs parameter, this a lower limit and in principle the final model could be trained for more epochs. The train_epochs argument is set lower so as to avoid having to create full models during optimization, as this would increase the optimization time significantly. In practice, we create test models during optimization that are shallower than the final model. Nonetheless, in this example we will set the epochs to 10 as a starting point as we are dealing with a limited data set that may not require us to train the network for much more:

import numpy as np
from pyBIA import cnn_model

lenses = np.load('lenses.npy')
val_lenses = lenses[:4]
lenses = lenses[4:]

other = np.load('other.npy')


model = cnn_model.Classifier(lenses, val_lenses, val_lenses)
model.load('Optimized_CNN_Model_CV5')
model.epochs=10
model.create()