Morphological Catalog

The catalog module is the core of pyBIA’s morphological feature extraction and catalog-generation pipeline. It performs source detection, aperture photometry, and segmentation-based morphology measurements designed for training machine-learning models. This is all handled by the Catalog class.

This module combines image segmentation (via Astropy’s photutils) with moment-based descriptors (included in the image_moments module) to convert image data into a feature matrix containing:

  • Photometry: aperture fluxes and flux errors (and magnitudes if a zeropoint is provided).

  • Moments: raw, central, geometrically centered, Hu-invariant, and Legendre moments computed on the segmented source.

  • Segmentation properties: shape and intensity statistics (e.g., ellipticity, eccentricity, Gini, bounding box and other metadata).

Quick Start

Automatic detection (no input positions)

If you provide a 2D image containing various sources and do not specify positions, pyBIA will run segmentation on the full frame and build a catalog for all detected sources:

from pyBIA import catalog

# Instantiate the Catalog class (data is the image array)
cat = catalog.Catalog(data)

# Run source detection and compute photometric and morphological features
cat.create(save_file=True)

# The catalog is stored in the ``cat.cat`` attribute
print(cat.cat)

Targeted extraction (user-supplied positions)

If you want to analyze a specific source (or a set of sources) at known pixel positions, pass the x and y arguments, which represent the source coordinate(s), in relative pixels of the input image:

from pyBIA import catalog

# Example: 100x100 stamp with a source at the center
cat = catalog.Catalog(data, x=50, y=50)
cat.create(save_file=True)

print(cat.cat)

Computed Features

When morph_params=True (default), pyBIA returns a morphology vector containing various source properties including segmentation-based image moments.

Moments are computed on the segmented source pixels (all non-source pixels are zeroed prior to measurement). The moments table contains 47 features:

  • Raw moments up to 3rd order:

    M00, M10, M01, M20, M11, M02, M30, M21, M12, M03
    
  • Central moments up to 3rd order:

    mu00, mu10, mu01, mu20, mu11, mu02, mu30, mu21, mu12, mu03
    
  • Geometrically centered polynomial moments up to 3rd order:

    G00, G10, G01, G20, G11, G02, G30, G21, G12, G03
    
  • Hu invariants:

    Hu1, Hu2, Hu3, Hu4, Hu5, Hu6, Hu7
    
  • Legendre moments (orthonormal) up to total order 3 (n+m ≤ 3):

    L00, L10, L01, L20, L11, L02, L30, L21, L12, L03
    
  • Shape/geometry:

    eccentricity, ellipticity, elongation, orientation, perimeter, equivalent_radius, fwhm
    
  • Intensity/statistics:

    gini, max_value, min_value, plus index metadata for extrema
    
  • Covariance/ellipse terms:

    covar_sigx2, covar_sigy2, covar_sigxy, cxx, cxy, cyy, covariance_eigval1, covariance_eigval2
    
  • Bounds:

    bbox_xmin, bbox_xmax, bbox_ymin, bbox_ymax
    
  • Scalar flag:

    isscalar (stored as 1 for True, 0 for False)
    

In total, the default morphology vector contains 77 features per source (47 moment features + 30 segmentation properties), plus optional photometry and metadata columns (e.g., flux, flux_err, mag, mag_err, median_bkg, xpix, ypix, obj_name, field_name, flag), depending on which inputs are provided (e.g., zp, error, bkg, and position mode).

Methodology

Catalog generation follows one of two workflows depending on whether positions are provided:

Auto-detect mode (x/y not provided)

  1. Background subtraction (optional): if bkg=None, a robust global/tiled sigma-clipped background is subtracted from the full frame prior to segmentation.

  2. Segmentation: the background-subtracted image is convolved with a Gaussian kernel (FWHM = 9 pixels; window size set by kernel_size), then thresholded at nsig using photutils.detect_sources (with an option to deblend the detected sources).

  3. Photometry: circular-aperture photometry is measured at the detected centroids using radius aperture (pixels).

  4. Morphology: for each detected centroid, a local square cutout of length size is extracted and morphology features are computed on the segmented pixels.

Targeted mode (x/y provided)

When positions are input, pyBIA treats each coordinate pair as an independent source/measurement:

  1. Photometry: circular-aperture flux is measured at (x, y) using radius aperture (pixels). If bkg=None, a sigma-clipped annulus median (radii annulus_in and annulus_out in pixels) is used for local background subtraction and stored as median_bkg.

  2. Cutout extraction: a square cutout of length size is cropped around the target coordinate (automatically reduced if the image is smaller than size).

  3. Segmentation (cutout): the cutout is convolved with the Gaussian kernel and segmented using the same nsig/kernel_size/npixels/connectivity settings (with optional deblend). If exptime is provided, the cutout is divided by exptime for segmentation and morphology only (photometry is measured on the original data).

  4. Central validation: the detection is accepted only if a segmented region is present near the cutout center. If threshold==0, the central pixel must lie in the segmentation; otherwise, at least one segmented pixel must fall within a central circular mask of radius threshold. The segmented object whose centroid is closest to the cutout center is retained.

  5. Morphology: moment features and segmentation properties are computed on the retained segmented pixels. If validation fails, morphology columns are set to -999 while photometry is still reported.

Key Parameters

The Catalog class includes the following parameters:

Parameter

Default

Description

x, y

None

Source center pixel coordinates. If None, pyBIA runs automatic detection on the full frame.

bkg

None

Background handling. Use 0 if the image is already background-subtracted; use None to estimate local background (targeted mode) or subtract a global background (auto-detect mode).

error

None

Optional per-pixel error map (same shape as data). Enables flux_err and mag_err.

zp

None

Zeropoint for magnitude calculation. If provided, mag and mag_err are computed from aperture fluxes.

exptime

None

Exposure time (seconds). If provided, cutouts are divided by exptime for segmentation/morphology (photometry is measured on the original data).

morph_params

True

If True, compute morphology features (moments + segmentation properties). If False, only photometry/metadata columns are produced.

nsig

0.3

Segmentation detection threshold (in sigma above background).

threshold

10

Central validation radius (pixels). Use 0 to require exact central-pixel membership in a segment.

deblend

False

If True, attempt to split overlapping sources during segmentation.

size

100

Side length (pixels) of the square cutout used to segment each source when computing morphology.

aperture

15

Circular-aperture radius (pixels) for photometry.

annulus_in

20

Inner radius (pixels) of the background annulus (targeted mode; bkg=None).

annulus_out

35

Outer radius (pixels) of the background annulus.

kernel_size

21

Gaussian convolution window size (pixels) used prior to segmentation.

npixels

9

Minimum number of connected pixels required to define a detection.

connectivity

8

Pixel connectivity (4 = edge-connected, 8 = edge+corner-connected).

invert

False

If True, swap (x, y) ordering when cropping cutouts (useful for row/column-style indexing).

Note

Non-detections (morphology): if the segmentation does not contain a valid central object, pyBIA flags the source as a non-detection and sets all morphology columns to -999. Aperture photometry is still recorded.

Example

In this example, we generate a source catalog for a dataset of 20,000 simulated extragalactic sources (10k strong gravitational lenses, 10k non-lensed galaxies). These sources are simulated in the five bands LSST will observe (g, r, i, z, y).

Data Access You can download the sample binary files here:

Processing Script We will process each band individually, constructing separate catalogs for lenses and non-lenses, and then merging them.

  1import numpy as np
  2import pandas as pd
  3from pyBIA import catalog
  4
  5# Load the images, binary files (41x41 pixels) containing five filters: g,r,i,z,y
  6lenses = np.load('lenses_10k.npy')
  7nonlenses = np.load('nonlenses_10k.npy')
  8
  9# pyBIA catalog parameteres #
 10
 11error = None # The corresponding error map, for computing photometric errors
 12xpix = ypix = lenses.shape[-1] // 2 # Relative position (in pixels) of the source centroid, here they are centered about the image cutouts
 13bkg = None # None if background subtraction required, else set to 0 if data already bg-subtracted
 14aperture = 10 # Aperture radius (in pixels) for the photometry
 15annulus_in = 15 # Inner radius (in pixels) of background annulus for local sky estimation
 16annulus_out = 50 # Outer radius (in pixels) of background annulus.
 17nsig = 0.3 # The image segmentation detection threshold
 18threshold = 1 # Will plot the closest object within a circular mask of radius 1 (pixels) within the center
 19exptime = 1 # Exposure time to normalize the flux
 20zp = 27 # The instrumental zeropoint, for computing the apparent magnitudes
 21deblend = False # Whether to deblend detected source(s)
 22kernel_size = 21 # Gaussian filter kernel size used to convolve the data prior to segmentation
 23npixels = 9 # Required number of pixels above the sigma threshold required to detect a source
 24connectivity = 8 # Scheme to determine how pixels are grouped into a detected source, either 4 (touch along edges) or 8 (edges and corners)
 25
 26# Will process one band at a time, and save each individually
 27for i, band in enumerate(['g','r','i','z','y']):
 28
 29        # To save all of the individual catalogs
 30        master_catalog = []
 31
 32        # Loop through each individual lens source
 33        for j in range(len(lenses)):
 34                obj_name = j # The object name in the catalog will just be the order as it appears in the data
 35                flag = 1 # The positive class label
 36
 37                # Instantiate the Catalog class
 38                cat = catalog.Catalog(
 39                        data=lenses[j][i], # First axis selects the individual source, second axis the band
 40                        error=error,
 41                        x=xpix,
 42                        y=ypix,
 43                        bkg=bkg,
 44                        aperture=aperture,
 45                        annulus_in=annulus_in,
 46                        annulus_out=annulus_out,
 47                        nsig=nsig,
 48                        threshold=threshold,
 49                        exptime=exptime,
 50                        zp=zp,
 51                        deblend=deblend,
 52                        kernel_size=kernel_size,
 53                        npixels=npixels,
 54                        connectivity=connectivity,
 55                        obj_name=obj_name,
 56                        flag=flag
 57                        )
 58
 59                # Create the catalog, will be stored as the `cat` class attribute.
 60                cat.create(save_file=False)
 61
 62                # Append the created `cat` attribute to the master list
 63                master_catalog.append(cat.cat)
 64
 65        ##
 66        ## Now repeat for the non-lenses ##
 67        ##
 68
 69        # Loop through each individual non-lens source
 70        for j in range(len(nonlenses)):
 71                obj_name = len(lenses) + j # The object name in the catalog will just be the order as it appears in the data + how many lenses there are already
 72                flag = 0 # The negative class label
 73
 74                # Instantiate the Catalog class
 75                cat = catalog.Catalog(
 76                        data=nonlenses[j][i], # First axis selects the individual source, second axis the band
 77                        error=error,
 78                        x=xpix,
 79                        y=ypix,
 80                        bkg=bkg,
 81                        aperture=aperture,
 82                        annulus_in=annulus_in,
 83                        annulus_out=annulus_out,
 84                        nsig=nsig,
 85                        threshold=threshold,
 86                        exptime=exptime,
 87                        zp=zp,
 88                        deblend=deblend,
 89                        kernel_size=kernel_size,
 90                        npixels=npixels,
 91                        connectivity=connectivity,
 92                        obj_name=obj_name,
 93                        flag=flag
 94                        )
 95
 96                # Create the catalog, will be stored as the `cat` class attribute.
 97                cat.create(save_file=False)
 98
 99                # Append the created `cat` attribute to the master list
100                master_catalog.append(cat.cat)
101
102
103        # Now Merge all individual catalogs into one master dataframe and save
104        df = pd.concat(master_catalog, ignore_index=True)
105        df.to_csv(f'segm_catalog_{band}_band.csv', index=False)

The five catalogs generated above are available for download here:

These catalogs will be merged and used to train a binary classifier using the ensemble_model module. This example is provided in the Supervised Learning Algorithms page.

NOTE: The catalog module also provides a standalone function to plot individual sources and the corresponding image segmentation patches given some set of parameter(s). The plot_objects_segmentation function allows users to inspect the segmentation masks overlaid on the source. As the source morphological features are dependent on the resulting segmentation, it is important to ensure the generated patches are truly representative of the source morphology. This function allows users to input up to four segmentation detection thresholds (sigma_values), so as to visualize how different values affect the resulting source extent.

In the example below, we inspect a lens where only the i-band yields a positive detection at the strictest threshold (\(\sigma=5.0\)). The other four filters at this detection level would be non-detections and the corresponding morphological features would thus be cataloged with -999 values.

 1import numpy as np
 2from pyBIA import catalog
 3
 4# Load the lenses
 5lens = np.load('lenses_10k.npy')
 6
 7# Plotting parameters
 8median_bkg = None # Whether to subtract the background (set to None if background subtraction required)
 9pix_conversion = 5.8 # Survey pixel-per-arcsecond (for setting the axes)
10crop_size = None # Will crop the image to be of this size, otherwise set to None
11xpix = ypix = lens.shape[2] // 2 # Cropped image will be centered about these coords, if not cropping set to None
12r_in = 15 # Inner radius (in pixels) of background annulus for local sky estimation
13r_out = 50 # Outer radius (in pixels) of background annulus.
14
15# Figure parameters
16fig_title = r'Example Lens' # Figure suptitle
17sup_titles = [r'$g$', r'$r$', r'$i$', r'$z$', r'$y$'] # Title(s) above each individual panel
18cmap = 'viridis' # Colormap to use when displaying input image, the segmentation patches always use binary
19
20# Segm detection parameters
21sigma_vals = [0.3, 1.0, 3.0, 5.0] # The detection threshold(s) to apply
22deblend = False # Whether to deblend detected sources
23kernel_size = 21 # Gaussian filter kernel size used to convolve the data prior to segmentation
24npixels = 9 # Required number of pixels above the sigma threshold required to detect a source
25connectivity = 8 # Scheme to determine how pixels are grouped into a detected source, either 4 (touch along edges) or 8 (edges and corners)
26threshold = 0 # Will plot the object present within the image center
27savefig = True # Whether to save the figure, it False it will show instead
28savepath = 'segm_example_lens.png' # Path (and/or filename) to save in/as
29
30i = 4 # Will plot the fifth source in the array
31
32# This function takes in up to 5 images, and plots the detection thresholds (up to 4 thresholds allowed)
33catalog.plot_objects_segmentation(
34   lens[i][0],
35   lens[i][1],
36   lens[i][2],
37   lens[i][3],
38   lens[i][4],
39   pix_conversion=pix_conversion,
40   sigma_values=sigma_vals,
41   deblend=deblend,
42   kernel_size=kernel_size,
43   npixels=npixels,
44   connectivity=connectivity,
45   threshold=threshold,
46   titles=sup_titles,
47   suptitle=fig_title,
48   cmap=cmap,
49   xpix=xpix,
50   ypix=ypix,
51   size=crop_size,
52   median_bkg=median_bkg,
53   savefig=savefig,
54   r_in=15,
55   r_out=50,
56   savepath=savepath
57   )
Segmentation Example

Visualization of segmentation maps across 5 bands. The binary masks illustrate the detected morphology at increasing sigma thresholds, computed from the sigma_values.