Data Models

Core data classes for the pipeline.

Data models for spectroscopic reduction pipeline.

Per data-model.md: RawFrame hierarchy, calibration classes, observation sets, wavelength solutions, extracted spectra.

class pykosmospp.models.RawFrame(file_path)[source]

Bases: ABC

Abstract base class for raw FITS files from telescope.

Per data-model.md §1: Represents single FITS file with metadata (observation type, target, exposure time, instrument configuration).

__init__(file_path)[source]

Initialize RawFrame from FITS file.

Parameters:

file_path (Path) – Path to FITS file

classmethod from_fits(file_path, gain=1.4, readnoise=3.7, saturate=58982.0)[source]

Load RawFrame from FITS file with detector parameters.

Parameters:
  • file_path (Path) – Path to FITS file

  • gain (float) – CCD gain in e-/ADU

  • readnoise (float) – Read noise in e-

  • saturate (float) – Saturation level in ADU

Return type:

RawFrame subclass instance

validate_header()[source]

Validate required FITS header keywords are present.

Returns:

True if header valid

Return type:

bool

Raises:

ValueError – If required keywords missing

detect_saturation()[source]

Detect saturated pixels in frame.

Return type:

tuple[bool, float]

Returns:

  • saturated (bool) – True if any pixels saturated

  • fraction (float) – Fraction of pixels above saturation threshold

class pykosmospp.models.BiasFrame(file_path)[source]

Bases: RawFrame

Bias calibration frame.

Per data-model.md §2: Captures detector readout bias pattern.

__init__(file_path)[source]

Initialize RawFrame from FITS file.

Parameters:

file_path (Path) – Path to FITS file

validate_header()[source]

Validate bias frame has zero exposure time.

Return type:

bool

class pykosmospp.models.FlatFrame(file_path)[source]

Bases: RawFrame

Flat field calibration frame.

Per data-model.md §3: Captures pixel-to-pixel sensitivity and illumination.

__init__(file_path)[source]

Initialize RawFrame from FITS file.

Parameters:

file_path (Path) – Path to FITS file

validate_header()[source]

Validate flat frame header.

Return type:

bool

class pykosmospp.models.ArcFrame(file_path)[source]

Bases: RawFrame

Arc lamp calibration frame for wavelength calibration.

Per data-model.md §4: Contains emission line spectrum.

__init__(file_path)[source]

Initialize RawFrame from FITS file.

Parameters:

file_path (Path) – Path to FITS file

validate_header()[source]

Validate arc frame header and detect lamp type.

Return type:

bool

class pykosmospp.models.ScienceFrame(file_path)[source]

Bases: RawFrame

Science observation frame (target spectrum).

Per data-model.md §5: 2D spectral image of astronomical target.

__init__(file_path)[source]

Initialize RawFrame from FITS file.

Parameters:

file_path (Path) – Path to FITS file

validate_header()[source]

Validate science frame header and extract metadata.

Return type:

bool

class pykosmospp.models.MasterBias(data, n_combined, bias_level, bias_stdev, provenance=<factory>)[source]

Bases: object

Combined master bias frame.

Per data-model.md §7: Median-combined bias with provenance.

data: CCDData
n_combined: int
bias_level: float
bias_stdev: float
provenance: Dict[str, any]
validate()[source]

Validate master bias quality.

Per data-model.md §7: bias_stdev <10 ADU

Return type:

bool

__init__(data, n_combined, bias_level, bias_stdev, provenance=<factory>)
class pykosmospp.models.MasterFlat(data, n_combined, normalization_region, bad_pixel_fraction, provenance=<factory>)[source]

Bases: object

Combined master flat field frame.

Per data-model.md §8: Median-combined flat normalized to unity.

data: CCDData
n_combined: int
normalization_region: tuple
bad_pixel_fraction: float
provenance: Dict[str, any]
validate()[source]

Validate master flat quality.

Per data-model.md §8: bad_pixel_fraction <0.05

Return type:

bool

__init__(data, n_combined, normalization_region, bad_pixel_fraction, provenance=<factory>)
class pykosmospp.models.CalibrationSet(master_bias, master_flat, bad_pixel_mask=None)[source]

Bases: object

Complete set of calibrations for science frame reduction.

Per data-model.md §6: Master bias, flat, bad pixel mask.

master_bias: MasterBias
master_flat: MasterFlat
bad_pixel_mask: Optional[ndarray] = None
apply_to_frame(science_frame, propagate_uncertainty=True)[source]

Apply calibrations to science frame with proper uncertainty propagation.

Per T112: Propagates read noise + Poisson noise through bias subtraction and flat fielding per FR-014.

Uncertainty Propagation: 1. Raw frame: σ² = (readnoise)² + (data × gain) [Poisson + read noise] 2. Bias subtraction: σ²_calib = σ²_science + σ²_bias 3. Flat fielding: σ²_final = σ²_calib / flat² + (calib × σ_flat / flat²)²

Parameters:
  • science_frame (ScienceFrame) – Raw science frame

  • propagate_uncertainty (bool, optional) – Whether to compute and propagate uncertainties (default: True)

Returns:

Calibrated science data with uncertainty

Return type:

CCDData

validate()[source]

Validate all calibration components.

Return type:

bool

__init__(master_bias, master_flat, bad_pixel_mask=None)
class pykosmospp.models.Spectrum2D(data, variance, source_frame, mask=None, cosmic_ray_mask=None)[source]

Bases: object

2D spectrum with calibration-applied data and detected traces.

Per data-model.md §9: Contains calibrated 2D data (bias-subtracted, flat-fielded), variance map, mask, cosmic ray flags, and list of detected/selected traces.

__init__(data, variance, source_frame, mask=None, cosmic_ray_mask=None)[source]

Initialize 2D spectrum.

Parameters:
  • data (np.ndarray) – Calibrated 2D data (spatial x spectral)

  • variance (np.ndarray) – Variance map (same shape as data)

  • source_frame (ScienceFrame) – Source science frame

  • mask (np.ndarray, optional) – Bad pixel mask (True = bad)

  • cosmic_ray_mask (np.ndarray, optional) – Cosmic ray mask (True = cosmic ray)

detect_traces(min_snr=3.0, **kwargs)[source]

Detect spectral traces using cross-correlation.

Parameters:
  • min_snr (float) – Minimum SNR for trace detection

  • **kwargs – Additional arguments for trace detection

Returns:

Detected traces

Return type:

List[Trace]

subtract_sky(sky_buffer=30)[source]

Estimate and subtract sky background.

Parameters:

sky_buffer (int) – Buffer pixels from trace edges

Returns:

Sky-subtracted 2D data

Return type:

np.ndarray

extract_spectrum(trace, method='optimal')[source]

Extract 1D spectrum from trace.

Parameters:
  • trace (Trace) – Trace to extract

  • method (str) – Extraction method (‘optimal’ or ‘aperture’)

Returns:

Extracted 1D spectrum

Return type:

Spectrum1D

class pykosmospp.models.Trace(trace_id, spatial_positions, spectral_pixels, snr_estimate, spatial_profile=None, wavelength_solution=None, user_selected=False)[source]

Bases: object

Spectral trace with position, profile, and wavelength solution.

Per data-model.md §10: Spatial position as function of spectral pixel, fitted spatial profile, wavelength solution, SNR estimate.

__init__(trace_id, spatial_positions, spectral_pixels, snr_estimate, spatial_profile=None, wavelength_solution=None, user_selected=False)[source]

Initialize trace.

Parameters:
  • trace_id (int) – Unique trace identifier

  • spatial_positions (np.ndarray) – Spatial (Y) position at each spectral pixel

  • spectral_pixels (np.ndarray) – Spectral (X) pixel array

  • snr_estimate (float) – Estimated median SNR

  • spatial_profile (SpatialProfile, optional) – Fitted spatial profile

  • wavelength_solution (WavelengthSolution, optional) – Wavelength calibration

  • user_selected (bool) – Whether user manually selected this trace

fit_profile(data_2d, variance_2d, aperture_width=10)[source]

Fit spatial profile to trace.

Parameters:
  • data_2d (np.ndarray) – 2D spectral data

  • variance_2d (np.ndarray) – 2D variance map

  • aperture_width (int) – Width for profile extraction

Returns:

Fitted profile

Return type:

SpatialProfile

apply_wavelength_solution(wavelength_solution)[source]

Apply wavelength calibration to this trace.

Parameters:

wavelength_solution (WavelengthSolution) – Wavelength solution to apply

extract_optimal(data_2d, variance_2d)[source]

Extract optimal 1D spectrum from trace.

Parameters:
  • data_2d (np.ndarray) – 2D spectral data

  • variance_2d (np.ndarray) – 2D variance map

Return type:

Tuple[ndarray, ndarray]

Returns:

  • flux (np.ndarray) – Extracted flux

  • variance (np.ndarray) – Extracted variance

class pykosmospp.models.SpatialProfile(profile_type, center, width, amplitude, profile_function, chi_squared)[source]

Bases: object

Fitted spatial profile (cross-dispersion direction).

Per data-model.md §11: Profile type (Gaussian, Moffat, empirical), parameters (center, width, amplitude), fit quality (chi-squared).

__init__(profile_type, center, width, amplitude, profile_function, chi_squared)[source]

Initialize spatial profile.

Parameters:
  • profile_type (str) – Profile type (‘gaussian’, ‘moffat’, ‘empirical’)

  • center (float) – Profile center position (pixels)

  • width (float) – Profile width (FWHM in pixels)

  • amplitude (float) – Profile amplitude (peak value)

  • profile_function (callable) – Function to evaluate profile at positions

  • chi_squared (float) – Chi-squared of fit

evaluate(positions)[source]

Evaluate profile at given positions.

Parameters:

positions (np.ndarray) – Spatial positions

Returns:

Profile values

Return type:

np.ndarray

class pykosmospp.models.WavelengthSolution(coefficients, order, arc_frame, n_lines_identified, rms_residual, wavelength_range, poly_type='chebyshev', pixel_range=None, calibration_method='line_matching', template_used=None, dtw_parameters=None)[source]

Bases: object

Wavelength calibration solution mapping pixel to wavelength.

Per data-model.md §12: Polynomial coefficients, arc line identifications, RMS residual, wavelength range.

__init__(coefficients, order, arc_frame, n_lines_identified, rms_residual, wavelength_range, poly_type='chebyshev', pixel_range=None, calibration_method='line_matching', template_used=None, dtw_parameters=None)[source]

Initialize wavelength solution.

Parameters:
  • coefficients (np.ndarray) – Polynomial coefficients

  • order (int) – Polynomial order

  • arc_frame (ArcFrame) – Source arc frame

  • n_lines_identified (int) – Number of arc lines identified

  • rms_residual (float) – RMS residual of fit in Angstroms

  • wavelength_range (tuple) – (min_wavelength, max_wavelength) in Angstroms

  • poly_type (str) – Polynomial type (‘chebyshev’, ‘legendre’, ‘polynomial’)

  • pixel_range (tuple, optional) – (min_pixel, max_pixel) used for normalization. If None, uses (0, 4095)

  • calibration_method (str) – Method used for calibration: ‘line_matching’ or ‘dtw’

  • template_used (str, optional) – Name of arc template file used (for DTW method)

  • dtw_parameters (dict, optional) – DTW parameters used (e.g., peak_threshold, step_pattern)

wavelength(pixels, max_wavelength=10000.0)[source]

Evaluate wavelength at pixel positions.

Parameters:
  • pixels (np.ndarray) – Pixel positions

  • max_wavelength (float, optional) – Maximum allowed wavelength in Angstroms (default: 10000.0) Limits polynomial extrapolation to optical range. Ground-based non-IR spectrographs don’t capture >10000 Å.

Returns:

Wavelengths in Angstroms (clipped to max_wavelength)

Return type:

np.ndarray

inverse(wavelengths)[source]

Approximate inverse: wavelength to pixel (via interpolation).

Parameters:

wavelengths (np.ndarray) – Wavelengths in Angstroms

Returns:

Pixel positions

Return type:

np.ndarray

validate()[source]

Validate wavelength solution quality.

Returns:

True if valid

Return type:

bool

Raises:

ValueError – If RMS too high or too few lines

class pykosmospp.models.Spectrum1D[source]

Bases: object

Placeholder for 1D spectrum class (data-model.md §13)

class pykosmospp.models.QualityMetrics[source]

Bases: object

Quality metrics for reduced spectra.

Per data-model.md §14: SNR, wavelength RMS, sky residuals, cosmic ray fraction, overall quality grade.

__init__()[source]

Initialize quality metrics.

compute(spectrum_1d, spectrum_2d=None)[source]

Compute all quality metrics.

Parameters:
  • spectrum_1d (Spectrum1D) – Extracted 1D spectrum

  • spectrum_2d (Spectrum2D, optional) – Source 2D spectrum

generate_report()[source]

Generate formatted quality report.

Returns:

Formatted report string

Return type:

str

class pykosmospp.models.PipelineConfig[source]

Bases: object

Placeholder for pipeline config class (data-model.md §15)

class pykosmospp.models.ObservationSet(observation_date, target_name, bias_frames=<factory>, flat_frames=<factory>, arc_frames=<factory>, science_frames=<factory>, calibration_set=None)[source]

Bases: object

Collection of frames for a single observation sequence.

Per data-model.md §16: Groups bias, flat, arc, and science frames with methods for validation and AB pair grouping.

observation_date: datetime
target_name: str
bias_frames: List[BiasFrame]
flat_frames: List[FlatFrame]
arc_frames: List[ArcFrame]
science_frames: List[ScienceFrame]
calibration_set: Optional[CalibrationSet] = None
classmethod from_directory(input_dir, config)[source]

Create ObservationSet by discovering FITS files in directory.

Parameters:
  • input_dir (Path) – Directory with arcs/, flats/, biases/, science/ subdirectories

  • config (dict) – Pipeline configuration

Returns:

Populated observation set

Return type:

ObservationSet

group_ab_pairs(max_time_diff=600.0)[source]

Group science frames into AB nod pairs.

Per data-model.md §16: Matches by nod_position=’A’/’B’ or by observation time proximity (<10 minutes).

Parameters:

max_time_diff (float, optional) – Maximum time difference in seconds (default: 600 = 10 minutes)

Returns:

List of (A_frame, B_frame) pairs

Return type:

list of tuples

validate_completeness()[source]

Validate observation set has required calibrations.

Returns:

True if complete

Return type:

bool

Raises:

ValueError – If required frames missing

__init__(observation_date, target_name, bias_frames=<factory>, flat_frames=<factory>, arc_frames=<factory>, science_frames=<factory>, calibration_set=None)
class pykosmospp.models.ReducedData(source_frame, spectrum_2d, spectra_1d=<factory>, diagnostic_plots=<factory>, processing_log=<factory>, reduction_timestamp=<factory>, quality_metrics=None)[source]

Bases: object

Container for fully reduced data products.

Per data-model.md §17: Contains source frame, 2D spectrum, extracted 1D spectra, diagnostic plots, processing log.

__init__(source_frame, spectrum_2d, spectra_1d=<factory>, diagnostic_plots=<factory>, processing_log=<factory>, reduction_timestamp=<factory>, quality_metrics=None)
source_frame: ScienceFrame
spectrum_2d: Spectrum2D
spectra_1d: List
diagnostic_plots: Dict[str, Path]
processing_log: List[str]
reduction_timestamp: datetime
quality_metrics: Optional[QualityMetrics] = None
save_to_disk(output_dir)[source]

Save all reduced data products to disk.

Parameters:

output_dir (Path) – Output directory

generate_summary_report()[source]

Generate summary report of reduction.

Returns:

Summary report

Return type:

str

class pykosmospp.models.InteractiveSelection[source]

Bases: object

Placeholder for interactive selection class (data-model.md §18)

class pykosmospp.models.ProcessingLog[source]

Bases: object

Placeholder for processing log class (data-model.md §19)