Ghost Analysis

`sigil_watermark.ghost.spectral_analysis`

Ghost signature spectral analysis.

Analyzes images for traces of an author's spectral fingerprint — the "ghost" that propagates through AI model training.

The ghost signal uses multiplicative magnitude modulation at specific frequency bands. Extraction uses spectral whitening (normalizing by local magnitude) to detect the ±modulation pattern regardless of image content. This is robust to natural images with complex spectra.

`GhostAnalysisResult` `dataclass`

Result of ghost signature spectral analysis.

Attributes:

Name	Type	Description
`ghost_detected`	`bool`	`True` if a statistically significant ghost signal was found (correlation > 0.01 and p-value < 0.05).
`correlation`	`float`	Normalized correlation between whitened spectrum and expected ghost PN pattern. Higher values indicate stronger signal.
`band_energies`	`dict[float, float]`	Average spectral magnitude at each ghost frequency band, keyed by normalized frequency.
`p_value`	`float`	Statistical significance under the null hypothesis of no watermark. Combined across channels via Fisher's method for RGB inputs.
`ghost_hash`	`list[int] \| None`	Extracted ghost hash bits (blind, no author key needed), or `None` if extraction failed. Used for O(1) author lookup.

Source code in src/sigil_watermark/ghost/spectral_analysis.py

@dataclass
class GhostAnalysisResult:
    """Result of ghost signature spectral analysis.

    Attributes:
        ghost_detected: ``True`` if a statistically significant ghost signal
            was found (correlation > 0.01 and p-value < 0.05).
        correlation: Normalized correlation between whitened spectrum and
            expected ghost PN pattern. Higher values indicate stronger signal.
        band_energies: Average spectral magnitude at each ghost frequency
            band, keyed by normalized frequency.
        p_value: Statistical significance under the null hypothesis of no
            watermark. Combined across channels via Fisher's method for
            RGB inputs.
        ghost_hash: Extracted ghost hash bits (blind, no author key needed),
            or ``None`` if extraction failed. Used for O(1) author lookup.
    """

    ghost_detected: bool
    correlation: float
    band_energies: dict[float, float]
    p_value: float
    ghost_hash: list[int] | None = None

`extract_ghost_hash(image, config=DEFAULT_CONFIG)`

Extract ghost hash bits from an image (blind, no key needed).

Detects multiplicative magnitude modulation at ghost frequency bands. Each hash bit's PN sign pattern produces a detectable modulation. Uses spectral whitening for robustness to natural image content.

Parameters:

Name	Type	Description	Default
`image`	`ndarray`	Grayscale (H,W) or RGB (H,W,3) image.	required
`config`	`SigilConfig`	Sigil configuration.	`DEFAULT_CONFIG`

Returns:

Type	Description
`list[int]`	(hash_bits, confidences) where hash_bits is a list of 0/1 values
`list[float]`	and confidences is a list of absolute correlation magnitudes.

Source code in src/sigil_watermark/ghost/spectral_analysis.py

def extract_ghost_hash(
    image: np.ndarray,
    config: SigilConfig = DEFAULT_CONFIG,
) -> tuple[list[int], list[float]]:
    """Extract ghost hash bits from an image (blind, no key needed).

    Detects multiplicative magnitude modulation at ghost frequency bands.
    Each hash bit's PN sign pattern produces a detectable modulation.
    Uses spectral whitening for robustness to natural image content.

    Args:
        image: Grayscale (H,W) or RGB (H,W,3) image.
        config: Sigil configuration.

    Returns:
        (hash_bits, confidences) where hash_bits is a list of 0/1 values
        and confidences is a list of absolute correlation magnitudes.
    """
    if image.ndim == 3 and image.shape[2] == 3:
        # Multi-channel extraction with majority vote
        all_channel_bits = []
        all_channel_confs = []
        for ch in range(3):
            bits_ch, confs_ch = _extract_ghost_hash_single(image[:, :, ch], config)
            all_channel_bits.append(bits_ch)
            all_channel_confs.append(confs_ch)

        # Majority vote across channels
        bits = []
        confidences = []
        for i in range(config.ghost_hash_bits):
            votes = [cb[i] for cb in all_channel_bits]
            bits.append(1 if sum(votes) >= 2 else 0)
            confidences.append(max(cc[i] for cc in all_channel_confs))
        return bits, confidences

    return _extract_ghost_hash_single(image, config)

`analyze_ghost_signature(image, public_key, config=DEFAULT_CONFIG)`

Analyze a single image for ghost signature traces.

Uses the composite ghost PN (encoding the author's ghost hash bits) to measure correlation. Also extracts the ghost hash bits blindly. Supports both grayscale (H,W) and RGB (H,W,3) input — RGB channels are analyzed independently and averaged for sqrt(3) SNR improvement.

Source code in src/sigil_watermark/ghost/spectral_analysis.py

def analyze_ghost_signature(
    image: np.ndarray,
    public_key: bytes,
    config: SigilConfig = DEFAULT_CONFIG,
) -> GhostAnalysisResult:
    """Analyze a single image for ghost signature traces.

    Uses the composite ghost PN (encoding the author's ghost hash bits)
    to measure correlation. Also extracts the ghost hash bits blindly.
    Supports both grayscale (H,W) and RGB (H,W,3) input — RGB channels
    are analyzed independently and averaged for sqrt(3) SNR improvement.
    """
    if image.ndim == 3 and image.shape[2] == 3:
        channel_results = [
            _analyze_ghost_single_channel(image[:, :, ch], public_key, config) for ch in range(3)
        ]
        avg_corr = np.mean([r.correlation for r in channel_results])
        avg_band_energies = {}
        for band in config.ghost_bands:
            avg_band_energies[band] = np.mean(
                [r.band_energies.get(band, 0) for r in channel_results]
            )
        p_values = [r.p_value for r in channel_results if r.p_value > 0]
        if p_values:
            from scipy.stats import chi2

            chi2_stat = -2 * sum(np.log(max(p, 1e-300)) for p in p_values)
            combined_p = float(1.0 - chi2.cdf(chi2_stat, df=2 * len(p_values)))
        else:
            combined_p = 1.0

        # Majority-vote ghost hash across channels
        all_channel_hashes = [r.ghost_hash for r in channel_results if r.ghost_hash]
        if all_channel_hashes:
            ghost_hash = []
            for bit_idx in range(config.ghost_hash_bits):
                votes = [h[bit_idx] for h in all_channel_hashes]
                ghost_hash.append(1 if sum(votes) > len(votes) / 2 else 0)
        else:
            ghost_hash = channel_results[0].ghost_hash

        ghost_detected = bool(avg_corr > 0.01 and combined_p < 0.05)
        return GhostAnalysisResult(
            ghost_detected=ghost_detected,
            correlation=float(avg_corr),
            band_energies=avg_band_energies,
            p_value=combined_p,
            ghost_hash=ghost_hash,
        )

    return _analyze_ghost_single_channel(image, public_key, config)

`batch_analyze_ghost(images, public_key, config=DEFAULT_CONFIG)`

Analyze multiple images for collective ghost signature.

More reliable than single-image analysis because the ghost signal is consistent across all images from the same author, while noise averages out.

Source code in src/sigil_watermark/ghost/spectral_analysis.py

def batch_analyze_ghost(
    images: list[np.ndarray],
    public_key: bytes,
    config: SigilConfig = DEFAULT_CONFIG,
) -> GhostAnalysisResult:
    """Analyze multiple images for collective ghost signature.

    More reliable than single-image analysis because the ghost signal
    is consistent across all images from the same author, while noise
    averages out.
    """
    if not images:
        return GhostAnalysisResult(
            ghost_detected=False,
            correlation=0.0,
            band_energies={},
            p_value=1.0,
        )

    results = [analyze_ghost_signature(img, public_key, config) for img in images]

    # Average correlations and band energies
    avg_corr = np.mean([r.correlation for r in results])
    avg_band_energies = {}
    for band in config.ghost_bands:
        avg_band_energies[band] = np.mean([r.band_energies.get(band, 0) for r in results])

    # Fisher's method to combine p-values
    p_values = [r.p_value for r in results if r.p_value > 0]
    if p_values:
        from scipy.stats import chi2

        chi2_stat = -2 * sum(np.log(max(p, 1e-300)) for p in p_values)
        combined_p = float(1.0 - chi2.cdf(chi2_stat, df=2 * len(p_values)))
    else:
        combined_p = 1.0

    ghost_detected = bool(avg_corr > 0.005 and combined_p < 0.01)

    return GhostAnalysisResult(
        ghost_detected=ghost_detected,
        correlation=float(avg_corr),
        band_energies=avg_band_energies,
        p_value=combined_p,
    )

Ghost Analysis

sigil_watermark.ghost.spectral_analysis

GhostAnalysisResult dataclass

extract_ghost_hash(image, config=DEFAULT_CONFIG)

analyze_ghost_signature(image, public_key, config=DEFAULT_CONFIG)

batch_analyze_ghost(images, public_key, config=DEFAULT_CONFIG)

`sigil_watermark.ghost.spectral_analysis`

`GhostAnalysisResult` `dataclass`

`extract_ghost_hash(image, config=DEFAULT_CONFIG)`

`analyze_ghost_signature(image, public_key, config=DEFAULT_CONFIG)`

`batch_analyze_ghost(images, public_key, config=DEFAULT_CONFIG)`