Skip to content

Ghost Analysis

sigil_watermark.ghost.spectral_analysis

Ghost signature spectral analysis.

Analyzes images for traces of an author's spectral fingerprint — the "ghost" that propagates through AI model training.

The ghost signal uses multiplicative magnitude modulation at specific frequency bands. Extraction uses spectral whitening (normalizing by local magnitude) to detect the ±modulation pattern regardless of image content. This is robust to natural images with complex spectra.

GhostAnalysisResult dataclass

Result of ghost signature spectral analysis.

Attributes:

Name Type Description
ghost_detected bool

True if a statistically significant ghost signal was found (correlation > 0.01 and p-value < 0.05).

correlation float

Normalized correlation between whitened spectrum and expected ghost PN pattern. Higher values indicate stronger signal.

band_energies dict[float, float]

Average spectral magnitude at each ghost frequency band, keyed by normalized frequency.

p_value float

Statistical significance under the null hypothesis of no watermark. Combined across channels via Fisher's method for RGB inputs.

ghost_hash list[int] | None

Extracted ghost hash bits (blind, no author key needed), or None if extraction failed. Used for O(1) author lookup.

Source code in src/sigil_watermark/ghost/spectral_analysis.py
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
@dataclass
class GhostAnalysisResult:
    """Result of ghost signature spectral analysis.

    Attributes:
        ghost_detected: ``True`` if a statistically significant ghost signal
            was found (correlation > 0.01 and p-value < 0.05).
        correlation: Normalized correlation between whitened spectrum and
            expected ghost PN pattern. Higher values indicate stronger signal.
        band_energies: Average spectral magnitude at each ghost frequency
            band, keyed by normalized frequency.
        p_value: Statistical significance under the null hypothesis of no
            watermark. Combined across channels via Fisher's method for
            RGB inputs.
        ghost_hash: Extracted ghost hash bits (blind, no author key needed),
            or ``None`` if extraction failed. Used for O(1) author lookup.
    """

    ghost_detected: bool
    correlation: float
    band_energies: dict[float, float]
    p_value: float
    ghost_hash: list[int] | None = None

extract_ghost_hash(image, config=DEFAULT_CONFIG)

Extract ghost hash bits from an image (blind, no key needed).

Detects multiplicative magnitude modulation at ghost frequency bands. Each hash bit's PN sign pattern produces a detectable modulation. Uses spectral whitening for robustness to natural image content.

Parameters:

Name Type Description Default
image ndarray

Grayscale (H,W) or RGB (H,W,3) image.

required
config SigilConfig

Sigil configuration.

DEFAULT_CONFIG

Returns:

Type Description
list[int]

(hash_bits, confidences) where hash_bits is a list of 0/1 values

list[float]

and confidences is a list of absolute correlation magnitudes.

Source code in src/sigil_watermark/ghost/spectral_analysis.py
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
def extract_ghost_hash(
    image: np.ndarray,
    config: SigilConfig = DEFAULT_CONFIG,
) -> tuple[list[int], list[float]]:
    """Extract ghost hash bits from an image (blind, no key needed).

    Detects multiplicative magnitude modulation at ghost frequency bands.
    Each hash bit's PN sign pattern produces a detectable modulation.
    Uses spectral whitening for robustness to natural image content.

    Args:
        image: Grayscale (H,W) or RGB (H,W,3) image.
        config: Sigil configuration.

    Returns:
        (hash_bits, confidences) where hash_bits is a list of 0/1 values
        and confidences is a list of absolute correlation magnitudes.
    """
    if image.ndim == 3 and image.shape[2] == 3:
        # Multi-channel extraction with majority vote
        all_channel_bits = []
        all_channel_confs = []
        for ch in range(3):
            bits_ch, confs_ch = _extract_ghost_hash_single(image[:, :, ch], config)
            all_channel_bits.append(bits_ch)
            all_channel_confs.append(confs_ch)

        # Majority vote across channels
        bits = []
        confidences = []
        for i in range(config.ghost_hash_bits):
            votes = [cb[i] for cb in all_channel_bits]
            bits.append(1 if sum(votes) >= 2 else 0)
            confidences.append(max(cc[i] for cc in all_channel_confs))
        return bits, confidences

    return _extract_ghost_hash_single(image, config)

analyze_ghost_signature(image, public_key, config=DEFAULT_CONFIG)

Analyze a single image for ghost signature traces.

Uses the composite ghost PN (encoding the author's ghost hash bits) to measure correlation. Also extracts the ghost hash bits blindly. Supports both grayscale (H,W) and RGB (H,W,3) input — RGB channels are analyzed independently and averaged for sqrt(3) SNR improvement.

Source code in src/sigil_watermark/ghost/spectral_analysis.py
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
def analyze_ghost_signature(
    image: np.ndarray,
    public_key: bytes,
    config: SigilConfig = DEFAULT_CONFIG,
) -> GhostAnalysisResult:
    """Analyze a single image for ghost signature traces.

    Uses the composite ghost PN (encoding the author's ghost hash bits)
    to measure correlation. Also extracts the ghost hash bits blindly.
    Supports both grayscale (H,W) and RGB (H,W,3) input — RGB channels
    are analyzed independently and averaged for sqrt(3) SNR improvement.
    """
    if image.ndim == 3 and image.shape[2] == 3:
        channel_results = [
            _analyze_ghost_single_channel(image[:, :, ch], public_key, config) for ch in range(3)
        ]
        avg_corr = np.mean([r.correlation for r in channel_results])
        avg_band_energies = {}
        for band in config.ghost_bands:
            avg_band_energies[band] = np.mean(
                [r.band_energies.get(band, 0) for r in channel_results]
            )
        p_values = [r.p_value for r in channel_results if r.p_value > 0]
        if p_values:
            from scipy.stats import chi2

            chi2_stat = -2 * sum(np.log(max(p, 1e-300)) for p in p_values)
            combined_p = float(1.0 - chi2.cdf(chi2_stat, df=2 * len(p_values)))
        else:
            combined_p = 1.0

        # Majority-vote ghost hash across channels
        all_channel_hashes = [r.ghost_hash for r in channel_results if r.ghost_hash]
        if all_channel_hashes:
            ghost_hash = []
            for bit_idx in range(config.ghost_hash_bits):
                votes = [h[bit_idx] for h in all_channel_hashes]
                ghost_hash.append(1 if sum(votes) > len(votes) / 2 else 0)
        else:
            ghost_hash = channel_results[0].ghost_hash

        ghost_detected = bool(avg_corr > 0.01 and combined_p < 0.05)
        return GhostAnalysisResult(
            ghost_detected=ghost_detected,
            correlation=float(avg_corr),
            band_energies=avg_band_energies,
            p_value=combined_p,
            ghost_hash=ghost_hash,
        )

    return _analyze_ghost_single_channel(image, public_key, config)

batch_analyze_ghost(images, public_key, config=DEFAULT_CONFIG)

Analyze multiple images for collective ghost signature.

More reliable than single-image analysis because the ghost signal is consistent across all images from the same author, while noise averages out.

Source code in src/sigil_watermark/ghost/spectral_analysis.py
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
def batch_analyze_ghost(
    images: list[np.ndarray],
    public_key: bytes,
    config: SigilConfig = DEFAULT_CONFIG,
) -> GhostAnalysisResult:
    """Analyze multiple images for collective ghost signature.

    More reliable than single-image analysis because the ghost signal
    is consistent across all images from the same author, while noise
    averages out.
    """
    if not images:
        return GhostAnalysisResult(
            ghost_detected=False,
            correlation=0.0,
            band_energies={},
            p_value=1.0,
        )

    results = [analyze_ghost_signature(img, public_key, config) for img in images]

    # Average correlations and band energies
    avg_corr = np.mean([r.correlation for r in results])
    avg_band_energies = {}
    for band in config.ghost_bands:
        avg_band_energies[band] = np.mean([r.band_energies.get(band, 0) for r in results])

    # Fisher's method to combine p-values
    p_values = [r.p_value for r in results if r.p_value > 0]
    if p_values:
        from scipy.stats import chi2

        chi2_stat = -2 * sum(np.log(max(p, 1e-300)) for p in p_values)
        combined_p = float(1.0 - chi2.cdf(chi2_stat, df=2 * len(p_values)))
    else:
        combined_p = 1.0

    ghost_detected = bool(avg_corr > 0.005 and combined_p < 0.01)

    return GhostAnalysisResult(
        ghost_detected=ghost_detected,
        correlation=float(avg_corr),
        band_energies=avg_band_energies,
        p_value=combined_p,
    )