CLIP Latent Space — Conformity & Likelihood (ViT-L/14)

This Space operates on CLIP ViT-L/14 latent space to compute two metrics per modality:

Conformity — measure how common the samle is (based on The Double-Ellipsoid Geometry of CLIP)
Log-Likelihood — measure how like the common is (based on Whitened CLIP as a Likelihood Surrogate of Images and Captions)

All modality means and W matrices are stored internally and loaded from w_mats/*.pt.

Data provenance
Modality means and precision matrices (W) are computed from MS-COCO features.
They are loaded from precomputed .pt files in the Space repo.