# 🌌 SpecCLIP: Aligning and Translating Spectroscopic Measurements for Stars

[![arXiv](https://img.shields.io/badge/arXiv-2507.01939-b31b1b.svg)](https://arxiv.org/abs/2507.01939)
[![GitHub](https://img.shields.io/badge/GitHub-Repo-black)](https://github.com/Xiaosheng-Zhao/SpecCLIP)
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://github.com/Xiaosheng-Zhao/SpecCLIP/blob/main/LICENSE)

**SpecCLIP** is a contrastive + domain-preserving foundation model designed to align **LAMOST LRS** spectra with **Gaia XP** spectrophotometric data.
It learns a **general-purpose spectral embedding (768-dim)** that supports:

* **Stellar parameter estimation**
* **Cross-survey spectral translation** (LAMOST LRS ⟷ Gaia XP)
* **Similarity retrieval** across LAMOST LRS and GAIA XP spectra

For full documentation, installation instructions, examples, and end-to-end usage, please visit the **GitHub repository**:
👉 [https://github.com/Xiaosheng-Zhao/SpecCLIP](https://github.com/Xiaosheng-Zhao/SpecCLIP)

---

## 🔧 Available Models

The following pretrained weights are included in this model repository:

| File                                         | Description                           | Embedding Dim | Param |
| -------------------------------------------- | ------------------------------------- | ------------- | ------|
| `encoders/lrs_encoder.ckpt`                  | LAMOST LRS masked transformer encoder | 768           |  43M  |
| `encoders/xp_encoder.ckpt`                   | Gaia XP masked transformer encoder    | 768           |  43M  |
| `encoders/xp_encoder_mlp.ckpt`               | Gaia XP autoencoder (MLP head)        | 768           |  43M  |
| `specclip/specclip_model_base.ckpt`          | Gaia XP  ⟷ LAMOST contrastive        | 768           |  100M |
| `specclip/specclip_model_predrecon_mlp.ckpt` | CLIP alignment + pred+recon           | 768           |  168M |
| `specclip/specclip_model_split_mlp.ckpt`     | CLIP alignment + split pred/recon     | 768           |  126M |

---

## 🧠 What the Model Does

SpecCLIP consists of:

* **Two masked transformer encoders**
  – LAMOST LRS
  – Gaia XP
* **Contrastive alignment loss (CLIP-style)**
* **Domain-preserving prediction & reconstruction heads**
* **Cross-modal decoder** for spectrum translation

It produces **shared embeddings** enabling multi-survey astrophysical analysis.

---

## 📄 Full Documentation

To keep the Hugging Face card concise, **all detailed instructions**, including:

* Installation
* Parameter prediction
* Spectral translation
* Retrieval
* Full examples (Python + figures)
* Acknowledgments

are available at the GitHub repo:

👉 **[https://github.com/Xiaosheng-Zhao/SpecCLIP](https://github.com/Xiaosheng-Zhao/SpecCLIP)**

---

## 📊 Citation

```bibtex
@ARTICLE{2025arXiv250701939Z,
       author = {{Zhao}, Xiaosheng and {Huang}, Yang and {Xue}, Guirong and {Kong}, Xiao and
                 {Liu}, Jifeng and {Tang}, Xiaoyu and {Beers}, Timothy C. and
                 {Ting}, Yuan-Sen and {Luo}, A-Li},
        title = "{SpecCLIP: Aligning and Translating Spectroscopic Measurements for Stars}",
      journal = {arXiv e-prints},
     keywords = {Instrumentation and Methods for Astrophysics, Solar and Stellar Astrophysics,
                 Artificial Intelligence, Machine Learning},
         year = 2025,
        month = jul,
          eid = {arXiv:2507.01939},
        pages = {arXiv:2507.01939},
          doi = {10.48550/arXiv.2507.01939},
archivePrefix = {arXiv},
       eprint = {2507.01939},
 primaryClass = {astro-ph.IM},
}
```

---

## 📬 Contact

* GitHub Issues: [https://github.com/Xiaosheng-Zhao/SpecCLIP/issues](https://github.com/Xiaosheng-Zhao/SpecCLIP/issues)
* Email: [xzhao113@jh.edu](mailto:xzhao113@jh.edu)