# 🌌 SpecCLIP: Aligning and Translating Spectroscopic Measurements for Stars [![arXiv](https://img.shields.io/badge/arXiv-2507.01939-b31b1b.svg)](https://arxiv.org/abs/2507.01939) [![GitHub](https://img.shields.io/badge/GitHub-Repo-black)](https://github.com/Xiaosheng-Zhao/SpecCLIP) [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://github.com/Xiaosheng-Zhao/SpecCLIP/blob/main/LICENSE) **SpecCLIP** is a contrastive + domain-preserving foundation model designed to align **LAMOST LRS** spectra with **Gaia XP** spectrophotometric data. It learns a **general-purpose spectral embedding (768-dim)** that supports: * **Stellar parameter estimation** * **Cross-survey spectral translation** (LAMOST LRS ⟷ Gaia XP) * **Similarity retrieval** across LAMOST LRS and GAIA XP spectra For full documentation, installation instructions, examples, and end-to-end usage, please visit the **GitHub repository**: 👉 [https://github.com/Xiaosheng-Zhao/SpecCLIP](https://github.com/Xiaosheng-Zhao/SpecCLIP) --- ## 🔧 Available Models The following pretrained weights are included in this model repository: | File | Description | Embedding Dim | Param | | -------------------------------------------- | ------------------------------------- | ------------- | ------| | `encoders/lrs_encoder.ckpt` | LAMOST LRS masked transformer encoder | 768 | 43M | | `encoders/xp_encoder.ckpt` | Gaia XP masked transformer encoder | 768 | 43M | | `encoders/xp_encoder_mlp.ckpt` | Gaia XP autoencoder (MLP head) | 768 | 43M | | `specclip/specclip_model_base.ckpt` | Gaia XP ⟷ LAMOST contrastive | 768 | 100M | | `specclip/specclip_model_predrecon_mlp.ckpt` | CLIP alignment + pred+recon | 768 | 168M | | `specclip/specclip_model_split_mlp.ckpt` | CLIP alignment + split pred/recon | 768 | 126M | --- ## 🧠 What the Model Does SpecCLIP consists of: * **Two masked transformer encoders** – LAMOST LRS – Gaia XP * **Contrastive alignment loss (CLIP-style)** * **Domain-preserving prediction & reconstruction heads** * **Cross-modal decoder** for spectrum translation It produces **shared embeddings** enabling multi-survey astrophysical analysis. --- ## 📄 Full Documentation To keep the Hugging Face card concise, **all detailed instructions**, including: * Installation * Parameter prediction * Spectral translation * Retrieval * Full examples (Python + figures) * Acknowledgments are available at the GitHub repo: 👉 **[https://github.com/Xiaosheng-Zhao/SpecCLIP](https://github.com/Xiaosheng-Zhao/SpecCLIP)** --- ## 📊 Citation ```bibtex @ARTICLE{2025arXiv250701939Z, author = {{Zhao}, Xiaosheng and {Huang}, Yang and {Xue}, Guirong and {Kong}, Xiao and {Liu}, Jifeng and {Tang}, Xiaoyu and {Beers}, Timothy C. and {Ting}, Yuan-Sen and {Luo}, A-Li}, title = "{SpecCLIP: Aligning and Translating Spectroscopic Measurements for Stars}", journal = {arXiv e-prints}, keywords = {Instrumentation and Methods for Astrophysics, Solar and Stellar Astrophysics, Artificial Intelligence, Machine Learning}, year = 2025, month = jul, eid = {arXiv:2507.01939}, pages = {arXiv:2507.01939}, doi = {10.48550/arXiv.2507.01939}, archivePrefix = {arXiv}, eprint = {2507.01939}, primaryClass = {astro-ph.IM}, } ``` --- ## 📬 Contact * GitHub Issues: [https://github.com/Xiaosheng-Zhao/SpecCLIP/issues](https://github.com/Xiaosheng-Zhao/SpecCLIP/issues) * Email: [xzhao113@jh.edu](mailto:xzhao113@jh.edu)