CrossEncoder based on BAAI/bge-reranker-v2-m3

This is a Cross Encoder model finetuned from BAAI/bge-reranker-v2-m3 on the basalam-query-triplet-grmma27-1_m dataset using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.

Model Details

Model Description

Model Sources

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the 🤗 Hub
model = CrossEncoder("mjaliz/bge-reranker-finetuned-basalam-query-triplet-grmma27-1M-beir")
# Get scores for pairs of texts
pairs = [
    ['کیف بیسیک', 'کیف دستی زنانه ساده'],
    ['کتاب راس الحسین', 'کتاب داستان راس الحسین'],
    ['بلوز گپ بافت موهر', 'بلوز بافتنی موهر گپ زنانه'],
    ['پشت گردنی سفر', 'پشت گردنی مسافرتی بادی'],
    ['فروشگاه ایفون', 'خرید گوشی آیفون'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)

# Or rank different texts based on similarity to a single text
ranks = model.rank(
    'کیف بیسیک',
    [
        'کیف دستی زنانه ساده',
        'کتاب داستان راس الحسین',
        'بلوز بافتنی موهر گپ زنانه',
        'پشت گردنی مسافرتی بادی',
        'خرید گوشی آیفون',
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Evaluation

Metrics

Cross Encoder Reranking

Metric custom_beir eval_reranking
map 0.6741 0.971
mrr@10 0.751 0.971
ndcg@10 0.7177 0.9834

Training Details

Training Dataset

basalam-query-triplet-grmma27-1_m

  • Dataset: basalam-query-triplet-grmma27-1_m at 5a80579
  • Size: 991,734 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 3 characters
    • mean: 17.09 characters
    • max: 74 characters
    • min: 8 characters
    • mean: 25.5 characters
    • max: 76 characters
    • min: 7 characters
    • mean: 19.37 characters
    • max: 55 characters
  • Samples:
    anchor positive negative
    کیف بیسیک کیف دستی زنانه ساده کیس کامپیوتر
    کتاب راس الحسین کتاب داستان راس الحسین کتاب شعر راس الحسین
    بلوز گپ بافت موهر بلوز بافتنی موهر گپ زنانه شال بافت موهر گپ
  • Loss: CachedMultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "num_negatives": 4,
        "activation_fn": "torch.nn.modules.activation.Sigmoid",
        "mini_batch_size": 32
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • overwrite_output_dir: True
  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • gradient_accumulation_steps: 2
  • learning_rate: 2e-05
  • weight_decay: 0.01
  • num_train_epochs: 4
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • bf16: True
  • tf32: True
  • dataloader_num_workers: 4
  • load_best_model_at_end: True
  • push_to_hub: True
  • hub_model_id: mjaliz/bge-reranker-finetuned-basalam-query-triplet-grmma27-1M-beir

All Hyperparameters

Click to expand
  • overwrite_output_dir: True
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 2
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: True
  • local_rank: 3
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: True
  • dataloader_num_workers: 4
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: True
  • resume_from_checkpoint: None
  • hub_model_id: mjaliz/bge-reranker-finetuned-basalam-query-triplet-grmma27-1M-beir
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss custom_beir_ndcg@10 eval_reranking_ndcg@10
0.0258 100 0.5313 - -
0.0516 200 0.5197 - -
0.0774 300 0.4847 - -
0.1033 400 0.4391 - -
0.1291 500 0.4011 - -
0.1549 600 0.3645 - -
0.1807 700 0.3382 - -
0.2065 800 0.3034 - -
0.2323 900 0.2705 - -
0.2582 1000 0.2478 0.7949 0.979
0.2840 1100 0.2296 - -
0.3098 1200 0.2175 - -
0.3356 1300 0.1985 - -
0.3614 1400 0.1879 - -
0.3872 1500 0.191 - -
0.4131 1600 0.1844 - -
0.4389 1700 0.1956 - -
0.4647 1800 0.1793 - -
0.4905 1900 0.1871 - -
0.5163 2000 0.1808 0.7906 0.9852
0.5421 2100 0.184 - -
0.5680 2200 0.1839 - -
0.5938 2300 0.1773 - -
0.6196 2400 0.1864 - -
0.6454 2500 0.1808 - -
0.6712 2600 0.1847 - -
0.6970 2700 0.1869 - -
0.7229 2800 0.1852 - -
0.7487 2900 0.2085 - -
0.7745 3000 0.2042 0.7642 0.9852
0.8003 3100 0.2079 - -
0.8261 3200 0.1982 - -
0.8519 3300 0.2195 - -
0.8778 3400 0.2223 - -
0.9036 3500 0.2287 - -
0.9294 3600 0.2139 - -
0.9552 3700 0.2214 - -
0.9810 3800 0.2161 - -
1.0067 3900 0.218 - -
1.0325 4000 0.2042 0.7526 0.9867
1.0583 4100 0.2006 - -
1.0842 4200 0.2215 - -
1.1100 4300 0.2025 - -
1.1358 4400 0.2031 - -
1.1616 4500 0.2203 - -
1.1874 4600 0.2322 - -
1.2132 4700 0.216 - -
1.2391 4800 0.2195 - -
1.2649 4900 0.2137 - -
1.2907 5000 0.2175 0.7417 0.9858
1.3165 5100 0.2101 - -
1.3423 5200 0.2072 - -
1.3681 5300 0.1956 - -
1.3940 5400 0.2233 - -
1.4198 5500 0.214 - -
1.4456 5600 0.2139 - -
1.4714 5700 0.2141 - -
1.4972 5800 0.2041 - -
1.5230 5900 0.2123 - -
1.5489 6000 0.205 0.7516 0.9863
1.5747 6100 0.2135 - -
1.6005 6200 0.2213 - -
1.6263 6300 0.2055 - -
1.6521 6400 0.2091 - -
1.6779 6500 0.2073 - -
1.7038 6600 0.2192 - -
1.7296 6700 0.2072 - -
1.7554 6800 0.2192 - -
1.7812 6900 0.2081 - -
1.8070 7000 0.2137 0.7369 0.9858
1.8328 7100 0.231 - -
1.8587 7200 0.215 - -
1.8845 7300 0.216 - -
1.9103 7400 0.2084 - -
1.9361 7500 0.2079 - -
1.9619 7600 0.2031 - -
1.9877 7700 0.2116 - -
2.0134 7800 0.2225 - -
2.0392 7900 0.2217 - -
2.0651 8000 0.2164 0.7283 0.9863
2.0909 8100 0.2044 - -
2.1167 8200 0.2133 - -
2.1425 8300 0.2207 - -
2.1683 8400 0.2106 - -
2.1941 8500 0.2164 - -
2.2200 8600 0.1968 - -
2.2458 8700 0.2089 - -
2.2716 8800 0.2223 - -
2.2974 8900 0.2228 - -
2.3232 9000 0.2276 0.7205 0.9856
2.3490 9100 0.2182 - -
2.3749 9200 0.2183 - -
2.4007 9300 0.2284 - -
2.4265 9400 0.2149 - -
2.4523 9500 0.2065 - -
2.4781 9600 0.2199 - -
2.5039 9700 0.2217 - -
2.5298 9800 0.1966 - -
2.5556 9900 0.2163 - -
2.5814 10000 0.2173 0.7158 0.9834
2.6072 10100 0.2139 - -
2.6330 10200 0.237 - -
2.6588 10300 0.2129 - -
2.6847 10400 0.217 - -
2.7105 10500 0.2181 - -
2.7363 10600 0.2338 - -
2.7621 10700 0.2244 - -
2.7879 10800 0.2251 - -
2.8137 10900 0.2276 - -
2.8396 11000 0.2179 0.7150 0.9841
2.8654 11100 0.2186 - -
2.8912 11200 0.2291 - -
2.9170 11300 0.2093 - -
2.9428 11400 0.2202 - -
2.9686 11500 0.2262 - -
2.9944 11600 0.2249 - -
3.0201 11700 0.2238 - -
3.0460 11800 0.2028 - -
3.0718 11900 0.2244 - -
3.0976 12000 0.2181 0.7151 0.9834
3.1234 12100 0.2192 - -
3.1492 12200 0.2139 - -
3.1750 12300 0.2075 - -
3.2009 12400 0.2258 - -
3.2267 12500 0.2291 - -
3.2525 12600 0.2136 - -
3.2783 12700 0.2207 - -
3.3041 12800 0.2248 - -
3.3299 12900 0.2269 - -
3.3558 13000 0.23 0.7167 0.9841
3.3816 13100 0.214 - -
3.4074 13200 0.218 - -
3.4332 13300 0.2315 - -
3.4590 13400 0.2241 - -
3.4848 13500 0.2175 - -
3.5106 13600 0.2167 - -
3.5365 13700 0.2141 - -
3.5623 13800 0.2163 - -
3.5881 13900 0.2219 - -
3.6139 14000 0.2218 0.7178 0.9836
3.6397 14100 0.2113 - -
3.6655 14200 0.2132 - -
3.6914 14300 0.2234 - -
3.7172 14400 0.2259 - -
3.7430 14500 0.2151 - -
3.7688 14600 0.2273 - -
3.7946 14700 0.2192 - -
3.8204 14800 0.2253 - -
3.8463 14900 0.2237 - -
3.8721 15000 0.217 0.7177 0.9834
3.8979 15100 0.2108 - -
3.9237 15200 0.2219 - -
3.9495 15300 0.2298 - -
3.9753 15400 0.2132 - -
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.12.3
  • Sentence Transformers: 5.1.2
  • Transformers: 4.57.3
  • PyTorch: 2.9.1+cu128
  • Accelerate: 1.12.0
  • Datasets: 4.4.1
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
66
Safetensors
Model size
0.6B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mjaliz/bge-reranker-finetuned-basalam-query-triplet-grmma27-1M-beir

Finetuned
(36)
this model

Dataset used to train mjaliz/bge-reranker-finetuned-basalam-query-triplet-grmma27-1M-beir

Evaluation results