sileod commited on
Commit
40f472f
·
verified ·
1 Parent(s): a98d39f

Add new SentenceTransformer model

Browse files
Files changed (2) hide show
  1. README.md +126 -129
  2. model.safetensors +1 -1
README.md CHANGED
@@ -37,27 +37,30 @@ widget:
37
  \ pediatrician, or paediatrician. The word pediatrics and its cognates mean healer\
38
  \ of children; they derive from two Greek words: Ï\x80αá¿\x96Ï\x82 (pais child)\
39
  \ and ἰαÏ\x84Ï\x81Ï\x8CÏ\x82 (iatros doctor, healer)."
40
- - source_sentence: Creek Township borders Elsinboro Township , Pennsville Township
41
- and Salem .
42
  sentences:
43
- - Today , Galesburg-Augusta Community Schools consists of a primary school and a
44
- high school in Galesburg and a middle school in Augusta .
45
- - Elsinboro Township borders with the Lower Alloways Creek Township , Pennsville
46
- Township and Salem .
47
- - In 1953 , he married the actress Gilda Neeltje , sister of the actress Diane Holland
48
- .
49
- - source_sentence: A man is riding on one wheel on a motorcycle.
50
  sentences:
51
- - A person is performing tricks on a motorcycle.
52
- - A boy jumping in the air on the beach.
53
- - A woman is pouring ingredients into a frying pan.
54
- - source_sentence: '''Why don''t you find out?'
 
 
55
  sentences:
56
- - He is suggesting that the lack of effort focusing on the concept is making it
57
- seem unrealistic.
58
- - The military stated that the 244th Engineer Battalion has been handling the construction
59
- of playgrounds, cleaning up the rubble and restoring irrigation services in Iraq.
60
- - Why you haven't find out?.
 
 
61
  - source_sentence: what are the three subatomic particles called?
62
  sentences:
63
  - Subatomic particles include electrons, the negatively charged, almost massless
@@ -168,7 +171,7 @@ print(query_embeddings.shape, document_embeddings.shape)
168
  # Get the similarity scores for the embeddings
169
  similarities = model.similarity(query_embeddings, document_embeddings)
170
  print(similarities)
171
- # tensor([[ 0.6600, -0.0148, 0.0229]])
172
  ```
173
 
174
  <!--
@@ -221,13 +224,13 @@ You can finetune this model on your own dataset.
221
  | | sentence1 | sentence2 | label |
222
  |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------|
223
  | type | string | string | int |
224
- | details | <ul><li>min: 11 tokens</li><li>mean: 27.65 tokens</li><li>max: 57 tokens</li></ul> | <ul><li>min: 11 tokens</li><li>mean: 27.73 tokens</li><li>max: 57 tokens</li></ul> | <ul><li>0: ~57.50%</li><li>1: ~42.50%</li></ul> |
225
  * Samples:
226
- | sentence1 | sentence2 | label |
227
- |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
228
- | <code>Ceremonial music ( `` rokon fada '' ) is listed as a status symbol , and musicians are generally chosen for political reasons as opposed to musical ones .</code> | <code>Ceremonial music ( `` rokon fada '' ) is performed as a status symbol , and musicians are generally chosen for musical reasons as opposed to political ones .</code> | <code>0</code> |
229
- | <code>In 1989 he travelled to South Africa , Johannesburg and Angola , Mozambique on a peace-seeking mission .</code> | <code>In 1989 , he traveled to Mozambique , Johannesburg , and Angola , South Africa on a peace-seeking mission .</code> | <code>1</code> |
230
- | <code>In this way , the Nestorian faith was established in the East under tragic signs .</code> | <code>In this way , under Nestorian auspices , the tragic faith was established in the East .</code> | <code>0</code> |
231
  * Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
232
  ```json
233
  {
@@ -244,16 +247,16 @@ You can finetune this model on your own dataset.
244
  * Size: 11,004 training samples
245
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
246
  * Approximate statistics based on the first 1000 samples:
247
- | | sentence1 | sentence2 | label |
248
- |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------|
249
- | type | string | string | int |
250
- | details | <ul><li>min: 11 tokens</li><li>mean: 27.23 tokens</li><li>max: 52 tokens</li></ul> | <ul><li>min: 11 tokens</li><li>mean: 27.29 tokens</li><li>max: 53 tokens</li></ul> | <ul><li>0: ~33.10%</li><li>1: ~66.90%</li></ul> |
251
  * Samples:
252
- | sentence1 | sentence2 | label |
253
- |:------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
254
- | <code>Tony Blair has taken a hardline stance arguing nothing should be done to lessen the pressure on Mugabe at the gathering in the capital Abuja .</code> | <code>The Prime Minister has taken a hardline stance arguing nothing should be done to lessen the pressure on Mugabe .</code> | <code>0</code> |
255
- | <code>The identical rovers will act as robotic geologists , searching for evidence of past water .</code> | <code>The rovers act as robotic geologists , moving on six wheels .</code> | <code>0</code> |
256
- | <code>" We make no apologies for finding every legal way possible to protect the American public from further terrorist attack , " Barbara Comstock said .</code> | <code>" We make no apologies for finding every legal way possible to protect the American public from further terrorist attacks , " said Barbara Comstock , Ashcroft 's press secretary .</code> | <code>1</code> |
257
  * Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
258
  ```json
259
  {
@@ -273,13 +276,13 @@ You can finetune this model on your own dataset.
273
  | | sentence1 | sentence2 | label |
274
  |:--------|:----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------|:------------------------------------------------|
275
  | type | string | string | int |
276
- | details | <ul><li>min: 7 tokens</li><li>mean: 13.65 tokens</li><li>max: 28 tokens</li></ul> | <ul><li>min: 28 tokens</li><li>mean: 318.06 tokens</li><li>max: 1024 tokens</li></ul> | <ul><li>0: ~30.20%</li><li>1: ~69.80%</li></ul> |
277
  * Samples:
278
- | sentence1 | sentence2 | label |
279
- |:-----------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
280
- | <code>Batman: The Killing Joke features characters.</code> | <code>notice. Cantonese Pinyin -LRB- , also known as 教院式拼音方案 -RRB- is a romanization system for Cantonese developed by Rev. Yu Ping Chiu in 1971 , and subsequently modified by the Education Department -LRB- merged into the Education and Manpower Bureau since 2003 -RRB- of Hong Kong and Prof. Zhan Bohui of the Chinese Dialects Research Centre of the Jinan University , Guangdong , PRC , and honorary professor of the School of Chinese , University of Hong Kong .. romanization. romanization. Cantonese. Cantonese. Education and Manpower Bureau. Education and Manpower Bureau. Zhan Bohui. Zhan Bohui. It is the only romanization system accepted by Education and Manpower Bureau of Hong Kong and Hong Kong Examinations and Assessment Authority .. romanization. romanization. Education and Manpower Bureau. Education and Manpower Bureau. Hong Kong Examinations and Assessment Authority. Hong Kong Examinations and Assessment Authority. The formal and short forms of the system 's Chinese names mean respectiv...</code> | <code>1</code> |
281
- | <code>Jon Snow is played by a person.</code> | <code>Cao'an is a temple in Jinjiang , Fujian .. Originally constructed by Chinese Manicheans , it was viewed by later worshipers as a Buddhist temple .. Manicheans. Manichaeism. This `` Manichean temple in Buddhist disguise ''. is seen by modern experts on Manichaeism as `` the only extant Manichean temple in China '' , or `` the only Manichean building which has survived intact '' .</code> | <code>1</code> |
282
- | <code>Scotland includes islands.</code> | <code>Scotland -LRB- -LSB- ˈskɒt.lənd -RSB- Scots : -LSB- - scoˈskɔt.lənd -RSB- Alba -LSB- ˈalˠapə -RSB- -RRB- is a country that is part of the United Kingdom and covers the northern third of the island of Great Britain .. Scots. Scots language. Scotland. Scots Law. Alba. Alba. country. country. part. Countries of the United Kingdom. United Kingdom. United Kingdom. Great Britain. Great Britain. It shares a border with England to the south , and is otherwise surrounded by the Atlantic Ocean , with the North Sea to the east and the North Channel and Irish Sea to the south-west .. England. England. Atlantic Ocean. Atlantic Ocean. North Sea. North Sea. North Channel. North Channel ( British Isles ). Irish Sea. Irish Sea. In addition to the mainland , the country is made up of more than 790 islands , including the Northern Isles and the Hebrides .. country. country. Northern Isles. Northern Isles. Hebrides. Hebrides. The Kingdom of Scotland emerged as an independent sovereign state in the Early ...</code> | <code>0</code> |
283
  * Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
284
  ```json
285
  {
@@ -299,13 +302,13 @@ You can finetune this model on your own dataset.
299
  | | sentence1 | sentence2 | label |
300
  |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:------------------------------------------------|
301
  | type | string | string | int |
302
- | details | <ul><li>min: 6 tokens</li><li>mean: 22.21 tokens</li><li>max: 61 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 21.48 tokens</li><li>max: 49 tokens</li></ul> | <ul><li>0: ~54.80%</li><li>1: ~45.20%</li></ul> |
303
  * Samples:
304
- | sentence1 | sentence2 | label |
305
- |:---------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
306
- | <code>access to device itself application specific data (network services, dns, html, http, etc)</code> | <code>(upper layer data)facilitates communication between such programs and lower-layer network services. high-level apis, including resource sharing, remote file access.</code> | <code>0</code> |
307
- | <code>an important element of information management, but it is just one part of a larger whole</code> | <code>converting facts and figures into useful information</code> | <code>0</code> |
308
- | <code>web site that has a field for you to type in a search query, as it will search the internet for you using your search criteria.</code> | <code>web-based search tool that locates a web page using a keyword</code> | <code>1</code> |
309
  * Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
310
  ```json
311
  {
@@ -322,16 +325,16 @@ You can finetune this model on your own dataset.
322
  * Size: 10,047 training samples
323
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
324
  * Approximate statistics based on the first 1000 samples:
325
- | | sentence1 | sentence2 | label |
326
- |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------|
327
- | type | string | string | int |
328
- | details | <ul><li>min: 4 tokens</li><li>mean: 17.32 tokens</li><li>max: 213 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 16.46 tokens</li><li>max: 121 tokens</li></ul> | <ul><li>0: ~35.80%</li><li>1: ~64.20%</li></ul> |
329
  * Samples:
330
- | sentence1 | sentence2 | label |
331
- |:------------------------------------------------------------------|:-------------------------------------------------------------------------|:---------------|
332
- | <code>Watch out.</code> | <code>U.S. Bank</code> | <code>0</code> |
333
- | <code>Oh! we spent all night, used all the fancy machines.</code> | <code>We spent all night using the luxurious equipment.</code> | <code>1</code> |
334
- | <code>I'm willing to give you all this information...</code> | <code>This information, all of it, I'm inclined to provide you...</code> | <code>1</code> |
335
  * Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
336
  ```json
337
  {
@@ -351,13 +354,13 @@ You can finetune this model on your own dataset.
351
  | | sentence1 | sentence2 | label |
352
  |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
353
  | type | string | string | float |
354
- | details | <ul><li>min: 6 tokens</li><li>mean: 14.68 tokens</li><li>max: 57 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 14.84 tokens</li><li>max: 68 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 2.64</li><li>max: 5.0</li></ul> |
355
  * Samples:
356
- | sentence1 | sentence2 | label |
357
- |:----------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------|:--------------------------------|
358
- | <code>Mandela's condition has 'improved'</code> | <code>Mandela's condition has 'worsened over past 48 hours'</code> | <code>1.0</code> |
359
- | <code>the cfe is very important for european security.</code> | <code>the cfe is a cornerstone of european security.</code> | <code>5.0</code> |
360
- | <code>The Nasdaq fell about 1.3% for the month, snapping a seven-month winning streak.</code> | <code>The Nasdaq is down roughly 0.4 percent for the month, on track to snap a 7-month streak of gains.</code> | <code>2.4000000953674316</code> |
361
  * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
362
  ```json
363
  {
@@ -377,13 +380,13 @@ You can finetune this model on your own dataset.
377
  | | sentence1 | sentence2 | label |
378
  |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
379
  | type | string | string | float |
380
- | details | <ul><li>min: 6 tokens</li><li>mean: 12.25 tokens</li><li>max: 28 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 12.11 tokens</li><li>max: 38 tokens</li></ul> | <ul><li>min: 1.0</li><li>mean: 3.51</li><li>max: 5.0</li></ul> |
381
  * Samples:
382
- | sentence1 | sentence2 | label |
383
- |:------------------------------------------------------|:------------------------------------------------------------------------------------|:--------------------------------|
384
- | <code>A cold cyclist is celebrating</code> | <code>A bike is being held over his head by a bicyclist in a group of people</code> | <code>2.299999952316284</code> |
385
- | <code>Nobody is cutting a capsicum into pieces</code> | <code>The person is slicing a clove of garlic into pieces</code> | <code>3.0999999046325684</code> |
386
- | <code>A woman is not cutting shrimps</code> | <code>A man is chopping butter into a container</code> | <code>1.7999999523162842</code> |
387
  * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
388
  ```json
389
  {
@@ -400,16 +403,16 @@ You can finetune this model on your own dataset.
400
  * Size: 14,280 training samples
401
  * Columns: <code>label</code>, <code>sentence1</code>, and <code>sentence2</code>
402
  * Approximate statistics based on the first 1000 samples:
403
- | | label | sentence1 | sentence2 |
404
- |:--------|:---------------------------------------------------------------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
405
- | type | float | string | string |
406
- | details | <ul><li>min: 0.0</li><li>mean: 3.13</li><li>max: 5.0</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 18.95 tokens</li><li>max: 91 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 17.55 tokens</li><li>max: 269 tokens</li></ul> |
407
  * Samples:
408
- | label | sentence1 | sentence2 |
409
- |:-----------------|:-----------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------|
410
- | <code>4.2</code> | <code>I am calling BS!!! NYTimes: Morsi Says His Slurs of Jews Were Taken Out of Context</code> | <code>Morsi Says Slurs of Jews Were Taken Out of Context</code> |
411
- | <code>3.0</code> | <code>The driver of the coach tried to avoid it by swerving hard, but still grazed the right side of the lorry.</code> | <code>The driver of the last to try to avoid it through a sudden move, but he fell short by his right side.</code> |
412
- | <code>5.0</code> | <code>create a mess or disorder</code> | <code>make a mess of or create disorder in.</code> |
413
  * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
414
  ```json
415
  {
@@ -728,15 +731,11 @@ You can finetune this model on your own dataset.
728
  ### Training Hyperparameters
729
  #### Non-Default Hyperparameters
730
 
731
- - `per_device_train_batch_size`: 360
732
- - `learning_rate`: 8e-05
733
- - `weight_decay`: 5e-05
734
  - `num_train_epochs`: 1
735
- - `warmup_ratio`: 0.03
736
  - `fp16`: True
737
- - `gradient_checkpointing`: True
738
- - `torch_compile`: True
739
- - `torch_compile_backend`: inductor
740
 
741
  #### All Hyperparameters
742
  <details><summary>Click to expand</summary>
@@ -745,15 +744,15 @@ You can finetune this model on your own dataset.
745
  - `do_predict`: False
746
  - `eval_strategy`: no
747
  - `prediction_loss_only`: True
748
- - `per_device_train_batch_size`: 360
749
  - `per_device_eval_batch_size`: 8
750
  - `per_gpu_train_batch_size`: None
751
  - `per_gpu_eval_batch_size`: None
752
  - `gradient_accumulation_steps`: 1
753
  - `eval_accumulation_steps`: None
754
  - `torch_empty_cache_steps`: None
755
- - `learning_rate`: 8e-05
756
- - `weight_decay`: 5e-05
757
  - `adam_beta1`: 0.9
758
  - `adam_beta2`: 0.999
759
  - `adam_epsilon`: 1e-08
@@ -762,7 +761,7 @@ You can finetune this model on your own dataset.
762
  - `max_steps`: -1
763
  - `lr_scheduler_type`: linear
764
  - `lr_scheduler_kwargs`: {}
765
- - `warmup_ratio`: 0.03
766
  - `warmup_steps`: 0
767
  - `log_level`: passive
768
  - `log_level_replica`: warning
@@ -828,7 +827,7 @@ You can finetune this model on your own dataset.
828
  - `hub_private_repo`: None
829
  - `hub_always_push`: False
830
  - `hub_revision`: None
831
- - `gradient_checkpointing`: True
832
  - `gradient_checkpointing_kwargs`: None
833
  - `include_inputs_for_metrics`: False
834
  - `include_for_metrics`: []
@@ -842,8 +841,8 @@ You can finetune this model on your own dataset.
842
  - `torchdynamo`: None
843
  - `ray_scope`: last
844
  - `ddp_timeout`: 1800
845
- - `torch_compile`: True
846
- - `torch_compile_backend`: inductor
847
  - `torch_compile_mode`: None
848
  - `include_tokens_per_second`: False
849
  - `include_num_input_tokens_seen`: no
@@ -866,45 +865,43 @@ You can finetune this model on your own dataset.
866
  ### Training Logs
867
  | Epoch | Step | Training Loss |
868
  |:------:|:-----:|:-------------:|
869
- | 0.0251 | 500 | 5.0537 |
870
- | 0.0501 | 1000 | 3.6206 |
871
- | 0.0752 | 1500 | 3.249 |
872
- | 0.1003 | 2000 | 3.5885 |
873
- | 0.1254 | 2500 | 3.2479 |
874
- | 0.1504 | 3000 | 3.2033 |
875
- | 0.1755 | 3500 | 2.7123 |
876
- | 0.2006 | 4000 | 2.8247 |
877
- | 0.2257 | 4500 | 2.7694 |
878
- | 0.2507 | 5000 | 3.0215 |
879
- | 0.2758 | 5500 | 2.6723 |
880
- | 0.3009 | 6000 | 2.8297 |
881
- | 0.3259 | 6500 | 2.4046 |
882
- | 0.3510 | 7000 | 2.2289 |
883
- | 0.3761 | 7500 | 2.4628 |
884
- | 0.4012 | 8000 | 2.4032 |
885
- | 0.4262 | 8500 | 2.5024 |
886
- | 0.4513 | 9000 | 2.0948 |
887
- | 0.4764 | 9500 | 2.4389 |
888
- | 0.5015 | 10000 | 2.4771 |
889
- | 0.5265 | 10500 | 2.6465 |
890
- | 0.5516 | 11000 | 2.5892 |
891
- | 0.5767 | 11500 | 2.3557 |
892
- | 0.6017 | 12000 | 2.2359 |
893
- | 0.6268 | 12500 | 2.5839 |
894
- | 0.6519 | 13000 | 2.4216 |
895
- | 0.6770 | 13500 | 2.3211 |
896
- | 0.7020 | 14000 | 2.1171 |
897
- | 0.7271 | 14500 | 2.1206 |
898
- | 0.7522 | 15000 | 2.2557 |
899
- | 0.7773 | 15500 | 2.2815 |
900
- | 0.8023 | 16000 | 2.0951 |
901
- | 0.8274 | 16500 | 2.3415 |
902
- | 0.8525 | 17000 | 2.2792 |
903
- | 0.8775 | 17500 | 2.3113 |
904
- | 0.9026 | 18000 | 2.1932 |
905
- | 0.9277 | 18500 | 2.1134 |
906
- | 0.9528 | 19000 | 1.9995 |
907
- | 0.9778 | 19500 | 1.8916 |
908
 
909
 
910
  ### Framework Versions
 
37
  \ pediatrician, or paediatrician. The word pediatrics and its cognates mean healer\
38
  \ of children; they derive from two Greek words: Ï\x80αá¿\x96Ï\x82 (pais child)\
39
  \ and ἰαÏ\x84Ï\x81Ï\x8CÏ\x82 (iatros doctor, healer)."
40
+ - source_sentence: However , in 1919 , concluded that no more operational awards would
41
+ be made for the recently decreed war .
42
  sentences:
43
+ - At executive level , EEAA represents the central arm of the ministry .
44
+ - In 1919 , however , no operational awards would be made for the recently concluded
45
+ war .
46
+ - He was asked his opinion about the books `` Mission to Moscow '' by Joseph E.
47
+ Davies and `` One World '' by Wendell Willkie .
48
+ - source_sentence: Twelve killed in bomb blast on Pakistani train
 
49
  sentences:
50
+ - Five killed by bomb blast in East India
51
+ - Five million citizens get unofficial salary in Ukraine
52
+ - Above that, seniors would be responsible for 100 percent of drug costs until the
53
+ out-of-pocket total reaches $3,600.
54
+ - source_sentence: Pen Hadow, who became the first person to reach the geographic
55
+ North Pole unsupported from Canada, has just over two days of rations left.
56
  sentences:
57
+ - Remnants of highly enriched uranium were found near an Iranian nuclear facility
58
+ by United Nations inspectors, deepening fears that Iran possibly has a secret
59
+ nuclear weapons program.
60
+ - However, the singer believes that artists similar to his self should not receive
61
+ any blame.
62
+ - Pen Hadow, the first person to to reach the North Pole, has only a little more
63
+ than two days of rations left.
64
  - source_sentence: what are the three subatomic particles called?
65
  sentences:
66
  - Subatomic particles include electrons, the negatively charged, almost massless
 
171
  # Get the similarity scores for the embeddings
172
  similarities = model.similarity(query_embeddings, document_embeddings)
173
  print(similarities)
174
+ # tensor([[0.6104, 0.0070, 0.0514]])
175
  ```
176
 
177
  <!--
 
224
  | | sentence1 | sentence2 | label |
225
  |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------|
226
  | type | string | string | int |
227
+ | details | <ul><li>min: 10 tokens</li><li>mean: 27.74 tokens</li><li>max: 54 tokens</li></ul> | <ul><li>min: 10 tokens</li><li>mean: 27.73 tokens</li><li>max: 55 tokens</li></ul> | <ul><li>0: ~54.60%</li><li>1: ~45.40%</li></ul> |
228
  * Samples:
229
+ | sentence1 | sentence2 | label |
230
+ |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
231
+ | <code>Göttsche received international acclaim with his formula for the generating function for the Hilbert numbers of the Betti scheme of points on an algebraic surface :</code> | <code>With his formula for the producing function for the Betti - numbers of the Hilbert scheme of points on an algebraic surface , Göttsche received international recognition :</code> | <code>0</code> |
232
+ | <code>The former AFL players Tarkyn Lockyer ( Collingwood ) and Ryan Brabazon ( Sydney ) , Jason Mandzij ( Gold Coast ) , started their football careers and played for the Kangas .</code> | <code>Former AFL players Ryan Brabazon ( Collingwood ) and Tarkyn Lockyer ( Sydney ) , Jason Mandzij ( Gold Coast ) started their football careers playing for the Kangas .</code> | <code>0</code> |
233
+ | <code>Potter married in 1945 . He and his wife Anne ( a weaver ) had two children , Julian ( born 1947 ) and Mary ( born 1952 ) .</code> | <code>He and his wife Anne ( a weaver ) had two children , Julian ( born 1947 ) and Mary ( born in 1952 ) .</code> | <code>0</code> |
234
  * Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
235
  ```json
236
  {
 
247
  * Size: 11,004 training samples
248
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
249
  * Approximate statistics based on the first 1000 samples:
250
+ | | sentence1 | sentence2 | label |
251
+ |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:------------------------------------------------|
252
+ | type | string | string | int |
253
+ | details | <ul><li>min: 10 tokens</li><li>mean: 27.33 tokens</li><li>max: 48 tokens</li></ul> | <ul><li>min: 12 tokens</li><li>mean: 27.3 tokens</li><li>max: 48 tokens</li></ul> | <ul><li>0: ~32.40%</li><li>1: ~67.60%</li></ul> |
254
  * Samples:
255
+ | sentence1 | sentence2 | label |
256
+ |:---------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
257
+ | <code>Passed in 1999 but never put into effect , the law would have made it illegal for bar and restaurant patrons to light up .</code> | <code>Passed in 1999 but never put into effect , the smoking law would have prevented bar and restaurant patrons from lighting up , but exempted private clubs from the regulation .</code> | <code>0</code> |
258
+ | <code>" Indeed , Iran should be put on notice that efforts to try to remake Iraq in their image will be aggressively put down , " he said .</code> | <code>" Iran should be on notice that attempts to remake Iraq in Iran 's image will be aggressively put down , " he said .</code> | <code>1</code> |
259
+ | <code>But U.S. troops will not shrink from mounting raids and attacking their foes when their locations can be pinpointed .</code> | <code>But American troops will not shrink from mounting raids in the locations of their foes that can be pinpointed .</code> | <code>1</code> |
260
  * Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
261
  ```json
262
  {
 
276
  | | sentence1 | sentence2 | label |
277
  |:--------|:----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------|:------------------------------------------------|
278
  | type | string | string | int |
279
+ | details | <ul><li>min: 6 tokens</li><li>mean: 13.83 tokens</li><li>max: 33 tokens</li></ul> | <ul><li>min: 29 tokens</li><li>mean: 340.09 tokens</li><li>max: 1024 tokens</li></ul> | <ul><li>0: ~31.40%</li><li>1: ~68.60%</li></ul> |
280
  * Samples:
281
+ | sentence1 | sentence2 | label |
282
+ |:----------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
283
+ | <code>Performance (film) is a religion.</code> | <code>The associative model of data is a data model for database systems .. data model. data model. database. database. Other data models , such as the relational model and the object data model , are record-based .. data model. data model. relational model. relational model. These models involve encompassing attributes about a thing , such as a car , in a record structure .. Such attributes might be registration , colour , make , model , etc. .. In the associative model , everything which has `` discrete independent existence '' is modeled as an entity , and relationships between them are modeled as associations .. The granularity at which data is represented is similar to schemes presented by Chen -LRB- Entity-relationship model -RRB- ; Bracchi , Paolini and Pelagatti -LRB- Binary Relations -RRB- ; and Senko -LRB- The Entity Set Model -RRB- .. Entity-relationship model. Entity-relationship model. A number of claims made about the model by Simon Williams , in his book The Associative Model ...</code> | <code>1</code> |
284
+ | <code>American Gods (TV series) has one showrunner, whose name is Greg Berlanti.</code> | <code>American Gods is an American television series based on the novel of the same name , written by Neil Gaiman and originally published in 2001 .. American Gods. American Gods. Neil Gaiman. Neil Gaiman. novel of the same name. American Gods. The television series was developed by Bryan Fuller and Michael Green for the premium cable network Starz .. Bryan Fuller. Bryan Fuller. Michael Green. Michael Green ( writer ). Starz. Starz. Fuller and Green are the showrunners for the series .. Gaiman serves as an executive producer along with Fuller , Green , Craig Cegielski , Stefanie Berk , and Thom Beers .. Thom Beers. Thom Beers. The first episode premiered on the Starz network and through their streaming application on April 30 , 2017 .. Starz. Starz. In May 2017 , the series was renewed for a second season .</code> | <code>0</code> |
285
+ | <code>The Ren & Stimpy Show was one of the original four Nicktoons.</code> | <code>Cloud was a browser-based operating system created by Good OS LLC , a Los Angeles-based corporation .. Los Angeles. Los Angeles. The company initially launched a Linux distribution called gOS which is heavily based on Ubuntu , now in its third incarnation .. gOS. gOS ( operating system ). Ubuntu. Ubuntu ( operating system )</code> | <code>1</code> |
286
  * Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
287
  ```json
288
  {
 
302
  | | sentence1 | sentence2 | label |
303
  |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:------------------------------------------------|
304
  | type | string | string | int |
305
+ | details | <ul><li>min: 6 tokens</li><li>mean: 22.32 tokens</li><li>max: 61 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 22.15 tokens</li><li>max: 46 tokens</li></ul> | <ul><li>0: ~54.50%</li><li>1: ~45.50%</li></ul> |
306
  * Samples:
307
+ | sentence1 | sentence2 | label |
308
+ |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------|:---------------|
309
+ | <code>the process of shrinking the size of a file by removing data or recoding it more efficiently</code> | <code>reducing the amount of space needed to store a piece of data/bandwidth to transmit it (ex. zip files)</code> | <code>0</code> |
310
+ | <code>the siem software can ensure that the time is the same across devices so the security events across devices are recorded at the same time.</code> | <code>feature of a siem that makes sure all products are synced up so they are running with the same timestamps.</code> | <code>1</code> |
311
+ | <code>a model that is part of a dssa to describe the context and domain semantics important to understand a reference architecture and its architectural decisions</code> | <code>provide a means of information about that class of system and of comparing different architectures</code> | <code>0</code> |
312
  * Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
313
  ```json
314
  {
 
325
  * Size: 10,047 training samples
326
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
327
  * Approximate statistics based on the first 1000 samples:
328
+ | | sentence1 | sentence2 | label |
329
+ |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:------------------------------------------------|
330
+ | type | string | string | int |
331
+ | details | <ul><li>min: 4 tokens</li><li>mean: 17.82 tokens</li><li>max: 124 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 17.3 tokens</li><li>max: 143 tokens</li></ul> | <ul><li>0: ~37.20%</li><li>1: ~62.80%</li></ul> |
332
  * Samples:
333
+ | sentence1 | sentence2 | label |
334
+ |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
335
+ | <code>"Kahuku Ranch has world - class qualities - tremendous resources, tremendous beauty and tremendous value to global biodiversity."</code> | <code>"TRENENDOUS BEAUTY AND TREMENDOUS VALUE TO GLOBAL BIODERVERSITY TRENENDOUS RESOURCES-CLASS QUALITIES KAHUKU RANCH HAS WORLD"</code> | <code>1</code> |
336
+ | <code>In Damascus, Syrian Information Minister Ahmad al-Hassan called the charges "baseless and illogical".</code> | <code>The Syrian Information Minister Ahmad al-Hassan, in Damascus, termed the charges without base and with no logic behind</code> | <code>1</code> |
337
+ | <code>We'd talk about the stars... ...and whether there might be somebody else like us out in space,... ...places we wanted to go and... it made our trials seem smaller.</code> | <code>We often would talk about the stars and if somebody else is similar to us out in the universe, places we wanted to visit and it made our problems seem minuscule.</code> | <code>1</code> |
338
  * Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
339
  ```json
340
  {
 
354
  | | sentence1 | sentence2 | label |
355
  |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
356
  | type | string | string | float |
357
+ | details | <ul><li>min: 6 tokens</li><li>mean: 15.23 tokens</li><li>max: 50 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 15.39 tokens</li><li>max: 51 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 2.73</li><li>max: 5.0</li></ul> |
358
  * Samples:
359
+ | sentence1 | sentence2 | label |
360
+ |:--------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------|:--------------------------------|
361
+ | <code>China's anger at N. Korea overcomes worry over US</code> | <code>China's anger at North Korea overcomes worry over U.S. stealth flights</code> | <code>3.200000047683716</code> |
362
+ | <code>Declining issues outnumbered advancers nearly 2 to 1 on the New York Stock Exchange.</code> | <code>Advancers outnumbered decliners by nearly 8 to 3 on the NYSE and more than 11 to 5 on Nasdaq.</code> | <code>1.7999999523162842</code> |
363
+ | <code>The computers were reportedly located in the U.S., Canada and South Korea.</code> | <code>The PCs are scattered across the United States, Canada and South Korea.</code> | <code>4.75</code> |
364
  * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
365
  ```json
366
  {
 
380
  | | sentence1 | sentence2 | label |
381
  |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
382
  | type | string | string | float |
383
+ | details | <ul><li>min: 6 tokens</li><li>mean: 12.39 tokens</li><li>max: 30 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 12.16 tokens</li><li>max: 38 tokens</li></ul> | <ul><li>min: 1.0</li><li>mean: 3.48</li><li>max: 5.0</li></ul> |
384
  * Samples:
385
+ | sentence1 | sentence2 | label |
386
+ |:-------------------------------------------------------------------|:------------------------------------------------------------|:-------------------------------|
387
+ | <code>Someone is cutting some paper with scissors</code> | <code>The piece of paper is being cut</code> | <code>4.5</code> |
388
+ | <code>A man is hanging up the phone</code> | <code>A man is making a phone call</code> | <code>3.799999952316284</code> |
389
+ | <code>A person is pouring olive oil into a pot on the stove</code> | <code>A person is pouring oil for cooking into a pot</code> | <code>4.300000190734863</code> |
390
  * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
391
  ```json
392
  {
 
403
  * Size: 14,280 training samples
404
  * Columns: <code>label</code>, <code>sentence1</code>, and <code>sentence2</code>
405
  * Approximate statistics based on the first 1000 samples:
406
+ | | label | sentence1 | sentence2 |
407
+ |:--------|:---------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
408
+ | type | float | string | string |
409
+ | details | <ul><li>min: 0.0</li><li>mean: 3.16</li><li>max: 5.0</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 19.29 tokens</li><li>max: 80 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 17.45 tokens</li><li>max: 82 tokens</li></ul> |
410
  * Samples:
411
+ | label | sentence1 | sentence2 |
412
+ |:------------------|:------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------|
413
+ | <code>1.0</code> | <code>How do I wire a bathroom exhaust fan/light to two switches?</code> | <code>How do I wire a combo with two supplies?</code> |
414
+ | <code>4.2</code> | <code>How an all-American hero fell to earth - . (Where have all the REAL heroes gone?)</code> | <code>How all-American hero fell to earth</code> |
415
+ | <code>3.75</code> | <code>Be larger in number, quantity, power, status, or importance, without personally having sovereign power.</code> | <code>be larger in number, quantity, power, status or importance.</code> |
416
  * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
417
  ```json
418
  {
 
731
  ### Training Hyperparameters
732
  #### Non-Default Hyperparameters
733
 
734
+ - `per_device_train_batch_size`: 384
735
+ - `learning_rate`: 1.0
736
+ - `weight_decay`: 6e-05
737
  - `num_train_epochs`: 1
 
738
  - `fp16`: True
 
 
 
739
 
740
  #### All Hyperparameters
741
  <details><summary>Click to expand</summary>
 
744
  - `do_predict`: False
745
  - `eval_strategy`: no
746
  - `prediction_loss_only`: True
747
+ - `per_device_train_batch_size`: 384
748
  - `per_device_eval_batch_size`: 8
749
  - `per_gpu_train_batch_size`: None
750
  - `per_gpu_eval_batch_size`: None
751
  - `gradient_accumulation_steps`: 1
752
  - `eval_accumulation_steps`: None
753
  - `torch_empty_cache_steps`: None
754
+ - `learning_rate`: 1.0
755
+ - `weight_decay`: 6e-05
756
  - `adam_beta1`: 0.9
757
  - `adam_beta2`: 0.999
758
  - `adam_epsilon`: 1e-08
 
761
  - `max_steps`: -1
762
  - `lr_scheduler_type`: linear
763
  - `lr_scheduler_kwargs`: {}
764
+ - `warmup_ratio`: 0.0
765
  - `warmup_steps`: 0
766
  - `log_level`: passive
767
  - `log_level_replica`: warning
 
827
  - `hub_private_repo`: None
828
  - `hub_always_push`: False
829
  - `hub_revision`: None
830
+ - `gradient_checkpointing`: False
831
  - `gradient_checkpointing_kwargs`: None
832
  - `include_inputs_for_metrics`: False
833
  - `include_for_metrics`: []
 
841
  - `torchdynamo`: None
842
  - `ray_scope`: last
843
  - `ddp_timeout`: 1800
844
+ - `torch_compile`: False
845
+ - `torch_compile_backend`: None
846
  - `torch_compile_mode`: None
847
  - `include_tokens_per_second`: False
848
  - `include_num_input_tokens_seen`: no
 
865
  ### Training Logs
866
  | Epoch | Step | Training Loss |
867
  |:------:|:-----:|:-------------:|
868
+ | 0.0267 | 500 | 4.3558 |
869
+ | 0.0535 | 1000 | 3.0724 |
870
+ | 0.0802 | 1500 | 2.979 |
871
+ | 0.1070 | 2000 | 2.9205 |
872
+ | 0.1337 | 2500 | 3.0679 |
873
+ | 0.1604 | 3000 | 2.837 |
874
+ | 0.1872 | 3500 | 3.2635 |
875
+ | 0.2139 | 4000 | 2.7602 |
876
+ | 0.2407 | 4500 | 2.6911 |
877
+ | 0.2674 | 5000 | 2.6963 |
878
+ | 0.2941 | 5500 | 2.8504 |
879
+ | 0.3209 | 6000 | 2.7501 |
880
+ | 0.3476 | 6500 | 2.6315 |
881
+ | 0.3744 | 7000 | 2.5372 |
882
+ | 0.4011 | 7500 | 2.8814 |
883
+ | 0.4278 | 8000 | 2.2826 |
884
+ | 0.4546 | 8500 | 2.764 |
885
+ | 0.4813 | 9000 | 2.4418 |
886
+ | 0.5080 | 9500 | 2.3762 |
887
+ | 0.5348 | 10000 | 2.5542 |
888
+ | 0.5615 | 10500 | 2.2653 |
889
+ | 0.5883 | 11000 | 2.5098 |
890
+ | 0.6150 | 11500 | 2.3009 |
891
+ | 0.6417 | 12000 | 2.4029 |
892
+ | 0.6685 | 12500 | 2.1538 |
893
+ | 0.6952 | 13000 | 2.6398 |
894
+ | 0.7220 | 13500 | 2.3101 |
895
+ | 0.7487 | 14000 | 2.8489 |
896
+ | 0.7754 | 14500 | 2.3822 |
897
+ | 0.8022 | 15000 | 2.3035 |
898
+ | 0.8289 | 15500 | 2.4212 |
899
+ | 0.8557 | 16000 | 2.1447 |
900
+ | 0.8824 | 16500 | 1.985 |
901
+ | 0.9091 | 17000 | 2.1427 |
902
+ | 0.9359 | 17500 | 2.3002 |
903
+ | 0.9626 | 18000 | 2.2671 |
904
+ | 0.9894 | 18500 | 2.3033 |
 
 
905
 
906
 
907
  ### Framework Versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3d9729ed5a375cb33fdfe9941bf4032235f8e37c6b27fa88b752ff736b85616b
3
  size 127538496
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e77447660a82d5a1c32834edf181ece19138af5fe0d1a489194f8674d0a79f19
3
  size 127538496