본문 바로가기
분석

STS embedding models benchmark

by Joseph.Lee 2026. 5. 26.

몇개 검색용으로 쓸 embedding model 을 찾아보고 있는 중에 비교했던 내용을 공유한다.

nvidia L4 에서 동작한 결과이다.

STS

Model Dataset Samples Spearman Elapsed(s) Total tokens Tokens/s Avg sec/pair
jhgan/ko-sbert-sts English(STS-B val) 1500 0.7762 31.88 84667 2655.79 0.02125
jhgan/ko-sbert-sts Korean(KLUE-STS val) 519 0.7863 8.26 20184 2443.12 0.01592
snunlp/KR-SBERT-V40K-klueNLI-augSTS English(STS-B val) 1500 0.6292 35.90 103223 2875.46 0.02393
snunlp/KR-SBERT-V40K-klueNLI-augSTS Korean(KLUE-STS val) 519 0.7341 7.06 16447 2329.41 0.01360
sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 English(STS-B val) 1500 0.8747 7.28 53774 7387.00 0.00485
sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 Korean(KLUE-STS val) 519 0.6590 2.87 20702 7219.38 0.00553
Xenova/paraphrase-multilingual-MiniLM-L12-v2 (fp16) English(STS-B val) 1500 0.8747 18.16 53774 2960.95 0.01211
Xenova/paraphrase-multilingual-MiniLM-L12-v2 (fp16) Korean(KLUE-STS val) 519 0.6589 9.23 20702 2241.85 0.01779
Xenova/paraphrase-multilingual-MiniLM-L12-v2 (int8) English(STS-B val) 1500 0.8731 15.84 53774 3395.30 0.01056
Xenova/paraphrase-multilingual-MiniLM-L12-v2 (int8) Korean(KLUE-STS val) 519 0.6486 7.37 20702 2807.42 0.01421
Xenova/all-MiniLM-L12-v2 (original) English(STS-B val) 1500 0.8750 14.68 48709 3317.26 0.00979
Xenova/all-MiniLM-L12-v2 (original) Korean(KLUE-STS val) 519 0.3142 17.31 49995 2888.67 0.03335
Xenova/all-MiniLM-L12-v2 (int8) English(STS-B val) 1500 0.8711 15.28 48709 3188.24 0.01019
Xenova/all-MiniLM-L12-v2 (int8) Korean(KLUE-STS val) 519 0.3012 19.48 49995 2566.08 0.03754
unsloth/embeddinggemma-300m-GGUF (Q8_0) English(STS-B val) 1500 0.8800 1.87 49015 26198.40 0.00125
unsloth/embeddinggemma-300m-GGUF (Q8_0) Korean(KLUE-STS val) 519 0.8630 0.74 22363 30413.22 0.00142
unsloth/embeddinggemma-300m-GGUF (Q4_0) English(STS-B val) 1500 0.8812 1.80 49015 27236.45 0.00120
unsloth/embeddinggemma-300m-GGUF (Q4_0) Korean(KLUE-STS val) 519 0.8592 0.72 22363 30941.49 0.00139
sentence-transformers/distiluse-base-multilingual-cased-v2 (original) English(STS-B val) 1500 0.8193 10.93 52212 4777.15 0.00729
sentence-transformers/distiluse-base-multilingual-cased-v2 (original) Korean(KLUE-STS val) 519 0.7856 4.66 25317 5438.42 0.00897
Xenova/distiluse-base-multilingual-cased-v2 (int8) English(STS-B val) 1500 0.7813 17.63 52212 2962.25 0.01175
Xenova/distiluse-base-multilingual-cased-v2 (int8) Korean(KLUE-STS val) 519 0.7563 9.69 25317 2613.16 0.01867
jc-lab/multilingual-e5-small-ko-v2-gguf (Q8_0) English(STS-B val) 1500 0.8715 0.97 53842 55644.30 0.00065
jc-lab/multilingual-e5-small-ko-v2-gguf (Q8_0) Korean(KLUE-STS val) 519 0.8040 0.32 20780 64254.70 0.00062
jc-lab/multilingual-e5-small-ko-v2-gguf (Q4_K_M) English(STS-B val) 1500 0.8705 0.88 53842 60864.01 0.00059
jc-lab/multilingual-e5-small-ko-v2-gguf (Q4_K_M) Korean(KLUE-STS val) 519 0.8031 0.33 20780 62507.24 0.00064

Retrieval

Model Dataset Queries Docs Recall@1 Recall@3 Recall@5 MRR@5 Elapsed(s) Total tokens Tokens/s Avg sec/query
jhgan/ko-sbert-sts English(MSMARCO Passage Ranking top250 test) 43 6609 0.4884 0.6977 0.7907 0.5973 231.28 -1 0.00 5.37859
jhgan/ko-sbert-sts Korean(MIRACL-ko top250 train) 211 43421 0.1991 0.3649 0.4692 0.2968 1337.67 -1 0.00 6.33966
snunlp/KR-SBERT-V40K-klueNLI-augSTS English(MSMARCO Passage Ranking top250 test) 43 6609 0.3023 0.3488 0.4186 0.3407 235.91 -1 0.00 5.48622
snunlp/KR-SBERT-V40K-klueNLI-augSTS Korean(MIRACL-ko top250 train) 211 43421 0.1564 0.3223 0.4076 0.2469 1312.00 -1 0.00 6.21799
sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 English(MSMARCO Passage Ranking top250 test) 43 6609 0.8372 0.9535 0.9535 0.8915 81.84 -1 0.00 1.90324
sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 Korean(MIRACL-ko top250 train) 211 43421 0.2654 0.4123 0.5071 0.3520 559.02 -1 0.00 2.64940
Xenova/paraphrase-multilingual-MiniLM-L12-v2 (fp16) English(MSMARCO Passage Ranking top250 test) 43 6609 0.8372 0.9302 0.9535 0.8857 184.46 -1 0.00 4.28982
Xenova/paraphrase-multilingual-MiniLM-L12-v2 (fp16) Korean(MIRACL-ko top250 train) 211 43421 0.2322 0.3886 0.4360 0.3095 1658.68 -1 0.00 7.86106
Xenova/paraphrase-multilingual-MiniLM-L12-v2 (int8) English(MSMARCO Passage Ranking top250 test) 43 6609 0.7442 0.9302 0.9535 0.8380 190.47 -1 0.00 4.42946
Xenova/paraphrase-multilingual-MiniLM-L12-v2 (int8) Korean(MIRACL-ko top250 train) 211 43421 0.1991 0.3270 0.4028 0.2716 1727.12 -1 0.00 8.18540
Xenova/all-MiniLM-L12-v2 (original) English(MSMARCO Passage Ranking top250 test) 43 6609 0.9302 1.0000 1.0000 0.9651 154.73 -1 0.00 3.59842
Xenova/all-MiniLM-L12-v2 (original) Korean(MIRACL-ko top250 train) 211 43421 0.0284 0.0379 0.0521 0.0355 1580.13 -1 0.00 7.48878
Xenova/all-MiniLM-L12-v2 (int8) English(MSMARCO Passage Ranking top250 test) 43 6609 0.8837 0.9767 1.0000 0.9271 168.95 -1 0.00 3.92914
Xenova/all-MiniLM-L12-v2 (int8) Korean(MIRACL-ko top250 train) 211 43421 0.0237 0.0379 0.0427 0.0302 1729.01 -1 0.00 8.19435
unsloth/embeddinggemma-300m-GGUF (Q8_0) English(MSMARCO Passage Ranking top250 test) 43 6609 0.8140 0.9767 0.9767 0.8837 20.33 -1 0.00 0.47275
unsloth/embeddinggemma-300m-GGUF (Q8_0) Korean(MIRACL-ko top250 train) 211 43421 0.6066 0.8057 0.8768 0.7101 211.36 -1 0.00 1.00168
unsloth/embeddinggemma-300m-GGUF (Q4_0) English(MSMARCO Passage Ranking top250 test) 43 6609 0.8605 0.9767 0.9767 0.9147 20.09 -1 0.00 0.46724
unsloth/embeddinggemma-300m-GGUF (Q4_0) Korean(MIRACL-ko top250 train) 211 43421 0.5592 0.7962 0.8531 0.6787 209.15 -1 0.00 0.99121
sentence-transformers/distiluse-base-multilingual-cased-v2 (original) English(MSMARCO Passage Ranking top250 test) 43 6609 0.5116 0.6512 0.7674 0.5953 98.78 -1 0.00 2.29715
sentence-transformers/distiluse-base-multilingual-cased-v2 (original) Korean(MIRACL-ko top250 train) 211 43421 0.1469 0.2133 0.2844 0.1906 665.74 -1 0.00 3.15514
Xenova/distiluse-base-multilingual-cased-v2 (int8) English(MSMARCO Passage Ranking top250 test) 43 6609 0.5349 0.7674 0.7674 0.6279 214.55 -1 0.00 4.98956
Xenova/distiluse-base-multilingual-cased-v2 (int8) Korean(MIRACL-ko top250 train) 211 43421 0.1564 0.2275 0.3081 0.2055 1886.81 -1 0.00 8.94223
jc-lab/multilingual-e5-small-ko-v2-gguf (Q8_0) English(MSMARCO Passage Ranking top250 test) 43 6609 0.8140 0.9767 1.0000 0.8961 9.35 -1 0.00 0.21737
jc-lab/multilingual-e5-small-ko-v2-gguf (Q8_0) Korean(MIRACL-ko top250 train) 211 43421 0.5592 0.8009 0.8626 0.6812 90.83 -1 0.00 0.43047
jc-lab/multilingual-e5-small-ko-v2-gguf (Q4_K_M) English(MSMARCO Passage Ranking top250 test) 43 6609 0.8372 0.9767 1.0000 0.9078 9.57 -1 0.00 0.22258
jc-lab/multilingual-e5-small-ko-v2-gguf (Q4_K_M) Korean(MIRACL-ko top250 train) 211 43421 0.5545 0.7867 0.8626 0.6738 91.60 -1 0.00 0.43413

테스트에 사용 된 코드와 결과는 다음에 있습니다:

https://github.com/jclab-joseph/blog-artifacts/blob/main/2026-05/26-embedding-models/benchmark_results.md

 

blog-artifacts/2026-05/26-embedding-models/benchmark_results.md at main · jclab-joseph/blog-artifacts

Contribute to jclab-joseph/blog-artifacts development by creating an account on GitHub.

github.com

 

multilingual-e5-small-ko-v2 가 가성비가 제일 좋은 듯 합니다. 100M 도 안되는 모델인데 300M embeddimg gemma 랑 거의 비슷하네요.

 

양자화된 모델은 여기에 올려놨습니다.

https://huggingface.co/jc-lab/multilingual-e5-small-ko-v2-gguf

 

반응형

'분석' 카테고리의 다른 글

인터넷 등기소 프로그램 (IPRT) 분석  (2) 2024.10.03
Intel vPro CIRA Packet  (0) 2023.01.05

댓글