Semantic Textual Similarity

Semantic Textual Similarity related modeling class

class pororo.tasks.semantic_textual_similarity.PororoStsFactory(task: str, lang: str, model: Optional[str])[source]

Bases: pororo.tasks.utils.base.PororoFactoryBase

Sentence similarity base semantic textual similarity using korsts, sts

Korean (brainbert.base.ko.korsts)

  • dataset: KorSTS (Ham et al. 2020)

  • metric: Spearman (83.00)

Korean (brainsbert.base.ko.kornli.korsts)

  • dataset: KorSTS (Ham et al. 2020)

  • metric: Spearman (83.46)

English (roberta.base.en.sts)

  • dataset: STS-B (Daniel Cer et al. 2017)

  • metric: Spearman (91.2)

Japanese (jaberta.base.ja.sts)

  • dataset: Translated STS-B (Daniel Cer et al. 2017)

  • metric: Spearman (82.80)

Chinese (zhberta.base.zh.sts)

  • dataset: Translated STS-B (Daniel Cer et al. 2017)

  • metric: Spearman (83.65)

Examples

>>> sts = Pororo(task="similarity", lang="ko")
>>> sts("나는 동물을 좋아하는 사람이야", "강아지를 좋아하는 아버지")
0.415
>>> sts = Pororo(task="similarity", lang="ja")
>>> sts("ベビーパンダがスライドを下ります。", "パンダがスライドを下って滑ります。") # ["아기 팬더가 슬라이드를 내려 갑니다.", "팬더가 슬라이드를 내려 미끄러집니다."]
0.746
>>> sts = Pororo(task="similarity", lang="zh")
>>> sts('三名男子在街上做同样的舞蹈。', '街上有三个无衬衫的男人在跳舞。')  # ["세 남자가 거리에서 같은 춤을 춥니다.", "거리에서 춤추는 세 명의 벗은 남자가 있습니다."]
0.669
>>> sts = Pororo(task="similarity", lang="en")
>>> sts("Two dogs and one cat sitting on couch.", "Two dogs and a cat resting on a couch.")
0.921
static get_available_langs()[source]
static get_available_models()[source]
load(device: str)[source]

Load user-selected task-specific model

Parameters

device (str) – device information

Returns

User-selected task-specific model

Return type

object

class pororo.tasks.semantic_textual_similarity.PororoBertSts(model, config)[source]

Bases: pororo.tasks.utils.base.PororoBiencoderBase

predict(sent_a: str, sent_b: str)[source]

Conduct semantic textual similarity task with BERT

Parameters
  • sent_a (str) – first sentence to be encoded

  • sent_b (str) – second sentence to be encoded

Returns

similarity score

Return type

float

class pororo.tasks.semantic_textual_similarity.PororoSBertSts(model, config)[source]

Bases: pororo.tasks.utils.base.PororoBiencoderBase

predict(sent_a: str, sent_b: str, **kwargs)float[source]

Conduct semantic textual similariry task with S-BERT

Parameters
  • sent_a (str) – first sentence to be encoded

  • sent_b (str) – second sentence to be encoded

Returns

similarity score

Return type

float