Paraphrase Generation

Paraphrase Generation modeling class

class pororo.tasks.paraphrase_generation.PororoParaphraseFactory(task: str, lang: str, model: Optional[str])[source]

Bases: pororo.tasks.utils.base.PororoFactoryBase

paraphrase generation using Transformer Seq2Seq

Multi (transformer.large.multi.mtpg)

  • dataset: Internal data

  • metric: BLEU score

    Language

    BLEU score

    Average

    33.00

    English

    54

    Korean

    50

    Japanese

    20

    Chinese

    8

Multi (transformer.large.multi.fast.mtpg)

  • dataset: Internal data

  • metric: BLEU score

    Language

    BLEU score

    Average

    33.50

    English

    56

    Korean

    50

    Japanese

    20

    Chinese

    8

Parameters
  • text (str) – input sentence to be paraphrase generated

  • beam (int) – beam search size

  • temperature (float) – temperature scale

  • top_k (int) – top-K sampling vocabulary size

  • top_p (float) – top-p sampling ratio

  • no_repeat_ngram_size (int) – no repeat ngram size

  • len_penalty (float) – length penalty ratio

Returns

generated paraphrase

Return type

str

Examples

>>> pg = Pororo(task="pg", lang="ko")
>>> pg("노는게 제일 좋아. 친구들 모여라. 언제나 즐거워.")
노는 것이 가장 좋습니다. 친구들끼리 모여 주세요. 언제나 즐거운 시간 되세요.
>>> pg = Pororo("pg", lang="zh")
>>> pg("我喜欢足球")  # 나는 축구를 좋아해
'我喜欢球球球'  # 나는 공을 좋아해
>>> pg = Pororo(task="pg", lang="ja")
>>> pg("雨の日を聞く良い音楽をお勧めしてくれ。")  # 비오는 날 듣기 좋은 음악 가르쳐줘
'雨の日を聞くいい音楽を教えてください。'          # 비오는 날 듣기 좋은 음악을 가르쳐 주세요
>>> pg = Pororo("pg", lang="en")
>>> pg("There is someone at the door.")
"Someone's at the door."
>>> pg("I'm good, but thanks for the offer.")
"I'm fine, but thanks for the deal."
static get_available_langs()[source]
static get_available_models()[source]
load(device: str)[source]

Load user-selected task-specific model

Parameters

device (str) – device information

Returns

User-selected task-specific model

Return type

object

class pororo.tasks.paraphrase_generation.PororoTransformerTransMulti(model, config, tokenizer)[source]

Bases: pororo.tasks.utils.base.PororoGenerationBase

predict(text: str, beam: int = 5, temperature: float = 1.0, top_k: int = - 1, top_p: float = - 1, no_repeat_ngram_size: int = 4, len_penalty: float = 1.0, **kwargs)str[source]

Conduct machine translation

Parameters
  • text (str) – input sentence to be paraphrase generated

  • beam (int) – beam search size

  • temperature (float) – temperature scale

  • top_k (int) – top-K sampling vocabulary size

  • top_p (float) – top-p sampling ratio

  • no_repeat_ngram_size (int) – no repeat ngram size

  • len_penalty (float) – length penalty ratio

Returns

machine translated sentence

Return type

str

class pororo.tasks.paraphrase_generation.PororoTransformerParaphrase(model, config, tokenizer)[source]

Bases: pororo.tasks.utils.base.PororoGenerationBase

predict(text: str, beam: int = 1, temperature: float = 1.0, top_k: int = - 1, top_p: float = - 1, no_repeat_ngram_size: int = 4, len_penalty: float = 1.0, **kwargs)[source]

Conduct paraphrase generation using Transformer Seq2Seq

Parameters
  • text (str) – input sentence to be paraphrase generated

  • beam (int) – beam search size

  • temperature (float) – temperature scale

  • top_k (int) – top-K sampling vocabulary size

  • top_p (float) – top-p sampling ratio

  • no_repeat_ngram_size (int) – no repeat ngram size

  • len_penalty (float) – length penalty ratio

Returns

generated paraphrase

Return type

str