Part-of-Speech Tagging

Part-Of-Speech Tagging related modeling class

class pororo.tasks.pos_tagging.PororoPosFactory(task: str, lang: str, model: Optional[str])[source]

Bases: pororo.tasks.utils.base.PororoFactoryBase

Conduct Part-of-Speech tagging

Korean (mecab-ko)

  • dataset: N/A

  • metric: N/A

japanese (mecab-ipadic)

  • dataset: N/A

  • metric: N/A

English (nltk)

  • dataset: N/A

  • metric: N/A

Chinese (jieba)

  • dataset: N/A

  • metric: N/A

Parameters

sent (str) – input sentence to be tagged

Returns

list of token and its corresponding pos tag tuple

Return type

List[Tuple[str, str]]

Examples

>>> pos = Pororo(task="pos", lang="ko")
>>> pos("안녕하세요. 제 이름은 카터입니다.")
[('안녕', 'NNG'), ('하', 'XSV'), ('시', 'EP'), ('어요', 'EF'), ('.', 'SF'), (' ', 'SPACE'),
 ('저', 'NP'), ('의', 'JKG'), (' ', 'SPACE'), ('이름', 'NNG'), ('은', 'JX'), (' ', 'SPACE'),
 ('카터', 'NNP'), ('이', 'VCP'), ('ᄇ니다', 'EF'), ('.', 'SF')]
>>> pos = Pororo("pos", lang="ja")
>>> pos("日本語でペラペラではないです")
[('日本語', '名詞'), ('で', '助詞'), ('ペラペラ', '副詞'), ('で', '助動詞'),
 ('は', '助詞'), ('ない', '助動詞'), ('です', '助動詞')]
>>> pos = Pororo("pos", lang="en")
>>> pos("The striped bats are hanging, on their feet for best.")
[('The', 'DT'), (' ', 'SPACE'), ('striped', 'JJ'), (' ', 'SPACE'), ('bats', 'NNS'),
 (' ', 'SPACE'), ('are', 'VBP'), (' ', 'SPACE'), ('hanging', 'VBG'), (',', ','),
 (' ', 'SPACE'), ('on', 'IN'), (' ', 'SPACE'), ('their', 'PRP$'), (' ', 'SPACE'),
 ('feet', 'NNS'), (' ', 'SPACE'), ('for', 'IN'), (' ', 'SPACE'), ('best', 'JJS'), ('.', '.')]
>>> pos = Pororo("pos", lang="zh")
>>> pos("乒乓球拍卖完了")
[('乒乓球', 'n'), ('拍卖', 'v'), ('完', 'v'), ('了', 'ul')]
static get_available_langs()[source]
static get_available_models()[source]
load(device: str)[source]

Load user-selected task-specific model

Parameters

device (str) – device information

Returns

User-selected task-specific model

Return type

object

class pororo.tasks.pos_tagging.PororoMecabPos(model, config)[source]

Bases: pororo.tasks.utils.base.PororoSimpleBase

stringfy(result: List[Tuple[str, str]])str[source]
predict(sent: str, **kwargs) → Union[Tuple[str, str], str][source]

Conduct Part-of-Speech tagging using mecab-ko

Parameters
  • sent (str) – input sentence to be tagged

  • return_surface (bool) – whether to return surface

  • return_string (bool) – whether to return value as a string

Returns

list of token and its corresponding pos tag tuple

Return type

List[Tuple[str, str]]

class pororo.tasks.pos_tagging.PororoMecabJap(model, config)[source]

Bases: pororo.tasks.utils.base.PororoSimpleBase

predict(sent: str, **kwargs)[source]

Conduct Part-of-Speech tagging using mecab and ipadic modules

Parameters

sent (str) – input sentence to be tagged

Returns

list of token and its corresponding pos tag tuple

Return type

List[Tuple[str, str]]

class pororo.tasks.pos_tagging.PororoJieba(model, config)[source]

Bases: pororo.tasks.utils.base.PororoSimpleBase

predict(sent: str, **kwargs)[source]

Conduct Part-of-Speech tagging using jieba modules

Parameters

sent (str) – input sentence to be tagged

Returns

list of token and its corresponding pos tag tuple

Return type

List[Tuple[str, str]]

class pororo.tasks.pos_tagging.PororoNLTKPosTagger(model, config)[source]

Bases: pororo.tasks.utils.base.PororoSimpleBase

predict(sent: str, **kwargs)[source]

Conduct Part-of-Speech tagging using NLTK modules

Parameters

sent (str) – input sentence to be tagged

Returns

list of token and its corresponding pos tag tuple

Return type

List[Tuple[str, str]]