kyoto_reader.sentence module

class kyoto_reader.sentence.Sentence(knp_string: str, dtid_offset: int, dmid_offset: int, doc_id: str)[source]

Bases: object

A class to represent a single sentence.

blist

BList object of pyknp.

Type:BList
doc_id

The document ID of this sentence.

Type:str
bps

Base phrases in this sentence.

Type:List[BasePhrase]
__init__(knp_string: str, dtid_offset: int, dmid_offset: int, doc_id: str) → None[source]
Parameters:
  • knp_string (str) – KNP format string of this sentence.
  • dtid_offset (int) – The document-wide tag ID of the previous base phrase.
  • dmid_offset (int) – The document-wide morpheme ID of the previous morpheme.
  • doc_id (str) – The document ID of this sentence.
bnst_list()[source]

Return list of Bunsetsu object in pyknp.

dtids

A document-wide tag ID.

mrph2dmid

A mapping from morpheme to its document-wide ID.

mrph_list()[source]

Return list of Morpheme object in pyknp.

sid

A sentence ID.

surf

A surface expression

tag_list()[source]

Return list of Tag object in pyknp.