[关闭]
@zhengyuhong 2015-04-09T08:06:55.000000Z 字数 659 阅读 1130

NLTK笔记

NLTK Python 文本挖掘 NLP


  1. >>> import nltk
  2. >>> sentence = """At eight o'clock on Thursday morning
  3. ... Arthur didn't feel very good."""
  4. >>> tokens = nltk.word_tokenize(sentence)
  5. >>> tokens
  6. ['At', 'eight', "o'clock", 'on', 'Thursday', 'morning',
  7. 'Arthur', 'did', "n't", 'feel', 'very', 'good', '.']
  8. >>> tagged = nltk.pos_tag(tokens)
  9. >>> tagged[0:6]
  10. [('At', 'IN'), ('eight', 'CD'), ("o'clock", 'JJ'), ('on', 'IN'),
  11. ('Thursday', 'NNP'), ('morning', 'NN')]
  12. >>> entities = nltk.chunk.ne_chunk(tagged)
  13. >>> entities
  14. Tree('S', [('At', 'IN'), ('eight', 'CD'), ("o'clock", 'JJ'),
  15. ('on', 'IN'), ('Thursday', 'NNP'), ('morning', 'NN'),
  16. Tree('PERSON', [('Arthur', 'NNP')]),
  17. ('did', 'VBD'), ("n't", 'RB'), ('feel', 'VB'),
  18. ('very', 'RB'), ('good', 'JJ'), ('.', '.')])
添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注