python___string index out of range
用Pythontrainingchunker,从书上找到以下方法,我想用这个chunker,chunktaggedbigrams,然后pattern=“NP:{<DT>*...
用Python training chunker ,从书上找到以下方法, 我想用这个chunker, chunk tagged bigrams ,然后pattern=“ NP: {<DT>*<NN.*>}” 我就简单弄了这个, 但调用这个pattern的时候就有错误出现,不知道这是什么原因希望哪位大师能帮个忙,以下是代码:
>>>from nltk.corpus import conll2000
>>>test = conll2000.chunked_sents('test.txt', chunk_types=['NP'])
>>>train= conll2000.chunked_sents('train.txt', chunk_types=['NP'])
>>> class ChunkParser(nltk.ChunkParserI):
... def __init__ (self,train):
... train_data=[[(t,c) for w,t,c in nltk.chunk.tree2conlltags(sent)]
for sent in train]
... self.tagger=nltk.UnigramTagger(train_data)
... def parse(self,sentence):
... pos_tags=[pos for (word,pos) in sentence]
... tagged_pos_tags=self.tagger.tag(pos_tags)
... chunktags=[chunktag for (pos,chunktag) in tagged_pos_tags]
... conlltags=[(word,pos,chunktag) for ((word,pos),chunktag)
in zip (sentence,chunktags)]
... return nltk.chunk.tree2conlltags(conlltags)
...
>>> NPChunker=ChunkParser(train)
>>> print NPChunker.evaluate(test)
ChunkParse score:
IOB Accuracy: 43.4%
Precision: 0.0%
Recall: 0.0%
F-Measure: 0.0%
>>> NPChunker=ChunkParser(pattern)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in __init__
File "C:\Python27\lib\site-packages\nltk\chunk\util.py", line 427, in tree2con
lltags
tags.append((child[0], child[1], "O"))
IndexError: string index out of range 展开
>>>from nltk.corpus import conll2000
>>>test = conll2000.chunked_sents('test.txt', chunk_types=['NP'])
>>>train= conll2000.chunked_sents('train.txt', chunk_types=['NP'])
>>> class ChunkParser(nltk.ChunkParserI):
... def __init__ (self,train):
... train_data=[[(t,c) for w,t,c in nltk.chunk.tree2conlltags(sent)]
for sent in train]
... self.tagger=nltk.UnigramTagger(train_data)
... def parse(self,sentence):
... pos_tags=[pos for (word,pos) in sentence]
... tagged_pos_tags=self.tagger.tag(pos_tags)
... chunktags=[chunktag for (pos,chunktag) in tagged_pos_tags]
... conlltags=[(word,pos,chunktag) for ((word,pos),chunktag)
in zip (sentence,chunktags)]
... return nltk.chunk.tree2conlltags(conlltags)
...
>>> NPChunker=ChunkParser(train)
>>> print NPChunker.evaluate(test)
ChunkParse score:
IOB Accuracy: 43.4%
Precision: 0.0%
Recall: 0.0%
F-Measure: 0.0%
>>> NPChunker=ChunkParser(pattern)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in __init__
File "C:\Python27\lib\site-packages\nltk\chunk\util.py", line 427, in tree2con
lltags
tags.append((child[0], child[1], "O"))
IndexError: string index out of range 展开
1个回答
展开全部
你好:
这个错误的出错原因在于:
这是在对string对象切片;
估计是string为空的缘故,所以索引值超出了范围!
这个错误的出错原因在于:
这是在对string对象切片;
估计是string为空的缘故,所以索引值超出了范围!
追问
那为了个tagged bigram 切片 我应该修改哪里?还是我的NP 设置错了?
比如说
>>> bigrams[:10]
[(('It', 'PRP'), ('was', 'VBD')), (('was', 'VBD'), ('now', 'RB')),..]
要切一个bigram,我想的吧上面的UnigramTagger 改成BigramTagger就好了 但那还不行
追答
你好:
我没有用过这样的包;
你可以每一步打印出来看看那;
是不是你要的结果;
或者用:try。。。except。。;
这是调试的技巧;
推荐律师服务:
若未解决您的问题,请您详细描述您的问题,通过百度律临进行免费专业咨询