Need Parser for english language

Need Parser for english language

Post by Hemant Jos » Sun, 07 Nov 2004 03:08:19


Hello Basavraj,
Parsing effectively to obtain sentences is still a problem at
hand for many researchers. For example, the sentence mostly ends with
'.' but it also does end with '?' or '!' and so on. Take a look at
section 4.2 pg. 123 of book "Foundations of Natural Language
Processing" by Christopher Manning and Hinrich Schutze published by MIT
Press.

Also if you plan to use java, then java.text.BreakIterator class has
basic functionality to parse for sentences in a given text. You can use
it as
BreakIterator st = BreakIterator.getSentenceInstance(); and then
iterate to get each of sentence instances. My personal experience about
this java class is not that great. You will need to do little more work
to understand sentence boundaries.

My suggestion would be to look for perl scripts or may be write one
with regular expression patterns to identify sentences in a given text.

I hope this helps.
-Hemant Joshi
http://www.yqcomputer.com/
University of Arkansas at Little Rock