-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Porter stemmer #19
base: main
Are you sure you want to change the base?
Conversation
("relate", "relat"), | ||
("pirate", "pirat"), | ||
("necessitate", "necessit"), | ||
("you", "you"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test "" et "a"
"ent", "ou", "ism", "ate", "iti", "ous", "ive", "ize" | ||
]; | ||
|
||
fn double_consonant(word: &str, exceptions: Option<&str>) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simplify
src/stemmers/en/porter.rs
Outdated
}; | ||
} | ||
|
||
static STEP3: Steps<7> = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Steps is not required.
src/stemmers/en/porter.rs
Outdated
static ref STEP1A1: Regex = Regex::new(r"^(.+?)(ss|i)es$").unwrap(); | ||
static ref STEP1A2: Regex = Regex::new(r"^(.+?)([^s])s$").unwrap(); | ||
static ref STEP1B1: Regex = Regex::new(r"^(.+?)eed$").unwrap(); | ||
static ref STEP1B2: Regex = Regex::new(r"(ed|ing)$").unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(?:
src/stemmers/en/porter.rs
Outdated
} | ||
|
||
// Step 1a | ||
if STEP1A1.find(&word).is_some() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_match
src/stemmers/en/porter.rs
Outdated
// Step 1b | ||
if STEP1B1.find(&word).is_some() { | ||
let stem = word[..word.len() - 1].to_string(); | ||
if compute_m(&stem) > 0{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compute_m(&word[..word.len() - 1])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.chars().last().unwrap().len()
No description provided.