You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
67 lines
1.9 KiB
67 lines
1.9 KiB
5 years ago
|
# snowballstem
|
||
|
|
||
|
This repository contains the Go stemmers generated by the [Snowball](https://github.com/snowballstem/snowball) project. They are maintained outside of the core bleve package so that they may be more easily be reused in other contexts.
|
||
|
|
||
|
## Usage
|
||
|
|
||
|
All these stemmers export a single `Stem()` method which operates on a snowball `Env` structure. The `Env` structure maintains all state for the stemmer. A new `Env` is created to point at an initial string. After stemming, the results of the `Stem()` operation can be retrieved using the `Current()` method. The `Env` structure can be reused for subsequent calls by using the `SetCurrent()` method.
|
||
|
|
||
|
## Example
|
||
|
|
||
|
```
|
||
|
package main
|
||
|
|
||
|
import (
|
||
|
"fmt"
|
||
|
|
||
|
"github.com/blevesearch/snowballstem"
|
||
|
"github.com/blevesearch/snowballstem/english"
|
||
|
)
|
||
|
|
||
|
func main() {
|
||
|
|
||
|
// words to stem
|
||
|
words := []string{
|
||
|
"running",
|
||
|
"jumping",
|
||
|
}
|
||
|
|
||
|
// build new environment
|
||
|
env := snowballstem.NewEnv("")
|
||
|
|
||
|
for _, word := range words {
|
||
|
// set up environment for word
|
||
|
env.SetCurrent(word)
|
||
|
// invoke stemmer
|
||
|
english.Stem(env)
|
||
|
// print results
|
||
|
fmt.Printf("%s stemmed to %s\n", word, env.Current())
|
||
|
}
|
||
|
}
|
||
|
```
|
||
|
Produces Output:
|
||
|
```
|
||
|
$ ./snowtest
|
||
|
running stemmed to run
|
||
|
jumping stemmed to jump
|
||
|
```
|
||
|
|
||
|
## Testing
|
||
|
|
||
|
The test harness for these stemmers is hosted in the main [Snowball](https://github.com/snowballstem/snowball) repository. There are functional tests built around the separate [snowballstem-data](https://github.com/snowballstem/snowball-data) repository, and there is support for fuzz-testing the stemmers there as well.
|
||
|
|
||
|
## Generating the Stemmers
|
||
|
|
||
|
```
|
||
|
$ export SNOWBALL=/path/to/github.com/snowballstem/snowball/after/snowball/built
|
||
|
$ go generate
|
||
|
```
|
||
|
|
||
|
## Updated the Go Generate Commands
|
||
|
|
||
|
A simple tool is provided to automate these from the snowball algorithms directory:
|
||
|
|
||
|
```
|
||
|
$ go run gengen.go /path/to/github.com/snowballstem/snowball/algorithms
|
||
|
```
|