Snowball is a small string processing programming language designed for creating stemming algorithms for use in information retrieval.
The name Snowball was chosen as a tribute to the SNOBOL programming language, "with which it shares the concept of string patterns delivering signals that are used to control the flow of the program." For ANSI C, each Snowball script produces a program file and corresponding header file (with .c and .h extensions).
The basic datatypes handled by Snowball are strings of characters, signed integers, and boolean truth values, or more simply strings, integers and booleans. Snowball's characters are either 8-bit wide, or 16-bit, depending on the mode of use. In particular, both ASCII and 16-bit Unicode are supported.
Though the original Snowball website maintained by Dr. Martin Porter and colleague Richard Boulton has been closed since 2014 following Dr. Porter's retirement, the site itself is still accessible, and the language continues to be developed as a community project on GitHub.
References
External links
- Snowball Stemming language and algorithms project on GitHub
- Porter Stemmer in Snowball
