Go machine learning & NLP libraries
The Go programming language is getting increasingly popular for the purpose of natural language processing. At the moment there seems to be no central directory of useful libraries. Therefore, I decided to publish a list of corresponding packages that I am using or have been stumbled upon.
Natural Language Processing
- go-stem: Go implementation of the Porter stemming algorithm
- snowball: Cgo wrapper for the snowball stemmer
- paicehusk: Implementation of the Paice/Husk Stemmer
- go-porterstemmer: A native Go clean room implementation of the Porter Stemming algorithm
- stemmer: English and German stemmers in native Go
- snowball: Native Go snowball stemmers for English, Spanish, French and Russian
- porter: Another Go port of the Porter stemmer
- golibstemmer: Go bindings for libstemmer
- snowball: Snowball stemmer for Go (cgo)
- icu: Cgo binding for icu4c C library detection and conversion functions
- libtextcat: Cgo binding for libtextcat C library
- textcat: A Go package for n-gram based text categorization, with support for utf-8 and raw text.
- go-eco: Similarity, dissimilarity and distance matrices; diversity, equitability and inequality measures; species richness estimators; coenocline models
- MMSEGO: Go implementation of MMSEG (a Chinese word splitting algorithm)
- unidecode: Unicode transliterator (also known as unidecode) for Go
- GNLP: A few structures for doing NLP analysis / experiments
- (Compact Language Detector: see my article about writing a Go binding)
Machine Learning
- mlgo: Various “minimalistic” machine learning algorithms
- go-fann: Go bindings for the Fast Artificial Neural Networks (FANN) library
- neural-go: Implements a simple multilayer perceptron network
- bayesian: Bayesian classifier
- shield: Bayesian text classifier with flexible tokeniser and backend store support
- probab: Probability distribution functions - Bayesian inference
- libsvm: libSVM implementation in Go
- golinear: liblinear bindings for Go
- go-pr: A gaussian classifier pattern recognition package
- go-galib: Genetic Algorithms library written in Go
- goml: On-line Machine Learning in Go
- quantized: ML Quantized classifier in Go
- nlp: Selected Machine Learning algorithms for basic natural language processing
I have not personally tested all of the packages above, so if you have any corrections or want to complete the list, please contact me. I will update this collection on occasion.
First published on March 19, 2013