Hunspell is a free spell checker and morphological analyzer library
and command-line tool, licensed under LGPL/GPL/MPL tri-license.
Hunspell is used by LibreOffice office suite, free browsers, like
Mozilla Firefox and Google Chrome, and other tools and OSes, like
Linux distributions and macOS. It is also a command-line tool for
Linux, Unix-like and other OSes.
It is designed for quick and high quality spell checking and
correcting for languages with word-level writing system,
including languages with rich morphology, complex word compounding
and character encoding.
Hunspell interfaces: Ispell-like terminal interface using Curses
library, Ispell pipe interface, C++/C APIs and shared library, also
with existing language bindings for other programming languages.
Hunspell’s code base comes from OpenOffice.org’s MySpell library,
developed by Kevin Hendricks (originally a C++ reimplementation of
spell checking and affixation of Geoff Kuenning’s International
Ispell from scratch, later extended with eg. n-gram suggestions),
see http://lingucomponent.openoffice.org/MySpell-3.zip, and
its README, CONTRIBUTORS and license.readme (here: license.myspell) files.
Main features of Hunspell library, developed by László Németh:
Unicode support
Highly customizable suggestions: word-part replacement tables and
stem-level phonetic and other alternative transcriptions to recognize
and fix all typical misspellings, don’t suggest offensive words etc.
Complex morphology: dictionary and affix homonyms; twofold affix
stripping to handle inflectional and derivational morpheme groups for
agglutinative languages, like Azeri, Basque, Estonian, Finnish, Hungarian,
Turkish; 64 thousand affix classes with arbitrary number of affixes;
conditional affixes, circumfixes, fogemorphemes, zero morphemes,
virtual dictionary stems, forbidden words to avoid overgeneration etc.
Handling complex compounds (for example, for Finno-Ugric, German and
Indo-Aryan languages): recognizing compounds made of arbitrary
number of words, handle affixation within compounds etc.
Custom dictionaries with affixation
Stemming
Morphological analysis (in custom item and arrangement style)
Morphological generation
SPELLML XML API over plain spell() API function for easier integration
of stemming, morpological generation and custom dictionaries with affixation
Language specific algorithms, like special casing of Azeri or Turkish
dotted i and German sharp s, and special compound rules of Hungarian.
Main features of Hunspell command line tool, developed by László Németh:
Reimplementation of quick interactive interface of Geoff Kuenning’s Ispell
Custom dictionaries with optional affixation, specified by a model word
Multiple dictionary usage (for example hunspell -d en_US,de_DE,de_medical)
Various filtering options (bad or good words/lines)
Morphological analysis (option -m)
Stemming (option -s)
See man hunspell, man 3 hunspell, man 5 hunspell for complete manual.
Translations: Hunspell has been translated into several languages already. If your language is missing or incomplete, please use Weblate to help translate Hunspell.
Dependencies
Build only dependencies:
g++ make autoconf automake autopoint libtool
Runtime dependencies:
Mandatory
Optional
libhunspell
hunspell tool
libiconv gettext
ncurses readline
Compiling on GNU/Linux and Unixes
We first need to download the dependencies. On Linux, gettext and
libiconv are part of the standard library. On other Unixes we
need to manually install them.
Open Mingw-w64 Win64 prompt and compile the same way as on Linux, see
above. Without mingw-w64-x86_64-libiconv the build still succeeds but
the hunspell tool cannot convert between dictionary encodings, so any
test or dictionary that declares a non-UTF-8 SET (e.g. ISO8859-1/2/15)
will fail at runtime.
Compiling in Cygwin environment
Download and install Cygwin environment for Windows with the following
extra packages:
make
automake
autoconf
libtool
gcc-g++ development package
ncurses, readline (for user interface)
iconv (character conversion)
Then compile the same way as on Linux. Cygwin builds depend on
Cygwin1.dll.
Debugging
It is recommended to install a debug build of the standard library:
libstdc++6-6-dbg
For debugging we need to create a debug build and then we need to start
gdb.
./configure CXXFLAGS='-g -O0 -Wall -Wextra'
make
./libtool --mode=execute gdb src/tools/hunspell
You can also pass the CXXFLAGS directly to make without calling
./configure, but we don’t recommend this way during long development
sessions.
After compiling and installing (see INSTALL) you can run the Hunspell
spell checker (compiled with user interface) with a Hunspell or Myspell
dictionary:
hunspell -d en_US text.txt
or without interface:
hunspell
hunspell -d en_GB -l <text.txt
Dictionaries consist of an affix (.aff) and dictionary (.dic) file, for
example, download American English dictionary files of LibreOffice
(older version, but with stemming and morphological generation) with
and with command line input and output, it’s possible to check its work quickly,
for example with the input words “example”, “examples”, “teached” and
“verybaaaaaaaaaaaaaaaaaaaaaad”:
Where in the output, * and + mean correct (accepted) words (* = dictionary stem,
+ = affixed forms of the following dictionary stem), and
& and # mean bad (rejected) words (& = with suggestions, # = without suggestions)
(see man hunspell).
Example for stemming:
$ hunspell -d en_US -s
mice
mice mouse
Example for morphological analysis (very limited with this English dictionary):
Note: morphological generation, stemming and analysis only work with
dictionaries whose entries carry morphological description fields
(po:, st:, is:, ts:, al:, ds:, dp: etc.; see man hunspell.5).
The example above relies on the older en_US dictionary linked earlier in
this README, which still ships these fields. Most current distributions
of en_US, fr, nl and hu_HU do not, and analyze, stem and generate
will return empty results for them. This is a dictionary property, not
a library bug. See tests/morph.aff and tests/morph.dic for a minimal
example.
Using Hunspell library with GCC
Including in your program:
#include <hunspell.hxx>
Linking with Hunspell static library:
g++ -lhunspell-1.7 example.cxx
# or better, use pkg-config
g++ $(pkg-config --cflags --libs hunspell) example.cxx
Installing Hunspell (vcpkg)
Alternatively, you can build and install hunspell using vcpkg dependency manager:
The hunspell port in vcpkg is kept up to date by Microsoft team members and community contributors. If the version is out of date, please create an issue or pull request on the vcpkg repository.
About Hunspell
Hunspell is a free spell checker and morphological analyzer library and command-line tool, licensed under LGPL/GPL/MPL tri-license.
Hunspell is used by LibreOffice office suite, free browsers, like Mozilla Firefox and Google Chrome, and other tools and OSes, like Linux distributions and macOS. It is also a command-line tool for Linux, Unix-like and other OSes.
It is designed for quick and high quality spell checking and correcting for languages with word-level writing system, including languages with rich morphology, complex word compounding and character encoding.
Hunspell interfaces: Ispell-like terminal interface using Curses library, Ispell pipe interface, C++/C APIs and shared library, also with existing language bindings for other programming languages.
Hunspell’s code base comes from OpenOffice.org’s MySpell library, developed by Kevin Hendricks (originally a C++ reimplementation of spell checking and affixation of Geoff Kuenning’s International Ispell from scratch, later extended with eg. n-gram suggestions), see http://lingucomponent.openoffice.org/MySpell-3.zip, and its README, CONTRIBUTORS and license.readme (here: license.myspell) files.
Main features of Hunspell library, developed by László Németh:
Main features of Hunspell command line tool, developed by László Németh:
See man hunspell, man 3 hunspell, man 5 hunspell for complete manual.
Translations: Hunspell has been translated into several languages already. If your language is missing or incomplete, please use Weblate to help translate Hunspell.
Dependencies
Build only dependencies:
Runtime dependencies:
Compiling on GNU/Linux and Unixes
We first need to download the dependencies. On Linux,
gettextandlibiconvare part of the standard library. On other Unixes we need to manually install them.For Ubuntu:
Then run the following commands:
For a non-root install, use
DESTDIRso libtool skipsldconfig:For dictionary development, use the
--with-warningsoption of configure.For interactive user interface of Hunspell executable, use the
--with-uioption.Optional developer packages:
In Ubuntu, the packages are:
Compiling on OSX and macOS
On macOS for compiler always use
clangand notg++because Homebrew dependencies are build with that.Then run:
Compiling on Windows
Compiling with Mingw64 and MSYS2
Download Msys2, update everything and install the following packages:
Open Mingw-w64 Win64 prompt and compile the same way as on Linux, see above. Without
mingw-w64-x86_64-libiconvthe build still succeeds but thehunspelltool cannot convert between dictionary encodings, so any test or dictionary that declares a non-UTF-8SET(e.g. ISO8859-1/2/15) will fail at runtime.Compiling in Cygwin environment
Download and install Cygwin environment for Windows with the following extra packages:
Then compile the same way as on Linux. Cygwin builds depend on Cygwin1.dll.
Debugging
It is recommended to install a debug build of the standard library:
For debugging we need to create a debug build and then we need to start
gdb.You can also pass the
CXXFLAGSdirectly tomakewithout calling./configure, but we don’t recommend this way during long development sessions.If you like to develop and debug with an IDE, see documentation at https://github.com/hunspell/hunspell/wiki/IDE-Setup
Testing
Testing Hunspell (see tests in tests/ subdirectory):
or with Valgrind debugger:
For example:
Documentation
features and dictionary format:
http://hunspell.github.io/
Usage
After compiling and installing (see INSTALL) you can run the Hunspell spell checker (compiled with user interface) with a Hunspell or Myspell dictionary:
or without interface:
Dictionaries consist of an affix (.aff) and dictionary (.dic) file, for example, download American English dictionary files of LibreOffice (older version, but with stemming and morphological generation) with
and with command line input and output, it’s possible to check its work quickly, for example with the input words “example”, “examples”, “teached” and “verybaaaaaaaaaaaaaaaaaaaaaad”:
Where in the output,
*and+mean correct (accepted) words (*= dictionary stem,+= affixed forms of the following dictionary stem), and&and#mean bad (rejected) words (&= with suggestions,#= without suggestions) (see man hunspell).Example for stemming:
Example for morphological analysis (very limited with this English dictionary):
Other executables
The src/tools directory contains the following executables after compiling.
Example for morphological generation:
Note: morphological generation, stemming and analysis only work with dictionaries whose entries carry morphological description fields (
po:,st:,is:,ts:,al:,ds:,dp:etc.; see man hunspell.5). The example above relies on the older en_US dictionary linked earlier in this README, which still ships these fields. Most current distributions of en_US, fr, nl and hu_HU do not, andanalyze,stemandgeneratewill return empty results for them. This is a dictionary property, not a library bug. Seetests/morph.affandtests/morph.dicfor a minimal example.Using Hunspell library with GCC
Including in your program:
Linking with Hunspell static library:
Installing Hunspell (vcpkg)
Alternatively, you can build and install hunspell using vcpkg dependency manager:
The hunspell port in vcpkg is kept up to date by Microsoft team members and community contributors. If the version is out of date, please create an issue or pull request on the vcpkg repository.
Dictionaries
Hunspell (MySpell) dictionaries:
Aspell dictionaries (conversion: man 5 hunspell):
László Németh, nemeth at numbertext org