It is often useful to compare one checklist to another. This project uses
pretty much the same algorithms as GNverifier, but does not require an
external database and can be used offline.
where -s option provides id/ids of selected GNverifier data-sources,
and -o flag (--only-preferred) limits results to data-sources set by -s
option.
If both checklists of scientific names are local, use GNdiff.
GNdiff Installation
Using Homebrew on Mac OS X, Linux, and Linux on Windows (WSL2)
Homebrew is a popular package manager for Open Source software originally
developed for Mac OS X. Now it is also available on Linux, and can easily
be used on MS Windows 10 or 11, if Windows Subsystem for Linux (WSL) is
installed.
Note that Homebrew requires some other programs to be installed, like Curl,
Git, a compiler (GCC compiler on Linux, Xcode on Mac). If it is too much,
go to the Linux and Mac without Homebrew section.
Another, simpler way, would be to use cd C:\Users\your_username\bin command
in cmd terminal window. The GNdiff program then will be automatically
found by Windows operating system when you run its commands from that
directory.
Prepare two files with names. There are 3 possible file formats:
A simple list of scientific names, one name per line.
Comma-separated or Tab-separated (CSV) file with a ScientificName
field.
Tab-separated (TCV) file with a ScientificName field.
For both CSV and TSV files the fields TaxonID and Family are also ingested
if given, any capitalization of either of the fields names is accepted.
The Family field indicates a family a particular species are assigned to
according to the dataset. Normally this field is not needed, but in case of
tricky homonyms it helps to resolve taxa from each other.
Run command:
gndiff query.csv reference.csv
The first of the two files should contain names that need to be matched.
The second file should contain reference names.
To see how it works you can use tests files from the GNdiff project for example
ebird.csv and ioc-bird.csv. These files contain avian names from eBird
and eBird and IOC Bird checklists correspondingly.
Open each link and use Ctrl-S on Windows/Linux or ⌘-S on Mac to save
these files on your computer. You can then run:
gndiff path/to/ebird.csv path/to/ioc-bird.csv
# or
gndiff path/to/ioc-bird.csv path/to/ebird.csv
Options and flags
According to POSIX standard flags and options can be given either before or
after file-paths arguments.
help
gndiff -h
# or
gndiff --help
# or
gndiff
version
gndiff -V
# or
gndiff --version
format
Sets the format of the comparison result and can take the following values:
csv: Comma-separated format
tsv: Tab-separated format
compact: JSON as one line
pretty: JSON in a human-readable format with indentations and lines separation.
When port is set, GNdiff works as a web server with its RESTful API
exposed at the given port.
gndiff -p 8080
# or
gndiff --port 8080
quiet
This flag supresses warnings log, showing only the matching results.
gndiff query.txt ref.txt -q
# or
gndiff query.txt ref.txt --quiet
Please note, that matching result uses STDOUT, while log uses STDERR,
so a similar result can be achieved by redirecting STDERR to /dev/null
gndiff query.txt ref.txt 2> /dev/null
Family names as a disambiguation tool
Family sometimes help to distinquish homonyms in names lists. For example,
there are homonyms within one nomenclatural code, and homonyms in between
two nomenclatural codes.
The same nomenclatural code homonyms (senior, junior homonyms)
For example, in zoology a genus name Echidna has 3 homonyms:
Moray eel
Family Muraenidae -> Genus Echidna J. R. Forster
Egg-laying mammal Echidna
Family Tachyglossidae -> Echidna Cuvier, 1797 (junior homonym)
Currently genus Tachyglossus Illiger, 1811
Snake
Family Viperidae -> Echidna Merrem, 1820 (junior homonym)
Currently genus Bitis Gray, 1842
Such homonyms are not allowed within the same code and eventually they
get corrected. However historical records still contain them and
have to be disambiguated.
Homonyms from different nomenclatural codes (hemihomonyms)
There are no rules how to deal with homonyms that are treated by different (for
example Botanical and Zoological) nomenclatural codes.
GNdiffapp takes two files with scientific names and compares themIntroduction
It is often useful to compare one checklist to another. This project uses pretty much the same algorithms as GNverifier, but does not require an external database and can be used offline.
GNdiff is a complementary tool to GNverifier. It is made to compare a checklist with checklists that are not in GNverifier database. If you need to compare a list of names that are already in GNverifier, either use GNverifier web app or install it locally and run:
where
-soption providesid/idsof selected GNverifier data-sources, and-oflag (--only-preferred) limits results to data-sources set by-soption.If both checklists of scientific names are local, use GNdiff.
GNdiffInstallationUsing Homebrew on Mac OS X, Linux, and Linux on Windows (WSL2)
Homebrew is a popular package manager for Open Source software originally developed for Mac OS X. Now it is also available on Linux, and can easily be used on MS Windows 10 or 11, if Windows Subsystem for Linux (WSL) is installed.
Note that Homebrew requires some other programs to be installed, like Curl, Git, a compiler (GCC compiler on Linux, Xcode on Mac). If it is too much, go to the
Linux and Mac without Homebrewsection.To use
GNdiffwith Homebrew:Install Homebrew
Open terminal and run the following commands:
Linux and Mac without Homebrew
Download the latest GNdiff release, untar, and install binary somewhere in your path.
MS Windows
Download the latest GNdiff release, unzip.
One possible way would be to create a default folder for executables and place
GNdiffthere.Use
Windows+Rkeys combination and type “cmd“. In the appeared terminal window type:Add
C:\Users\your_username\bindirectory to yourPATHenvironment variable.Another, simpler way, would be to use
cd C:\Users\your_username\bincommand incmdterminal window. TheGNdiffprogram then will be automatically found by Windows operating system when you run its commands from that directory.Compile from source
Install Go according to installation instructions and run:
Usage
Compare Files
Prepare two files with names. There are 3 possible file formats:
ScientificNamefield.ScientificNamefield.For both CSV and TSV files the fields
TaxonIDandFamilyare also ingested if given, any capitalization of either of the fields names is accepted.The
Familyfield indicates a family a particular species are assigned to according to the dataset. Normally this field is not needed, but in case of tricky homonyms it helps to resolve taxa from each other.Run command:
The first of the two files should contain names that need to be matched. The second file should contain reference names.
Any combination of these 3 formats would work:
Compare using test files
To see how it works you can use tests files from the GNdiff project for example ebird.csv and ioc-bird.csv. These files contain avian names from eBird and eBird and IOC Bird checklists correspondingly. Open each link and use
Ctrl-Son Windows/Linux or⌘-Son Mac to save these files on your computer. You can then run:Options and flags
According to POSIX standard flags and options can be given either before or after file-paths arguments.
help
version
format
Sets the format of the comparison result and can take the following values:
csv: Comma-separated formattsv: Tab-separated formatcompact: JSON as one linepretty: JSON in a human-readable format with indentations and lines separation.The default format is CSV.
port (integer)
When
portis set,GNdiffworks as a web server with its RESTful API exposed at the given port.quiet
This flag supresses warnings log, showing only the matching results.
Please note, that matching result uses
STDOUT, while log usesSTDERR, so a similar result can be achieved by redirectingSTDERRto/dev/nullFamily names as a disambiguation tool
Family sometimes help to distinquish homonyms in names lists. For example, there are homonyms within one nomenclatural code, and homonyms in between two nomenclatural codes.
The same nomenclatural code homonyms (senior, junior homonyms)
For example, in zoology a genus name
Echidnahas 3 homonyms:Moray eel
Family
Muraenidae-> GenusEchidna J. R. ForsterEgg-laying mammal Echidna
Family
Tachyglossidae->Echidna Cuvier, 1797(junior homonym)Currently genus
Tachyglossus Illiger, 1811Snake
Family
Viperidae->Echidna Merrem, 1820(junior homonym)Currently genus
Bitis Gray, 1842Such homonyms are not allowed within the same code and eventually they get corrected. However historical records still contain them and have to be disambiguated.
Homonyms from different nomenclatural codes (hemihomonyms)
There are no rules how to deal with homonyms that are treated by different (for example Botanical and Zoological) nomenclatural codes.
Sea Snail (Zoological Nomenclatural Code)
Family
Ficidae->Ficus variegata Röding, 1798Red fig (Botanical Nomenclatural Code)
Family
Moraceae->Ficus variegata Blume