Diagtool enables users to automate the date collection which is required for troubleshooting. Diagtool gathers configuration and log files of Fluentd and diagnostic information from an operating system, such as process information and network status. In some cases, configuration and log files contain security sensitive information, such as IP addresses and Hostname. Diagtool has the functions to generate masks on IP addresses, Hostname(in FQDN style) and user defined keywords in the collected files.
Diagtool has been developed for Fluentd(td-agent, fluent-package) and FluentBit(td-agent-bit) running on Linux OS, mainly.
On Windows, it only supports the installed td-agent-gem list collection for Fluentd, currently (since v1.0.3).
Diagtool is written in Ruby and Ruby version should be higher than 2.3 for the installation.
The supported Linux OS is described in the following page: https://docs.fluentd.org/quickstart/td-agent-v2-vs-v3-vs-v4
Diagtool Installation
When you are using td-agent, you can install Diagtool easily with “/usr/sbin/td-agent-gem” command.
# /usr/sbin/td-agent-gem install fluent-diagtool
Successfully installed fluent-diagtool-1.0.4
Parsing documentation for fluent-diagtool-1.0.4
Installing ri documentation for fluent-diagtool-1.0.4
Done installing documentation for fluent-diagtool after 0 seconds
1 gem installed
When using /usr/sbin/td-agent-gem command, fluent-diagtool is installed under “/opt/td-agent/embedded/lib/ruby/gems/2.4.0/bin/“ directory. You can add that directory to $PATH in .bash_profile.
Otherwise, you can install Diagtool with common gem command. In this case, Ruby version higher than 2.3 might be required to install.
# gem install fluent-diagtool
Successfully installed fluent-diagtool-1.0.4
Parsing documentation for fluent-diagtool-1.0.4
Installing ri documentation for fluent-diagtool-1.0.4
Done installing documentation for fluent-diagtool after 0 seconds
1 gem installed
Usage
# fluent-diagtool --help
Usage: fluent-diagtool -o OUTPUT_DIR -m {yes | no} -w {word1,[word2...]} -f {listfile} -s {hash seed}
--precheck Run Precheck (Optional)
-t, --type fluentd|fluentbit Select the type of Fluentd (Mandatory)
-o, --output DIR Output directory (Mandatory)
-m, --mask yes|no Enable mask function (Optional : Default=no)
-w, --word-list word1,word2 Provide a list of user-defined words which will to be masked (Optional : Default=None)
-f, --word-file list_file provide a file which describes a List of user-defined words (Optional : Default=None)
-s, --hash-seed seed provide a word which will be used when generate the mask (Optional : Default=None)
-c, --conf config_file provide a full path of td-agent configuration file (Optional : Default=None)
-l, --log log_file provide a full path of td-agent log file (Optional : Default=None)
On Windows, only the -o, --output DIR option is supported.
Precheck
(Not supported on Windows)
In order to run Diagtool correctly, it is required to ensure that Diagtool can obtain the fundamental information of Fluentd. Basically, Diagtool automatically parses the required information from the running Fluentd processes. The precheck option is useful to confirm if Diagtool certainly collects the information as expected.
The following output example shows the case where Diatool properly collects the required information.
In some cases, Dialtool, with custom command line options, may fail to identify the path of Fluentd configuration and log files. You need to specify this information manually with “-c” and “-l” options.
The following example shows pre-check returns failure resulting Diagtool is not able to extract the path of td-agent configuration and log files.
# fluent-diagtool --precheck -t fluentd
2020-05-28 05:45:14 +0000: [Diagtool] [INFO] [Precheck] Check OS parameters...
2020-05-28 05:45:14 +0000: [Diagtool] [INFO] [Precheck] operating system = CentOS Linux 8 (Core)
2020-05-28 05:45:14 +0000: [Diagtool] [INFO] [Precheck] kernel version = Linux 4.18.0-147.5.1.el8_1.x86_64
2020-05-28 05:45:14 +0000: [Diagtool] [INFO] [Precheck] Check td-agent parameters...
2020-05-28 05:45:14 +0000: [Diagtool] [INFO] [Precheck] td-agent conf path =
2020-05-28 05:45:14 +0000: [Diagtool] [INFO] [Precheck] td-agent conf file =
2020-05-28 05:45:14 +0000: [Diagtool] [INFO] [Precheck] td-agent log path =
2020-05-28 05:45:14 +0000: [Diagtool] [INFO] [Precheck] td-agent log =
2020-05-28 05:45:14 +0000: [Diagtool] [WARN] [Precheck] can not find td-agent conf path: please run diagtool command with -c /path/to/<td-agent conf file>
2020-05-28 05:45:14 +0000: [Diagtool] [WARN] [Precheck] can not find td-agent log path: please run diagtool command with -l /path/to/<td-agent log file>
Run diagtool
Once the pre-check is completed, you are ready to run the tool. The “-o” is mandatory out of provided options and the output will be generated as a compressed file under the directory specified by “-o“ option.
(*) If the pre-check results mentioned that it is not able to find “td-agent conf path” and “td-agent log path“, you need to use “-c“ and “-l” respectively to specify the file path manually.
The user-defined words can be specified both -e option and -f option and the words are merged when both options are selected.
The format of user-defined words list file specified in -f option should be followed format.
# cat word_list_sample
centos8101
centos8102
NOTE: When user specified the keywork, only the exact match words will be masked. For instance, when users like to mask words like “nginx1” and “nginx2”, users need to specify “nginx1” and “nginx2” respectively and “nginx*” should not work in the tool.
Mask Function
When run Diagtool with the mask option, the log of mask is also created in ‘mask_{timestamp}.json’ file. Users are able to confirm how the mask was generated on each file. The diagtool provides a hash-seed option with ‘-s’. When hash-seed is specified, the mask will be generated with the original word and hash-seed so that users could use a unique mask value.
Fluentd Diagnostic Tool
Diagtool enables users to automate the date collection which is required for troubleshooting. Diagtool gathers configuration and log files of Fluentd and diagnostic information from an operating system, such as process information and network status. In some cases, configuration and log files contain security sensitive information, such as IP addresses and Hostname. Diagtool has the functions to generate masks on IP addresses, Hostname(in FQDN style) and user defined keywords in the collected files.
The scope of data collection:
Prerequisite
Diagtool has been developed for Fluentd(td-agent, fluent-package) and FluentBit(td-agent-bit) running on Linux OS, mainly. On Windows, it only supports the
installed td-agent-gem listcollection for Fluentd, currently (since v1.0.3). Diagtool is written in Ruby and Ruby version should be higher than 2.3 for the installation. The supported Linux OS is described in the following page:https://docs.fluentd.org/quickstart/td-agent-v2-vs-v3-vs-v4
Diagtool Installation
When you are using td-agent, you can install Diagtool easily with “/usr/sbin/td-agent-gem” command.
When using /usr/sbin/td-agent-gem command, fluent-diagtool is installed under “/opt/td-agent/embedded/lib/ruby/gems/2.4.0/bin/“ directory. You can add that directory to $PATH in .bash_profile.
Otherwise, you can install Diagtool with common gem command. In this case, Ruby version higher than 2.3 might be required to install.
Usage
On Windows, only the
-o, --output DIRoption is supported.Precheck
(Not supported on Windows)
In order to run Diagtool correctly, it is required to ensure that Diagtool can obtain the fundamental information of Fluentd. Basically, Diagtool automatically parses the required information from the running Fluentd processes. The precheck option is useful to confirm if Diagtool certainly collects the information as expected. The following output example shows the case where Diatool properly collects the required information.
In some cases, Dialtool, with custom command line options, may fail to identify the path of Fluentd configuration and log files. You need to specify this information manually with “-c” and “-l” options. The following example shows pre-check returns failure resulting Diagtool is not able to extract the path of td-agent configuration and log files.
Run diagtool
Once the pre-check is completed, you are ready to run the tool. The “-o” is mandatory out of provided options and the output will be generated as a compressed file under the directory specified by “-o“ option. (*) If the pre-check results mentioned that it is not able to find “td-agent conf path” and “td-agent log path“, you need to use “-c“ and “-l” respectively to specify the file path manually.
Command sample:
fluent-package (td-agent) on Windows: Fluent Package Command Prompt (Td-agent Command Prompt) with Administrator privilege
The “@include” directive in td-agent configuration file
The “@include” directive is a function to reuse configuration defined in other configuration files. Diagtool reads Fluentd configuration and gathers the files described in “@include” directive as well. The details of “@include” directive are described in followed page:
https://docs.fluentd.org/configuration/config-file#6-re-use-your-config-the-include-directive
User defined words to be masked
The user-defined words can be specified both -e option and -f option and the words are merged when both options are selected. The format of user-defined words list file specified in -f option should be followed format.
NOTE: When user specified the keywork, only the exact match words will be masked. For instance, when users like to mask words like “nginx1” and “nginx2”, users need to specify “nginx1” and “nginx2” respectively and “nginx*” should not work in the tool.
Mask Function
When run Diagtool with the mask option, the log of mask is also created in ‘mask_{timestamp}.json’ file. Users are able to confirm how the mask was generated on each file.
The diagtool provides a hash-seed option with ‘-s’. When hash-seed is specified, the mask will be generated with the original word and hash-seed so that users could use a unique mask value.
Mask sample - IP address: IPv4_{md5hash}
Mask sample - Hostname address: FQDN_{md5hash}
Mask sample - User defined keywords: Word_{md5hash}
Tested Environment