XPath JSON 输入数据格式

XPath 解析器插件

使用由 XPath 解析器插件提供的 xpath_json 输入数据格式，通过 XPath 表达式将 JSON 数据解析为 Telegraf 指标。

有关支持的 XPath 函数的信息，请参阅底层 XPath 库。

注意： 字段的类型使用 XPath 函数指定。唯一的例外是 integer 字段，需要在 fields_int 部分中指定。

支持的数据格式

名称	`data_format` 设置	注释
可扩展标记语言 (XML)	`"xml"`
JSON	`"xpath_json"`
MessagePack	`"xpath_msgpack"`
协议缓冲区	`"xpath_protobuf"`	查看其他参数

协议缓冲区其他设置

为了使用协议缓冲区格式，您需要为解析器指定其他（强制性）属性。这些选项在此处描述。

`xpath_protobuf_file` (强制性)

使用此选项指定协议缓冲区定义文件 (.proto) 的名称。

`xpath_protobuf_type` (强制性)

此选项包含用于反序列化要解析的数据的顶级消息文件。通常，这是从协议缓冲区定义文件中的 package 名称和 message 名称构造的，格式为 <package name>.<message name>。

`xpath_protobuf_import_paths` (可选)

如果您在 .proto 文件中导入其他协议缓冲区定义（即您使用 import 语句），您可以使用此选项指定搜索导入的定义文件的路径。默认情况下，导入仅在 . 中搜索，即当前工作目录，通常是您启动 telegraf 时所在的目录。

假设您在目录（例如 /data/my_proto_files）中有多个协议缓冲区定义（例如 A.proto、B.proto 和 C.proto），其中您的顶级文件（例如 A.proto）至少导入一个其他定义

syntax = "proto3";

package foo;

import "B.proto";

message Measurement {
    ...
}

您应该使用以下设置

[[inputs.file]]
  files = ["example.dat"]

  data_format = "xpath_protobuf"
  xpath_protobuf_file = "A.proto"
  xpath_protobuf_type = "foo.Measurement"
  xpath_protobuf_import_paths = [".", "/data/my_proto_files"]

  ...

`xpath_protobuf_skip_bytes` (可选)

此选项允许在尝试解析协议缓冲区消息之前跳过一定数量的字节。这在原始数据有标头（例如消息长度或 GRPC 消息的情况）时很有用。

这是已知标头和 xpath_protobuf_skip_bytes 的相应值的列表

名称	设置	注释
GRPC 协议	5	GRPC 为长度前缀消息添加 5 字节标头
PowerDNS 日志记录	2	发送的消息包含 2 字节标头，其中包含消息长度

配置

[[inputs.file]]
  files = ["example.xml"]

  ## Data format to consume.
  ## Each data format has its own unique set of configuration options, read
  ## more about them here:
  ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
  data_format = "xml"

  ## PROTOCOL-BUFFER definitions
  ## Protocol-buffer definition file
  # xpath_protobuf_file = "sparkplug_b.proto"
  ## Name of the protocol-buffer message type to use in a fully qualified form.
  # xpath_protobuf_type = "org.eclipse.tahu.protobuf.Payload"
  ## List of paths to use when looking up imported protocol-buffer definition files.
  # xpath_protobuf_import_paths = ["."]
  ## Number of (header) bytes to ignore before parsing the message.
  # xpath_protobuf_skip_bytes = 0

  ## Print the internal XML document when in debug logging mode.
  ## This is especially useful when using the parser with non-XML formats like protocol-buffers
  ## to get an idea on the expression necessary to derive fields etc.
  # xpath_print_document = false

  ## Allow the results of one of the parsing sections to be empty.
  ## Useful when not all selected files have the exact same structure.
  # xpath_allow_empty_selection = false

  ## Get native data-types for all data-format that contain type information.
  ## Currently, protobuf, msgpack and JSON support native data-types
  # xpath_native_types = false

  ## Multiple parsing sections are allowed
  [[inputs.file.xpath]]
    ## Optional: XPath-query to select a subset of nodes from the XML document.
    # metric_selection = "/Bus/child::Sensor"

    ## Optional: XPath-query to set the metric (measurement) name.
    # metric_name = "string('example')"

    ## Optional: Query to extract metric timestamp.
    ## If not specified the time of execution is used.
    # timestamp = "/Gateway/Timestamp"
    ## Optional: Format of the timestamp determined by the query above.
    ## This can be any of "unix", "unix_ms", "unix_us", "unix_ns" or a valid Golang
    ## time format. If not specified, a "unix" timestamp (in seconds) is expected.
    # timestamp_format = "2006-01-02T15:04:05Z"
    ## Optional: Timezone of the parsed time
    ## This will locate the parsed time to the given timezone. Please note that
    ## for times with timezone-offsets (e.g. RFC3339) the timestamp is unchanged.
    ## This is ignored for all (unix) timestamp formats.
    # timezone = "UTC"

    ## Optional: List of fields to convert to hex-strings if they are
    ## containing byte-arrays. This might be the case for e.g. protocol-buffer
    ## messages encoding data as byte-arrays. Wildcard patterns are allowed.
    ## By default, all byte-array-fields are converted to string.
    # fields_bytes_as_hex = []

    ## Tag definitions using the given XPath queries.
    [inputs.file.xpath.tags]
      name   = "substring-after(Sensor/@name, ' ')"
      device = "string('the ultimate sensor')"

    ## Integer field definitions using XPath queries.
    [inputs.file.xpath.fields_int]
      consumers = "Variable/@consumers"

    ## Non-integer field definitions using XPath queries.
    ## The field type is defined using XPath expressions such as number(), boolean() or string(). If no conversion is performed the field will be of type string.
    [inputs.file.xpath.fields]
      temperature = "number(Variable/@temperature)"
      power       = "number(Variable/@power)"
      frequency   = "number(Variable/@frequency)"
      ok          = "Mode != 'ok'"

在此配置模式下，您显式指定要从数据中抓取的字段和标签。

一个配置可以包含多个 xpath 子节（例如，文件插件多次处理 xml 字符串）。有关 XPath 查询的详细信息和帮助，请参阅 XPath 语法和底层库的函数。考虑使用 XPath 测试器，例如 xpather.com 或 Code Beautify 的 XPath 测试器，以帮助开发和调试您的查询。

配置 (批量)

除了上述配置之外，还可以批量方式指定字段。因此，与在节中指定字段相反，您可以定义 name 和 value 选择器，用于确定指标中字段的名称和值。

[[inputs.file]]
  files = ["example.xml"]

  ## Data format to consume.
  ## Each data format has its own unique set of configuration options, read
  ## more about them here:
  ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
  data_format = "xml"

  ## PROTOCOL-BUFFER definitions
  ## Protocol-buffer definition file
  # xpath_protobuf_file = "sparkplug_b.proto"
  ## Name of the protocol-buffer message type to use in a fully qualified form.
  # xpath_protobuf_type = "org.eclipse.tahu.protobuf.Payload"
  ## List of paths to use when looking up imported protocol-buffer definition files.
  # xpath_protobuf_import_paths = ["."]

  ## Print the internal XML document when in debug logging mode.
  ## This is especially useful when using the parser with non-XML formats like protocol-buffers
  ## to get an idea on the expression necessary to derive fields etc.
  # xpath_print_document = false

  ## Allow the results of one of the parsing sections to be empty.
  ## Useful when not all selected files have the exact same structure.
  # xpath_allow_empty_selection = false

  ## Get native data-types for all data-format that contain type information.
  ## Currently, protobuf, msgpack and JSON support native data-types
  # xpath_native_types = false

  ## Multiple parsing sections are allowed
  [[inputs.file.xpath]]
    ## Optional: XPath-query to select a subset of nodes from the XML document.
    metric_selection = "/Bus/child::Sensor"

    ## Optional: XPath-query to set the metric (measurement) name.
    # metric_name = "string('example')"

    ## Optional: Query to extract metric timestamp.
    ## If not specified the time of execution is used.
    # timestamp = "/Gateway/Timestamp"
    ## Optional: Format of the timestamp determined by the query above.
    ## This can be any of "unix", "unix_ms", "unix_us", "unix_ns" or a valid Golang
    ## time format. If not specified, a "unix" timestamp (in seconds) is expected.
    # timestamp_format = "2006-01-02T15:04:05Z"

    ## Field specifications using a selector.
    field_selection = "child::*"
    ## Optional: Queries to specify field name and value.
    ## These options are only to be used in combination with 'field_selection'!
    ## By default the node name and node content is used if a field-selection
    ## is specified.
    # field_name  = "name()"
    # field_value = "."

    ## Optional: Expand field names relative to the selected node
    ## This allows to flatten out nodes with non-unique names in the subtree
    # field_name_expansion = false

    ## Tag specifications using a selector.
    ## tag_selection = "child::*"
    ## Optional: Queries to specify tag name and value.
    ## These options are only to be used in combination with 'tag_selection'!
    ## By default the node name and node content is used if a tag-selection
    ## is specified.
    # tag_name  = "name()"
    # tag_value = "."

    ## Optional: Expand tag names relative to the selected node
    ## This allows to flatten out nodes with non-unique names in the subtree
    # tag_name_expansion = false

    ## Tag definitions using the given XPath queries.
    [inputs.file.xpath.tags]
      name   = "substring-after(Sensor/@name, ' ')"
      device = "string('the ultimate sensor')"

请注意：结果字段始终为字符串类型。

也可以混合使用两种替代的字段指定方式。在这种情况下，如果显式定义的标签和字段与批量实例使用相同的标签或字段名称，则显式定义的标签和字段优先。

metric_selection (可选)

您可以指定 XPath 查询，以从 XML 文档中选择节点子集，每个节点用于生成具有指定字段、标签等的新指标。

后续查询中的相对查询相对于 metric_selection。要指定绝对路径，请以斜杠 (/) 开始查询。

指定 metric_selection 是可选的。如果未指定，则所有相对查询都相对于 XML 文档的根节点。

metric_name (可选)

通过指定 metric_name，您可以使用给定的 XPath 查询的结果覆盖指标/测量名称。如果未指定，则使用默认指标名称。

timestamp, timestamp_format, timezone (可选)

默认情况下，当前时间用于所有创建的指标。要从 XML 文档中的值设置时间，您可以在 timestamp 中指定 XPath 查询，并在 timestamp_format 中设置格式。

timestamp_format 可以设置为 unix、unix_ms、unix_us、unix_ns 或接受的 Go “参考时间”。有关如何设置时间格式的详细信息和其他示例，请参阅 Go time 包。如果省略 timestamp_format，则假定 timestamp 查询的结果为 unix 格式。

timezone 设置用于在给定的时区中定位解析的时间。这对于时间不包含时区信息的情况很有帮助，例如 2023-03-09 14:04:40 并且不位于 UTC 中，这是默认设置。也可以将 timezone 设置为 Local，它使用配置的主机时区。

对于包含时区信息的时间格式，例如 RFC3339，生成的时间戳保持不变。timezone 设置对于所有 unix 时间戳格式都被忽略。

tags 子节

tag name = query 格式的 XPath 查询，用于向指标添加标签。指定的路径可以是绝对路径（以 / 开头）或相对路径。相对路径使用当前选定的节点作为参考。

注意： 标签查询的结果始终转换为字符串。

fields_int 子节

field name = query 格式的 XPath 查询，用于向指标添加整数类型字段。指定的路径可以是绝对路径（以 / 开头）或相对路径。相对路径使用当前选定的节点作为参考。

注意： field_int 查询的结果始终转换为 int64。如果查询结果不可转换，则转换失败。

fields 子节

field name = query 格式的 XPath 查询，用于向指标添加非整数字段。指定的路径可以是绝对路径（以 / 开头）或相对路径。相对路径使用当前选定的节点作为参考。

字段的类型在 XPath 查询中使用 XPath 的类型转换函数指定，例如 number()、boolean() 或 string()。如果在查询中未执行转换，则字段将为字符串类型。

注意：路径转换函数始终成功，即使您将文本转换为浮点数。

field_selection, field_name, field_value (可选)

您可以指定 XPath 查询，以选择一组构成指标字段的节点。指定的路径可以是绝对路径（以 / 开头）或相对于当前选定节点。field_selection 选择的每个节点都在指标内构成一个新字段。

可以使用可选的 field_name 和 field_value 查询指定每个字段的名称和值。如果查询不是以 / 开头，则查询相对于选定的字段。如果未指定，则字段的名称默认为节点名称，字段的值默认为选定字段节点的内容。

注意：仅当指定了 field_selection 时，才会评估 field_name 和 field_value 查询。

指定 field_selection 是可选的。这是指定字段的另一种方式，尤其适用于节点名称事先未知或需要指定大量字段的文档。这些选项也可以与上面的字段规范结合使用。

注意：路径转换函数始终成功，即使您将文本转换为浮点数。

field_name_expansion (可选)

当为 true 时，使用 field_selection 选择的字段名称将扩展为相对于选定节点的路径。如果我们将所有叶节点选择为字段，并且这些叶节点没有唯一的名称，则这是必要的。也就是说，如果您选择的字段中有名重复的名称，则应将其设置为 true。

tag_selection, tag_name, tag_value (可选)

您可以指定 XPath 查询，以选择一组构成指标标签的节点。指定的路径可以是绝对路径（以 / 开头）或相对于当前选定节点。tag_selection 选择的每个节点都在指标内构成一个新标签。

可以使用可选的 tag_name 和 tag_value 查询指定每个标签的名称和值。如果查询不是以 / 开头，则查询相对于选定的标签。如果未指定，则标签的名称默认为节点名称，标签的值默认为选定标签节点的内容。注意：仅当指定了 tag_selection 时，才会评估 tag_name 和 tag_value 查询。

指定 tag_selection 是可选的。这是指定标签的另一种方式，尤其适用于节点名称事先未知或需要指定大量标签的文档。这些选项也可以与上面的标签规范结合使用。

tag_name_expansion (可选)

当为 true 时，使用 tag_selection 选择的标签名称将扩展为相对于选定节点的路径。例如，如果我们将所有叶节点选择为标签，并且这些叶节点没有唯一的名称，则这是必要的。也就是说，如果您选择的标签中有名重复的名称，则应将其设置为 true。

示例

以下 example.xml 文件用于以下配置示例中

<?xml version="1.0"?>
<Gateway>
  <Name>Main Gateway</Name>
  <Timestamp>2020-08-01T15:04:03Z</Timestamp>
  <Sequence>12</Sequence>
  <Status>ok</Status>
</Gateway>

<Bus>
  <Sensor name="Sensor Facility A">
    <Variable temperature="20.0"/>
    <Variable power="123.4"/>
    <Variable frequency="49.78"/>
    <Variable consumers="3"/>
    <Mode>busy</Mode>
  </Sensor>
  <Sensor name="Sensor Facility B">
    <Variable temperature="23.1"/>
    <Variable power="14.3"/>
    <Variable frequency="49.78"/>
    <Variable consumers="1"/>
    <Mode>standby</Mode>
  </Sensor>
  <Sensor name="Sensor Facility C">
    <Variable temperature="19.7"/>
    <Variable power="0.02"/>
    <Variable frequency="49.78"/>
    <Variable consumers="0"/>
    <Mode>error</Mode>
  </Sensor>
</Bus>

基本解析

此示例演示了 xml 解析器的基本用法。

配置

[[inputs.file]]
  files = ["example.xml"]
  data_format = "xml"

  [[inputs.file.xpath]]
    [inputs.file.xpath.tags]
      gateway = "substring-before(/Gateway/Name, ' ')"

    [inputs.file.xpath.fields_int]
      seqnr = "/Gateway/Sequence"

    [inputs.file.xpath.fields]
      ok = "/Gateway/Status = 'ok'"

输出

file,gateway=Main,host=Hugin seqnr=12i,ok=true 1598610830000000000

在 tags 定义中，XPath 函数 substring-before() 用于仅提取空格前的子字符串。要获取 /Gateway/Sequence 的整数值，我们必须使用 fields_int 部分，因为没有 XPath 表达式可以将节点值转换为整数（仅限浮点数）。

ok 字段通过指定一个查询来填充布尔值，该查询将 /Gateway/Status 的查询结果与字符串 ok 进行比较。使用 XPath 语法中提供的类型转换来指定字段类型。

时间和指标名称

这是一个从 XML 文档本身使用时间和指标名称的示例。

配置

[[inputs.file]]
  files = ["example.xml"]
  data_format = "xml"

  [[inputs.file.xpath]]
    metric_name = "name(/Gateway/Status)"

    timestamp = "/Gateway/Timestamp"
    timestamp_format = "2006-01-02T15:04:05Z"

    [inputs.file.xpath.tags]
      gateway = "substring-before(/Gateway/Name, ' ')"

    [inputs.file.xpath.fields]
      ok = "/Gateway/Status = 'ok'"

输出

Status,gateway=Main,host=Hugin ok=true 1596294243000000000

除了基本解析示例之外，指标名称定义为 /Gateway/Status 节点的名称，时间戳从 XML 文档派生，而不是使用执行时间。

多节点选择

对于包含例如多个设备（如 example.xml 中的 Sensor）的指标的 XML 文档，可以使用节点选择生成多个指标。此示例演示了如何为示例中的每个 Sensor 生成一个指标。

配置

[[inputs.file]]
  files = ["example.xml"]
  data_format = "xml"

  [[inputs.file.xpath]]
    metric_selection = "/Bus/child::Sensor"

    metric_name = "string('sensors')"

    timestamp = "/Gateway/Timestamp"
    timestamp_format = "2006-01-02T15:04:05Z"

    [inputs.file.xpath.tags]
      name = "substring-after(@name, ' ')"

    [inputs.file.xpath.fields_int]
      consumers = "Variable/@consumers"

    [inputs.file.xpath.fields]
      temperature = "number(Variable/@temperature)"
      power       = "number(Variable/@power)"
      frequency   = "number(Variable/@frequency)"
      ok          = "Mode != 'error'"

输出

sensors,host=Hugin,name=Facility\ A consumers=3i,frequency=49.78,ok=true,power=123.4,temperature=20 1596294243000000000
sensors,host=Hugin,name=Facility\ B consumers=1i,frequency=49.78,ok=true,power=14.3,temperature=23.1 1596294243000000000
sensors,host=Hugin,name=Facility\ C consumers=0i,frequency=49.78,ok=false,power=0.02,temperature=19.7 1596294243000000000

使用 metric_selection 选项，我们选择 XML 文档中的所有 Sensor 节点。请注意，所有字段和标签定义都相对于这些选定的节点。例外情况是时间戳定义，它相对于 XML 文档的根节点。

批量字段处理与多节点选择

对于包含大量字段或字段事先未知（例如 example.xml 中一组未知的 Variable 节点）的指标的 XML 文档，可以使用字段选择器。此示例演示了如何为示例中的每个 Sensor 生成一个指标，其字段从 Variable 节点派生。

配置

[[inputs.file]]
  files = ["example.xml"]
  data_format = "xml"

  [[inputs.file.xpath]]
    metric_selection = "/Bus/child::Sensor"
    metric_name = "string('sensors')"

    timestamp = "/Gateway/Timestamp"
    timestamp_format = "2006-01-02T15:04:05Z"

    field_selection = "child::Variable"
    field_name = "name(@*[1])"
    field_value = "number(@*[1])"

    [inputs.file.xpath.tags]
      name = "substring-after(@name, ' ')"

输出

sensors,host=Hugin,name=Facility\ A consumers=3,frequency=49.78,power=123.4,temperature=20 1596294243000000000
sensors,host=Hugin,name=Facility\ B consumers=1,frequency=49.78,power=14.3,temperature=23.1 1596294243000000000
sensors,host=Hugin,name=Facility\ C consumers=0,frequency=49.78,power=0.02,temperature=19.7 1596294243000000000

使用 metric_selection 选项，我们选择 XML 文档中的所有 Sensor 节点。对于每个 Sensor，我们然后使用 field_selection 选择传感器的所有子节点作为字段节点。请注意，字段选择相对于选定的节点。对于每个选定的字段节点，我们使用 field_name 和 field_value 分别确定字段的名称和值。field_name 派生节点的第一个属性的名称，而 field_value 派生第一个属性的值并将结果转换为数字。

此页内容对您有帮助吗？

感谢您的反馈！

支持和反馈

感谢您成为我们社区的一份子！我们欢迎并鼓励您提供有关 Telegraf 和本文档的反馈和错误报告。要获得支持，请使用以下资源

拥有年度合同或支持合同的客户可以联系 InfluxData 支持。

编辑此页面提交文档问题提交 Telegraf 问题

XPath JSON 输入数据格式

支持的数据格式

协议缓冲区其他设置

`xpath_protobuf_file` (强制性)

`xpath_protobuf_type` (强制性)

`xpath_protobuf_import_paths` (可选)

`xpath_protobuf_skip_bytes` (可选)

配置

配置 (批量)

metric_selection (可选)

metric_name (可选)

timestamp, timestamp_format, timezone (可选)

tags 子节

fields_int 子节

fields 子节

field_selection, field_name, field_value (可选)

field_name_expansion (可选)

tag_selection, tag_name, tag_value (可选)

tag_name_expansion (可选)

示例

基本解析

时间和指标名称

多节点选择

批量字段处理与多节点选择

支持和反馈

Flux 的未来

InfluxDB 3 Core 和 Enterprise

XPath JSON 输入数据格式

支持的数据格式

协议缓冲区其他设置

xpath_protobuf_file (强制性)

xpath_protobuf_type (强制性)

xpath_protobuf_import_paths (可选)

xpath_protobuf_skip_bytes (可选)

配置

配置 (批量)

metric_selection (可选)

metric_name (可选)

timestamp, timestamp_format, timezone (可选)

tags 子节

fields_int 子节

fields 子节

field_selection, field_name, field_value (可选)

field_name_expansion (可选)

tag_selection, tag_name, tag_value (可选)

tag_name_expansion (可选)

示例

基本解析

时间和指标名称

多节点选择

批量字段处理与多节点选择

支持和反馈

您在哪里运行 InfluxDB？

AWS

GCP

Azure

默认

自定义

感谢您的反馈！

Flux 的未来

InfluxDB 3 Core 和 Enterprise

`xpath_protobuf_file` (强制性)

`xpath_protobuf_type` (强制性)

`xpath_protobuf_import_paths` (可选)

`xpath_protobuf_skip_bytes` (可选)