处理重复数据点

InfluxDB 通过测量值、标签集和时间戳（每个都是用于将数据写入 InfluxDB 的 Line protocol 的一部分）来识别唯一数据点。

web,host=host2,region=us_west firstByte=15.0 1559260800000000000
--- -------------------------                -------------------
 |               |                                    |
Measurement   Tag set                             Timestamp

重复数据点

对于具有相同测量名称、标签集和时间戳的点，InfluxDB 会创建旧字段集和新字段集的并集。对于任何匹配的字段键，InfluxDB 使用新点的字段值。例如

# Existing data point
web,host=host2,region=us_west firstByte=24.0,dnsLookup=7.0 1559260800000000000

# New data point
web,host=host2,region=us_west firstByte=15.0 1559260800000000000

在您提交新的数据点后，InfluxDB 会用新的字段值覆盖 firstByte，并保持字段 dnsLookup 不变

# Resulting data point
web,host=host2,region=us_west firstByte=15.0,dnsLookup=7.0 1559260800000000000

from(bucket: "example-bucket")
  |> range(start: 2019-05-31T00:00:00Z, stop: 2019-05-31T12:00:00Z)
  |> filter(fn: (r) => r._measurement == "web")

Table: keys: [_measurement, host, region]
               _time  _measurement   host   region  dnsLookup  firstByte
--------------------  ------------  -----  -------  ---------  ---------
2019-05-31T00:00:00Z           web  host2  us_west          7         15

保留重复点

要在重复点中保留旧字段值和新字段值，请使用以下策略之一

添加任意标签
递增时间戳

添加任意标签

添加具有唯一值的任意标签，以便 InfluxDB 将重复点读取为唯一点。

例如，为每个数据点添加一个 uniq 标签

# Existing point
web,host=host2,region=us_west,uniq=1 firstByte=24.0,dnsLookup=7.0 1559260800000000000

# New point
web,host=host2,region=us_west,uniq=2 firstByte=15.0 1559260800000000000

无需追溯地将唯一标签添加到现有数据点。标签集作为一个整体进行评估。新点上的任意 uniq 标签允许 InfluxDB 将其识别为唯一点。但是，这会导致两个点的模式不同，并可能在查询数据时导致挑战。

将新点写入 InfluxDB 后

from(bucket: "example-bucket")
  |> range(start: 2019-05-31T00:00:00Z, stop: 2019-05-31T12:00:00Z)
  |> filter(fn: (r) => r._measurement == "web")

Table: keys: [_measurement, host, region, uniq]
               _time  _measurement   host   region  uniq  firstByte  dnsLookup
--------------------  ------------  -----  -------  ----  ---------  ---------
2019-05-31T00:00:00Z           web  host2  us_west     1         24          7

Table: keys: [_measurement, host, region, uniq]
               _time  _measurement   host   region  uniq  firstByte
--------------------  ------------  -----  -------  ----  ---------
2019-05-31T00:00:00Z           web  host2  us_west     2         15

递增时间戳

将时间戳递增一纳秒以强制每个点的唯一性。

# Old data point
web,host=host2,region=us_west firstByte=24.0,dnsLookup=7.0 1559260800000000000

# New data point
web,host=host2,region=us_west firstByte=15.0 1559260800000000001

将新点写入 InfluxDB 后

from(bucket: "example-bucket")
  |> range(start: 2019-05-31T00:00:00Z, stop: 2019-05-31T12:00:00Z)
  |> filter(fn: (r) => r._measurement == "web")

Table: keys: [_measurement, host, region]
                         _time  _measurement   host   region  firstByte  dnsLookup
------------------------------  ------------  -----  -------  ---------  ---------
2019-05-31T00:00:00.000000000Z           web  host2  us_west         24          7
2019-05-31T00:00:00.000000001Z           web  host2  us_west         15

本文中示例查询的输出已进行修改，以清晰地展示处理重复数据的不同方法和结果。

最佳实践写入

此页面是否有帮助？

感谢您的反馈！

支持和反馈

感谢您成为我们社区的一份子！我们欢迎并鼓励您提供关于 InfluxDB 和本文档的反馈和错误报告。要获得支持，请使用以下资源

拥有年度或支持合同的客户 可以联系 InfluxData 支持。

编辑此页面提交文档问题提交 InfluxDB 问题

处理重复数据点

重复数据点

保留重复点

添加任意标签

递增时间戳

支持和反馈

Flux 的未来

InfluxDB 3 Core 和 Enterprise

处理重复数据点

重复数据点

保留重复点

添加任意标签

递增时间戳

支持和反馈

您的 InfluxDB OSS URL 是什么？

默认

自定义

感谢您的反馈！

Flux 的未来

InfluxDB 3 Core 和 Enterprise