如何通过查找每个列的最大时间戳来获取列的不同值,然后获取其余列

我有一个大的Oracle(Oracle数据库12C企业版版本12.1.0.2.0)表说,每15秒更新一次table_name。 它有很多列,但我担心的是:

Name Null? Type --------------- -------- --------------------------------- ID_1 NOT NULL NUMBER(38) UTC_TIMESTAMP NOT NULL TIMESTAMP(6) WITH TIME ZONE ID_2 VARCHAR2(8) SERVER_NAME VARCHAR2(256) ID_3 NUMBER(38) COUNT_1 NUMBER(38) COUNT_2 NUMBER(38) 

我想要做的是:

1)获取UTC_TIMESTAMP <= current_date和UTC_TIMESTAMP> current_date – 5分钟(大约125K-150K)

2)这些数据将有重复的ID_1。 所以我只想logging每个ID_1中有最大值(UTC_TIMESTAMP)的logging。 所以现在我们将有不同的ID_1。

我曾尝试过:使用以下SQL

 with temp_1 as ( select m.ID_2, m.ID_1, max(utc_timestamp) max_utc_timestamp from commsdesk.table_name m where m.ID_2 = 'TWC' group by m.ID_2, m.ID_1) select f.utc_timestamp from commsdesk.table_name f join temp_1 t on t.max_utc_timestamp = f.utc_timestamp and t.ID_2 = f.ID_2 and t.ID_1 = f.ID_1; 

问题:我只能得到ID_2,ID_1和UTC_TIMESTAMP,但我也想要所有其他列。 可以使用SQL来完成吗?

在5分钟的时间内,大约有2200个不同的ID_1和大约125K-150K的logging。 因此,通过复制Excel表格中的125K-150Klogging并对2200 ID_1中的每一个进行过滤来find每个ID_1的UTC_TIMESTAMP的最大值是不切实际的。 但是我也可以这样做,如果有任何使用macros的快速方法。

示例虚拟数据:

 ID_2 SERVER_NAME ID_3 ID_1 UTC_TIMESTAMP COUNT_1 COUNT_2 ABC PQRS.ABC.TPO 2 303 24-JUL-17 03.41.55.000000000 PM +00:00 4 0 ABC PQRS.ABC.TPO 2 1461 24-JUL-17 03.42.48.000000000 PM +00:00 1 7 ABC PQRS.ABC.TPO 2 1 24-JUL-17 03.41.36.000000000 PM +00:00 2 3 ABC PQRS.ABC.TPO 2 1461 24-JUL-17 03.41.16.000000000 PM +00:00 0 8 ABC PQRS.ABC.TPO 1 1 24-JUL-17 03.41.11.000000000 PM +00:00 5 0 ABC SRP.ROP.MTP 1 1 24-JUL-17 03.41.23.000000000 PM +00:00 0 0 ABC SRP.ROP.MTP 2 303 24-JUL-17 03.41.34.000000000 PM +00:00 0 0 ABC SRP.ROP.MTP 2 1461 24-JUL-17 03.41.31.000000000 PM +00:00 0 0 ABC SRP.ROP.MTP 4 303 24-JUL-17 03.41.26.000000000 PM +00:00 4 8 ABC SRP.ROP.MTP 2 303 24-JUL-17 03.41.20.000000000 PM +00:00 0 0 ABC SRP.ROP.MTP 1 1461 24-JUL-17 03.41.01.000000000 PM +00:00 3 8 ABC SRP.ROP.MTP 4 1 24-JUL-17 03.41.18.000000000 PM +00:00 9 1 

预期产出:

 ID_1 UTC_TIMESTAMP COUNT_1 COUNT_2 1 24-JUL-17 03.41.36.000000000 PM +00:00 2 3 303 24-JUL-17 03.41.55.000000000 PM +00:00 4 0 1461 24-JUL-17 03.42.48.000000000 PM +00:00 1 7 

您可以使用max()聚合函数的keep (dense_rank last ...)版本(或者,如果您愿意,可以使用firstmin ),例如:

 select id_1, max(utc_timestamp), max(id_2) keep (dense_rank last order by utc_timestamp) as id_2, max(server_name) keep (dense_rank last order by utc_timestamp) as server_name, max(id_3) keep (dense_rank last order by utc_timestamp) as id_3, max(count_1) keep (dense_rank last order by utc_timestamp) as count_1, max(count_2) keep (dense_rank last order by utc_timestamp) as count_2 from table_name where utc_timestamp > current_timestamp - interval '5' minute and utc_timestamp <= current_timestamp group by id_1 order by id_1; 

查询按id_1分组,并且按照您想要的最新时间戳记, max(utc_timestamp)为“正常”。 其他列保留与具有该最大时间戳的行相关联的值,用于id_

用一些虚拟数据:

 insert into table_name (id_1, utc_timestamp, id_2, server_name, id_3, count_1, count_2) values (1, systimestamp at time zone 'UTC' - interval '30' second, 'TWC', 'test1', 301, 1, 1); insert into table_name (id_1, utc_timestamp, id_2, server_name, id_3, count_1, count_2) values (1, systimestamp at time zone 'UTC' - interval '60' second, 'TWC', 'test2', 302, 2, 2); insert into table_name (id_1, utc_timestamp, id_2, server_name, id_3, count_1, count_2) values (1, systimestamp at time zone 'UTC' - interval '90' second, 'TWC', 'test3', 303, 3, 3); insert into table_name (id_1, utc_timestamp, id_2, server_name, id_3, count_1, count_2) values (2, systimestamp at time zone 'UTC' - interval '45' second, 'TWC', 'test4', 304, 4, 4); insert into table_name (id_1, utc_timestamp, id_2, server_name, id_3, count_1, count_2) values (2, systimestamp at time zone 'UTC' - interval '15' second, 'TWC', 'test5', 305, 5, 5); 

该查询得到结果:

  ID_1 MAX(UTC_TIMESTAMP) ID_2 SERVE ID_3 COUNT_1 COUNT_2 ---------- --------------------------- -------- ----- ---------- ---------- ---------- 1 2017-07-21 18:38:22.944 UTC TWC test1 301 1 1 2 2017-07-21 18:38:38.399 UTC TWC test5 305 5 5 

你可以得到相同的结果,更像你的尝试:

 with cte as ( select id_1, max(utc_timestamp) max_utc_timestamp from table_name m where utc_timestamp > current_timestamp - interval '5' minute and utc_timestamp <= current_timestamp group by id_1 ) select t.id_1, t.utc_timestamp, t.id_2, t.server_name, t.id_3, t.count_1, t.count_2 from cte join table_name t on t.id_1 = cte.id_1 and t.utc_timestamp = cte.max_utc_timestamp order by t.id_1; 

…假设id_1utc_timestamp组合是唯一的(不知道为什么你使用id_2进行连接;也许这是唯一性需要?)。 但是这样做效率不高,因为它需要两次查询真正的表,一次查找每个id_1的最大时间戳,然后再次在连接中查找。 运行两个版本来比较结果和时间以及执行计划可能是值得的。


通过您的示例数据(在2017-07-24更新),上面的第一个查询 – 修改为使用固定的时间戳范围进行匹配 – 得到:

  ID_1 MAX(UTC_TIMESTAMP) ID_ SERVER_NAME ID_3 COUNT_1 COUNT_2 ---------- --------------------------------- --- ------------ ---------- ---------- ---------- 1 2017-07-24 15:41:36.000000 +00:00 ABC PQRS.ABC.TPO 2 2 3 303 2017-07-24 15:41:55.000000 +00:00 ABC PQRS.ABC.TPO 2 4 0 1461 2017-07-24 15:42:48.000000 +00:00 ABC PQRS.ABC.TPO 2 1 7 

或者拿出你并不感兴趣的专栏:

 select id_1, max(utc_timestamp), max(count_1) keep (dense_rank last order by utc_timestamp) as count_1, max(count_2) keep (dense_rank last order by utc_timestamp) as count_2 from table_name where utc_timestamp > timestamp '2017-07-24 16:40:00 Europe/London' -- current_timestamp - interval '5' minute and utc_timestamp <= timestamp '2017-07-24 16:45:00 Europe/London' -- current_timestamp group by id_1 order by id_1; ID_1 MAX(UTC_TIMESTAMP) COUNT_1 COUNT_2 ---------- --------------------------------- ---------- ---------- 1 2017-07-24 15:41:36.000000 +00:00 2 3 303 2017-07-24 15:41:55.000000 +00:00 4 0 1461 2017-07-24 15:42:48.000000 +00:00 1 7 

然后为你的下一步:

 select max(max_utc_timestamp) as max_utc_timestamp, sum(count_1) as sum_count_1, sum(count_2) as sum_count_2 from ( select max(utc_timestamp) as max_utc_timestamp, max(count_1) keep (dense_rank last order by utc_timestamp) as count_1, max(count_2) keep (dense_rank last order by utc_timestamp) as count_2 from table_name where utc_timestamp > timestamp '2017-07-24 16:40:00 Europe/London' -- current_timestamp - interval '5' minute and utc_timestamp <= timestamp '2017-07-24 16:45:00 Europe/London' -- current_timestamp group by id_1 ); MAX_UTC_TIMESTAMP SUM_COUNT_1 SUM_COUNT_2 --------------------------------- ----------- ----------- 2017-07-24 15:42:48.000000 +00:00 7 10