在清洗数据的时候,经常会遇到去重的需求,特地在这里记录一下。

查询重复的数据

select * , row_number() over (partition by 字段 order by  字段) as rowid from 表名
where  字段 in (select  字段 from 表名
group by  字段 having count(字段)>1)

删除重复的数据,保留一条

delete t from (select * , row_number() over (partition by 字段 order by  字段) as rowid from 表名
where  字段 in (select  字段 from 表名
group by  字段 having count(字段)>1)) t where t.rowid != 1