问题描述
- MySQL 删除重复数据(两个字段)
-
表full中userID和movieID两个字段确定一条唯一的数据。
现在表中有少量重复数据和大量非重复数据。
重复数据中,preference列和timestamp列不为空的这些,需要保留;preference列和timestamp列为空的这些需要删除,应该怎么办?
初学者,希望答案详细:) 感激不尽
+------------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------+---------+------+-----+---------+-------+
| userID | int(11) | NO | | 0 | |
| movieID | int(11) | NO | | 0 | |
| preference | int(11) | YES | | NULL | |
| timestamp | int(11) | YES | | NULL | |
+------------+---------+------+-----+---------+-------+
解决方案
先建一个表,用distinct查询不重复的数据放到新建的表里面,然后把原表数据清空,再把刚才新表的数据插到原表里
然后把数据导入到新表中:
INSERT INTO table2 SELECT distinct * FROM table1;
你试一下,看可以不?
解决方案二:
step 1: 复制table为table_copy
step 2: delete from table where UserID is NULL and MovieID is NULL and (UserID,MovieID) in (select t1.UserID,t1.MovieID from table_copy as t1,table_copy as t2 where t1.UserID = t2.UserID and t1.MovieID = t2.MovieID and not (t1.preference = t2.preference and t1.timestamp = t2.timestamp));
时间: 2024-11-08 19:51:12