删除重复记录方法一:
1. 新建一个临时表
代码如下 | 复制代码 |
create table tmp as select * from youtable group by name(name为不希望有重复的列) |
2. 删除原来的表
代码如下 | 复制代码 |
drop table youtable |
3. 重命名表
代码如下 | 复制代码 |
alter table tmp rename youtable |
但是这个方法有个问题,由临时表转变过来的最终表,其表结构会和原来的不一致,需要手工更改。这个问题,待解决。
删除重复记录方法二:
1. 新建一个临时表
代码如下 | 复制代码 |
CREATE TABLE tmp AS SELECT * FROM youtable GROUP BY name(name为不希望有重复的列) |
2. 清空原来的表
代码如下 | 复制代码 |
TRUNCATE TABLE youtable |
3. 把临时表插入到youtable
代码如下 | 复制代码 |
INSERT INTO tablename SELECT * FROM temp |
4. 删除临时表
代码如下 | 复制代码 |
DROP TABLE temp |
删除重复记录方法三:
代码如下 | 复制代码 |
delete table where ID not in(select min(ID) from table group by name(name:重复的字段)) |
删除重复记录方法四:
具体实现如下:
代码如下 | 复制代码 |
Table Create Table ------------ -------------------------------------------------------- users_groups CREATE TABLE `users_groups` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `uid` int(11) NOT NULL, `gid` int(11) NOT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=15 DEFAULT CHARSET=utf8 users_groups.txt内容: 1,11,502 mysql> load data infile 'c:\users_groups.txt' into table users_groups fields mysql> select * from users_groups; query result(14 records) id uid gid |
根据一位兄弟的建议修改。
代码如下 | 复制代码 |
mysql> create temporary table tmp_wrap select * from users_groups group by uid having count(1) >= 1; Query OK, 7 rows affected (0.11 sec) Records: 7 Duplicates: 0 Warnings: 0 mysql> truncate table users_groups; mysql> insert into users_groups select * from tmp_wrap; mysql> select * from users_groups; query result(7 records) mysql> drop table tmp_wrap; |
2、还有一个很精简的办法。
查找重复的,并且除掉最小的那个。
代码如下 | 复制代码 |
delete users_groups as a from users_groups as a, ( select *,min(id) from users_groups group by uid having count(1) > 1 ) as b where a.uid = b.uid and a.id > b.id; (7 row(s)affected) (0 ms taken) query result(7 records) id uid gid 1 11 502 2 107 502 3 100 503 4 110 501 5 112 501 6 104 502 9 102 501 |
3、现在来看一下这两个办法的效率。
运行一下以下SQL 语句
代码如下 | 复制代码 |
create index f_uid on users_groups(uid); explain select * from users_groups group by uid having count(1) > 1 union all select * from users_groups group by uid having count(1) = 1; explain select * from users_groups as a, ( select *,min(id) from users_groups group by uid having count(1) > 1 ) as b where a.uid = b.uid and a.id > b.id; query result(3 records) id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY users_groups index (NULL) f_uid 4 (NULL) 14 2 UNION users_groups index (NULL) f_uid 4 (NULL) 14 (NULL) UNION RESULT <union1,2> ALL (NULL) (NULL) (NULL) (NULL) (NULL) |
很明显的第二个比第一个扫描的函数要少。
当没有创建表或创建索引权限的时候
创建一个新表,然后将原表中不重复的数据插入新表:
代码如下 | 复制代码 |
mysql> create table demo_new as select * from demo group by site; Query OK, 3 rows affected (0.19 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql> show tables; +----------------+ | Tables_in_test | +----------------+ | demo | | demo_new | +----------------+ 2 rows in set (0.00 sec) mysql> select * from demo order by id; +----+------------------------+ | id | site | +----+------------------------+ | 1 | http://www.111cn.net | | 2 | http://111cn.net | | 3 | http://www.111cn.net | | 4 | http://www.111cn.net | | 5 | http://www.111cn.net | +----+------------------------+ 5 rows in set (0.00 sec) mysql> select * from demo_new order by id; +----+------------------------+ | id | site | +----+------------------------+ | 1 | http://www.111cn.net | | 2 | http://111cn.net | | 3 | http://www.111cn.net | +----+------------------------+ 3 rows in set (0.00 sec) |
然后将原表备份,将新表重命名为当前表:
代码如下 | 复制代码 |
mysql> rename table demo to demo_old, demo_new to demo; Query OK, 0 rows affected (0.04 sec) mysql> show tables; +----------------+ | Tables_in_test | +----------------+ | demo | | demo_old | +----------------+ 2 rows in set (0.00 sec) mysql> select * from demo order by id; +----+------------------------+ | id | site | +----+------------------------+ | 1 | http://www.111cn.net | | 2 | http://111cn.net | | 3 | http://www.111cn.net | +----+------------------------+ 3 rows in set (0.00 sec) |