How to Dedupe a table in MySQL

A quick and dirty way of stripping duplicate records out of a MySQL table!, if your table has no indexes or constraints:

Assuming the name of the offending table is customers:

CREATE TABLE customer_dedupe AS SELECT DISTINCT * FROM customers;
RENAME TABLE customers TO customers_dupe;
RENAME TABLE customers_dedupe TO customers;

Done!

But what if your original table had indexes?

You need to manually recreate those indexes based on those of the original table of duplicates, before you rename and drop tables. Have a look at the indexes on the original table (no renamed):

mysql> SHOW INDEXES FROM customers_dupe;
| Table     | Non_unique | Key_name           | Seq_in_index | Column_name      | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
| customers |          1 | i_primary_category |            1 | primary_category | A         |         287 |      255 | NULL   | YES  | BTREE      |         |
| customers |          1 | i_product_id       |            1 | product_id       | A         |      700391 |     NULL | NULL   |      | BTREE      |         |
| customers |          1 | title              |            1 | title            | NULL      |      140078 |     NULL | NULL   |      | FULLTEXT   |         |
| customers |          1 | brand              |            1 | brand            | NULL      |        7146 |     NULL | NULL   | YES  | FULLTEXT   |         |

Create the same indexes on the new deduped table, for example:

CREATE fulltext INDEX ix_data_brand ON customer(brand);
CREATE fulltext INDEX ix_data_title ON customer(title);
CREATE INDEX ix_data_id ON customer(product_id);
CREATE INDEX ix_data_pri_cat ON customer(primary_category(255));

Confirm that the indexes are same on the original and th dedupd table with another command:

SHOW INDEXES FROM customers;

 

You can now go ahead and drop the table of duplicated, e.g.:

DROP TABLE customers_dupe;