やったこと
テストデータで、1000万件のデータを用意する必要があったので、
MySQL だけで完結できるようにやってみます。
確認環境
$ mysql --version
mysql Ver 14.14 Distrib 5.6.25, for Linux (x86_64) using EditLine wrapper
調査
テーブル準備
CREATE TABLE IF NOT EXISTS `bulk_test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`text_data` varchar(20) NOT NULL,
`created_at` datetime NOT NULL,
`updated_at` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
;
INSERT 実行
1件目
INSERT INTO bulk_test values (null, CONCAT('dummy text', CEIL(RAND() * 100)), NOW(), NOW());
これを欲しい件数まで繰り返します。
INSERT bulk_test (id, text_data, created_at, updated_at)
SELECT null, CONCAT('dummy text', CEIL(RAND() * 100)), NOW(), NOW() FROM bulk_test;
1 -> 2 -> 4 -> 8 … とどんどん投入データが増えていきます。
実行結果
mysql> INSERT bulk_test (id, text_data, created_at, updated_at)
-> SELECT null, CONCAT('dummy text', CEIL(RAND() * 100)), NOW(), NOW() FROM bulk_test;
Query OK, 1 row affected (0.01 sec)
Records: 1 Duplicates: 0 Warnings: 0
mysql> INSERT bulk_test (id, text_data, created_at, updated_at) SELECT null, CONCAT('dummy text', CEIL(RAND() * 100)), NOW(), NOW() FROM bulk_test;
Query OK, 2 rows affected (0.00 sec)
Records: 2 Duplicates: 0 Warnings: 0
mysql> INSERT bulk_test (id, text_data, created_at, updated_at) SELECT null, CONCAT('dummy text', CEIL(RAND() * 100)), NOW(), NOW() FROM bulk_test;
Query OK, 4 rows affected (0.00 sec)
Records: 4 Duplicates: 0 Warnings: 0
mysql> INSERT bulk_test (id, text_data, created_at, updated_at) SELECT null, CONCAT('dummy text', CEIL(RAND() * 100)), NOW(), NOW() FROM bulk_test;
Query OK, 8 rows affected (0.00 sec)
Records: 8 Duplicates: 0 Warnings: 0
mysql> INSERT bulk_test (id, text_data, created_at, updated_at) SELECT null, CONCAT('dummy text', CEIL(RAND() * 100)), NOW(), NOW() FROM bulk_test;
Query OK, 16 rows affected (0.00 sec)
Records: 16 Duplicates: 0 Warnings: 0
mysql> INSERT bulk_test (id, text_data, created_at, updated_at) SELECT null, CONCAT('dummy text', CEIL(RAND() * 100)), NOW(), NOW() FROM bulk_test;
Query OK, 32 rows affected (0.00 sec)
Records: 32 Duplicates: 0 Warnings: 0
mysql> INSERT bulk_test (id, text_data, created_at, updated_at) SELECT null, CONCAT('dummy text', CEIL(RAND() * 100)), NOW(), NOW() FROM bulk_test;
Query OK, 64 rows affected (0.00 sec)
Records: 64 Duplicates: 0 Warnings: 0
mysql> INSERT bulk_test (id, text_data, created_at, updated_at) SELECT null, CONCAT('dummy text', CEIL(RAND() * 100)), NOW(), NOW() FROM bulk_test;
Query OK, 128 rows affected (0.00 sec)
Records: 128 Duplicates: 0 Warnings: 0
mysql> INSERT bulk_test (id, text_data, created_at, updated_at) SELECT null, CONCAT('dummy text', CEIL(RAND() * 100)), NOW(), NOW() FROM bulk_test;
Query OK, 256 rows affected (0.02 sec)
Records: 256 Duplicates: 0 Warnings: 0
mysql> INSERT bulk_test (id, text_data, created_at, updated_at) SELECT null, CONCAT('dummy text', CEIL(RAND() * 100)), NOW(), NOW() FROM bulk_test;
Query OK, 512 rows affected (0.00 sec)
Records: 512 Duplicates: 0 Warnings: 0
mysql> INSERT bulk_test (id, text_data, created_at, updated_at) SELECT null, CONCAT('dummy text', CEIL(RAND() * 100)), NOW(), NOW() FROM bulk_test;
Query OK, 1024 rows affected (0.01 sec)
Records: 1024 Duplicates: 0 Warnings: 0
mysql> INSERT bulk_test (id, text_data, created_at, updated_at) SELECT null, CONCAT('dummy text', CEIL(RAND() * 100)), NOW(), NOW() FROM bulk_test;
Query OK, 2048 rows affected (0.01 sec)
Records: 2048 Duplicates: 0 Warnings: 0
mysql> INSERT bulk_test (id, text_data, created_at, updated_at) SELECT null, CONCAT('dummy text', CEIL(RAND() * 100)), NOW(), NOW() FROM bulk_test;
Query OK, 4096 rows affected (0.03 sec)
Records: 4096 Duplicates: 0 Warnings: 0
mysql> INSERT bulk_test (id, text_data, created_at, updated_at) SELECT null, CONCAT('dummy text', CEIL(RAND() * 100)), NOW(), NOW() FROM bulk_test;
Query OK, 8192 rows affected (0.10 sec)
Records: 8192 Duplicates: 0 Warnings: 0
mysql> INSERT bulk_test (id, text_data, created_at, updated_at) SELECT null, CONCAT('dummy text', CEIL(RAND() * 100)), NOW(), NOW() FROM bulk_test;
Query OK, 16384 rows affected (0.06 sec)
Records: 16384 Duplicates: 0 Warnings: 0
mysql> INSERT bulk_test (id, text_data, created_at, updated_at) SELECT null, CONCAT('dummy text', CEIL(RAND() * 100)), NOW(), NOW() FROM bulk_test;
Query OK, 32768 rows affected (0.12 sec)
Records: 32768 Duplicates: 0 Warnings: 0
mysql> INSERT bulk_test (id, text_data, created_at, updated_at) SELECT null, CONCAT('dummy text', CEIL(RAND() * 100)), NOW(), NOW() FROM bulk_test;
Query OK, 65536 rows affected (0.22 sec)
Records: 65536 Duplicates: 0 Warnings: 0
mysql> INSERT bulk_test (id, text_data, created_at, updated_at) SELECT null, CONCAT('dummy text', CEIL(RAND() * 100)), NOW(), NOW() FROM bulk_test;
Query OK, 131072 rows affected (0.45 sec)
Records: 131072 Duplicates: 0 Warnings: 0
mysql> INSERT bulk_test (id, text_data, created_at, updated_at) SELECT null, CONCAT('dummy text', CEIL(RAND() * 100)), NOW(), NOW() FROM bulk_test;
Query OK, 262144 rows affected (1.09 sec)
Records: 262144 Duplicates: 0 Warnings: 0
mysql> INSERT bulk_test (id, text_data, created_at, updated_at) SELECT null, CONCAT('dummy text', CEIL(RAND() * 100)), NOW(), NOW() FROM bulk_test;
Query OK, 524288 rows affected (2.05 sec)
Records: 524288 Duplicates: 0 Warnings: 0
mysql> INSERT bulk_test (id, text_data, created_at, updated_at) SELECT null, CONCAT('dummy text', CEIL(RAND() * 100)), NOW(), NOW() FROM bulk_test;
Query OK, 1048576 rows affected (4.62 sec)
Records: 1048576 Duplicates: 0 Warnings: 0
mysql> INSERT bulk_test (id, text_data, created_at, updated_at) SELECT null, CONCAT('dummy text', CEIL(RAND() * 100)), NOW(), NOW() FROM bulk_test;
Query OK, 2097152 rows affected (9.64 sec)
Records: 2097152 Duplicates: 0 Warnings: 0
mysql> INSERT bulk_test (id, text_data, created_at, updated_at) SELECT null, CONCAT('dummy text', CEIL(RAND() * 100)), NOW(), NOW() FROM bulk_test;
Query OK, 4194304 rows affected (22.91 sec)
Records: 4194304 Duplicates: 0 Warnings: 0
mysql> INSERT bulk_test (id, text_data, created_at, updated_at) SELECT null, CONCAT('dummy text', CEIL(RAND() * 100)), NOW(), NOW() FROM bulk_test;
Query OK, 8388608 rows affected (44.87 sec)
Records: 8388608 Duplicates: 0 Warnings: 0
mysql> select count(*) from bulk_test;
+----------+
| count(*) |
+----------+
| 16777216 |
+----------+
1 row in set (6.51 sec)
16777216 件のレコードが出来上がりました。
unique キー制約、外部キー制約があると、もう少し複雑になるので、
ruby, python などでプログラム書いた方が速いかもしれません。