ClickHouse 由MergeTree转换为ReplicatedMergeTree(副本集) 测试环境准备 创建测试数据库
1 2 3 create database test ; use test ;
创建测试表
1 2 3 4 5 6 7 8 9 CREATE TABLE foo ( `A` Int64, `D` Date, `S` String ) ENGINE = MergeTree PARTITION BY toYYYYMM(D) ORDER BY A
插入测试数据
1 2 3 4 5 6 7 8 9 10 11 INSERT INTO foo SELECT number, today(), '' FROM numbers(100); INSERT INTO foo SELECT number, today() - 60, '' FROM numbers(100)
确认测试数据插入OK
1 2 3 4 5 6 7 8 9 10 11 12 localhost :) select count() from foo; SELECT count() FROM foo Query id: 05e4bb12-e307-4613-8e29-5721edbfa1a8 ┌─count()─┐ │ 200 │ └─────────┘ 1 rows in set . Elapsed: 0.005 sec.
方案 方案一:创建复制表后,直接将原表数据插入,然后再重命名表名
适合小数据量
需保证存储可用,此方案保留”旧”数据
创建复制表
1 2 3 4 5 6 7 8 9 CREATE TABLE foo_replicated ( `A` Int64, `D` Date, `S` String ) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/{database}/{table}' , '{replica}' ) PARTITION BY toYYYYMM(D) ORDER BY A
将原表中的数据插入新表
1 2 INSERT INTO foo_replicated SELECT * FROM foo
分别在复制集节点确认数据
1 2 3 4 5 6 7 8 9 10 11 12 localhost :) select count() from foo_replicated; SELECT count() FROM foo_replicated Query id: 1467d6e7-8e42-4bdd-9fbc-ad2d54f6d39b ┌─count()─┐ │ 200 │ └─────────┘ 1 rows in set . Elapsed: 0.005 sec.
按需重命名表名
1 2 3 RENAME TABLE foo TO foo_old; RENAME TABLE foo_replicated TO foo;
show create foo 可以看到重命名并不会更改ZK中的配置,新表复用原表名的话,建议一开始就规划好
1 ENGINE = ReplicatedMergeTree('/ clickhouse / tables / {shard }/ test / foo_replicated ', '{replica }')
方案二:将原表数据目录中的数据”拷贝或移动”到新表的数据目录”detached”中,然后执行”ALTER TABLE ATTACH PARTITION”
建好复制集表后,拷贝数据
为了保持数据一致性,可以先将数据DETACH(或应用层面暂停数据变更操作)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 SELECT database, table, partition FROM system.parts WHERE table = 'foo' ; ALTER TABLE foo DETACH PARTITION '202212' ALTER TABLE foo DETACH PARTITION '202302'
拷贝数据
1 2 3 4 5 cd /var/lib/clickhouse/data/test cp -r foo/detached/* foo_replicated/detached/ chown -R clickhouse:clickhouse foo_replicated/detached/*
复制集表ATTACH数据
1 2 3 4 5 ALTER TABLE foo_replicated ATTACH PARTITION '202212' ; ALTER TABLE foo_replicated ATTACH PARTITION '202302'
校验数据
1 2 3 4 5 6 7 8 9 10 11 12 localhost :) select count() from foo_replicated SELECT count() FROM foo_replicated Query id: aa1ceb41-42bd-45fa-b891-48efcb430578 ┌─count()─┐ │ 200 │ └─────────┘ 1 rows in set . Elapsed: 0.010 sec.
PARTITION 过多时操作起来很不优雅,并且容易遗漏
提前生成DETACH | ATTACH 语句
DETACH
1 2 3 4 5 6 7 8 9 10 11 12 SELECT DISTINCT concat('ALTER TABLE foo DETACH PARTITION \' ', partition_id, ' \';' ) FROM system.parts WHERE (table = 'foo' ) AND active Query id: ae34e9b9-e278-4af1-a95c-50d21de7161e ┌─concat('ALTER TABLE foo DETACH PARTITION \' ', partition_id, ' \';' )─┐ │ ALTER TABLE foo DETACH PARTITION '202212' ; │ │ ALTER TABLE foo DETACH PARTITION '202302' ; │ └────────────────────────────────────────────────────────────────────┘ 2 rows in set . Elapsed: 0.013 sec.
ATTACH
1 2 3 4 5 6 7 8 9 10 SELECT DISTINCT concat('ALTER TABLE foo_replicated ATTACH PARTITION \' ', partition_id, ' \';' ) FROM system.parts WHERE (table = 'foo' ) AND active Query id: e380689b-4532-430e-904c-c68eba49ce94 ┌─concat('ALTER TABLE foo_replicated ATTACH PARTITION \' ', partition_id, ' \';' )─┐ │ ALTER TABLE foo_replicated ATTACH PARTITION '202212' ; │ │ ALTER TABLE foo_replicated ATTACH PARTITION '202302' ; │ └───────────────────────────────────────────────────────────────────────────────┘
方案三:最便捷、高效的方式ALTER TABLE table ATTACH PARTITION ID partition_id FROM table 生成相关命令
1 2 3 4 5 6 7 8 9 10 SELECT DISTINCT concat('ALTER TABLE foo_replicated ATTACH PARTITION ID \' ', partition_id, ' \' FROM foo;' ) FROM system.parts WHERE (table = 'foo' ) AND active Query id: ff5bf6c5-cd96-4e24-9069-d58b0691928b ┌─concat('ALTER TABLE foo_replicated ATTACH PARTITION ID \' ', partition_id, ' \' FROM foo;' )─┐ │ ALTER TABLE foo_replicated ATTACH PARTITION ID '202212' FROM foo; │ │ ALTER TABLE foo_replicated ATTACH PARTITION ID '202302' FROM foo; │ └───────────────────────────────────────────────────────────────────────────────────────────┘
执行
1 2 ALTER TABLE foo_replicated ATTACH PARTITION ID '202212' FROM foo; ... ...
之所以高效是因为其基于文件系统的硬链接(不额外占用存储),可以对比下Links
1 2 3 4 root@localhost:/var/lib/clickhouse/data/test /foo/202212_2_2_0 File: data.bin Size: 503 Blocks: 8 IO Block: 4096 regular file Device: fd01h/64769d Inode: 1731864 Links: 1
1 2 3 4 root@localhost:/var/lib/clickhouse/data/test /foo/202212_2_2_0 File: data.bin Size: 503 Blocks: 8 IO Block: 4096 regular file Device: fd01h/64769d Inode: 1731864 Links: 2