Коллеги, всем привет. Нужна помощь. Есть кластер из 1 шарда

Question

Коллеги, всем привет. Нужна помощь. Есть кластер из 1 шарда

с 2 нодами. 5 дней назад 1я нода стала отставать от 2й, сейчас разница примерно в 570млн строк. Данные во 2ю уже не вставляются, т.к. Code: 252. DB::Exception: Too many parts (2011). Merges are processing significantly slower than inserts. (TOO_MANY_PARTS) (version 22.6.3.35 (official build)). Куда рыть и что делать? Сервера железные, данные лежат на 5 рейде из 3 нвме карт.

#backend #clickhouse #database #devops #programming #russian

0

10.05.2023

16 ответов

53 просмотра

Игельшнойцхен Автор вопроса

Denny [Altinity]
select * from system.replication_queue там есть п...

Роле пустое. В таблице боле 3млн записей. Есть поле postpone_reason там следующее - Not executing fetch of part 202305_16536780_16536780_0 because 8 fetches already executing, max 8.

0

10.05.2023

Denny [Altinity]

выполните system restart replica ваша таблица затем сразу SELECT database, table, type, max(last_exception), max(postpone_reason), min(create_time), max(last_attempt_time), max(last_postpone_time), max(num_postponed) AS max_postponed, max(num_tries) AS max_tries, min(num_tries) AS min_tries, countIf(last_exception != '') AS count_err, countIf(num_postponed > 0) AS count_postponed, countIf(is_currently_executing) AS count_executing, count() AS count_all FROM system.replication_queue GROUP BY database, table, type ORDER BY count_all DESC

0

10.05.2023

Игельшнойцхен Автор вопроса

Denny [Altinity]
выполните system restart replica ваша таблица зат...

А как долго выполняется в среднем system restart replica?

0

10.05.2023

Denny [Altinity]

Игельшнойцхен
А как долго выполняется в среднем system restart r...

от 0 секунд до часа в среднем видимо 30 минут

0

10.05.2023

Игельшнойцхен Автор вопроса

Denny [Altinity]
от 0 секунд до часа в среднем видимо 30 минут

Спасибо, в err лог пока сыпется с адской частотой вот это - 2023.05.10 19:11:32.935842 [ 36658 ] {175C55DF8287C4D0} <Warning> ClusterProxy::SelectStreamFactory: Local replica of shard 1 is stale (delay: 1683735092s.)

0

10.05.2023

Игельшнойцхен Автор вопроса

Denny [Altinity]
выполните system restart replica ваша таблица зат...

┌─database─┬─table──────────┬─type────────┬─max(last_exception)─┬─max(postpone_reason)────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┬────min(create_time)─┬─max(last_attempt_time)─┬─max(last_postpone_time)─┬─max_postponed─┬─max_tries─┬─min_tries─┬─count_err─┬─count_postponed─┬─count_executing─┬─count_all─┐ │ xxx │ yyy │ GET_PART │ │ Not executing log entry queue-0070763642 for part 202305_17587216_17587216_0 because it is covered by part 202305_17587197_17587216_2 that is currently executing. │ 2023-05-05 20:12:17 │ 2023-05-10 19:16:54 │ 2023-05-10 19:16:54 │ 149 │ 1 │ 0 │ 0 │ 935095 │ 1 │ 2932804 │ │ xxx │ yyy │ MERGE_PARTS │ │ Not executing log entry queue-0070762391 of type MERGE_PARTS for part 202209_36619_36632_2 because part 202209_36619_36626_1 is not ready yet (log entry for that part is being processed). │ 2023-05-06 06:31:07 │ 1970-01-01 03:00:00 │ 2023-05-10 19:16:54 │ 2 │ 0 │ 0 │ 0 │ 1 │ 0 │ 183285 │ └──────────┴────────────────┴─────────────┴─────────────────────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴─────────────────────┴────────────────────────┴─────────────────────────┴───────────────┴───────────┴───────────┴───────────┴─────────────────┴─────────────────┴───────────┘

0

10.05.2023

Denny [Altinity]

Игельшнойцхен
┌─database─┬─table──────────┬─type────────┬─max(la...

вот такой запрос что возращает select count() from ( select zoo.p_path as part_zoo, zoo.ctime, zoo.mtime, disk.p_path as part_disk from ( select concat(path,'/',name) as p_path, ctime, mtime from system.zookeeper where path in (select concat(replica_path,'/parts') from system.replicas) ) zoo left join ( select concat(replica_path,'/parts/',name) as p_path from system.parts inner join system.replicas using (database, table) ) disk on zoo.p_path = disk.p_path where part_disk='' and zoo.mtime <= now() - interval 1 day )

0

10.05.2023

Игельшнойцхен Автор вопроса

Denny [Altinity]
вот такой запрос что возращает select count() fro...

0 строк

0

10.05.2023

Denny [Altinity]

Игельшнойцхен
0 строк

count = 0 ? или 0 строк?

0

10.05.2023