файла работать в клике в кафка движке? Есть message, внутри которого определен еще message, а внутри еще один, то есть получается примерно такая структура:
message X{
message Y{
message Z{}
}
}
Для маппинга мне нужен тип Z, указываю следующим образом: kafka_schema = 'file.proto:X_Y_Z'
На что получаю, что нет такой модели в этом прото файле.
In order to return response in Protobuf format, we first need to import proto files in ClickHouse. It can be done like that: cd /var/lib/clickhouse/format_schemas/ wget https://raw.githubusercontent.com/open-telemetry/opentelemetry-proto/main/opentelemetry/proto/trace/v1/trace.proto wget https://raw.githubusercontent.com/open-telemetry/opentelemetry-proto/main/opentelemetry/proto/common/v1/common.proto wget https://raw.githubusercontent.com/open-telemetry/opentelemetry-proto/main/opentelemetry/proto/resource/v1/resource.proto Lets connect to ClickHouse and try to use this proto file SELECT 1 FORMAT ProtobufSingle SETTINGS format_schema = '/var/lib/clickhouse/format_schemas/trace.proto:TracesData' Query id: 9755e685-5f3b-4b76-b794-e690b86a28c3 Ok. Error on processing query: Code: 434. DB::Exception: Code: 434. DB::Exception: Cannot parse 'trace.proto' file, found an error at line -1, column 0, File recursively imports itself: trace.proto -> trace.proto. (CANNOT_PARSE_PROTOBUF_SCHEMA) (version 22.12.1.1115 (official build)). (CANNOT_PARSE_PROTOBUF_SCHEMA) (version 22.12.1.1115 (official build)) And it gives us rather cryptic error messages about recursion. What it really does mean, that we need to fix our import paths in .proto files. (in traces.proto and resource.proto files) import "opentelemetry/proto/common/v1/common.proto"; -> import "common.proto"; import "opentelemetry/proto/resource/v1/resource.proto"; -> import "resource.proto"; Let's test our query again. SELECT 1 FORMAT ProtobufSingle SETTINGS format_schema = '/var/lib/clickhouse/format_schemas/trace.proto:TracesData' Query id: 0a920df8-8bbd-4525-95cc-f5906f52b678 Ok. Error on processing query: Code: 443. DB::Exception: Code: 443. DB::Exception: Not found matches between the names of the columns {1} and the fields {resource_spans} of the message 'opentelemetry.proto.trace.v1.TracesData' in the protobuf schema. (NO_COLUMNS_SERIALIZED_TO_PROTOBUF_FIELDS) (version 22.12.1.1115 (official build)). (NO_COLUMNS_SERIALIZED_TO_PROTOBUF_FIELDS) (version 22.12.1.1115 (official build)) We see different error, it does mean that ClickHouse successfully parsed our proto. Nested structures What we need to do next, is to make a similar nested structure using Tuples and Arrays. We will simplify our example quite a bit, but the idea will be the same even for more complex objects(or messages). Let's take a look on that proto definition: message KeyValue { string key = 1; AnyValue value = 2; } message Resource { repeated KeyValue attributes = 1; ... } message ResourceSpans { Resource resource = 1; ... } Or, if we make it into nested message: message ResourceSpans { message Resource resource = 1 { repeated message KeyValue attributes = 1 { string key = 1; message AnyValue value = 2 { oneof value { string string_value = 1; ... }; }; }; ... }; ... } Our task is to turn each message to Tuple and repeated to Array. For example repeated message will became Array(Tuple(..)) So proto example can be represented as Tuple( resource Tuple ( attributes Array ( Tuple ( key String, value Tuple ( string_value String ) ) ) ) ) Let's turn it into query: SELECT CAST(tuple(tuple([('key', tuple('value'))])), 'Tuple(resource Tuple (attributes Array (Tuple (key String, value Tuple (string_value String)))))') AS resource_spans FORMAT ProtobufSingle SETTINGS format_schema = '/var/lib/clickhouse/format_schemas/trace.proto:TracesData' key value 1 row in set. Elapsed: 0.001 sec. It's working, we only need to continue adding new fields in tuple to match proto specification.
WITH (reinterpretAsFixedString(toUUID(trace_id)) AS traceId, reinterpretAsFixedString(parent_span_id) AS parentSpanId, reinterpretAsFixedString(span_id) AS spanId, operation_name AS name, start_time_us * 1000 AS startTimeUnixNano, finish_time_us * 1000 AS endTimeUnixNano, CAST(mapApply((k, v) -> (k, tuple(v)), attribute), 'Array(Tuple(String, Tuple(String)))') AS attributes) AS span SELECT CAST([(tuple([('service.name', tuple('clickhouse')), ('hostname', tuple(hostName()))]), [tuple(groupArray(span))])], 'Array(Tuple(resource Tuple(attributes Array(Tuple(key String, value Tuple(string_value String)))), scope_spans Array(Tuple(spans Array(Tuple(trace_id FixedString(16), parent_span_id FixedString(8), span_id FixedString(8), name String, start_time_unix_nano UInt64, end_time_unix_nano UInt64, attributes Array(Tuple(key String, value Tuple(string_value String)))))))))') AS resource_spans FROM system.opentelemetry_span_log WHERE trace_id = reinterpretAsUUID(base64Decode({trace_id:String})) FORMAT ProtobufSingle SETTINGS format_schema = 'trace.proto:TracesData'
посмотри, на этот пример с Nested messages
Спасибо, посмотрю)
Решил повторить эту последовательность действий на 22.8.9.24 и не могу перескочить первую ошибку. Команду выполняю такую: SELECT 1 FORMAT ProtobufSingle SETTINGS format_schema = 'trace.proto:TracesData’; Мб есть что еще нужно сделать?
Надо править proto файлы, что бы инклуды работали
исправил, как и написано, или кроме import еще что-то нужно? Да и вообще делал свой proto файл, все равно такую ошибку получаю, даже на простой proto файл из одного поля
Обсуждают сегодня