Skip to content

Automatically infer schema of a stream from Avro schema #912

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
zliang-min opened this issue Feb 22, 2025 · 1 comment
Open

Automatically infer schema of a stream from Avro schema #912

zliang-min opened this issue Feb 22, 2025 · 1 comment

Comments

@zliang-min
Copy link
Collaborator

Nowadays, we can use a Avro schema with external streams to tell the stream to use the schema to read/write data, like this:

CREATE FORMAT SCHEMA my_avro_schema AS
$$
{
  "type": "record",
  "name": "Data",
  "fields" : [
    {"name": "a", "type": "int"},
    {"name": "b", "type": "string},
    {"name": "c", "type": "float}
  ]
}
$$
TYPE = Avro;

CREATE EXTERNAL STREAM example (
a int,
b string,
c float32
) SETTINGS type = 'kafka', brokers = 'some-broker:9092', topic = 'my-topic', data_format = 'Avro', data_schema = 'my_avro_schema';

This works well except that we have to repeat the schema twice, one in the format schema, the other in the external stream. This is cumbersome and error-prone, esp. when the scema is complex.

What we want to achieve is that, when an external stream is attached to a Avro schema, it can automatically infer the stream's schema from the Avro schema. So the above example will become

CREATE EXTERNAL STREAM example SETTINGS type = 'kafka', brokers = 'some-broker:9092', topic = 'my-topic', data_format = 'Avro', data_schema = 'my_avro_schema';

And then, users can run DESC example to see the stream's schema which was inferred from the Avro schema.

This is similar to how the ClickHouse and Timeplus external table generate the schema automatically without the user specifying it explicitly.

@KathrynLin
Copy link

I have submitted a PR to fix this issue: #913.
Please review and let me know if any further modifications are needed. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants