Skip to content
This repository was archived by the owner on Jun 22, 2025. It is now read-only.
This repository was archived by the owner on Jun 22, 2025. It is now read-only.

[Q] How to BulkLoad FixedString #637

@Stan-RED

Description

@Stan-RED

Question

In Clickhouse FixedString can act as fixed-size byte array (e.g. UUID, IPv6 are FixedStrings). How can I bulk-copy char(N) to FixedString(N) from other source to Clickhouse? When input is char(N) I have a error:

Unhandled exception. ClickHouse.Client.Copy.ClickHouseBulkCopySerializationException: Error when serializing data
 ---> System.ArgumentException: The output byte buffer is too small to contain the encoded data, encoding codepage '65001' and fallback 'System.Text.EncoderReplacementFallback'. (Parameter 'bytes')
   at System.Text.Encoding.ThrowBytesOverflow()
   at System.Text.Encoding.ThrowBytesOverflow(EncoderNLS encoder, Boolean nothingEncoded)
   at System.Text.Encoding.GetBytesWithFallback(ReadOnlySpan`1 chars, Int32 originalCharsLength, Span`1 bytes, Int32 originalBytesLength, EncoderNLS encoder, Boolean throwForDestinationOverflow)
   at System.Text.Encoding.GetBytesWithFallback(Char* pOriginalChars, Int32 originalCharCount, Byte* pOriginalBytes, Int32 originalByteCount, Int32 charsConsumedSoFar, Int32 bytesWrittenSoFar, Boolean throwForDestinationOverflow)
   at System.Text.UTF8Encoding.GetBytes(String s, Int32 charIndex, Int32 charCount, Byte[] bytes, Int32 byteIndex)
   at ClickHouse.Client.Types.FixedStringType.Write(ExtendedBinaryWriter writer, Object value)
   at ClickHouse.Client.Copy.Serializer.RowBinarySerializer.Serialize(Object[] row, ClickHouseType[] types, ExtendedBinaryWriter writer)
   at ClickHouse.Client.Copy.Serializer.BatchSerializer.Serialize(Batch batch, Stream stream)

And binary data in UTF-8 encoded, so it is corrupted and larger in size.

When input is byte[N], e.g. binary(12) (SQL Server data type) then it inserts string "System.Byte[]". Maybe in case of having byte arrays as an input copy it to Clickhouse as a binary string. Or maybe implement an option to skip UTF-encoding.

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionThis issue is a question

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions