Skip to content

Commit f7f1bcc

Browse files
authored
Add CBOR ignoreUnknownKeys option (#947)
Closes #935 How it works: When an unknown element is encountered, decoder.skipElement() is invoked until a known element is encountered. Skipping of an element begins at the first byte of the element's value. Bytes are processed to determine element type (and corresponding length), to ultimately determine how many bytes can be skipped. The general process is demonstrated by the following pseudo-code: lengthStack represents stack of "number of elements" at each depth.
1 parent d3d2dca commit f7f1bcc

24 files changed

+931
-264
lines changed

docs/formats.md

Lines changed: 64 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ stable, these are currently experimental features of Kotlin serialization.
1111
<!--- TOC -->
1212

1313
* [CBOR (experimental)](#cbor-experimental)
14+
* [Ignoring unknown keys](#ignoring-unknown-keys)
1415
* [Byte arrays and CBOR data types](#byte-arrays-and-cbor-data-types)
1516
* [ProtoBuf (experimental)](#protobuf-experimental)
1617
* [Field numbers](#field-numbers)
@@ -94,6 +95,54 @@ BF # map(*)
9495
> (see the [Allowing structured map keys](json.md#allowing-structured-map-keys) section for JSON workarounds),
9596
> and Kotlin maps are serialized as CBOR maps, but some parsers (like `jackson-dataformat-cbor`) don't support this.
9697
98+
### Ignoring unknown keys
99+
100+
CBOR format is often used to communicate with [IoT] devices where new properties could be added as a part of a device's
101+
API evolution. By default, unknown keys encountered during deserialization produce an error.
102+
This behavior can be configured with the [ignoreUnknownKeys][CborBuilder.ignoreUnknownKeys] property.
103+
104+
<!--- INCLUDE
105+
import kotlinx.serialization.*
106+
import kotlinx.serialization.cbor.*
107+
-->
108+
109+
```kotlin
110+
val format = Cbor { ignoreUnknownKeys = true }
111+
112+
@Serializable
113+
data class Project(val name: String)
114+
115+
fun main() {
116+
val data = format.decodeFromHexString<Project>(
117+
"bf646e616d65756b6f746c696e782e73657269616c697a6174696f6e686c616e6775616765664b6f746c696eff"
118+
)
119+
println(data)
120+
}
121+
```
122+
123+
> You can get the full code [here](../guide/example/example-formats-02.kt).
124+
125+
It decodes the object, despite the fact that `Project` is missing the `language` property.
126+
127+
```text
128+
Project(name=kotlinx.serialization)
129+
```
130+
131+
<!--- TEST -->
132+
133+
In [CBOR hex notation](http://cbor.me/), the input is equivalent to the following:
134+
```
135+
BF # map(*)
136+
64 # text(4)
137+
6E616D65 # "name"
138+
75 # text(21)
139+
6B6F746C696E782E73657269616C697A6174696F6E # "kotlinx.serialization"
140+
68 # text(8)
141+
6C616E6775616765 # "language"
142+
66 # text(6)
143+
4B6F746C696E # "Kotlin"
144+
FF # primitive(*)
145+
```
97146

98147
### Byte arrays and CBOR data types
99148

@@ -138,7 +187,7 @@ fun main() {
138187
}
139188
```
140189

141-
> You can get the full code [here](../guide/example/example-formats-02.kt).
190+
> You can get the full code [here](../guide/example/example-formats-03.kt).
142191
143192
As we see, the CBOR byte that precedes the data is different for different type of encoding.
144193

@@ -203,7 +252,7 @@ fun main() {
203252
}
204253
```
205254

206-
> You can get the full code [here](../guide/example/example-formats-03.kt).
255+
> You can get the full code [here](../guide/example/example-formats-04.kt).
207256
208257
```text
209258
{0A}{15}kotlinx.serialization{12}{06}Kotlin
@@ -253,7 +302,7 @@ fun main() {
253302
}
254303
```
255304

256-
> You can get the full code [here](../guide/example/example-formats-04.kt).
305+
> You can get the full code [here](../guide/example/example-formats-05.kt).
257306
258307
We see in the output that the number for the first property `name` did not change (as it is numbered from one by default),
259308
but it did change for the `language` property.
@@ -304,7 +353,7 @@ fun main() {
304353
}
305354
```
306355

307-
> You can get the full code [here](../guide/example/example-formats-05.kt).
356+
> You can get the full code [here](../guide/example/example-formats-06.kt).
308357
309358
* The [default][ProtoIntegerType.DEFAULT] is a varint encoding (`intXX`) that is optimized for
310359
small non-negative numbers. The value of `1` is encoded in one byte `01`.
@@ -361,7 +410,7 @@ fun main() {
361410
}
362411
```
363412

364-
> You can get the full code [here](../guide/example/example-formats-06.kt).
413+
> You can get the full code [here](../guide/example/example-formats-07.kt).
365414
366415
```text
367416
{08}{01}{08}{02}{08}{03}
@@ -407,7 +456,7 @@ fun main() {
407456
}
408457
```
409458

410-
> You can get the full code [here](../guide/example/example-formats-07.kt).
459+
> You can get the full code [here](../guide/example/example-formats-08.kt).
411460
412461
The resulting map has dot-separated keys representing keys of the nested objects.
413462

@@ -487,7 +536,7 @@ fun main() {
487536
}
488537
```
489538

490-
> You can get the full code [here](../guide/example/example-formats-08.kt).
539+
> You can get the full code [here](../guide/example/example-formats-09.kt).
491540
492541
As a result, we got all the primitives values in our object graph visited and put into a list
493542
in a _serial_ order.
@@ -589,7 +638,7 @@ fun main() {
589638
}
590639
```
591640

592-
> You can get the full code [here](../guide/example/example-formats-09.kt).
641+
> You can get the full code [here](../guide/example/example-formats-10.kt).
593642
594643
Now can convert a list of primitives back to an object tree.
595644

@@ -680,7 +729,7 @@ fun main() {
680729
}
681730
-->
682731

683-
> You can get the full code [here](../guide/example/example-formats-10.kt).
732+
> You can get the full code [here](../guide/example/example-formats-11.kt).
684733
685734
<!--- TEST
686735
[kotlinx.serialization, kotlin, 9000]
@@ -787,7 +836,7 @@ fun main() {
787836
}
788837
```
789838

790-
> You can get the full code [here](../guide/example/example-formats-11.kt).
839+
> You can get the full code [here](../guide/example/example-formats-12.kt).
791840
792841
We see the size of the list added to the result, letting decoder know where to stop.
793842

@@ -899,7 +948,7 @@ fun main() {
899948

900949
```
901950

902-
> You can get the full code [here](../guide/example/example-formats-12.kt).
951+
> You can get the full code [here](../guide/example/example-formats-13.kt).
903952
904953
In the output we see how not-null`!!` and `NULL` marks are used.
905954

@@ -1027,7 +1076,7 @@ fun main() {
10271076
}
10281077
```
10291078
1030-
> You can get the full code [here](../guide/example/example-formats-13.kt).
1079+
> You can get the full code [here](../guide/example/example-formats-14.kt).
10311080
10321081
As we can see, the result is the dense binary format that only contains the data that is being serialized.
10331082
It can be easily tweaked for any kind of domain-specific compact encoding.
@@ -1221,7 +1270,7 @@ fun main() {
12211270
}
12221271
```
12231272
1224-
> You can get the full code [here](../guide/example/example-formats-14.kt).
1273+
> You can get the full code [here](../guide/example/example-formats-15.kt).
12251274
12261275
As we can see, our custom byte array format is being used, with compact encoding of its size in one byte.
12271276

@@ -1239,6 +1288,7 @@ This chapter concludes [Kotlin Serialization Guide](serialization-guide.md).
12391288

12401289
<!-- references -->
12411290
[RFC 7049]: https://tools.ietf.org/html/rfc7049
1291+
[IoT]: https://en.wikipedia.org/wiki/Internet_of_things
12421292
[RFC 7049 Major Types]: https://tools.ietf.org/html/rfc7049#section-2.1
12431293

12441294
<!-- Java references -->
@@ -1286,5 +1336,6 @@ This chapter concludes [Kotlin Serialization Guide](serialization-guide.md).
12861336
[Cbor]: https://kotlin.github.io/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor/index.html
12871337
[Cbor.encodeToByteArray]: https://kotlin.github.io/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor/encode-to-byte-array.html
12881338
[Cbor.decodeFromByteArray]: https://kotlin.github.io/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor/decode-from-byte-array.html
1339+
[CborBuilder.ignoreUnknownKeys]: https://kotlin.github.io/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-builder/index.html#kotlinx.serialization.cbor%2FCborBuilder%2FignoreUnknownKeys%2F%23%2FPointingToDeclaration%2F
12891340
[ByteString]: https://kotlin.github.io/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-byte-string/index.html
12901341
<!--- END -->

docs/json.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -123,7 +123,7 @@ Project(name=kotlinx.serialization, status=SUPPORTED, votes=9000)
123123
### Ignoring unknown keys
124124

125125
JSON format is often used to read the output of 3rd-party services or in otherwise highly-dynamic environment where
126-
new properties could be added as a part of API evolution. By default, unknown keys encountered during deserialization produces an error.
126+
new properties could be added as a part of API evolution. By default, unknown keys encountered during deserialization produce an error.
127127
This behavior can be configured with
128128
the [ignoreUnknownKeys][JsonBuilder.ignoreUnknownKeys] property.
129129

docs/serialization-guide.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -129,6 +129,7 @@ Once the project is set up, we can start serializing some classes.
129129

130130
<!--- TOC_REF formats.md -->
131131
* <a name='cbor-experimental'></a>[CBOR (experimental)](formats.md#cbor-experimental)
132+
* <a name='ignoring-unknown-keys'></a>[Ignoring unknown keys](formats.md#ignoring-unknown-keys)
132133
* <a name='byte-arrays-and-cbor-data-types'></a>[Byte arrays and CBOR data types](formats.md#byte-arrays-and-cbor-data-types)
133134
* <a name='protobuf-experimental'></a>[ProtoBuf (experimental)](formats.md#protobuf-experimental)
134135
* <a name='field-numbers'></a>[Field numbers](formats.md#field-numbers)

formats/cbor/api/kotlinx-serialization-cbor.api

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ public final class kotlinx/serialization/cbor/ByteString$Impl : kotlinx/serializ
77

88
public abstract class kotlinx/serialization/cbor/Cbor : kotlinx/serialization/BinaryFormat {
99
public static final field Default Lkotlinx/serialization/cbor/Cbor$Default;
10-
public synthetic fun <init> (ZLkotlinx/serialization/modules/SerializersModule;Ljava/lang/Void;Lkotlin/jvm/internal/DefaultConstructorMarker;)V
10+
public synthetic fun <init> (ZZLkotlinx/serialization/modules/SerializersModule;Ljava/lang/Void;Lkotlin/jvm/internal/DefaultConstructorMarker;)V
1111
public fun decodeFromByteArray (Lkotlinx/serialization/DeserializationStrategy;[B)Ljava/lang/Object;
1212
public fun encodeToByteArray (Lkotlinx/serialization/SerializationStrategy;Ljava/lang/Object;)[B
1313
public fun getSerializersModule ()Lkotlinx/serialization/modules/SerializersModule;
@@ -18,8 +18,10 @@ public final class kotlinx/serialization/cbor/Cbor$Default : kotlinx/serializati
1818

1919
public final class kotlinx/serialization/cbor/CborBuilder {
2020
public final fun getEncodeDefaults ()Z
21+
public final fun getIgnoreUnknownKeys ()Z
2122
public final fun getSerializersModule ()Lkotlinx/serialization/modules/SerializersModule;
2223
public final fun setEncodeDefaults (Z)V
24+
public final fun setIgnoreUnknownKeys (Z)V
2325
public final fun setSerializersModule (Lkotlinx/serialization/modules/SerializersModule;)V
2426
}
2527

formats/cbor/commonMain/src/kotlinx/serialization/cbor/Cbor.kt

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,18 +26,20 @@ import kotlinx.serialization.modules.*
2626
*
2727
* @param encodeDefaults specifies whether default values of Kotlin properties are encoded.
2828
* False by default; meaning that properties with values equal to defaults will be elided.
29+
* @param ignoreUnknownKeys specifies if unknown CBOR elements should be ignored (skipped) when decoding.
2930
*/
3031
@ExperimentalSerializationApi
3132
public sealed class Cbor(
3233
internal val encodeDefaults: Boolean,
34+
internal val ignoreUnknownKeys: Boolean,
3335
override val serializersModule: SerializersModule,
3436
ctorMarker: Nothing? // Marker for the temporary migration
3537
) : BinaryFormat {
3638

3739
/**
3840
* The default instance of [Cbor]
3941
*/
40-
public companion object Default : Cbor(false, EmptySerializersModule, null)
42+
public companion object Default : Cbor(false, false, EmptySerializersModule, null)
4143

4244
override fun <T> encodeToByteArray(serializer: SerializationStrategy<T>, value: T): ByteArray {
4345
val output = ByteArrayOutput()
@@ -54,8 +56,8 @@ public sealed class Cbor(
5456
}
5557

5658
@OptIn(ExperimentalSerializationApi::class)
57-
private class CborImpl(encodeDefaults: Boolean, serializersModule: SerializersModule) :
58-
Cbor(encodeDefaults, serializersModule, null)
59+
private class CborImpl(encodeDefaults: Boolean, ignoreUnknownKeys: Boolean, serializersModule: SerializersModule) :
60+
Cbor(encodeDefaults, ignoreUnknownKeys, serializersModule, null)
5961

6062
/**
6163
* Creates an instance of [Cbor] configured from the optionally given [Cbor instance][from]
@@ -65,7 +67,7 @@ private class CborImpl(encodeDefaults: Boolean, serializersModule: SerializersMo
6567
public fun Cbor(from: Cbor = Cbor, builderAction: CborBuilder.() -> Unit): Cbor {
6668
val builder = CborBuilder(from)
6769
builder.builderAction()
68-
return CborImpl(builder.encodeDefaults, builder.serializersModule)
70+
return CborImpl(builder.encodeDefaults, builder.ignoreUnknownKeys, builder.serializersModule)
6971
}
7072

7173
/**
@@ -79,6 +81,13 @@ public class CborBuilder internal constructor(cbor: Cbor) {
7981
*/
8082
public var encodeDefaults: Boolean = cbor.encodeDefaults
8183

84+
/**
85+
* Specifies whether encounters of unknown properties in the input CBOR
86+
* should be ignored instead of throwing [SerializationException].
87+
* `false` by default.
88+
*/
89+
public var ignoreUnknownKeys: Boolean = cbor.ignoreUnknownKeys
90+
8291
/**
8392
* Module with contextual and polymorphic serializers to be used in the resulting [Cbor] instance.
8493
*/

0 commit comments

Comments
 (0)