-
-
Notifications
You must be signed in to change notification settings - Fork 144
[Avro] Add logicalType
support for some java.time
types; add AvroJavaTimeModule
for native ser/deser
#283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Avro] Add logicalType
support for some java.time
types; add AvroJavaTimeModule
for native ser/deser
#283
Changes from 8 commits
62570b4
1bd02ba
8f52e52
ba76375
956f365
170525f
e02185d
707f3a4
a05b4ab
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -111,6 +111,36 @@ byte[] avroData = mapper.writer(schema) | |
|
||
and that's about it, for now. | ||
|
||
## Java Time Support | ||
Serialization and deserialization support for limited set of `java.time` classes to Avro with [logical type](http://avro.apache.org/docs/current/spec.html#Logical+Types) is provided by `AvroJavaTimeModule`. | ||
|
||
```java | ||
AvroMapper mapper = AvroMapper.builder() | ||
.addModule(new AvroJavaTimeModule()) | ||
.build(); | ||
``` | ||
|
||
#### Note | ||
Please note that time zone information is at serialization. Serialized values represent point in time, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should "is at serialization" instead be "is not included at serialization" (or something like that)? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right, |
||
independent of a particular time zone or calendar. Upon reading a value back time instant is reconstructed but not the original time zone. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In the case of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes it is correct. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it would be clearer to say something like, "Note that time zone & offset information is not serialized—the serialized representation is only a point in time. For local time types ( There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Having written that, the behavior feels weird to me. Would it be possible to store the offset/zoneId for the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To me, zoned things absolutely should retain time zone and/or offset, and changing that to something else feels very much Wrong. For local variants it may be necessary to do interim binding if (but only if) representation uses a fixed timepoint (like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Avro specification does not aim to preserve time zone for non For correct deserialization into Support of
For local variants, contextual time zone is not used at all. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @MichalFoksa Ok, I think I better go over the changes once again & try to find what Avro specification says. Although handling of local/zoned types has expected semantics in Java 8, I vaguely recall Avro proscribing behvavior that seemed to differ... and I think for interoperability the letter of Avro spec should usually have precedence (even if I disagreed with how it was defined). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Even after reading and re-reading what Avro spec says about timestamp, regular and local, I have no idea what are those supposed to mean -- it seems nonsense to be blunt. Not the part about physical storage itself (although why on earth are there separate milli- vs micro-second types?) but ... well, if NEITHER stores timezone information NOR is there ANY WAY to sync send/receiver zones, then... there seems to be no actual reason for 2 types. At all. I mean, timestamp in this sense can NEITHER be local (it is concrete physical time offset) NOR non-local (no time offset or time zone!). It is much like But to try to untangle the mess I guess there is only the one question of how would read and write operations handle these differently. On writing side there probably cannot be any difference: physical timestamp is what it is. Whatever "local" timezone could be thought to be does not matter; change of zone/offset would not change that value. On reading side timezone/offset is sort of arbitrary as well: value itself is concrete, although we can use whatever zone we might want. @MichalFoksa WDYT? Apologies for this taking long -- but I have some time now and will get this merged during this week :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
"local" variants do not contain time zone.
I would leave it on user. Yeah, Avro is Avro ... :) But it is not bad. |
||
|
||
#### Supported java.time types: | ||
|
||
Supported java.time types with Avro schema. | ||
|
||
| Type | Avro schema | ||
| ------------------------------ | ------------- | ||
| `java.time.OffsetDateTime` | `{"type": "long", "logicalType": "timestamp-millis"}` | ||
| `java.time.ZonedDateTime` | `{"type": "long", "logicalType": "timestamp-millis"}` | ||
| `java.time.Instant` | `{"type": "long", "logicalType": "timestamp-millis"}` | ||
| `java.time.LocalDate` | `{"type": "int", "logicalType": "date"}` | ||
| `java.time.LocalTime` | `{"type": "int", "logicalType": "time-millis"}` | ||
| `java.time.LocalDateTime` | `{"type": "long", "logicalType": "local-timestamp-millis"}` | ||
|
||
#### Precision | ||
|
||
Avro supports milliseconds and microseconds precision for date and time related LogicalTypes, but this module only supports millisecond precision. | ||
|
||
## Generating Avro Schema from POJO definition | ||
|
||
Ok but wait -- you do not have to START with an Avro Schema. This module can | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
package com.fasterxml.jackson.dataformat.avro.jsr310; | ||
|
||
import com.fasterxml.jackson.databind.module.SimpleModule; | ||
import com.fasterxml.jackson.dataformat.avro.PackageVersion; | ||
import com.fasterxml.jackson.dataformat.avro.jsr310.deser.AvroInstantDeserializer; | ||
import com.fasterxml.jackson.dataformat.avro.jsr310.deser.AvroLocalDateDeserializer; | ||
import com.fasterxml.jackson.dataformat.avro.jsr310.deser.AvroLocalDateTimeDeserializer; | ||
import com.fasterxml.jackson.dataformat.avro.jsr310.deser.AvroLocalTimeDeserializer; | ||
import com.fasterxml.jackson.dataformat.avro.jsr310.ser.AvroInstantSerializer; | ||
import com.fasterxml.jackson.dataformat.avro.jsr310.ser.AvroLocalDateSerializer; | ||
import com.fasterxml.jackson.dataformat.avro.jsr310.ser.AvroLocalDateTimeSerializer; | ||
import com.fasterxml.jackson.dataformat.avro.jsr310.ser.AvroLocalTimeSerializer; | ||
|
||
import java.time.Instant; | ||
import java.time.LocalDate; | ||
import java.time.LocalDateTime; | ||
import java.time.LocalTime; | ||
import java.time.OffsetDateTime; | ||
import java.time.ZonedDateTime; | ||
|
||
/** | ||
* A module that installs a collection of serializers and deserializers for java.time classes. | ||
*/ | ||
public class AvroJavaTimeModule extends SimpleModule { | ||
|
||
private static final long serialVersionUID = 1L; | ||
|
||
public AvroJavaTimeModule() { | ||
super(AvroJavaTimeModule.class.getName(), PackageVersion.VERSION); | ||
|
||
addSerializer(Instant.class, AvroInstantSerializer.INSTANT); | ||
addSerializer(OffsetDateTime.class, AvroInstantSerializer.OFFSET_DATE_TIME); | ||
addSerializer(ZonedDateTime.class, AvroInstantSerializer.ZONED_DATE_TIME); | ||
addSerializer(LocalDateTime.class, AvroLocalDateTimeSerializer.INSTANCE); | ||
addSerializer(LocalDate.class, AvroLocalDateSerializer.INSTANCE); | ||
addSerializer(LocalTime.class, AvroLocalTimeSerializer.INSTANCE); | ||
|
||
addDeserializer(Instant.class, AvroInstantDeserializer.INSTANT); | ||
addDeserializer(OffsetDateTime.class, AvroInstantDeserializer.OFFSET_DATE_TIME); | ||
addDeserializer(ZonedDateTime.class, AvroInstantDeserializer.ZONED_DATE_TIME); | ||
addDeserializer(LocalDateTime.class, AvroLocalDateTimeDeserializer.INSTANCE); | ||
addDeserializer(LocalDate.class, AvroLocalDateDeserializer.INSTANCE); | ||
addDeserializer(LocalTime.class, AvroLocalTimeDeserializer.INSTANCE); | ||
} | ||
|
||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
package com.fasterxml.jackson.dataformat.avro.jsr310.deser; | ||
|
||
import java.time.Instant; | ||
import java.time.OffsetDateTime; | ||
import java.time.ZoneId; | ||
import java.time.ZonedDateTime; | ||
import java.time.temporal.Temporal; | ||
import java.util.function.BiFunction; | ||
|
||
/** | ||
* Deserializer for variants of java.time classes (Instant, OffsetDateTime, ZonedDateTime) from an integer value. | ||
* | ||
* Deserialized value represents an instant on the global timeline, independent of a particular time zone or | ||
* calendar, with a precision of one millisecond from the unix epoch, 1 January 1970 00:00:00.000 UTC. | ||
* Time zone information is lost at serialization. Time zone data types receives time zone from deserialization context. | ||
* | ||
* Deserialization from string is not supported. | ||
* | ||
* @param <T> The type of a instant class that can be deserialized. | ||
*/ | ||
public class AvroInstantDeserializer<T extends Temporal> extends AvroJavaTimeDeserializerBase <T> { | ||
|
||
private static final long serialVersionUID = 1L; | ||
|
||
public static final AvroInstantDeserializer<Instant> INSTANT = | ||
new AvroInstantDeserializer<>(Instant.class, (instant, zoneID) -> instant); | ||
|
||
public static final AvroInstantDeserializer<OffsetDateTime> OFFSET_DATE_TIME = | ||
new AvroInstantDeserializer<>(OffsetDateTime.class, OffsetDateTime::ofInstant); | ||
|
||
public static final AvroInstantDeserializer<ZonedDateTime> ZONED_DATE_TIME = | ||
new AvroInstantDeserializer<>(ZonedDateTime.class, ZonedDateTime::ofInstant); | ||
|
||
protected final BiFunction<Instant, ZoneId, T> fromInstant; | ||
|
||
protected AvroInstantDeserializer(Class<T> supportedType, BiFunction<Instant, ZoneId, T> fromInstant) { | ||
super(supportedType); | ||
this.fromInstant = fromInstant; | ||
} | ||
|
||
@Override | ||
protected T fromLong(long longValue, ZoneId defaultZoneId) { | ||
/** | ||
* Number of milliseconds, independent of a particular time zone or calendar, | ||
* from 1 January 1970 00:00:00.000 UTC. | ||
*/ | ||
return fromInstant.apply(Instant.ofEpochMilli(longValue), defaultZoneId); | ||
} | ||
|
||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
package com.fasterxml.jackson.dataformat.avro.jsr310.deser; | ||
|
||
import com.fasterxml.jackson.core.JsonParser; | ||
import com.fasterxml.jackson.databind.DeserializationContext; | ||
import com.fasterxml.jackson.databind.deser.std.StdScalarDeserializer; | ||
import com.fasterxml.jackson.databind.type.LogicalType; | ||
|
||
import java.io.IOException; | ||
import java.time.ZoneId; | ||
|
||
import static com.fasterxml.jackson.core.JsonToken.VALUE_NUMBER_INT; | ||
|
||
public abstract class AvroJavaTimeDeserializerBase<T> extends StdScalarDeserializer<T> { | ||
|
||
protected AvroJavaTimeDeserializerBase(Class<T> supportedType) { | ||
super(supportedType); | ||
} | ||
|
||
@Override | ||
public LogicalType logicalType() { | ||
return LogicalType.DateTime; | ||
} | ||
|
||
@SuppressWarnings("unchecked") | ||
@Override | ||
public T deserialize(JsonParser p, DeserializationContext context) throws IOException { | ||
if (p.getCurrentToken() == VALUE_NUMBER_INT) { | ||
final ZoneId defaultZoneId = context.getTimeZone().toZoneId().normalized(); | ||
return fromLong(p.getLongValue(), defaultZoneId); | ||
} else { | ||
return (T) context.handleUnexpectedToken(_valueClass, p); | ||
} | ||
} | ||
|
||
protected abstract T fromLong(long longValue, ZoneId defaultZoneId); | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
package com.fasterxml.jackson.dataformat.avro.jsr310.deser; | ||
|
||
import java.time.LocalDate; | ||
import java.time.ZoneId; | ||
|
||
/** | ||
* Deserializer for {@link LocalDate} from and integer value. | ||
* | ||
* Deserialized value represents number of days from the unix epoch, 1 January 1970. | ||
* | ||
* Deserialization from string is not supported. | ||
*/ | ||
public class AvroLocalDateDeserializer extends AvroJavaTimeDeserializerBase<LocalDate> { | ||
|
||
private static final long serialVersionUID = 1L; | ||
|
||
public static final AvroLocalDateDeserializer INSTANCE = new AvroLocalDateDeserializer(); | ||
|
||
protected AvroLocalDateDeserializer() { | ||
super(LocalDate.class); | ||
} | ||
|
||
@Override | ||
protected LocalDate fromLong(long longValue, ZoneId defaultZoneId) { | ||
/** | ||
* Number of days from the unix epoch, 1 January 1970.. | ||
*/ | ||
return LocalDate.ofEpochDay(longValue); | ||
} | ||
|
||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
package com.fasterxml.jackson.dataformat.avro.jsr310.deser; | ||
|
||
import java.time.Instant; | ||
import java.time.LocalDateTime; | ||
import java.time.ZoneId; | ||
import java.time.ZoneOffset; | ||
|
||
/** | ||
* Deserializer for {@link LocalDateTime} from an integer value. | ||
* | ||
* Deserialized value represents timestamp in a local timezone, regardless of what specific time zone | ||
* is considered local, with a precision of one millisecond from 1 January 1970 00:00:00.000. | ||
* | ||
* Deserialization from string is not supported. | ||
*/ | ||
public class AvroLocalDateTimeDeserializer extends AvroJavaTimeDeserializerBase<LocalDateTime> { | ||
|
||
private static final long serialVersionUID = 1L; | ||
|
||
public static final AvroLocalDateTimeDeserializer INSTANCE = new AvroLocalDateTimeDeserializer(); | ||
|
||
protected AvroLocalDateTimeDeserializer() { | ||
super(LocalDateTime.class); | ||
} | ||
|
||
@Override | ||
protected LocalDateTime fromLong(long longValue, ZoneId defaultZoneId) { | ||
/** | ||
* Number of milliseconds in a local timezone, regardless of what specific time zone is considered local, | ||
* from 1 January 1970 00:00:00.000. | ||
*/ | ||
return LocalDateTime.ofInstant(Instant.ofEpochMilli(longValue), ZoneOffset.ofTotalSeconds(0)); | ||
} | ||
|
||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
package com.fasterxml.jackson.dataformat.avro.jsr310.deser; | ||
|
||
import java.time.LocalTime; | ||
import java.time.ZoneId; | ||
|
||
/** | ||
* Deserializer for {@link LocalTime} from an integer value. | ||
* | ||
* Deserialized value represents time of day, with no reference to a particular calendar, | ||
* time zone or date, where the int stores the number of milliseconds after midnight, 00:00:00.000. | ||
* | ||
* Deserialization from string is not supported. | ||
*/ | ||
public class AvroLocalTimeDeserializer extends AvroJavaTimeDeserializerBase<LocalTime> { | ||
|
||
private static final long serialVersionUID = 1L; | ||
|
||
public static final AvroLocalTimeDeserializer INSTANCE = new AvroLocalTimeDeserializer(); | ||
|
||
protected AvroLocalTimeDeserializer() { | ||
super(LocalTime.class); | ||
} | ||
|
||
@Override | ||
protected LocalTime fromLong(long longValue, ZoneId defaultZoneId) { | ||
/** | ||
* Number of milliseconds, with no reference to a particular calendar, time zone or date, after | ||
* midnight, 00:00:00.000. | ||
*/ | ||
return LocalTime.ofNanoOfDay(longValue * 1000_000L); | ||
} | ||
|
||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
package com.fasterxml.jackson.dataformat.avro.jsr310.ser; | ||
|
||
import com.fasterxml.jackson.core.JsonGenerator; | ||
import com.fasterxml.jackson.core.JsonParser; | ||
import com.fasterxml.jackson.databind.JavaType; | ||
import com.fasterxml.jackson.databind.JsonMappingException; | ||
import com.fasterxml.jackson.databind.SerializerProvider; | ||
import com.fasterxml.jackson.databind.jsonFormatVisitors.JsonFormatVisitorWrapper; | ||
import com.fasterxml.jackson.databind.jsonFormatVisitors.JsonIntegerFormatVisitor; | ||
import com.fasterxml.jackson.databind.ser.std.StdScalarSerializer; | ||
|
||
import java.io.IOException; | ||
import java.time.Instant; | ||
import java.time.OffsetDateTime; | ||
import java.time.ZonedDateTime; | ||
import java.time.temporal.Temporal; | ||
import java.util.function.Function; | ||
|
||
/** | ||
* Serializer for variants of java.time classes (Instant, OffsetDateTime, ZonedDateTime) into long value. | ||
* | ||
* Serialized value represents an instant on the global timeline, independent of a particular time zone or | ||
* calendar, with a precision of one millisecond from the unix epoch, 1 January 1970 00:00:00.000 UTC. | ||
* Please note that time zone information gets lost in this process. Upon reading a value back, we can only | ||
* reconstruct the instant, but not the original representation. | ||
* | ||
* Note: In combination with {@link com.fasterxml.jackson.dataformat.avro.schema.DateTimeVisitor} it aims to produce | ||
* Avro schema with type long and logicalType timestamp-millis: | ||
* { | ||
* "type" : "long", | ||
* "logicalType" : "timestamp-millis" | ||
* } | ||
* | ||
* {@link AvroInstantSerializer} does not support serialization to string. | ||
* | ||
* @param <T> The type of a instant class that can be serialized. | ||
*/ | ||
public class AvroInstantSerializer<T extends Temporal> extends StdScalarSerializer<T> { | ||
|
||
private static final long serialVersionUID = 1L; | ||
|
||
public static final AvroInstantSerializer<Instant> INSTANT = | ||
new AvroInstantSerializer<>(Instant.class, Function.identity()); | ||
|
||
public static final AvroInstantSerializer<OffsetDateTime> OFFSET_DATE_TIME = | ||
new AvroInstantSerializer<>(OffsetDateTime.class, OffsetDateTime::toInstant); | ||
|
||
public static final AvroInstantSerializer<ZonedDateTime> ZONED_DATE_TIME = | ||
new AvroInstantSerializer<>(ZonedDateTime.class, ZonedDateTime::toInstant); | ||
|
||
private final Function<T, Instant> getInstant; | ||
|
||
protected AvroInstantSerializer(Class<T> t, Function<T, Instant> getInstant) { | ||
super(t); | ||
this.getInstant = getInstant; | ||
} | ||
|
||
@Override | ||
public void serialize(T value, JsonGenerator gen, SerializerProvider provider) throws IOException { | ||
/** | ||
* Number of milliseconds, independent of a particular time zone or calendar, | ||
* from 1 January 1970 00:00:00.000 UTC. | ||
*/ | ||
final Instant instant = getInstant.apply(value); | ||
gen.writeNumber(instant.toEpochMilli()); | ||
} | ||
|
||
@Override | ||
cowtowncoder marked this conversation as resolved.
Show resolved
Hide resolved
|
||
public void acceptJsonFormatVisitor(JsonFormatVisitorWrapper visitor, JavaType typeHint) throws JsonMappingException { | ||
JsonIntegerFormatVisitor v2 = visitor.expectIntegerFormat(typeHint); | ||
if (v2 != null) { | ||
v2.numberType(JsonParser.NumberType.LONG); | ||
} | ||
} | ||
|
||
} |
Uh oh!
There was an error while loading. Please reload this page.