3436: AVRO-1759: [java] Automatic union types for sealed classes#13
3436: AVRO-1759: [java] Automatic union types for sealed classes#13martin-augment wants to merge 3 commits intomainfrom
Conversation
WalkthroughAvro's reflection-based serialization gains support for sealed classes by introducing a new union resolver that checks both Union annotations and sealed class permits. Internal union handling refactors to use Changes
Sequence DiagramsequenceDiagram
participant Client
participant ReflectData
participant AnnotationCheck
participant SealedCheck
participant SpecificData
Client->>ReflectData: createSchema(Animal.class)
ReflectData->>ReflectData: getUnion(Animal.class)
ReflectData->>AnnotationCheck: Check for @Union annotation
alt Union annotation present
AnnotationCheck-->>ReflectData: Return Union.labelled types
else No annotation
ReflectData->>SealedCheck: Check if class is sealed
SealedCheck->>SpecificData: isSealed(Animal.class)
SpecificData-->>SealedCheck: true
SealedCheck->>SpecificData: getPermittedSubclasses(Animal.class)
SpecificData-->>SealedCheck: [Cat.class, Dog.class]
SealedCheck-->>ReflectData: Return permitted subclasses
end
ReflectData-->>Client: Schema with Cat/Dog union types
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes The changes involve core reflection logic modifications in ReflectData.java with union resolution refactoring affecting multiple methods, combined with Java 17 sealed-class feature integration requiring careful verification of backward compatibility and reflection correctness. Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
PR Review: AVRO-1759 - Automatic union types for sealed classesSummaryThis PR adds automatic union type detection for Java sealed classes, eliminating the need for explicit Code Quality & Best Practices✅ Strengths
|
|
Major Issue
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (4)
lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectData.java (3)
920-923: Unify union resolution at field level.Prefer calling getUnion(field) to centralize logic and enable sealed-class detection consistently at field sites.
Suggested diff:
- Union union = field.getAnnotation(Union.class); - if (union != null) - return getAnnotatedUnion(union.value(), names); + Class[] union = getUnion(field); + if (union != null) + return getAnnotatedUnion(union, names);
966-973: Apply centralized union derivation to parameters.To keep behavior consistent (annotations first, sealed fallback), use getUnion(parameter) rather than inspecting annotations manually.
Minimal change:
- else if (annotation instanceof Union) // union - paramSchema = getAnnotatedUnion(((Union) annotation).value(), names); + else if (annotation instanceof Union) // union + paramSchema = getAnnotatedUnion(((Union) annotation).value(), names); ... + Class[] paramUnion = getUnion(parameter); + if (paramUnion != null) { + paramSchema = getAnnotatedUnion(paramUnion, names); + }
833-851: Add defensive null-check for GET_PERMITTED_SUBCLASSES_METHOD; sorting and flattening lack test evidence.The null-check for
GET_PERMITTED_SUBCLASSES_METHOD(line 843) is a valid defensive improvement, even though both methods are initialized together in the same try-catch block. Adding it is recommended for clarity and robustness.However, the sorting-by-name suggestion lacks evidence of necessity—no deterministic union ordering requirement is documented or tested in the codebase. Similarly, the flattening of nested sealed hierarchies is speculative; there are no tests for sealed class unions at all, making this an unverified edge-case optimization rather than a confirmed issue.
The core null-check should be applied. The sorting and flattening suggestions should be treated as optional enhancements pending concrete use cases or test failures.
lang/java/java17-test/src/test/java/org/apache/avro/reflect/TestPolymorphicEncoding.java (1)
40-47: Optional: Validate schema shape too.Add an assertion that ReflectData.get().getSchema(Animal.class) is a UNION with Cat and Dog branches to guard against regressions.
Example (non-diff):
Schema s = ReflectData.get().getSchema(Animal.class); assertEquals(Schema.Type.UNION, s.getType()); assertEquals( Arrays.asList(Cat.class.getName(), Dog.class.getName()), s.getTypes().stream().map(t -> t.getFullName()).sorted().toList() );
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectData.java(7 hunks)lang/java/java17-test/src/test/java/org/apache/avro/reflect/TestPolymorphicEncoding.java(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
lang/java/java17-test/src/test/java/org/apache/avro/reflect/TestPolymorphicEncoding.java (2)
lang/java/avro/src/main/java/org/apache/avro/file/DataFileStream.java (1)
DataFileStream(50-400)lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectData.java (1)
ReflectData(69-1118)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: maven4
- GitHub Check: Java Interop (ubuntu-latest)
- GitHub Check: Java Interop (ubuntu-24.04-arm)
- GitHub Check: Java Test (ubuntu-24.04-arm)
- GitHub Check: Java Test (ubuntu-latest)
- GitHub Check: Analyze (java)
- GitHub Check: claude-review
- GitHub Check: codex
🔇 Additional comments (4)
lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectData.java (4)
41-41: No issues; import is appropriate.
727-731: OK: class-level union derivation is now shared.This refactor reads cleanly. Ensure branches are deterministically ordered to keep schema stable across runs. The ordering concern is addressed if you sort inside getUnion for sealed-class paths (see separate comment).
853-858: LGTM: updated utility for union schema creation.Signature change to Class[] is consistent and reads well.
981-982: OK: method return union handling updated.Adapting to Class[] via getAnnotatedUnion is consistent with the refactor.
| static { | ||
| Class<? extends Class> classClass = SpecificData.class.getClass(); | ||
| Method isSealed; | ||
| Method getPermittedSubclasses; | ||
| try { | ||
| isSealed = classClass.getMethod("isSealed"); | ||
| getPermittedSubclasses = classClass.getMethod("getPermittedSubclasses"); | ||
| } catch (NoSuchMethodException e) { | ||
| isSealed = null; | ||
| getPermittedSubclasses = null; | ||
| } | ||
| IS_SEALED_METHOD = isSealed; | ||
| GET_PERMITTED_SUBCLASSES_METHOD = getPermittedSubclasses; | ||
| } |
There was a problem hiding this comment.
Safer reflective lookup and future-proofing.
- Use Class.class directly; current SpecificData.class.getClass() is indirect.
- Guard both methods being non-null to avoid NPEs.
- Consider catching SecurityException too.
Apply this diff:
- static {
- Class<? extends Class> classClass = SpecificData.class.getClass();
+ static {
+ Class<?> classClass = Class.class;
Method isSealed;
Method getPermittedSubclasses;
try {
isSealed = classClass.getMethod("isSealed");
getPermittedSubclasses = classClass.getMethod("getPermittedSubclasses");
- } catch (NoSuchMethodException e) {
+ } catch (NoSuchMethodException | SecurityException e) {
isSealed = null;
getPermittedSubclasses = null;
}
IS_SEALED_METHOD = isSealed;
GET_PERMITTED_SUBCLASSES_METHOD = getPermittedSubclasses;
}🤖 Prompt for AI Agents
In lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectData.java around
lines 76 to 89, replace the indirect reflective lookup using
SpecificData.class.getClass() with Class.class directly, wrap getMethod calls in
a try/catch that catches both NoSuchMethodException and SecurityException, and
ensure both IS_SEALED_METHOD and GET_PERMITTED_SUBCLASSES_METHOD are assigned
safely (set to null if either lookup fails) so downstream code does not get an
NPE; i.e., obtain Method references from Class.class, catch
NoSuchMethodException and SecurityException, and only assign non-null Methods
(or null) to the static fields.
| try (DataFileStream<T> dataFileReader = new DataFileStream<>(new ByteArrayInputStream(toDecode, 0, toDecode.length), | ||
| datumReader);) { | ||
| List<T> toReturn = new ArrayList<>(); |
There was a problem hiding this comment.
Fix try-with-resources syntax (trailing semicolon).
The extra semicolon inside the resource list causes a compile error.
Apply this diff:
- try (DataFileStream<T> dataFileReader = new DataFileStream<>(new ByteArrayInputStream(toDecode, 0, toDecode.length),
- datumReader);) {
+ try (DataFileStream<T> dataFileReader = new DataFileStream<>(
+ new ByteArrayInputStream(toDecode, 0, toDecode.length), datumReader)) {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| try (DataFileStream<T> dataFileReader = new DataFileStream<>(new ByteArrayInputStream(toDecode, 0, toDecode.length), | |
| datumReader);) { | |
| List<T> toReturn = new ArrayList<>(); | |
| try (DataFileStream<T> dataFileReader = new DataFileStream<>( | |
| new ByteArrayInputStream(toDecode, 0, toDecode.length), datumReader)) { | |
| List<T> toReturn = new ArrayList<>(); |
🤖 Prompt for AI Agents
In
lang/java/java17-test/src/test/java/org/apache/avro/reflect/TestPolymorphicEncoding.java
around lines 51 to 53, the try-with-resources resource list incorrectly contains
a trailing semicolon which causes a compile error; remove the extra semicolon
after the DataFileStream resource declaration so the try declaration reads with
no semicolon inside the parentheses, leaving the rest of the try block and
resource initialization unchanged.
3436: To review by AI
Summary by CodeRabbit
New Features
Tests