-
Notifications
You must be signed in to change notification settings - Fork 40
Adding _source and schema merging to index_mappings #1101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ebff6e3
to
20022d6
Compare
c545ef9
to
bfcaf5b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove spark-warehouse
. I think it's generated by Spark test and normally should be removed after test complete automatically.
flint-spark-integration/src/main/scala/org/opensearch/flint/spark/FlintSparkIndexOptions.scala
Outdated
Show resolved
Hide resolved
...est/src/integration/scala/org/opensearch/flint/spark/FlintSparkSkippingIndexSqlITSuite.scala
Show resolved
Hide resolved
flint-commons/src/main/scala/org/opensearch/flint/common/metadata/FlintMetadata.scala
Outdated
Show resolved
Hide resolved
flint-spark-integration/src/main/scala/org/opensearch/flint/spark/FlintSparkIndex.scala
Outdated
Show resolved
Hide resolved
...e/src/main/scala/org/opensearch/flint/core/storage/FlintOpenSearchIndexMetadataService.scala
Outdated
Show resolved
Hide resolved
.../src/integration/scala/org/opensearch/flint/spark/FlintSparkMaterializedViewSqlITSuite.scala
Outdated
Show resolved
Hide resolved
flint-spark-integration/src/main/scala/org/opensearch/flint/spark/FlintSparkIndex.scala
Outdated
Show resolved
Hide resolved
flint-commons/src/main/scala/org/opensearch/flint/common/metadata/FlintMetadata.scala
Outdated
Show resolved
Hide resolved
...e/src/main/scala/org/opensearch/flint/core/storage/FlintOpenSearchIndexMetadataService.scala
Outdated
Show resolved
Hide resolved
* Fix antlr4 parser issues Signed-off-by: Lantao Jin <ltjin@amazon.com> * Case insensitive lexer Signed-off-by: Lantao Jin <ltjin@amazon.com> * revert useless change Signed-off-by: Lantao Jin <ltjin@amazon.com> * remove tokens file Signed-off-by: Lantao Jin <ltjin@amazon.com> --------- Signed-off-by: Lantao Jin <ltjin@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
…ed unnecessary code Signed-off-by: Kai Huang <ahkcs@amazon.com>
...e/src/main/scala/org/opensearch/flint/core/storage/FlintOpenSearchIndexMetadataService.scala
Outdated
Show resolved
Hide resolved
...e/src/main/scala/org/opensearch/flint/core/storage/FlintOpenSearchIndexMetadataService.scala
Outdated
Show resolved
Hide resolved
...e/src/main/scala/org/opensearch/flint/core/storage/FlintOpenSearchIndexMetadataService.scala
Show resolved
Hide resolved
...integration/src/main/scala/org/opensearch/flint/spark/covering/FlintSparkCoveringIndex.scala
Show resolved
Hide resolved
flint-spark-integration/src/main/scala/org/opensearch/flint/spark/FlintSparkIndex.scala
Outdated
Show resolved
Hide resolved
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes!
I recall current schema merge logic only supports limited field name, right? Could you create follow-up issue for improvements or updating doc with this limitation?
Signed-off-by: Kai Huang <ahkcs@amazon.com>
...e/src/main/scala/org/opensearch/flint/core/storage/FlintOpenSearchIndexMetadataService.scala
Outdated
Show resolved
Hide resolved
...e/src/main/scala/org/opensearch/flint/core/storage/FlintOpenSearchIndexMetadataService.scala
Show resolved
Hide resolved
Signed-off-by: Kai Huang <ahkcs@amazon.com>
...e/src/main/scala/org/opensearch/flint/core/storage/FlintOpenSearchIndexMetadataService.scala
Outdated
Show resolved
Hide resolved
Signed-off-by: Kai Huang <ahkcs@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahkcs please check the e2e test failing
* Fix antlr4 parser issues (#1094) * Fix antlr4 parser issues Signed-off-by: Lantao Jin <ltjin@amazon.com> * Case insensitive lexer Signed-off-by: Lantao Jin <ltjin@amazon.com> * revert useless change Signed-off-by: Lantao Jin <ltjin@amazon.com> * remove tokens file Signed-off-by: Lantao Jin <ltjin@amazon.com> --------- Signed-off-by: Lantao Jin <ltjin@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com> * adding _source to index_mappings Signed-off-by: Kai Huang <ahkcs@amazon.com> * syntax fix Signed-off-by: Kai Huang <ahkcs@amazon.com> * Apply scalafmt Signed-off-by: Kai Huang <ahkcs@amazon.com> * Added index_mapping as an option in index.md, applied scalafmtAll Signed-off-by: Kai Huang <ahkcs@amazon.com> * improve readability Signed-off-by: Kai Huang <ahkcs@amazon.com> * Removed index_mappings from FlintMetaData.scala, Modified index.md Signed-off-by: Kai Huang <ahkcs@amazon.com> * removed indexMappingsSourceEnabled from FlintMetadata.scala Signed-off-by: Kai Huang <ahkcs@amazon.com> * removed indexMappingsSourceEnabled from FlintMetadata.scala Signed-off-by: Kai Huang <ahkcs@amazon.com> * removed indexMappingsSourceEnabled from FlintMetadata.scala Signed-off-by: Kai Huang <ahkcs@amazon.com> * Removed indexMappingsSourceEnabled from FlintMetadata.scala and removed unnecessary code Signed-off-by: Kai Huang <ahkcs@amazon.com> * Added some test cases to test serialzie() and fixed some formatting issues Signed-off-by: Kai Huang <ahkcs@amazon.com> * Added some test cases for FlintOpenSearchIndexMetadataServiceSuite.scala Signed-off-by: Kai Huang <ahkcs@amazon.com> * Added schema merging to index_mappings, added some test cases Signed-off-by: Kai Huang <ahkcs@amazon.com> * updated test cases Signed-off-by: Kai Huang <ahkcs@amazon.com> * Minor format fix Signed-off-by: Kai Huang <ahkcs@amazon.com> * minor fixes Signed-off-by: Kai Huang <ahkcs@amazon.com> * added nested schema merging logic, moved mergeSchema to serialize, updated test cases, fixed some minor issues Signed-off-by: Kai Huang <ahkcs@amazon.com> * updated some comments Signed-off-by: Kai Huang <ahkcs@amazon.com> * fixed some formatting issues based on the comments Signed-off-by: Kai Huang <ahkcs@amazon.com> * fixed syntax issue Signed-off-by: Kai Huang <ahkcs@amazon.com> * syntax issue Signed-off-by: Kai Huang <ahkcs@amazon.com> * syntax issue Signed-off-by: Kai Huang <ahkcs@amazon.com> * fixed the FlintSparkSkippingIndexITSuite Signed-off-by: Kai Huang <ahkcs@amazon.com> * fixing schema merging limitation Signed-off-by: Kai Huang <ahkcs@amazon.com> * less scala/java conversion Signed-off-by: Kai Huang <ahkcs@amazon.com> * style fix Signed-off-by: Kai Huang <ahkcs@amazon.com> * fix unnecessary casting Signed-off-by: Kai Huang <ahkcs@amazon.com> --------- Signed-off-by: Lantao Jin <ltjin@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com> Co-authored-by: Lantao Jin <ltjin@amazon.com> (cherry picked from commit 76d35e2) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* Fix antlr4 parser issues (#1094) * Fix antlr4 parser issues Signed-off-by: Lantao Jin <ltjin@amazon.com> * Case insensitive lexer Signed-off-by: Lantao Jin <ltjin@amazon.com> * revert useless change Signed-off-by: Lantao Jin <ltjin@amazon.com> * remove tokens file Signed-off-by: Lantao Jin <ltjin@amazon.com> --------- Signed-off-by: Lantao Jin <ltjin@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com> * adding _source to index_mappings Signed-off-by: Kai Huang <ahkcs@amazon.com> * syntax fix Signed-off-by: Kai Huang <ahkcs@amazon.com> * Apply scalafmt Signed-off-by: Kai Huang <ahkcs@amazon.com> * Added index_mapping as an option in index.md, applied scalafmtAll Signed-off-by: Kai Huang <ahkcs@amazon.com> * improve readability Signed-off-by: Kai Huang <ahkcs@amazon.com> * Removed index_mappings from FlintMetaData.scala, Modified index.md Signed-off-by: Kai Huang <ahkcs@amazon.com> * removed indexMappingsSourceEnabled from FlintMetadata.scala Signed-off-by: Kai Huang <ahkcs@amazon.com> * removed indexMappingsSourceEnabled from FlintMetadata.scala Signed-off-by: Kai Huang <ahkcs@amazon.com> * removed indexMappingsSourceEnabled from FlintMetadata.scala Signed-off-by: Kai Huang <ahkcs@amazon.com> * Removed indexMappingsSourceEnabled from FlintMetadata.scala and removed unnecessary code Signed-off-by: Kai Huang <ahkcs@amazon.com> * Added some test cases to test serialzie() and fixed some formatting issues Signed-off-by: Kai Huang <ahkcs@amazon.com> * Added some test cases for FlintOpenSearchIndexMetadataServiceSuite.scala Signed-off-by: Kai Huang <ahkcs@amazon.com> * Added schema merging to index_mappings, added some test cases Signed-off-by: Kai Huang <ahkcs@amazon.com> * updated test cases Signed-off-by: Kai Huang <ahkcs@amazon.com> * Minor format fix Signed-off-by: Kai Huang <ahkcs@amazon.com> * minor fixes Signed-off-by: Kai Huang <ahkcs@amazon.com> * added nested schema merging logic, moved mergeSchema to serialize, updated test cases, fixed some minor issues Signed-off-by: Kai Huang <ahkcs@amazon.com> * updated some comments Signed-off-by: Kai Huang <ahkcs@amazon.com> * fixed some formatting issues based on the comments Signed-off-by: Kai Huang <ahkcs@amazon.com> * fixed syntax issue Signed-off-by: Kai Huang <ahkcs@amazon.com> * syntax issue Signed-off-by: Kai Huang <ahkcs@amazon.com> * syntax issue Signed-off-by: Kai Huang <ahkcs@amazon.com> * fixed the FlintSparkSkippingIndexITSuite Signed-off-by: Kai Huang <ahkcs@amazon.com> * fixing schema merging limitation Signed-off-by: Kai Huang <ahkcs@amazon.com> * less scala/java conversion Signed-off-by: Kai Huang <ahkcs@amazon.com> * style fix Signed-off-by: Kai Huang <ahkcs@amazon.com> * fix unnecessary casting Signed-off-by: Kai Huang <ahkcs@amazon.com> --------- Signed-off-by: Lantao Jin <ltjin@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com> Co-authored-by: Lantao Jin <ltjin@amazon.com> (cherry picked from commit 76d35e2) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* Fix antlr4 parser issues (#1094) * Fix antlr4 parser issues * Case insensitive lexer * revert useless change * remove tokens file --------- * adding _source to index_mappings * syntax fix * Apply scalafmt * Added index_mapping as an option in index.md, applied scalafmtAll * improve readability * Removed index_mappings from FlintMetaData.scala, Modified index.md * removed indexMappingsSourceEnabled from FlintMetadata.scala * removed indexMappingsSourceEnabled from FlintMetadata.scala * removed indexMappingsSourceEnabled from FlintMetadata.scala * Removed indexMappingsSourceEnabled from FlintMetadata.scala and removed unnecessary code * Added some test cases to test serialzie() and fixed some formatting issues * Added some test cases for FlintOpenSearchIndexMetadataServiceSuite.scala * Added schema merging to index_mappings, added some test cases * updated test cases * Minor format fix * minor fixes * added nested schema merging logic, moved mergeSchema to serialize, updated test cases, fixed some minor issues * updated some comments * fixed some formatting issues based on the comments * fixed syntax issue * syntax issue * syntax issue * fixed the FlintSparkSkippingIndexITSuite * fixing schema merging limitation * less scala/java conversion * style fix * fix unnecessary casting --------- (cherry picked from commit 76d35e2) Signed-off-by: Lantao Jin <ltjin@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Lantao Jin <ltjin@amazon.com>
* Fix antlr4 parser issues (#1094) * Fix antlr4 parser issues * Case insensitive lexer * revert useless change * remove tokens file --------- * adding _source to index_mappings * syntax fix * Apply scalafmt * Added index_mapping as an option in index.md, applied scalafmtAll * improve readability * Removed index_mappings from FlintMetaData.scala, Modified index.md * removed indexMappingsSourceEnabled from FlintMetadata.scala * removed indexMappingsSourceEnabled from FlintMetadata.scala * removed indexMappingsSourceEnabled from FlintMetadata.scala * Removed indexMappingsSourceEnabled from FlintMetadata.scala and removed unnecessary code * Added some test cases to test serialzie() and fixed some formatting issues * Added some test cases for FlintOpenSearchIndexMetadataServiceSuite.scala * Added schema merging to index_mappings, added some test cases * updated test cases * Minor format fix * minor fixes * added nested schema merging logic, moved mergeSchema to serialize, updated test cases, fixed some minor issues * updated some comments * fixed some formatting issues based on the comments * fixed syntax issue * syntax issue * syntax issue * fixed the FlintSparkSkippingIndexITSuite * fixing schema merging limitation * less scala/java conversion * style fix * fix unnecessary casting --------- (cherry picked from commit 76d35e2) Signed-off-by: Lantao Jin <ltjin@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Lantao Jin <ltjin@amazon.com>
Description
Adding _source and schema merging to index_mappings so that customers can customize customize the index mapping for the OpenSearch index that stores the Flint index data created via Spark SQL.
This PR added the _source feature so that user can customize whether they want to enable _source in index_mappings
For example:
index_mappings: '{ "_source": { "enabled": false }'
This PR also added the schema merging feature so that user can Disable doc_values for specific fields to save space.
For exmaple:
"properties": { "test_field": {"index": false} } }
Test Results
Before disabling _source:
Input:
Output:
After disabling _source:
Input:
Output:
Before adding index_mappings: '{ "_source": { "enabled": false }, "properties": { "cnt": {"index": false} } }'
Input:
Output:
After
Input:
Output:
Related Issues
Resolves #772
Pending implementations for Support index mapping option in create index statement #772