diff --git a/docs/bigquery-cdcTarget.md b/docs/bigquery-cdcTarget.md index eef981c..8911c41 100644 --- a/docs/bigquery-cdcTarget.md +++ b/docs/bigquery-cdcTarget.md @@ -1,4 +1,4 @@ -# Google BigQuery Delta Target +# Google BigQuery Replication Target Description ----------- @@ -9,7 +9,7 @@ table using a BigQuery merge query. The final target tables will include all the original columns from the source table plus one additional _sequence_num column. The sequence number is used to ensure that data is not duplicated or missed in -replicator failure scenarios. +replication job failure scenarios. Credentials ----------- @@ -46,9 +46,9 @@ https://cloud.google.com/bigquery/docs/locations. This value is ignored if an ex staging bucket and the BigQuery dataset will be created in the same location as that bucket. **Staging Bucket**: GCS bucket to write change events to before loading them into staging tables. -Changes are written to a directory that contains the replicator name and namespace. It is safe to use -the same bucket across multiple replicators within the same instance. If it is shared by replicators across -multiple instances, ensure that the namespace and name are unique, otherwise the behavior is undefined. +Changes are written to a directory that contains the replication job name and namespace. It is safe to use +the same bucket across multiple replication jobs within the same instance. If it is shared by replication jobs +across multiple instances, ensure that the namespace and name are unique, otherwise the behavior is undefined. The bucket must be in the same location as the BigQuery dataset. If not provided, new bucket will be created for each pipeline named as 'df-rbq---'. Note that user will have to explicitly delete the bucket once the pipeline is deleted. @@ -63,7 +63,7 @@ of the cluster. Staging tables names are generated by prepending this prefix to the target table name. **Require Manual Drop Intervention**: Whether to require manual administrative action to drop tables and -datasets when a drop table or drop database event is encountered. When set to true, the replicator will +datasets when a drop table or drop database event is encountered. When set to true, the replication job will not delete a table or dataset. Instead, it will fail and retry until the table or dataset does not exist. If the dataset or table does not already exist, no manual intervention is required. The event will be skipped as normal. diff --git a/src/main/java/io/cdap/delta/bigquery/BigQueryTarget.java b/src/main/java/io/cdap/delta/bigquery/BigQueryTarget.java index c4db226..748bfec 100644 --- a/src/main/java/io/cdap/delta/bigquery/BigQueryTarget.java +++ b/src/main/java/io/cdap/delta/bigquery/BigQueryTarget.java @@ -185,8 +185,8 @@ private static String stringifyPipelineId(DeltaPipelineId pipelineId) { public static class Conf extends PluginConfig { @Nullable - @Description("Project of the BigQuery dataset. When running on a Google Cloud VM, this can be set to " - + "'auto-detect', which will use the project of the VM.") + @Description("Project of the BigQuery dataset. When running on a Dataproc cluster, this can be set to " + + "'auto-detect', which will use the project of the cluster.") private String project; @Macro