You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I feel like there should be a deployment option to disable the Cloud Dataflow use.
Pretty much everything else used by this tool feels pay-as-you-go/serverless.
However, it seems like Dataflow is provisioning a n1-standard-4 instance ($97/mo). This is simply not going to be within my budget.
I'd love to see an option to disable Dataflow in scripts –and also an explanation of what Dataflow does.
Note: I'm not at all familiar with Dataflow, so I'm not sure where it's currently utilized in this tool. (It feels like Cloud Run service could directly write to Stackdriver and/or BigQuery.
The text was updated successfully, but these errors were encountered:
Agree, this is the one bit that's unlike the others in this stack. Cloud Run could insert the events directly into BigQuery but that's the untipatern given the low quota on individual inserts. Will consider alternative way of streaming events from PubSub to BigQuery
I've modified the PubSub to BigQuery pipeline to use max 1 worker so that should significantly reduce the cost (~$30/mo). Still need to test it but you should be able to use this in setup
gcloud dataflow jobs run $SERVICE_NAME \
--gcs-location gs://cloudylabs-public/cloudylabs-pipelines/pubsub-to-bigquery.json \
--region $SERVICE_REGION \
--parameters "inputTopic=projects/${PROJECT}/topics/${SERVICE_NAME},outputTableSpec=${PROJECT}:${SERVICE_NAME}.events"
$30 is still significant for something that runs maybe a few times a day.
I see the quota point, what if we published to pubsub, drained once a day and did a batch insert?
I feel like there should be a deployment option to disable the Cloud Dataflow use.
Pretty much everything else used by this tool feels pay-as-you-go/serverless.
However, it seems like Dataflow is provisioning a n1-standard-4 instance ($97/mo). This is simply not going to be within my budget.
I'd love to see an option to disable Dataflow in scripts –and also an explanation of what Dataflow does.
The text was updated successfully, but these errors were encountered: