Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
93c3ec4
Temporarily skip syncing of the full signatures table.
strangeways Jan 13, 2020
330392a
Implement an intermediate Lambda that is called by the API Gateway. T…
strangeways Jan 15, 2020
d9b52cc
Merge pull request #3 from MoveOnOrg/invoke-second-lambda
strangeways May 6, 2020
5fbf607
Allow for multiple instances of loader_lambda_role
sjwmoveon Jun 5, 2020
b8247b2
Truncate strings that are too large to fit
sjwmoveon Jun 8, 2020
7641d24
Node 8.10 environment has been disabled (deprecated)
sjwmoveon Jun 16, 2020
de4423c
Merge pull request #5 from MoveOnOrg/sjwmoveon-patch-2
sjwmoveon Nov 16, 2020
8e3a248
Merge pull request #4 from MoveOnOrg/sjwmoveon-patch-1
sjwmoveon Nov 16, 2020
efba6ba
Merge branch 'master' into main
sjwmoveon Nov 16, 2020
d09cc0a
Send email clicks and opens to (optional) Kinesis streams
sjwmoveon Nov 16, 2020
8e461b5
Merge branch 'main' of https://github.com/MoveOnOrg/terraform-aws-con…
sjwmoveon Nov 16, 2020
904ad17
Remove references to deprecated invoker function
sjwmoveon Nov 16, 2020
3fdee05
Manually fix all remaining discrepancies with latest CSL code
sjwmoveon Nov 17, 2020
97475d5
Use Firehose stream (which is different from base Kinesis stream)
sjwmoveon Nov 17, 2020
dfd7d50
Merge pull request #8 from controlshift/main
sjwmoveon Feb 23, 2021
b51c15c
Load gzipped files
sjwmoveon Feb 23, 2021
fbbfeda
Undo gzip for now
sjwmoveon Mar 1, 2021
f9535a2
Merge branch 'main' of https://github.com/MoveOnOrg/terraform-aws-con…
sjwmoveon Mar 1, 2021
cb1a933
Last commit was actually backwards
sjwmoveon Mar 1, 2021
e1ce7e7
Add external_ids field to sync
sjwmoveon Apr 7, 2021
5520fbf
Add Kinesis Firehose permissions to receiver role
sjwmoveon Apr 15, 2021
c2f2cec
Typo fix
sjwmoveon Apr 15, 2021
1c56701
Don't need jid
sjwmoveon Apr 16, 2021
ae83169
Merge branch 'main' of https://github.com/controlshift/terraform-aws-…
sjwmoveon Nov 18, 2021
7d62bde
Merge upstream (from separate branch to resolve conflicts)
sjwmoveon Nov 18, 2021
2e6caa3
Merge branch 'controlshift:main' into main
sjwmoveon Nov 30, 2021
f686d45
Changing config to match CSL dev noticing discrepancy
ibrand Feb 10, 2022
00a4e9b
Merge pull request #10 from MoveOnOrg/ibrand-patch-1
ibrand Feb 10, 2022
882b5a4
Reverting to see if this will speed the terraform builds up again. Th…
ibrand Feb 11, 2022
ba96075
This is a revert of a revert. We are going to try the compression cha…
ibrand Feb 21, 2022
5d2625c
Change the daisy_chain_id_used field from a string to bigint because …
ibrand May 24, 2022
e3731c2
fix comment
ibrand May 24, 2022
ef8b40b
Merge pull request #11 from MoveOnOrg/daisy-chain-id-bigint-change
ibrand May 24, 2022
19be449
add in the lifecycle rule we currently have manually entered in the s…
ibrand Feb 20, 2023
7fa1de1
put that code in the wrong s3 section
ibrand Feb 20, 2023
4a2e1c3
add in other fields that are turning up as conflicting when we try to…
ibrand Feb 20, 2023
0a5f8f4
Merge pull request #12 from MoveOnOrg/2023feb--terraform-updates
ibrand Feb 21, 2023
f0d1ad4
Update AWS version
crayolakat May 31, 2023
86f80de
Update AWS to version 4
crayolakat May 31, 2023
2769394
Remove region attribute
crayolakat May 31, 2023
11a0045
Merge pull request #13 from MoveOnOrg/kathy-upgrade-aws-version
crayolakat Jun 6, 2023
dc96685
ilona schema changes from 11-7
ibrand Nov 14, 2023
f41c9cf
Merge pull request #15 from MoveOnOrg/glue-job-nov-14
ibrand Nov 14, 2023
513f862
Update signatures mappings
sjwmoveon Jan 11, 2024
1b3d27a
Merge pull request #17 from MoveOnOrg/sjwmoveon-patch-2
sjwmoveon Jan 11, 2024
37fef96
Update Node version to 20
sjwmoveon Jan 31, 2024
4f27f61
Back out receiver lambda changes
sjwmoveon Feb 2, 2024
82f2ad0
Merge pull request #18 from MoveOnOrg/node
sjwmoveon Feb 5, 2024
622e1c6
Update AWS version to one with support for nodejs20.x
sjwmoveon Feb 5, 2024
07f14fe
Merge pull request #19 from MoveOnOrg/awsversion
sjwmoveon Feb 5, 2024
0af0560
Downgrade to Nodejs 16.x
sjwmoveon Feb 13, 2024
ba68233
Merge pull request #20 from MoveOnOrg/awsversion
sjwmoveon Feb 13, 2024
6241db0
Update Redshift loader version
sjwmoveon Feb 15, 2024
60ae62d
Merge pull request #21 from MoveOnOrg/awsversion
sjwmoveon Feb 15, 2024
4f44b25
Update minimum required Terraform version and resolve various depreca…
strangeways Jan 6, 2026
96401a9
Reorder sections to more closely match Controlshift's version.
strangeways Jan 14, 2026
c43d3c9
Fix typo
strangeways Jan 14, 2026
31280bc
Merge pull request #22 from MoveOnOrg/use-templatefile-function
strangeways Feb 2, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion config_item.json
Original file line number Diff line number Diff line change
Expand Up @@ -23,5 +23,5 @@
"failureTopicARN": {"S": "${failure_topic_arn}"},
"batchSize": {"N": "1"},
"currentBatch": {"S": "${current_batch}"},
"compress": {"S": "${compress}"}
"compression": {"S": "${compress}"}
}
108 changes: 53 additions & 55 deletions config_table.tf
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,12 @@ resource "aws_dynamodb_table" "loader_config" {
}

resource "aws_dynamodb_table_item" "load_config_full_items" {
for_each = toset([for table in local.parsed_bulk_data_schemas["tables"] : table["table"]["name"]])
for_each = toset(local.table_names)

table_name = aws_dynamodb_table.loader_config.name
hash_key = aws_dynamodb_table.loader_config.hash_key

item = data.template_file.loader_config_full_item[each.key].rendered
item = local.loader_config_full_items[each.key]

lifecycle {
ignore_changes = [
Expand All @@ -33,39 +33,13 @@ resource "aws_dynamodb_table_item" "load_config_full_items" {
}
}

data "template_file" "loader_config_full_item" {
for_each = toset([for table in local.parsed_bulk_data_schemas["tables"] : table["table"]["name"]])

template = "${file("${path.module}/config_item.json")}"
vars = {
kind = "full"
bulk_data_table = each.key
redshift_endpoint = data.aws_redshift_cluster.sync_data_target.endpoint
redshift_database_name: var.redshift_database_name
redshift_port = data.aws_redshift_cluster.sync_data_target.port
redshift_username = var.redshift_username
redshift_password = aws_kms_ciphertext.redshift_password.ciphertext_blob
schema = var.redshift_schema
s3_bucket = "agra-data-exports-${var.controlshift_environment}"
manifest_bucket = aws_s3_bucket.manifest.bucket
manifest_prefix = var.manifest_prefix
failed_manifest_prefix = var.failed_manifest_prefix
success_topic_arn = aws_sns_topic.success_sns_topic.arn
failure_topic_arn = aws_sns_topic.failure_sns_topic.arn
current_batch = random_id.current_batch.b64_url
column_list = data.http.column_list[each.key].body
truncate_target = true
compress = try(local.parsed_bulk_data_schemas["settings"]["compression_format"], "")
}
}

resource "aws_dynamodb_table_item" "load_config_incremental_items" {
for_each = toset([for table in local.parsed_bulk_data_schemas["tables"] : table["table"]["name"]])
for_each = toset(local.table_names)

table_name = aws_dynamodb_table.loader_config.name
hash_key = aws_dynamodb_table.loader_config.hash_key

item = data.template_file.loader_config_incremental_item[each.key].rendered
item = local.loader_config_incremental_items[each.key]

lifecycle {
ignore_changes = [
Expand All @@ -80,29 +54,53 @@ resource "aws_dynamodb_table_item" "load_config_incremental_items" {
}
}

data "template_file" "loader_config_incremental_item" {
for_each = toset([for table in local.parsed_bulk_data_schemas["tables"] : table["table"]["name"]])

template = "${file("${path.module}/config_item.json")}"
vars = {
kind = "incremental"
bulk_data_table = each.key
redshift_endpoint = data.aws_redshift_cluster.sync_data_target.endpoint
redshift_database_name: var.redshift_database_name
redshift_port = data.aws_redshift_cluster.sync_data_target.port
redshift_username = var.redshift_username
redshift_password = aws_kms_ciphertext.redshift_password.ciphertext_blob
schema = var.redshift_schema
s3_bucket = "agra-data-exports-${var.controlshift_environment}"
manifest_bucket = aws_s3_bucket.manifest.bucket
manifest_prefix = var.manifest_prefix
failed_manifest_prefix = var.failed_manifest_prefix
success_topic_arn = aws_sns_topic.success_sns_topic.arn
failure_topic_arn = aws_sns_topic.failure_sns_topic.arn
current_batch = random_id.current_batch.b64_url
column_list = data.http.column_list[each.key].body
truncate_target = false
compress = try(local.parsed_bulk_data_schemas["settings"]["compression_format"], "")
locals {
table_names = [for table in local.parsed_bulk_data_schemas["tables"] : table["table"]["name"]]

loader_config_full_items = {
for name in local.table_names : name => templatefile("${path.module}/config_item.json", {
kind = "full"
bulk_data_table = name
redshift_endpoint = data.aws_redshift_cluster.sync_data_target.endpoint
redshift_database_name = var.redshift_database_name
redshift_port = data.aws_redshift_cluster.sync_data_target.port
redshift_username = var.redshift_username
redshift_password = aws_kms_ciphertext.redshift_password.ciphertext_blob
schema = var.redshift_schema
s3_bucket = "agra-data-exports-${var.controlshift_environment}"
manifest_bucket = aws_s3_bucket.manifest.bucket
manifest_prefix = var.manifest_prefix
failed_manifest_prefix = var.failed_manifest_prefix
success_topic_arn = aws_sns_topic.success_sns_topic.arn
failure_topic_arn = aws_sns_topic.failure_sns_topic.arn
current_batch = random_id.current_batch.b64_url
column_list = data.http.column_list[name].body
truncate_target = true
compress = try(local.parsed_bulk_data_schemas["settings"]["compression_format"], "")
})
}

loader_config_incremental_items = {
for name in local.table_names : name => templatefile("${path.module}/config_item.json", {
kind = "incremental"
bulk_data_table = name
redshift_endpoint = data.aws_redshift_cluster.sync_data_target.endpoint
redshift_database_name = var.redshift_database_name
redshift_port = data.aws_redshift_cluster.sync_data_target.port
redshift_username = var.redshift_username
redshift_password = aws_kms_ciphertext.redshift_password.ciphertext_blob
schema = var.redshift_schema
s3_bucket = "agra-data-exports-${var.controlshift_environment}"
manifest_bucket = aws_s3_bucket.manifest.bucket
manifest_prefix = var.manifest_prefix
failed_manifest_prefix = var.failed_manifest_prefix
success_topic_arn = aws_sns_topic.success_sns_topic.arn
failure_topic_arn = aws_sns_topic.failure_sns_topic.arn
current_batch = random_id.current_batch.b64_url
column_list = data.http.column_list[name].body
truncate_target = false
compress = try(local.parsed_bulk_data_schemas["settings"]["compression_format"], "")
})
}
}

Expand Down Expand Up @@ -134,11 +132,11 @@ data "http" "bulk_data_schemas" {
}

locals {
parsed_bulk_data_schemas = jsondecode(data.http.bulk_data_schemas.body)
parsed_bulk_data_schemas = jsondecode(data.http.bulk_data_schemas.response_body)
}

data "http" "column_list" {
for_each = toset([for table in local.parsed_bulk_data_schemas["tables"] : table["table"]["name"]])
for_each = toset(local.table_names)

url = "https://${var.controlshift_hostname}/api/bulk_data/schema/columns?table=${each.key}"
}
69 changes: 57 additions & 12 deletions glue_job.tf
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,14 @@ resource "aws_glue_crawler" "signatures_crawler" {
database_name = aws_glue_catalog_database.catalog_db.name
name = "${var.controlshift_environment}_full_signatures"
role = aws_iam_role.glue_service_role.arn
configuration = jsonencode(
{
Grouping = {
TableGroupingPolicy = "CombineCompatibleSchemas"
}
Version = 1
}
)

s3_target {
path = local.signatures_s3_path
Expand All @@ -18,34 +26,69 @@ resource "aws_glue_crawler" "signatures_crawler" {

resource "aws_s3_bucket" "glue_resources" {
bucket = var.glue_scripts_bucket_name
region = var.aws_region
}

# Ownership controls block is required to support ACLs.
resource "aws_s3_bucket_ownership_controls" "glue_resources" {
bucket = aws_s3_bucket.glue_resources.id
rule {
object_ownership = "ObjectWriter"
}
}

resource "aws_s3_bucket_acl" "glue_resources" {
depends_on = [aws_s3_bucket_ownership_controls.glue_resources]

bucket = aws_s3_bucket.glue_resources.id
acl = "private"
server_side_encryption_configuration {
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "AES256"
}
}

resource "aws_s3_bucket_server_side_encryption_configuration" "glue_resources" {
bucket = aws_s3_bucket.glue_resources.id

rule {
apply_server_side_encryption_by_default {
sse_algorithm = "AES256"
}
}
}

data "template_file" "signatures_script" {
template = file("${path.module}/templates/signatures_job.py.tpl")
vars = {
resource "aws_s3_bucket_lifecycle_configuration" "glue_resources" {
bucket = aws_s3_bucket.glue_resources.id

rule {
id = "Remove temp files over a week old"
status = "Enabled"

filter {
prefix = "production/temp/"
}

expiration {
days = 7
}

abort_incomplete_multipart_upload {
days_after_initiation = 7 # Note: must be greater than 0
}
}
}

locals {
signatures_script = templatefile("${path.module}/templates/signatures_job.py.tpl", {
catalog_database_name = aws_glue_catalog_database.catalog_db.name
redshift_database_name = var.redshift_database_name
redshift_schema = var.redshift_schema
redshift_connection_name = aws_glue_connection.redshift_connection.name
}
})
}

resource "aws_s3_bucket_object" "signatures_script" {
resource "aws_s3_object" "signatures_script" {
bucket = aws_s3_bucket.glue_resources.id
key = "${var.controlshift_environment}/signatures_job.py"
acl = "private"

content = data.template_file.signatures_script.rendered
content = local.signatures_script
}

resource "aws_iam_role" "glue_service_role" {
Expand Down Expand Up @@ -134,6 +177,8 @@ resource "aws_glue_job" "signatures_full" {
name = "cs-${var.controlshift_environment}-signatures-full"
connections = [ aws_glue_connection.redshift_connection.name ]
glue_version = "3.0"
number_of_workers = 9
worker_type = "G.1X"
default_arguments = {
"--TempDir": "s3://${aws_s3_bucket.glue_resources.bucket}/${var.controlshift_environment}/temp",
"--job-bookmark-option": "job-bookmark-disable",
Expand Down
16 changes: 16 additions & 0 deletions iam.tf
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,22 @@ data "aws_iam_policy_document" "receiver_execution_policy" {
resources = ["arn:aws:sqs:${var.aws_region}:*:${aws_sqs_queue.receiver_queue.name}",
"arn:aws:sqs:${var.aws_region}:*:${aws_sqs_queue.receiver_queue_glue.name}"]
}

# allow the receiver lambda to send messages to Firehose streams
statement {
effect = "Allow"
actions = [
"firehose:DeleteDeliveryStream",
"firehose:PutRecord",
"firehose:StartDeliveryStreamEncryption",
"firehose:CreateDeliveryStream",
"firehose:PutRecordBatch",
"firehose:StopDeliveryStreamEncryption",
"firehose:UpdateDestination"
]
resources = ["arn:aws:firehose:${var.aws_region}:*:deliverystream/${var.email_open_firehose_stream}",
"arn:aws:firehose:${var.aws_region}:*:deliverystream/${var.email_click_firehose_stream}"]
}
}

resource "aws_iam_role_policy" "lambda_receiver" {
Expand Down
25 changes: 23 additions & 2 deletions lambdas/receiver.js
Original file line number Diff line number Diff line change
@@ -1,13 +1,28 @@
'use strict';

const AWS = require('aws-sdk');

// Set the region
AWS.config.update({region: process.env.AWS_REGION});

// Create an SQS service object
const sqs = new AWS.SQS();

// Create a Firehose service object
const firehose = new AWS.Firehose();

function putFirehose(data, stream) {
let params = {
DeliveryStreamName: stream,
Record:{
Data: data
}
};
firehose.putRecord(params, function(err, data) {
if (err) console.log(err, err.stack);
else console.log('Record added:',data);
});
}

async function enqueueTask(receivedData, kind) {
console.log("Processing: " + receivedData.url);

Expand All @@ -21,7 +36,7 @@ async function enqueueTask(receivedData, kind) {

messageBody['kind'] = kind;

const jsonMessageBody = JSON.stringify(messageBody)
const jsonMessageBody = JSON.stringify(messageBody);

const loaderQueueParams = {
MessageBody: jsonMessageBody,
Expand Down Expand Up @@ -69,6 +84,12 @@ exports.handler = async (event) => {
} else if(receivedJSON.type === 'data.incremental_table_exported'){
await enqueueTask(receivedJSON.data, 'incremental');
return sendResponse({"status": "processed"});
} else if(receivedJSON.type === 'email.open' && process.env.EMAIL_OPEN_FIREHOSE_STREAM !== null && process.env.EMAIL_OPEN_FIREHOSE_STREAM !== ''){
await putFirehose(JSON.stringify(receivedJSON.data), process.env.EMAIL_OPEN_FIREHOSE_STREAM);
return sendResponse({"status": "processed"});
} else if(receivedJSON.type === 'email.click' && process.env.EMAIL_CLICK_FIREHOSE_STREAM !== null && process.env.EMAIL_CLICK_FIREHOSE_STREAM !== ''){
await putFirehose(JSON.stringify(receivedJSON.data), process.env.EMAIL_CLICK_FIREHOSE_STREAM);
return sendResponse({"status": "processed"});
} else {
return Promise.resolve(sendResponse({"status": "skipped", "payload": receivedJSON}));
}
Expand Down
4 changes: 2 additions & 2 deletions loader.tf
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
resource "aws_lambda_function" "loader" {
s3_bucket = local.lambda_buckets[var.aws_region]
s3_key = "LambdaRedshiftLoader/AWSLambdaRedshiftLoader-2.7.8.zip"
s3_key = "LambdaRedshiftLoader/AWSLambdaRedshiftLoader-2.8.3.zip"
function_name = "controlshift-redshift-loader"
role = aws_iam_role.loader_lambda_role.arn
handler = "index.handler"
runtime = "nodejs12.x"
runtime = "nodejs16.x"
timeout = 900

vpc_config {
Expand Down
9 changes: 8 additions & 1 deletion receiver.tf
Original file line number Diff line number Diff line change
Expand Up @@ -9,16 +9,23 @@ resource "aws_lambda_function" "receiver_lambda" {
function_name = "controlshift-webhook-handler"
role = aws_iam_role.receiver_lambda_role.arn
handler = "receiver.handler"
runtime = "nodejs12.x"
runtime = "nodejs16.x"
timeout = var.receiver_timeout
source_code_hash = data.archive_file.receiver_zip.output_base64sha256

environment {
variables = {
SQS_QUEUE_URL = aws_sqs_queue.receiver_queue.id
GLUE_SQS_QUEUE_URL = aws_sqs_queue.receiver_queue_glue.id
EMAIL_OPEN_FIREHOSE_STREAM = var.email_open_firehose_stream
EMAIL_CLICK_FIREHOSE_STREAM = var.email_click_firehose_stream
}
}

// This prevents noisy logs from cluttering up datadog
tags = {
datadog = "exclude"
}
}

resource "aws_api_gateway_rest_api" "receiver" {
Expand Down
2 changes: 1 addition & 1 deletion run_glue_crawler.tf
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ resource "aws_lambda_function" "glue_crawler_lambda" {
function_name = "controlshift-run-glue-crawler"
role = aws_iam_role.run_glue_crawler_lambda_role.arn
handler = "run-glue-crawler.handler"
runtime = "nodejs12.x"
runtime = "nodejs16.x"
timeout = 60
source_code_hash = data.archive_file.run_glue_crawler_zip.output_base64sha256

Expand Down
2 changes: 1 addition & 1 deletion run_glue_job.tf
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ resource "aws_lambda_function" "glue_job_lambda" {
function_name = "controlshift-run-glue-job"
role = aws_iam_role.run_glue_job_lambda_role.arn
handler = "run-glue-job.handler"
runtime = "nodejs12.x"
runtime = "nodejs16.x"
timeout = 60
source_code_hash = data.archive_file.run_glue_job_zip.output_base64sha256

Expand Down
Loading