azure-datalake
datarootsio/azure-datalake/module
Terraform module for an Azure Data Lake
Terraform module Azure Data Lake This is a module for Terraform that deploys a complete and opinionated data lake network on Microsoft Azure.      Components Azure Data Factory for data ingestion from various sources Azure Data Lake Storage gen2 containers to store data for the data lake layers Azure Databricks to clean and transform the data Azure Synapse Analytics to store presentation data Azure CosmosDB to store metadata
| Name | Type | Description | Default |
|---|---|---|---|
| key_vault_depends_on | string | Optionally set to a dependency for the Key Vault secrets (e.g. access policy) | required |
| data_warehouse_dtu | string | Service objective (DTU) for the created data warehouse (e.g. DW100c) | required |
| storage_replication | string | Type of replication for the storage accounts. See https://www.terraform.io/docs/ | required |
| sql_server_admin_password | string | Password of the administrator of the SQL server | required |
| data_factory_vsts_tenant_id | string | Optional tenant ID for the VSTS back-end for the created Azure Data Factory. You | required |
| data_lake_name | string | Name of the data lake (has to be globally unique) | required |
| service_principal_client_id | string | Client ID of the existing service principal that will be used for communication | required |
| service_principal_client_secret | string | Client secret of the existing service principal that will be used for communicat | required |
| service_principal_object_id | string | Object ID of the existing service principal that will be used for communication | required |
| resource_group_name | string | Name of the resource group where the resources should be created | required |
| sql_server_admin_username | string | Username of the administrator of the SQL server | required |
| region | string | Region in which to create the resources | required |
| use_key_vault | bool | Set this to true to enable the usage of your existing Key Vault | false |
| data_factory_vsts_project_name | string | Optional project name for the VSTS back-end for the created Azure Data Factory. | "" |
| data_factory_vsts_repository_name | string | Optional repository name for the VSTS back-end for the created Azure Data Factor | "" |
| data_factory_github_repository_name | string | Optional repository name for the GitHub back-end for the created Azure Data Fact | "" |
| dl_acl | map(string) | Optional set of ACL to set on the filesystem roots inside the data lake. This is | {} |
| provision_synapse | bool | Set this to false to disable the creation of the Synapse Analytics instance. | true |
| databricks_workspace_name | string | Due to changes in how Terraform modules can use provider configurations, the mod | "" |
| provision_databricks_resources | bool | Set this to true to provision all Databricks related resources. | false |
| data_factory_github_branch_name | string | Optional branch name for the GitHub back-end for the created Azure Data Factory. | "" |
| databricks_max_workers | number | Maximum amount of workers in an active cluster | 4 |
| data_factory_vsts_account_name | string | Optional account name for the VSTS back-end for the created Azure Data Factory. | "" |
| data_factory_vsts_root_folder | string | Optional root folder for the VSTS back-end for the created Azure Data Factory. Y | "" |
| dl_directories | map(map(string)) | Optional root directories to be created inside the data lake. The value is a map | {} |
| data_lake_filesystems | list(string) | A list of filesystems to create inside the storage account | [
"raw",
"clean",
"curated",
"in |
| log_analytics_workspace_id | string | Optional Log Analytics Workspace ID where logs are stored | "" |
| databricks_cluster_node_type | string | Node type of the Databricks cluster machines | "Standard_F4s" |
| key_vault_id | string | ID of the optional Key Vault. The module will store all relevant secrets inside | "" |
| data_factory_github_git_url | string | Optional Git URL (either https://github.mycompany.com or https://github.com) for | "" |
| databricks_cluster_version | string | Runtime version of the Databricks cluster | "7.2.x-scala2.12" |
| databricks_workspace_resource_group_name | string | Due to changes in how Terraform modules can use provider configurations, the mod | "" |
| … and 3 more inputs | |||
storage_dfs_endpoint — Primary DFS endpoint of the created storage accountstorage_account_name — Name of the created storage account for ADLSdata_factory_name — Name of the created Data Factorydata_factory_id — Resource ID of the Data Factorysql_dw_server_hostname — Name of the SQL server that hosts the Azure Synapse Analytics instancesql_dw_server_database — Name of the Azure Synapse Analytics instancedata_factory_identity — Object ID of the managed identity of the created Data Factoryname — Name of the data lakecreated_key_vault_secrets — Secrets that have been created inside the optional Key Vault with their versionsAzure landing zones Terraform module
Terraform supermodule for the Terraform platform engineering for Azure
Terraform module to deploy landing zone subscriptions (and much more) in Azure
Terraform Module to define a consistent naming convention by (namespace, stage,