feat(ingest): enable connection string for all sqlalchemy datasources (#4508)

* feat(ingest): enable connection string for all sqlalchemy datasources

* Update sql_common.py

* fix types

* update docs

* rename variable to sqlalchemy_uri

Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
This commit is contained in:
Marcin Szymański 2022-04-08 04:11:52 +01:00 committed by GitHub
parent 45e09ca824
commit 7c3ad3d293
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
9 changed files with 17 additions and 5 deletions

View File

@ -87,6 +87,7 @@ Note that a `.` is used to denote nested fields in the YAML recipe.
| `password` | | | ClickHouse password. |
| `host_port` | ✅ | | ClickHouse host URL. |
| `database` | | | ClickHouse database to connect. |
| `sqlalchemy_uri` | | | URI of database to connect to. See https://docs.sqlalchemy.org/en/14/core/engines.html#database-urls. Takes precedence over other connection parameters. |
| `env` | | `"PROD"` | Environment to use in namespace when constructing URNs. |
| `platform_instance` | | None | The Platform instance to use while constructing URNs. |
| `options.<option>` | | | Any options specified here will be passed to SQLAlchemy's `create_engine` as kwargs.<br />See https://docs.sqlalchemy.org/en/14/core/engines.html#sqlalchemy.create_engine for details. |

View File

@ -54,6 +54,7 @@ As a SQL-based service, the Athena integration is also supported by our SQL prof
| `password` | | | Database password. |
| `host_port` | ✅ | | Host URL and port to connect to. |
| `database` | | | Database to ingest. |
| `sqlalchemy_uri` | | | URI of database to connect to. See https://docs.sqlalchemy.org/en/14/core/engines.html#database-urls. Takes precedence over other connection parameters. |
| `database_alias` | | | Alias to apply to database when ingesting. |
| `env` | | `"PROD"` | Environment to use in namespace when constructing URNs. |
| `platform_instance` | | None | The Platform instance to use while constructing URNs. |

View File

@ -57,6 +57,7 @@ Note that a `.` is used to denote nested fields in the YAML recipe.
| `password` | | | MySQL password. |
| `host_port` | | `"localhost:3306"` | MySQL host URL. |
| `database` | | | MySQL database. |
| `sqlalchemy_uri` | | | URI of database to connect to. See https://docs.sqlalchemy.org/en/14/core/engines.html#database-urls. Takes precedence over other connection parameters. |
| `database_alias` | | | Alias to apply to database when ingesting. A table with urn `urn:li:dataset:(urn:li:dataPlatform:mysql,<database>.<table>,PROD)` will turn into `urn:li:dataset:(urn:li:dataPlatform:mysql,<database_alias>.<table>,PROD)`. |
| `env` | | `"PROD"` | Environment to use in namespace when constructing URNs. |
| `platform_instance` | | None | The Platform instance to use while constructing URNs. |

View File

@ -61,6 +61,7 @@ As a SQL-based service, the Athena integration is also supported by our SQL prof
| `host_port` | | | Oracle host URL. |
| `database` | If `service_name` is not set | | If using, omit `service_name`. |
| `service_name` | If `database_alias` is not set | | Oracle service name. If using, omit `database`. |
| `sqlalchemy_uri` | | | URI of database to connect to. See https://docs.sqlalchemy.org/en/14/core/engines.html#database-urls. Takes precedence over other connection parameters. |
| `database_alias` | | | Alias to apply to database when ingesting. |
| `env` | | `"PROD"` | Environment to use in namespace when constructing URNs. |
| `options.<option>` | | | Any options specified here will be passed to SQLAlchemy's `create_engine` as kwargs.<br />See https://docs.sqlalchemy.org/en/14/core/engines.html#sqlalchemy.create_engine for details. |

View File

@ -59,6 +59,7 @@ As a SQL-based service, the Athena integration is also supported by our SQL prof
| `password` | | | PostgreSQL password. |
| `host_port` | ✅ | | PostgreSQL host URL. |
| `database` | | | PostgreSQL database. |
| `sqlalchemy_uri` | | | URI of database to connect to. See https://docs.sqlalchemy.org/en/14/core/engines.html#database-urls. Takes precedence over other connection parameters. |
| `database_alias` | | | Alias to apply to database when ingesting. |
| `env` | | `"PROD"` | Environment to use in namespace when constructing URNs. |
| `platform_instance` | | None | The Platform instance to use while constructing URNs. |

View File

@ -111,6 +111,7 @@ Note that a `.` is used to denote nested fields in the YAML recipe.
| `password` | | | Redshift password. |
| `host_port` | ✅ | | Redshift host URL. |
| `database` | | | Redshift database. |
| `sqlalchemy_uri` | | | URI of database to connect to. See https://docs.sqlalchemy.org/en/14/core/engines.html#database-urls. Takes precedence over other connection parameters. |
| `database_alias` | | | Alias to apply to database when ingesting. |
| `env` | | `"PROD"` | Environment to use in namespace when constructing URNs. |
| `platform_instance` | | None | The Platform instance to use while constructing URNs. |

View File

@ -168,6 +168,7 @@ Note that a `.` is used to denote nested fields in the YAML recipe.
| `host_port` | ✅ | | Snowflake host URL. |
| `warehouse` | | | Snowflake warehouse. |
| `role` | | | Snowflake role. |
| `sqlalchemy_uri` | | | URI of database to connect to. See https://docs.sqlalchemy.org/en/14/core/engines.html#database-urls. Takes precedence over other connection parameters. |
| `env` | | `"PROD"` | Environment to use in namespace when constructing URNs. |
| `platform_instance` | | None | The Platform instance to use while constructing URNs. |
| `options.<option>` | | | Any options specified here will be passed to SQLAlchemy's `create_engine` as kwargs.<br />See https://docs.sqlalchemy.org/en/14/core/engines.html#sqlalchemy.create_engine for details. |

View File

@ -53,6 +53,7 @@ As a SQL-based service, the Trino integration is also supported by our SQL profi
| `password` | | | Trino password. |
| `host_port` | ✅ | | Trino host URL. |
| `database` | ✅ | | Trino database (catalog). |
| `sqlalchemy_uri` | | | URI of database to connect to. See https://docs.sqlalchemy.org/en/14/core/engines.html#database-urls. Takes precedence over other connection parameters.
| `database_alias` | | | Alias to apply to database when ingesting. |
| `env` | | `"PROD"` | Environment to use in namespace when constructing URNs. |
| `options.<option>` | | | Any options specified here will be passed to SQLAlchemy's `create_engine` as kwargs.<br />See https://docs.sqlalchemy.org/en/14/core/engines.html#sqlalchemy.create_engine for details. |

View File

@ -249,17 +249,21 @@ class SQLAlchemyConfig(StatefulIngestionConfigBase):
class BasicSQLAlchemyConfig(SQLAlchemyConfig):
username: Optional[str] = None
password: Optional[pydantic.SecretStr] = None
host_port: str
host_port: Optional[str] = None
database: Optional[str] = None
database_alias: Optional[str] = None
scheme: str
scheme: Optional[str] = None
sqlalchemy_uri: Optional[str] = None
def get_sql_alchemy_url(self, uri_opts=None):
return make_sqlalchemy_uri(
self.scheme,
if not ((self.host_port and self.scheme) or self.sqlalchemy_uri):
raise ValueError("host_port and schema or connect_uri required.")
return self.sqlalchemy_uri or make_sqlalchemy_uri(
self.scheme, # type: ignore
self.username,
self.password.get_secret_value() if self.password else None,
self.host_port,
self.host_port, # type: ignore
self.database,
uri_opts=uri_opts,
)