支持以下引擎
- Spark
- Flink
- SeaTunnel Zeta
關(guān)鍵特性
- 批處理
- 精確一次性處理
- 列投影
- 并行處理
- 支持用戶自定義拆分
- 支持查詢 SQL 并實現(xiàn)投影效果
描述
通過 JDBC 讀取外部數(shù)據(jù)源數(shù)據(jù)。
支持的數(shù)據(jù)源信息
Datasource | Supported versions | Driver | Url | Maven |
---|---|---|---|---|
Vertica | Different dependency version has different driver class. | com.vertica.jdbc.Driver | jdbc:vertica://localhost:5433/vertica | Download |
數(shù)據(jù)庫依賴
請下載與 'Maven' 對應(yīng)的支持列表,并將其復(fù)制到 '$SEATNUNNEL_HOME/plugins/jdbc/lib/' 工作目錄中
例如,Vertica 數(shù)據(jù)源:cp vertica-jdbc-xxx.jar $SEATNUNNEL_HOME/plugins/jdbc/lib/
數(shù)據(jù)類型映射
Vertical Data type | SeaTunnel Data type |
---|---|
BIT | BOOLEAN |
TINYINT TINYINT UNSIGNED SMALLINT SMALLINT UNSIGNED MEDIUMINT MEDIUMINT UNSIGNED INT INTEGER YEAR |
INT |
INT UNSIGNED INTEGER UNSIGNED BIGINT |
LONG |
BIGINT UNSIGNED | DECIMAL(20,0) |
DECIMAL(x,y)(Get the designated column's specified column size.<38) | DECIMAL(x,y) |
DECIMAL(x,y)(Get the designated column's specified column size.>38) | DECIMAL(38,18) |
DECIMAL UNSIGNED | DECIMAL((Get the designated column's specified column size)+1, (Gets the designated column's number of digits to right of the decimal point.))) |
FLOAT FLOAT UNSIGNED |
FLOAT |
DOUBLE DOUBLE UNSIGNED |
DOUBLE |
CHAR VARCHAR TINYTEXT MEDIUMTEXT TEXT LONGTEXT JSON |
STRING |
DATE | DATE |
TIME | TIME |
DATETIME TIMESTAMP |
TIMESTAMP |
TINYBLOB MEDIUMBLOB BLOB LONGBLOB BINARY VARBINAR BIT(n) |
BYTES |
GEOMETRY UNKNOWN |
Not supported yet |
源選項
Name | Type | Required | Default | Description |
---|---|---|---|---|
url | String | Yes | - | The URL of the JDBC connection. Refer to a case: jdbc:vertica://localhost:5433/vertica |
driver | String | Yes | - | The jdbc class name used to connect to the remote data source, if you use Vertica the value is com.vertica.jdbc.Driver . |
user | String | No | - | Connection instance user name |
password | String | No | - | Connection instance password |
query | String | Yes | - | Query statement |
connection_check_timeout_sec | Int | No | 30 | The time in seconds to wait for the database operation used to validate the connection to complete |
partition_column | String | No | - | The column name for parallelism's partition, only support numeric type,Only support numeric type primary key, and only can config one column. |
partition_lower_bound | Long | No | - | The partition_column min value for scan, if not set SeaTunnel will query database get min value. |
partition_upper_bound | Long | No | - | The partition_column max value for scan, if not set SeaTunnel will query database get max value. |
partition_num | Int | No | job parallelism | The number of partition count, only support positive integer. default value is job parallelism |
fetch_size | Int | No | 0 | For queries that return a large number of objects,you can configure the row fetch size used in the query toimprove performance by reducing the number database hits required to satisfy the selection criteria. Zero means use jdbc default value. |
common-options | No | - | Source plugin common parameters, please refer to Source Common Options for details |
- 提示
如果未設(shè)置 partition_column
,則會在單一并發(fā)中運行;如果設(shè)置了 partition_column
,則將根據(jù)任務(wù)的并發(fā)性進(jìn)行并行執(zhí)行。
任務(wù)示例
簡單示例:
此示例在單一并行中查詢您的測試“數(shù)據(jù)庫”中的 type_bin 'table'
16 個數(shù)據(jù),并查詢其所有字段。您還可以指定要查詢的字段,以便將最終輸出顯示在控制臺上。
env {
您可以在此處設(shè)置 Flink 配置
execution.parallelism = 2
job.mode = "BATCH"
}
source{
Jdbc {
url = "jdbc:vertica://localhost:5433/vertica"
driver = "com.vertica.jdbc.Driver"
connection_check_timeout_sec = 100
user = "root"
password = "123456"
query = "select * from type_bin limit 16"
}
}
transform {
# 如果您想獲取有關(guān)如何配置 seatunnel 的更多信息,并查看完整的轉(zhuǎn)換插件列表,
# 請訪問 https://seatunnel.apache.org/docs/transform-v2/sql
}
sink {
Console {}
}
并行示例:
并行讀取您的查詢表,使用您配置的 shard 字段和 shard 數(shù)據(jù)。如果要讀取整個表,可以這樣做。文章來源:http://www.zghlxwxcb.cn/news/detail-654940.html
source {
Jdbc {
url = "jdbc:vertica://localhost:5433/vertica"
driver = "com.vertica.jdbc.Driver"
connection_check_timeout_sec = 100
user = "root"
password = "123456"
# 根據(jù)需要定義查詢邏輯
query = "select * from type_bin"
# 并行分片讀取字段
partition_column = "id"
# 片段數(shù)量
partition_num = 10
}
}
并行邊界示例:
根據(jù)查詢的上限和下限指定數(shù)據(jù)更加高效,根據(jù)您配置的上限和下限來讀取數(shù)據(jù)源更加高效
source {
Jdbc {
url = "jdbc:vertica://localhost:5433/vertica"
driver = "com.vertica.jdbc.Driver"
connection_check_timeout_sec = 100
user = "root"
password = "123456"
# 根據(jù)需要定義查詢邏輯
query = "select * from type_bin"
partition_column = "id"
# 讀取起始邊界
partition_lower_bound = 1
# 讀取結(jié)束邊界
partition_upper_bound = 500
partition_num = 10
}
}
本文由 白鯨開源 提供發(fā)布支持!文章來源地址http://www.zghlxwxcb.cn/news/detail-654940.html
到了這里,關(guān)于JDBC Vertica Source Connector 使用文檔的文章就介紹完了。如果您還想了解更多內(nèi)容,請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!