airbyte
- airbytehq/airbyte
- MIT+ELv2, Java+Python+TypeScript
- ETL
- 实现用到 temporal 进行调度
- Source
- APIs - Github, Gitlab, Notion
- Database - PostgreSQL, MySQL, ClickHouse, CockroachDB, MongoDB
- Files
- Format - CSV, JSON, jsonl, excel, parquet, feather
- Provider - HTTP, S3, SFTP, SSH SCP
- Queue - Kafka
- RESTful
- Destination
- Kafka, MQTT, RabbitMQ, Pulsar, Redis
- S3, CSV, JSON, SFTP
- MySQL, PostgreSQL, Cassandra, ElasticSearch, MongoDB, MeiliSearch
- CDC
- PostgreSQL - wal2json, pgoutput
- MySQL
- MSSQL
- 参考
caution
- 开源版本 API 目前无认证
Docker
- docker-compose.yaml
- airbyte/scheduler
- airbyte/webapp
- airbyte/server
- airbyte/temporal
- 基于 temporal 官方脚本
- 重新构建添加 M1 支持
- temporal/dynamicconfig/development.yaml
- limit.blobSize.warn=10MB - 默认 512KB
- limit.blobSize.error=15MB - 默认 2MB
- airbyte/worker
- airbyte/db - PostgreSQL
- airbyte/bootloader
- airbyte/init
- docker-compose.debug.yaml
- temporalio/web
Kubernetes
- Known Issues
- Helm charts/airbyte
- charts
- bitnami-common
- postgresql
- minio
- temporal - 内置
- charts
- Kustomization resources
Notes
- Scheduler
- API -> Temporal
- 参考
field | desc |
---|---|
_airbyte_ab_id | uuid |
_airbyte_emitted_at | timestamp-millis |
_airbyte_additional_properties | map of string |
_airbyte_data | 实际数据 |
_airbyte_normalized_at | |
_ab_cdc_updated_at | |
_ab_cdc_deleted_at | |
_ab_cdc_lsn | PostgreSQL, MSSQL CDC |
_ab_cdc_log_file | MySQL CDC |
_ab_cdc_log_pos | MySQL CDC |
Integrations
FAQ
When to use CDC
- 希望获取到删除信息
- 数据量大
- 表里没有增量同步信息 - 例如: updated_at