Skip to main content

airbyte

  • airbytehq/airbyte
    • MIT+ELv2, Java+Python+TypeScript
    • ETL
    • 实现用到 temporal 进行调度
    • Source
      • APIs - Github, Gitlab, Notion
      • Database - PostgreSQL, MySQL, ClickHouse, CockroachDB, MongoDB
      • Files
        • Format - CSV, JSON, jsonl, excel, parquet, feather
        • Provider - HTTP, S3, SFTP, SSH SCP
      • Queue - Kafka
      • RESTful
    • Destination
      • Kafka, MQTT, RabbitMQ, Pulsar, Redis
      • S3, CSV, JSON, SFTP
      • MySQL, PostgreSQL, Cassandra, ElasticSearch, MongoDB, MeiliSearch
  • CDC
    • PostgreSQL - wal2json, pgoutput
    • MySQL
    • MSSQL
  • 参考
caution
  • 开源版本 API 目前无认证

Docker

Kubernetes

Notes

fielddesc
_airbyte_ab_iduuid
_airbyte_emitted_attimestamp-millis
_airbyte_additional_propertiesmap of string
_airbyte_data实际数据
_airbyte_normalized_at
_ab_cdc_updated_at
_ab_cdc_deleted_at
_ab_cdc_lsnPostgreSQL, MSSQL CDC
_ab_cdc_log_fileMySQL CDC
_ab_cdc_log_posMySQL CDC

Integrations

FAQ

When to use CDC

  • 希望获取到删除信息
  • 数据量大
  • 表里没有增量同步信息 - 例如: updated_at