Skip to main content

Amazon S3

Connect Galaxy to an Amazon S3 bucket to observe objects, prefixes (folders), and organizational hierarchy. Amazon S3 source configuration

How to Connect

  1. Select Amazon S3 as the Source type
  2. Provide AWS credentials:
    • Access Key ID: AWS access key ID
    • Secret Access Key: AWS secret access key
    • Region: AWS region where your buckets are located
    • Optional: Role ARN for IAM role authentication
    • Optional: Custom endpoint URL
  3. Configure bucket: Provide the bucket name to observe
  4. Configure streams: Set up file-based stream configurations for different object types
  5. Optional start date: Set a start date for syncing historical data
Once connected, Galaxy validates the connection and begins observing the bucket structure.

Configuration Options

AWS Credentials

  • Access Key ID: AWS access key ID
  • Secret Access Key: AWS secret access key
  • Region: AWS region where your buckets are located
  • Role ARN: Optional IAM role ARN for role-based authentication
  • Endpoint: Optional custom endpoint URL (for S3-compatible services)
Ensure the credentials have appropriate permissions to list and read bucket contents.

Bucket Configuration

  • Bucket: S3 bucket name to observe

Stream Configuration

Configure file-based streams to process different object types:
  • Structured formats: CSV, JSONL, Parquet, Avro, Excel
  • Unstructured documents: PDFs, DOCX, Markdown, and other text formats
  • Processing options: Configure parsing strategies, validation policies, schema discovery, and glob patterns

Delivery Method

Choose how to deliver data:
  • Replicate Records: Deliver as structured records
  • Copy Raw Files: Copy files as-is, optionally preserving directory structure

What’s Next