Skip to main content

Sources

Sources are connections to external data systems. When you connect a Source, Galaxy observes its structure and replicates data according to your configuration. Sources page showing list of connected Sources

What Sources Are

Sources answer one question: What exists? They are the factual foundation—hard evidence of what exists in your data systems—that Projects use to build semantic understanding. When you connect a Source, Galaxy observes the structure of your connected systems—schemas, tables, columns, files, folders, and organizational elements. Galaxy automatically manages a data infrastructure built on Apache Iceberg that replicates and stores your data. You can connect to any of the main storage providers, and Galaxy handles the infrastructure for you. Galaxy supports both structured and unstructured data:
  • Structured sources: Databases, spreadsheets, and other structured formats expose schemas, tables, columns, and relationships
  • Unstructured sources: Documents, files, and text content are processed to extract structure, text, entities, and semantic information
Sources provide the observational layer—they record what exists without interpretation. This factual foundation enables Projects to build semantic understanding on top of verified, observable reality.

Supported Source Types

Galaxy supports the following Source types: Each Source type has specific configuration requirements. See the individual Source pages for detailed information about connecting and configuring each type.

Connecting a Source

Connecting a Source involves selecting the Source type and configuring the connection:
  1. Select Source Type: Choose from available Source types
  2. Configure Connection: Provide connection details, authentication credentials, and configuration options specific to the Source type
  3. Select Data: Choose which data to replicate (schemas, folders, channels, etc.)
Once connected, Galaxy validates the connection and begins observing structure and replicating data. Create Source modal showing source type selection

Managing Sources

After connecting a Source, you can:
  • View structure: Browse schemas, tables, columns, files, and other structural elements that Galaxy has observed
  • Refresh structure: Update Galaxy’s understanding if the source system has changed
  • Configure connection: Update connection details or authentication
  • Remove connection: Disconnect a Source if it is no longer needed
A Source can be attached to multiple Projects. Removing a Source does not delete Projects, but Projects that were using that Source will no longer have access to its structure.

Sources and Projects

Galaxy intentionally separates observation from interpretation. Sources observe reality: Galaxy connects to systems and records what exists, without opinion. Projects interpret reality: Teams define what those structures mean in context. Graphs emerge naturally: As entities and relationships are defined, a navigable graph forms. Understanding evolves: When systems change, the ontology can evolve with them. Multiple Projects can attach to the same Source. This allows different teams or use cases to interpret the same Source structure differently, depending on their needs. This separation prevents brittle assumptions while preserving shared context. Sources provide the factual foundation. Projects build semantic understanding on top.

What’s Next