LogoLogo
Back to OsmosDeveloper DocsOsmos BlogWhat's New
  • Welcome to Osmos
  • 👋Getting Started with Osmos
    • Terminology
  • 🎉What's New
  • 🧩Osmos API Reference
  • ⌨️Osmos Chat
  • 👩‍💻Developer Docs
    • Manage API Keys
    • Embedding an Osmos Uploader
    • Embedding Uploader Jobs Table
    • Turning on Advanced Mode Uploader
    • Customizing Uploader Styling
    • Passing Parameterized Fields
    • Configuring Uploader's "Recall" functionality
    • Optional Uploader Settings
    • Uploader Submission Callback
    • Configuring AutoClean for your Uploader
    • Uploader Client-Side Validation
      • Data Validators
      • Checking for Duplicate values in a field
      • Creating Dropdown-Controlled Fields
      • Dynamic Dropdown Options
      • Dropdown Interaction with Validation Functions
    • Validation and Transformation Webhooks
      • OpenAPI Validation Webhook Testing
    • Parser Webhook for file based connectors
  • 🔠Datasets
    • Osmos Datasets
      • Uploading Data to your Table
      • Creating Primary and Foreign keys
      • Osmos Dataset Destination Connector
      • Osmos Dataset Source Connector
      • Dataset Edits
    • Datasets Query Builder
      • Query Builder Metadata
    • Performing Look Ups
      • Performing Joins
        • Types of Joins
  • ⏏️Uploader
    • Creating an Osmos Uploader
      • Testing your Osmos Uploader
    • Uploader Validation Summary
    • Advanced Mode
      • Overview
      • Process
    • Standard Mode
      • Overview
      • AutoClean
      • Process
    • AI AutoMapping
    • Uploaders Page
    • Uploader Details Page
  • 🔀Pipelines
    • Step 1. Select the Source
    • Step 2. Select a Destination
    • Step 3. Map & Transform Data
    • Step 4. Schedule the Pipeline
    • Step 5. Review & Confirm
    • Pipelines Page
    • Pipeline Details Page
  • ⏩Data Transformations
    • AutoMap
    • Column Mapping & Data Cleanup Panel
    • QuickFixes
    • AI Value Mapping
    • AI AutoClean
    • Lookups
      • Performing Lookups
    • SmartFill
    • Formulas
      • Date & Time Formulas
        • DateTime Format Specifiers
        • Timezone specifiers
      • Math Formulas and Operators
      • Logical Formulas & Operators
        • True & False Casting
      • Text Formulas
      • Other Formulas
    • Deduplication
  • ↘️Source Connectors
    • Amazon S3
    • Azure Blob Storage
    • BigQuery
    • Email
    • FTP
    • Google Cloud Storage (GCS)
    • Google Drive
    • Google Sheets
    • HTTP API (Call an Osmos API)
    • HTTP API (Osmos Calls Your API)
    • Osmos Dataset
    • Snowflake
    • Accessing Sources behind firewall
  • ↖️Destination Connectors
    • Amazon S3
    • BigQuery
    • FTP
    • Google Cloud Storage (GCS)
    • Google Drive
    • Google Sheets
    • HTTP API (Call an Osmos API)
    • HTTP API (Osmos Calls Your API)
      • Passing Dynamic Tokens in the API Header
    • MySQL
    • Osmos Dataset
    • PostgreSQL
    • Snowflake
    • Accessing Destinations behind firewall
  • 🗂️Projects
  • ⚙️Administration
    • Email Notifications
  • 🔒Security
  • 📞Support
  • Back to Osmos.io
Powered by GitBook
On this page
  • Overview
  • Prerequisites
  • Creating an Amazon S3 Destination Connector
  • S3 Bucket Information
  • Access Key
  • Destination Schema
  • Advanced Options
  • Additional Configuration for File Prefix Format
  • Connector Options

Was this helpful?

  1. Destination Connectors

Amazon S3

PreviousDestination ConnectorsNextBigQuery

Last updated 1 year ago

Was this helpful?

Overview

You can create an Amazon S3 Destination Connector to write CSV files (by default) or JSONL files to an S3 bucket or folder using an Access Key ID and Secret Access Key.

Supported file formats: CSV and JSONL

Prerequisites

Required information:

  • Bucket Name

  • Region

  • Access Key ID

  • Secret Access Key

Creating an Amazon S3 Destination Connector

Step 1: After selecting + New Connector, under the system prompt, click Amazon S3.

Step 2: Provide a Connector Name.

Step 3: Select Destination Connector.

S3 Bucket Information

Step 4: Provide the name of the Amazon S3 bucket in the Bucket Name field.

Step 5: Provide the folder name with the trailing “/”. Subfolders within the folder provided will be ignored.

If this field is left blank, we will read from the root of the Amazon S3 bucket and ignore any folders within the bucket

Step 6: Provide the region for the S3 bucket.

Access Key

Step 8: Provide the secret access key ID for the S3 bucket

Destination Schema

Design the output schema via two options, either import the schema or build it within Osmos.

Option 1: Schema Import

Upload or drag & drop the schema file.

Import a file with the headers along with one row of sample data. This data is used only in schema creation.

Option 2: Building the Schema for the Destination Connector

Use the schema designer to build the output schema for this Destination Connector.

Parameter

Description

Field Name

Provide a field name for the output fields. These names will be used as the column headers or field names in the output file you are writing to.

Type

Define the type of each field. The field types will be used to enforce rules when you send data to this Connector.

Nullable

Check this box if the field is nullable. If the field is not nullable, you will be required to provide values for this field when sending data to this Connector.

Delete

Deletes the field.

Add Field

Adds another field to the schema.

Step 1: Click Add Field for each additional field required in the Schema Step 2: Select Create Schema once you have built the schema.

Advanced Options

Output File Format

By default, this Connector writes CSV files, and each Pipeline run produces a new file. If preferred, you can choose to change the output to a JSONL file instead of a CSV file.

File Prefix Format String

We support the designation of file prefixes in order to more easily manage the output of this connector. The contents of this field will be written into the filename of the data this Connector writes. If a prefix is specified, a UUID will be appended to it to prevent filename conflicts.

Limit Records Per File

By default, we do not set a limit on the number of records to be written to a single destination file by a single job (i.e. a single run of a Pipeline or Uploader). If this box is checked, the data written to the destination will be "chunked" into separate files which contain at-most the number of records designated here. These "chunked" files will be suffixed with it's position in the sequence i.e. filename_part_1.csv, filename_part_2.csv, etc.

Validation Webhook

We support the use of Validation Webhooks to prevent bad data from being written to your systems, adding another layer of protection to the built-in validations that Osmos provides. The Webhook URL can be posted here.

Overwrite Output Column with Raw Input Data

Enter the name of the destination column where you'd like to store the entire raw source record data. The raw source record data will be stored as a JSON string in the provided destination column.

Additional Configuration for File Prefix Format

Organizing File Structure

A user can chunk files and output to different buckets based on job_id. Osmos leverages a case-sensitive magic string {jobId} in the file prefix and the file output names for these file based Destination Connectors. To set a file prefix, go to the Destination Connector > Show Advanced Options > populate prefix information in the File Prefix Format String field.

File Output

We support the designation of file prefixes in order to more easily manage the output of this connector. The contents of this field will be written into the filename of the data this Connector writes. A UUID will be appended to it to the filename prevent writing conflicts. Osmos leverages two types of magic string identifiers in order to include additional information in your file prefix.

Job_Id: You can include an identifier that corresponds each individual Job (a run of an Osmos Uploader or Pipeline) by including {jobId} in your prefix format string. See examples 3 & 4

Output Scenarios:

  1. No file prefix Output: <user base path>/chunk-<chunk num>-<GUID>.<file extension>

  2. File includes description in the prefix Sample prefix: my_osmos_output_ Output: <user base path>/my_osmos_output_chunk-<chunk num>-<GUID>.<file extension>

  3. File includes description and job_id in the prefix Sample prefix: my_osmos_output/{jobId}/ Output: <user base path>/my_osmos_output/<ACTUAL JOB_ID HERE>/chunk-<chunk num>-<GUID>.<file extension>

  4. File includes datetime specifiers and job_id in the prefix

    Sample prefix: {jobId}/%F_%T/ Output: <user base path>/<ACTUAL JOB_ID HERE>/<YYYY-MM-DD>_<HH:mm:ss>/chunk-<chunk num>-<UUID>.<file extension>

Connector Options

The connector can be deleted, edited and duplicated.

Duplication

To save time, the connector can be duplicated. This new connector needs to be named and can be edited, as needed.

Step 7: Provide an access key ID for the S3 bucket. To learn more about creating an access key ID, visit:

For additional configuration go to

For more information on Validation webhook configuration, see

DateTime: You can include datetime values in your file output using . The time values created here correspond to Osmos internal system time at the moment the job was started. See example 4

↖️
https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html
Server Side Validation Webhooks
String from time (strftime) format specifiers
Additional Configuration for File Prefix Format.
Schema Upload