Email

Overview

Note: This connector is only available for Premium and Enterprise plans.
This Source Connector can receive data via emails sent to an established recipient email. The Osmos Pipeline will run automatically as new emails are received and not on a schedule.
The schema for this source Connector is defined by the newest file received at the email. All files must have the same schema (number and order of columns). Any files not matching the original schema will be ignored, and the schema cannot be changed after saving the Connector.

Prerequisites

Information needed:
  • Recipient Email ID (note: you must reach out to [email protected] to set up a Recipient Email ID).
  • Emails with data sent to the Recipient Email ID

Creating an Email Source Connector

Step 1: Navigate to the Connectors list page, then click + New Connector
Step 2: Under the System prompt, click Email
Step 3: Enter a Connector Name.
Step 4: Select Source Connector.
Step 5: Input the Recipient Email ID (please reach out to [email protected] to set up a Recipient Email ID).
Tip: The email address must be the only email on the To line and all other emails need to reside on the cc or bcc. The connector reads forwarded or new emails.
Step 6: Select the Ingestion Method (at this time, we only support Email Attachments as an Ingestion Method).
Tip: The body of the can have content, but the connector will only pick up data from the attachment.

Advanced Options

File Filtering
You can choose to process all source files, or filter the files based on the file name.
  1. 1.
    Include all files: If this option is chosen, all of the files in the folder will be processed in chronological order.
  2. 2.
    Only include files that: If you choose this option, you can filter which files to process from the source folder based on three options:
    • File names starting with,
    • File names containing, or
    • File names ending with.
    Any files that do not meet the filter criteria will be ignored.
File Headers
Within the source folder, all files can contain column header names or none of the files can contain column header names. Please select one of the options:
  1. 1.
    If this option is selected, we will use the first row as column header names to label the schema within Osmos. Rows two and up will be read as data records.
  2. 2.
    If this option is selected, we auto generate column names for the schema within Osmos. All rows, including the first row, will be read as data records.
Delimiter for TXT Files
The delimiter to use when reading files. By default, this is only applied to .TXT files. Turning on the override mode forces the delimiter regardless of file extension.
Handle Invalid Characters
The source file may have characters that may not be valid.
  1. 1.
    Keep all characters from source: If this option is selected, we will retain all characters from the source file, replacing characters we cannot decode with the unicode undefined character.
  2. 2.
    Strip null characters: If this option is selected, we filter out all characters that are equal to 0. Useful when dealing with null-terminated strings.
Deduplication Method
We support three different deduplication methods. You can choose to deduplicate at file level, or record level. Select one of the following options:
  1. 1.
    File level Deduplication: If this option is selected, deduplication will be performed at a file level only. If a file name is changed, or the file itself is changed, the entire file will be processed in subsequent runs.
  2. 2.
    Record level Deduplication across all historical data: When this is selected, in addition to file-level deduplication, deduplication will be performed at a record level across all the files processed by this pipeline. An identical record that was already processed in a previous pipeline run will not be processed in the current file, nor will duplicated records within the same file.
    • I.e. Let's say a pipeline had previously processed 5 files, and a 6th file (File 6) is added to the folder. Only the Net New records in File 6 will be processed. All the records in File 6 that might have existed in one of the other previously processed 5 files will be ignored.
  3. 3.
    Record level Deduplication within individual files: When this is selected, in addition to file-level deduplication, deduplication will be performed at a record level, but only within the same file. If the file being processed has the same record appearing multiple times, the record will be processed only once.
    1. 1.
      I.e. Let's say there are 5 files in the folder already. 2 more files (File 6 and File 7) are added to the folder, and a run is triggered. File 6 will be checked for duplicate records, and any records appearing more than once within File 6 will be processed only once. Same for File 7. However, if there are any records in File 7 that might have been processed as a part of File 6 and File 4, those records will be processed again as a part of File 7.
Starting Cell
Spreadsheet-type (csv, xls,xsv, etc.) data ingested with this connector will be cropped.
Any rows of input that are completely empty and void of commas will be omitted before the starting cell is applied.