Skip to content
Components

IMAP Email Contents and Attachments

This extractor allows you to automatically retrieve email contents and/or attachments via the IMAP protocol using basic authentication. It supports incremental loads and IMAP queries to define specific criteria.

The IMAP protocol offers several advantages:

FeatureNote
Generic UI formDynamic UI form adapting to various configurations.
Row based configurationExecute each row in parallel.
Incremental loadingFetch new data in increments.
IMAP query syntaxFilter emails using standardized IMAP query syntax.
Download email contentsDownload the full email body into the Storage column.
Download email attachmentsAll attachments are downloaded by default into File Storage.
Filter email attachmentsDownload only attachments that match a specified regex expression.
Processors supportUse processors to modify outputs before saving to Storage; e.g., process attachments to be stored in Tabular Storage.

Enable IMAP service on your email account. You will need your IMAP credentials (name, password), as well as the hostname and port of your IMAP server. Check with your email provider if you need more details.

  1. Enable and create an app password specificly for this integration. Name it, for instance, Keboola extractor.
  2. Enter your email address in the Username field.
  3. Enter the generated app password in the Password field.
  4. In the IMAP host field, entere the Gmail imap address: imap.gmail.com.
  5. Use port 993.

Fill in the Username, Password, Hostname, and Port of your provider’s IMAP server. See the Gmail example for guidance.

Screenshot - Auth configuration

Click Add Row and name the row accordingly.

Enter a Search query to filter the emails you want. By default, all emails are downloaded. A common use case is to filter by Subject and Sender, e.g., (FROM "sender-email@example.com" SUBJECT "the subject"). More complex queries are also supported; refer to the query syntax for examples.

Screenshot - Row configuration

Specify the folder to fetch emails from. Defaults to the root folder INBOX. For example, in Gmail, a label can function as a folder.

When selected, emails that have been extracted will be marked as “seen” in the inbox.

Use this field to filter emails received since a specific date. The field supports fixed dates in the format YYYY-MM-DD as well as relative dates like yesterday, 1 month ago, 2 days ago, etc. To avoid missing data, set this to cover a buffer period, e.g., 2 days ago when running daily. The data is always incrementally upserted, so duplicates won’t appear in the resulting table.

Select this option to download the email content.

When enabled, attachments are also downloaded. You may use a regex pattern to filter for attachments that match your definition.

For example, to match only PDF files, use the pattern .+\.pdf. If left empty, all attachments are downloaded by default.

The files are saved in File Storage by default. Use processors to control the behaviour.

If your attachments are in CSV format, you can use this combination of processors to store them in Table Storage:

  1. Set the folder parameter in the first processor to match the resulting table name.
  2. Use the second processor to define that the result will always replace the destination table and expects header in the CSV file.
  3. Note: In this setup, all attachments will be stored in the same table, so they msut share the same structure.
{
"before": [],
"after": [
{
"definition": {
"component": "keboola.processor-move-files"
},
"parameters": {
"direction": "tables",
"folder": "result_table"
}
},
{
"definition": {
"component": "keboola.processor-create-manifest"
},
"parameters": {
"delimiter": ",",
"enclosure": "\"",
"incremental": false,
"primary_key": [],
"columns_from": "header"
}
},
{
"definition": {
"component": "keboola.processor-skip-lines"
},
"parameters": {
"lines": 1
}
}
]
}

If your attachments are in XLSX format, you can use this combination of processors to store them in Table Storage:

  • The first processor converts each XSLX sheet into a separate table.
  • The second processor moves the converted files for output staging to the tables folder.
{
"before": [],
"after": [
{
"definition": {
"component": "kds-team.processor-xlsx2csv"
},
"parameters": {
"addFileName": true,
"selectSheets": [],
"ignoreSheets": []
}
},
{
"definition": {
"component": "keboola.processor-move-files"
},
"parameters": {
"direction": "tables"
}
}
]
}

Example 3 - Storing attachments in File Storage and adding custom tags

Section titled “Example 3 - Storing attachments in File Storage and adding custom tags”

Use this processor to store attachments in File Storage with custom tags. It adds custom tags to the resulting files and offers additional options to create tags based on the file name.

{
"before": [],
"after": [
{
"definition": {
"component": "kds-team.processor-create-file-manifest"
},
"parameters": {
"tags": [
"SOME_TAG"
],
"is_permanent": false,
"tag_functions": []
}
}
]
}

A single table named emails contains the email contents.

Results are inserted incrementally to avoid duplicates.

Columns: 'pk', 'uid', 'mail_box', 'date', 'from', 'to', 'body', 'headers', 'number_of_attachments', 'size'

Attachments are stored by default in File Storage, with filenames prefixed by the generated message primary key, e.g., bb41793268d4a8710fb5ebd94eaed6bc_some_file.pdf.

Files include tags to distinguish their source:

Screenshot - Tags

Additional tags can be specified with the Create File Manifest processor. Attachments can also be further processed and stored in Table Storage using other processors.

Ask Kai

Ask anything about Keboola — I'll search the docs and cite the pages I use.