Collibra DQ User Guide
2022.10
Search
⌃K

About remote file connections

We've moved! To improve customer experience, the Collibra Data Quality User Guide has moved to the Collibra Documentation Center as part of the Collibra Data Quality 2022.11 release. To ensure a seamless transition, dq-docs.collibra.com will remain accessible, but the DQ User Guide is now maintained exclusively in the Documentation Center.
This section is an overview of the supported data file formats and the limitations of connecting to a remote file.

Supported file types

File formats differ in structure, so you might need to prepare your data before establishing a connection.
Type
File structure
Notes
Delimited (CSV, TSV, etc.)
Structured
The default delimiter is comma (for example, CSV).
Parquet
Structured
Avro
Structured
JSON
Semi-structured
ORC
Semi-structured
XML
Semi-structured
Delta
Semi-structured

Supported delimiters

The following table is a list of supported delimiters available in the Delimiter dropdown menu.
Type
Format
Description
Comma
CSV
, is used to separate values in the file. This is the default delimiter for files.
Tab
TSV
tab is used to separate values in the file.
Semicolon
CSV
; is used to separate values in the file.
Double Quote
CSV
" is used to separate values in the file.
Single Quote
CSV
' is used to separate values in the file.
Pipe
TXT
| is used to separate values in the text file.
SOH
TXT
A Unicode character 'START OF HEADING' (U+0001) is an invisible control character.
Custom
N/A
Add a custom delimiter. Support for custom delimiters may vary.