Connecting to Hadoop Distributed File System (HDFS)
We've moved! To improve customer experience, the Collibra Data Quality User Guide has moved to the Collibra Documentation Center as part of the Collibra Data Quality 2022.11 release. To ensure a seamless transition, dq-docs.collibra.com will remain accessible, but the DQ User Guide is now maintained exclusively in the Documentation Center.
To configure the HDFS connector, you need:
- Admin privileges in your Collibra Data Quality instance.
- Access to an HDFS cluster.
- 1.In the main menu, hover over the gear icon and click Connection. >>The Connections page opens.
- 2.Scroll down to the HDFS card.
- 3.Click the Add button to add a new HDFS connection. >> The New Remote File Connection (HDFS) modal opens.
- 4.Enter the values for each property.
Property | Description |
---|---|
Name | The unique name of your HDFS connector. |
Connection URL | The HDFS URL used for your connection. |
Target Agent | The target agent lets you select an agent for your connection. |
Auth Type | The method used to authorize your connection.
Note: If you use an Unsecured Auth Type, no other authorization fields are required. This is not recommended. |
Principal | The service principal used to let Collibra Data Quality access your connection. |
Keytab | The keytab used to authorize your connection.
Note: Only applicable when you select Keytab as the Auth Type. |
TGT | The Ticket Granting Ticket used to authorize your connection.
Note: Only applicable when you select TGT Cache as the Auth Type. |
Driver Properties | The configurable driver properties for your connection.
Note: This is an optional configuration. |
5. Click Save to establish your connection.
Once you save your HDFS connection:
- A confirmation message tells you that your connection is saved and valid.
Last modified 6mo ago