Bank Loans
We've moved! To improve customer experience, the Collibra Data Quality User Guide has moved to the Collibra Documentation Center as part of the Collibra Data Quality 2022.11 release. To ensure a seamless transition, dq-docs.collibra.com will remain accessible, but the DQ User Guide is now maintained exclusively in the Documentation Center.
It is common for banks to lend money in return for monthly payments with interest. However to do so a bank must make sure that the applications are valid and wellformed to begin the underwriting and approval process. Below we will apply some basic lending concepts to OwlDQ.
- 1.Credit Score Validation
- 2.SSN Validation
- 3.Loan to Value Validation
- 4.Interest Rate Validation
- 5.Duplicate Loan Applications
- 6.Loan Amount Validation
- 7.Loan Completeness Validation


Business Check | OwlDQ Feature | Manual vs Auto |
---|---|---|
Is the credit score a whole number | BEHAVIOR | AUTO |
Is the credit score within a valid range (between 300 - 850) | RULE | credit_score between 300 and 850 |
Is the credit score NULL or Missing | BEHAVIOR | AUTO |
Business Check | OwlDQ Feature | Text |
---|---|---|
Is a valid formatted SSN | RULE | AUTO-SSN detection |
SSN is PII | SENSITIVITY | AUTO-SSN labeled |
Is the SSN NULL or Missing | BEHAVIOR | AUTO |
Does the SSN belong to the Applicant | PATTERN | SSN -> first_name, last_name |
Business Check | OwlDQ Feature | Text |
---|---|---|
Is Loan amount and asset value (home or auto) valid numbers | BEHAVIOR | AUTO |
95% loan to value ratio to approve | RULE | loan / asset_value < .95 |
Business Check | OwlDQ Feature | Text |
---|---|---|
Interest rate between min and max allowable range for the loans credit rating. | RULE COMPLEX | loan l join rates r on l.credit_rating = r.credit_rating where l.rate between r.min_rate and r.max_rate |
Can't give someone the same loan twice!
Business Check | OwlDQ Feature | Manual vs Auto |
---|---|---|
Ensure we don't issue the same loan twice | DUPE | first_n, last_n, SSN, Address |
Business Check | OwlDQ Feature | Manual vs Auto |
---|---|---|
Loan Amount within lendable range | OUTLIER | AUTO |
Loan Amount within lendable range only lend money between 50K and 3M | RULE | loan_amount between 50000 and 3000000 |
-lib "/home/opt/owl/drivers/postgres" \
-cxn postgres-gcp \
-q "select * from public.loan_risk_grade where last_pymnt_d = '2019-04-01'" \
-key member_id -alias loan_risk \
-ds public.loan \
-rd "2019-04-01" \
-dl -loglevel INFO \
-h 10.142.0.29:5432/owltrunk \
-numexecutors 10 -executormemory 1g -drivermemory 4g \
-master yarn -deploymode cluster \
-sparkprinc [email protected] \
-sparkkeytab /tmp/user2.keytab -tbin MONTH \
-dupe -dupeinc purpose -fpgon -fpgkey grade \
-fpginc grade,sub_grade -fpglb 365 -fpgdc last_pymnt_d \
-record member_id -dupecutoff 60 -dupepermatchupperlimit 99
We made use of Profiles, Duplicates, Outliers and Rules in this example. The experiments were automatically cataloged and put on a job scheduler. The next time a loan issue arises we will be able to take remediation action using the workflow Q. Over time we can see how the bank loan program is running via the report section.

interest_rates.csv
109B
Binary
Interest Rates CSV
Owl Dataset (2).csv
10KB
Binary
Loan Data CSV
Last modified 4mo ago