Collibra DQ User Guide
2022.10
Search
⌃K

Bank Loans

We've moved! To improve customer experience, the Collibra Data Quality User Guide has moved to the Collibra Documentation Center as part of the Collibra Data Quality 2022.11 release. To ensure a seamless transition, dq-docs.collibra.com will remain accessible, but the DQ User Guide is now maintained exclusively in the Documentation Center.
It is common for banks to lend money in return for monthly payments with interest. However to do so a bank must make sure that the applications are valid and wellformed to begin the underwriting and approval process. Below we will apply some basic lending concepts to OwlDQ.
  1. 1.
    Credit Score Validation
  2. 2.
    SSN Validation
  3. 3.
    Loan to Value Validation
  4. 4.
    Interest Rate Validation
  5. 5.
    Duplicate Loan Applications
  6. 6.
    Loan Amount Validation
  7. 7.
    Loan Completeness Validation

1. Credit Score

Business Check
OwlDQ Feature
Manual vs Auto
Is the credit score a whole number
BEHAVIOR
AUTO
Is the credit score within a valid range
(between 300 - 850)
RULE
credit_score between 300 and 850
Is the credit score NULL or Missing
BEHAVIOR
AUTO

2. SSN Validation

Business Check
OwlDQ Feature
Text
Is a valid formatted SSN
RULE
AUTO-SSN detection
SSN is PII
SENSITIVITY
AUTO-SSN labeled
Is the SSN NULL or Missing
BEHAVIOR
AUTO
Does the SSN belong to the Applicant
PATTERN
SSN -> first_name, last_name

3. Loan to Value

Business Check
OwlDQ Feature
Text
Is Loan amount and
asset value (home or auto) valid numbers
BEHAVIOR
AUTO
95% loan to value ratio to approve
RULE
loan / asset_value < .95

4. Interest Rate

Business Check
OwlDQ Feature
Text
Interest rate between
min and max allowable range
for the loans credit rating.
RULE COMPLEX
loan l join rates r on l.credit_rating = r.credit_rating
where l.rate between r.min_rate and r.max_rate

5. Duplicate Loan Applications

Can't give someone the same loan twice!
Business Check
OwlDQ Feature
Manual vs Auto
Ensure we don't issue the same loan twice
DUPE
first_n, last_n, SSN, Address

6. Loan Amount

Business Check
OwlDQ Feature
Manual vs Auto
Loan Amount within lendable range
OUTLIER
AUTO
Loan Amount within lendable range
only lend money between 50K and 3M
RULE
loan_amount between 50000 and 3000000

Resulting OwlCheck

-lib "/home/opt/owl/drivers/postgres" \
-cxn postgres-gcp \
-q "select * from public.loan_risk_grade where last_pymnt_d = '2019-04-01'" \
-key member_id -alias loan_risk \
-ds public.loan \
-rd "2019-04-01" \
-dl -loglevel INFO \
-h 10.142.0.29:5432/owltrunk \
-numexecutors 10 -executormemory 1g -drivermemory 4g \
-master yarn -deploymode cluster \
-sparkprinc [email protected] \
-sparkkeytab /tmp/user2.keytab -tbin MONTH \
-dupe -dupeinc purpose -fpgon -fpgkey grade \
-fpginc grade,sub_grade -fpglb 365 -fpgdc last_pymnt_d \
-record member_id -dupecutoff 60 -dupepermatchupperlimit 99

Which components did we use?

We made use of Profiles, Duplicates, Outliers and Rules in this example. The experiments were automatically cataloged and put on a job scheduler. The next time a loan issue arises we will be able to take remediation action using the workflow Q. Over time we can see how the bank loan program is running via the report section.

Files that can be used to replicate this example

interest_rates.csv
109B
Binary
Interest Rates CSV
Owl Dataset (2).csv
10KB
Binary
Loan Data CSV