2021.10
Collibra DIC Integration
Powered By GitBook
Export and Import API
Promoting and moving datasets across environments
The database needs the stored procedure (function) defined in order to use the Export/Import API.

Step 1 - Get-Exports

Best practice is to use the get-exports endpoint for most scenarios. You can pass in several dataset names and several tables at once. This endpoint will create a JSON payload
The three most common tables are:
    owl_rule
    job_schedule
    owl_check_repo
1
http://<url>/v2/get-exports?dataset=public.dataset_scan_2,public.dataset_scan_1&schema=public&tables=owl_rule,job_schedule,owl_check_repo
Copied!

Use Swagger to build this for you

This is located under controller-scala (internal API)

Click Try it out to input the details

Step 2 - Run-Import

You will want to perform a find/replace on the import payload to check for differences in connections, agents, spark and environment configurations. Migrating to different environments typically requires the payload to be modified.
Run import on the desired environment, passing the output of the previous statement to the body of the request
1
http://<url>/v2/run-import
Copied!

Use Swagger to try it out

This is under controller-catalog
This would be the body of the POST.

Stored Procedure

The following function needs to be created in the Owl metastore before this can run.
1
CREATE OR REPLACE FUNCTION public.dump(p_schema text, p_table text, p_where text)
2
RETURNS SETOF text
3
LANGUAGE plpgsql
4
AS $function$
5
DECLARE
6
dumpquery_0 text;
7
dumpquery_1 text;
8
selquery text;
9
selvalue text;
10
valrec record;
11
colrec record;
12
BEGIN
13
14
-- ------ --
15
-- GLOBAL --
16
-- build base INSERT
17
-- build SELECT array[ ... ]
18
dumpquery_0 := 'INSERT INTO ' || quote_ident(p_schema) || '.' || quote_ident(p_table) || '(';
19
selquery := 'SELECT array[';
20
21
<<label0>>
22
FOR colrec IN SELECT table_schema, table_name, column_name, data_type
23
FROM information_schema.columns
24
WHERE table_name = p_table and table_schema = p_schema
25
ORDER BY ordinal_position
26
LOOP
27
dumpquery_0 := dumpquery_0 || quote_ident(colrec.column_name) || ',';
28
selquery := selquery || 'CAST(' || quote_ident(colrec.column_name) || ' AS TEXT),';
29
END LOOP label0;
30
31
dumpquery_0 := substring(dumpquery_0 ,1,length(dumpquery_0)-1) || ')';
32
dumpquery_0 := dumpquery_0 || ' VALUES (';
33
selquery := substring(selquery ,1,length(selquery)-1) || '] AS MYARRAY';
34
selquery := selquery || ' FROM ' ||quote_ident(p_schema)||'.'||quote_ident(p_table);
35
selquery := selquery || ' WHERE '||p_where;
36
-- GLOBAL --
37
-- ------ --
38
39
-- ----------- --
40
-- SELECT LOOP --
41
-- execute SELECT built and loop on each row
42
<<label1>>
43
FOR valrec IN EXECUTE selquery
44
LOOP
45
dumpquery_1 := '';
46
IF not found THEN
47
EXIT ;
48
END IF;
49
50
-- ----------- --
51
-- LOOP ARRAY (EACH FIELDS) --
52
<<label2>>
53
FOREACH selvalue in ARRAY valrec.MYARRAY
54
LOOP
55
IF selvalue IS NULL
56
THEN selvalue := 'NULL';
57
ELSE selvalue := quote_literal(selvalue);
58
END IF;
59
dumpquery_1 := dumpquery_1 || selvalue || ',';
60
END LOOP label2;
61
dumpquery_1 := substring(dumpquery_1 ,1,length(dumpquery_1)-1) || ');';
62
-- LOOP ARRAY (EACH FIELD) --
63
-- ----------- --
64
65
-- debug: RETURN NEXT dumpquery_0 || dumpquery_1 || ' --' || selquery;
66
-- debug: RETURN NEXT selquery;
67
RETURN NEXT dumpquery_0 || dumpquery_1;
68
69
END LOOP label1 ;
70
-- SELECT LOOP --
71
-- ----------- --
72
73
RETURN ;
74
END
75
$function$
76
;
Copied!
This assignment needs added.
1
alter function dump(text, text, text) owner to ownername;
Copied!
Last modified 14d ago