Simulate UNION with BLOBs
TL;DR: If you have BLOBs, you need to use UNION ALL. If you use UNION ALL, you need to partition off a criterion and select the first row.
The other day I had to figure out how UNION works in Oracle, because we're retrofitting an application with extra features. Assuming I have 2 tables(table_1 and table_2) with identical structures:
"PROP_ID" NVARCHAR2(25),
"SEQ_NUM" NUMBER(*,0),
"FILE_NAME" NVARCHAR2(256),
"DESCRIPTION" NVARCHAR2(256),
"FILE_CONTENT" BLOB,
PRIMARY KEY ("PROP_ID", "SEQ_NUM")
I need to perform an UNION to access both tables'fields:
SELECT prop_id, seq_num, description, file_name, file_content, 'W' src
FROM table_1
UNION
SELECT prop_id, seq_num, description, file_name, file_content, 'S' src
FROM table_2
Unfortunately, I've found out that you can't just do UNION when you have BLOB data because a simple UNION would eliminate duplicates. So... I needed a way to fake that. The best I could come up with is:
/*
Eliminate duplicates SELECT (3)
*/
SELECT prop_id, seq_num, description, file_name, file_content,
length(file_content) AS content_length, src, RW FROM (
/*
Partition SELECT (2)
*/
SELECT prop_id, seq_num, description, file_name, file_content, src,
row_number() OVER (PARTITION BY file_name ORDER BY src ) RW FROM (
/*
Union SELECT (1)
*/
SELECT prop_id, seq_num, description, file_name, file_content, 'W' src
FROM we_proposal_attachments
UNION ALL
SELECT prop_id, seq_num, description, file_name, file_content, 'S' src
FROM we_sap_proposal_attachments
)
) WHERE RW = 1
Here's what happens:
- Using
UNION ALLwill get us duplicated rows (i.e. duplicated file names) - In Union SELECT (1) I'm adding a column src to identify the source (W from
table_1and S fromtable_2respectively) - In Partition SELECT (2) I'm partitioning by
file_name(the key we're interested in) and create yet another column RW - In Eliminate duplicates SELECT (3) I'm picking up the first row of the duplicates.
The main effects are:
- we can perform an
UNION-like statement with tables containingBLOBfields, where our uniqueness criteria don't imply theBLOBitself - We can prioritise results from either table. E.g. I know
table_2has newer versiond of the file and I'd prefer having the newer version available.
Bonus
You can apply the prioritisation also with simple UNION statements, particularly if you have odd fields like length(file_content) which would render the duplicate removal useless:
SELECT prop_id, seq_num, description, file_name, content_length, src, RW
FROM (
SELECT prop_id, seq_num, description, file_name, content_length, src,
row_number() OVER (PARTITION BY file_name ORDER BY src) RW
FROM (
SELECT prop_id, seq_num, description, file_name,
length(file_content) AS content_length, 'W' src
FROM table_1
UNION
SELECT prop_id, seq_num, description, file_name,
length(file_content) AS content_length, 'S' src
FROM table_2
)
) WHERE RW = 1
HTH,