TL;DR: If you have BLOB
s, you need to use UNION ALL
. If you use UNION ALL
, you need to partition off a criterion and select the first row.
The other day I had to figure out how UNION
works in Oracle, because we're retrofitting an application with extra features. Assuming I have 2 tables(table_1
and table_2
) with identical structures:
"PROP_ID" NVARCHAR2(25),
"SEQ_NUM" NUMBER(*,0),
"FILE_NAME" NVARCHAR2(256),
"DESCRIPTION" NVARCHAR2(256),
"FILE_CONTENT" BLOB,
PRIMARY KEY ("PROP_ID", "SEQ_NUM")
I need to perform an UNION
to access both tables'fields:
SELECT prop_id, seq_num, description, file_name, file_content, 'W' src
FROM table_1
UNION
SELECT prop_id, seq_num, description, file_name, file_content, 'S' src
FROM table_2
Unfortunately, I've found out that you can't just do UNION
when you have BLOB
data because a simple UNION
would eliminate duplicates. So... I needed a way to fake that. The best I could come up with is:
/*
Eliminate duplicates SELECT (3)
*/
SELECT prop_id, seq_num, description, file_name, file_content,
length(file_content) AS content_length, src, RW FROM (
/*
Partition SELECT (2)
*/
SELECT prop_id, seq_num, description, file_name, file_content, src,
row_number() OVER (PARTITION BY file_name ORDER BY src ) RW FROM (
/*
Union SELECT (1)
*/
SELECT prop_id, seq_num, description, file_name, file_content, 'W' src
FROM we_proposal_attachments
UNION ALL
SELECT prop_id, seq_num, description, file_name, file_content, 'S' src
FROM we_sap_proposal_attachments
)
) WHERE RW = 1
Here's what happens:
- Using
UNION ALL
will get us duplicated rows (i.e. duplicated file names) - In Union SELECT (1) I'm adding a column src to identify the source (W from
table_1
and S fromtable_2
respectively) - In Partition SELECT (2) I'm partitioning by
file_name
(the key we're interested in) and create yet another column RW - In Eliminate duplicates SELECT (3) I'm picking up the first row of the duplicates.
The main effects are:
- we can perform an
UNION
-like statement with tables containingBLOB
fields, where our uniqueness criteria don't imply theBLOB
itself - We can prioritise results from either table. E.g. I know
table_2
has newer versiond of the file and I'd prefer having the newer version available.
Bonus
You can apply the prioritisation also with simple UNION
statements, particularly if you have odd fields like length(file_content)
which would render the duplicate removal useless:
SELECT prop_id, seq_num, description, file_name, content_length, src, RW
FROM (
SELECT prop_id, seq_num, description, file_name, content_length, src,
row_number() OVER (PARTITION BY file_name ORDER BY src) RW
FROM (
SELECT prop_id, seq_num, description, file_name,
length(file_content) AS content_length, 'W' src
FROM table_1
UNION
SELECT prop_id, seq_num, description, file_name,
length(file_content) AS content_length, 'S' src
FROM table_2
)
) WHERE RW = 1
HTH,
Member discussion: