TL;DR: Here I present a SQL example from creating the table to performing a SELECT.
Table of Contents
- Part 1 - Prerequisites
- Part 2 - Create a table and perform a SELECT
- Part 3 - User Python to insert data
- Part 4 - references
Table and XMLTYPE
Once we've seen that we have XMLTYPE available, we can proceed to create a simple table.
The table structure
My problem required for a table with the following structure:
- An identifier (key) - 20 character long string
- An article field (nullable) - XML content
- A field with citations of the article (nullable) - XML content
The SQL statement to create the table is as follows:
CREATE TABLE ARTICLES_XML_DATA (
id VARCHAR(20) NOT NULL PRIMARY KEY,
article XMLTYPE,
citedby XMLTYPE
);
This is straight-forward. However, as you can see, the table is quite dumb and it doesn't have any XSDs attached to its contents.
Adding a record
Adding a record can be done via a simple SQL insert. However you need to make sure you escape all quotes... A simple example which adds a record with the ID and CITEDBY fields is:
insert into ARTICLES_XML_DATA(ID, CITEDBY)
values (
'00000000000000000003',
'<cited-by>
<count>2</count>
<citing-doc>
<eid>00000000000000000002</eid>
</citing-doc>
<citing-doc>
<eid>00000000000000000001</eid>
</citing-doc>
</cited-by>'
);
An XML structure
Once we have added several records, we can perform queries on them. As you can see from the example above, the CITEDBY field has a very simple structure:
- A count of citations and
- A list of document IDs that refer the current document
The XML from the example above is:
<cited-by>
<count>2</count>
<citing-doc>
<eid>00000000000000000002</eid>
</citing-doc>
<citing-doc>
<eid>00000000000000000001</eid>
</citing-doc>
</cited-by>
We can therefore perform some interesting queries. For example, we can display a table containing the article ID and the number of citations for that article. The query would be:
SELECT articles.ID, citations.NUMBER
FROM ARTICLES_XML_DATA articles,
XMLTABLE(
'cited-by/count'
PASSING articles.CITEDBY
COLUMNS NUMBER VARCHAR2(30)
PATH '.'
) citations;
This requires some explanations:
- I want to query the ARTICLESXML_DATA with an alias of _articles. I will need the field ID from it.
- For every record, I want to PASS articles.CITEDBY,
- Get the content from XPATH of cited-by/count,
- Place it in a VARCHAR2(30) as a column named NUMBER and
- Refer to this result (table with a single column) as citations.
As you can see, the result looks very muck like a select on an inner join.
The result is something like:
ID NUMBER
-------------------- ------
00000000000000000001 2
Glorious!
Member discussion: