-
Notifications
You must be signed in to change notification settings - Fork 0
Using XML in Oracle DB for Google Books
I demonstrate how to upload and extract information from an XML document that was exported from a Google Books bookshelf. Some data qaulity issues were detected with this bookshelf export. The final data model is a minor change from the model provided by Google Books: individual contributors are split out into a separate table; and the ISBN is used to identify books (where available).
- About Google Books
- Download Bookshelf
- Analysis of XML Document
- Load Bookshelf Into XE Database
- Examine Bookshelf Data
- Logical Database Model of DB_BOOKS Schema
- Create Materialised Views
- References
Google Books is a free product from Google that, according to their about page, allows you to:
- Search: Book Search works just like web search. Try a search on Google Books or on Google.com. When we find a book with content that contains a match for your search terms, we'll link to it in your search results.
- Browse books online: If the book is out of copyright, or the publisher has given us permission, you'll be able to see a preview of the book, and in some cases the entire text. If it's in the public domain, you're free to download a PDF copy.
- Buy books or borrow from the library: If you find a book you like, click on the "Buy this book" and "Borrow this book" links to see where you can buy or borrow the print book. You can now also buy the ebook from the Google Play Store.
- Learn more fast: We've created reference pages for every book so you can quickly find all kinds of relevant information: book reviews, web references, maps and more.
However, there is another feature that is not mentioned in the about page: a private library. Once you signs in with a Google account, you can access your library.
Within your private library, you can create multiple bookshelves. A bookshelf is created through the New Shelf button on the main page to Your Library (see below).
For example, I have created several bookshelves, one of which is called Database
. Into that bookshelf, I had added references to various books about databases. References to books can be added either in bulk through a list of ISBNs, or through searching within Google Books. See screenshot below for location of tool to Add by ISBN or ISSN within the Database bookshelf:
I exported my Database bookshelf from Google Books (using the Export as XML tool from the same menu to Add by ISBN or ISSN as shown above) into Database.xml
, which I renamed to Database_20240309.xml
. The first few lines of this file are (some details are blurred):
<?xml version="1.0" encoding="UTF-8" ?>
<library>
<list_title>Database</list_title>
<blurb></blurb>
<url>https://books.google.com.au/books?uid=*******************&as_coll=****</url>
<books>
<book>
<id>G9AJA91PL54C</id>
<url>https://books.google.com.au/books?id=G9AJA91PL54C</url>
<title>Oracle Core: Essential Internals for DBAs and Developers</title>
<contributor>Jonathan Lewis</contributor>
<identifier>
<type>ISBN</type>
<value>1430239549</value>
</identifier>
<labels />
</book>
<book>
<id>45fR_OeonfMC</id>
<url>https://books.google.com.au/books?id=45fR_OeonfMC</url>
<title>Effective Oracle by Design</title>
<contributor>Thomas Kyte</contributor>
<identifier>
<type>ISBN</type>
<value>0072230657</value>
</identifier>
<labels>
<label>Oracle</label>
<label>Database</label>
</labels>
</book>
...
</book>
</books>
</library>
Note that there is no schema published for this downloaded file.
Note I append the ISO date to the file name to keep versions of the exported Database bookshelf separate.
I examine the XML document to determine its overall size and structure. I devise a bare bones logical ER diagram.
The size of this file is determined using the following command (an alternative is to use the ls -l
command):
wc --bytes Database_20240309.xml
The result is, with the size of the file in bytes as the first argument:
11895 Database_20240309.xml
To see what tags are used in the XML document, I ran the following code to create a Markdown table skeleton:
sed --quiet \
--regexp-extended \
--expression='s!\s*<([^</> ]+).*!\1!p' Database_20240309.xml | \
sort | \
uniq --count | \
awk '{ print $2, "|", $1, "|" }'
The results are presented as follows:
Tag | Num Occurrences | Comment |
---|---|---|
blurb | 1 | Mandatory attribute of the root element, <library> . |
book | 29 | Element that describes a book. |
books | 1 | Mandatory attribute of the root element, <library> . It is a sequence of one or more <book> elements. |
contributor | 29 | This is a mandatory attribute of <book> , and contains a comma-separated list of authors. |
id | 29 | This is the primary key for the <book> element. |
identifier | 25 | This is an optional structured type consisting of a pair of <type> and <value> tags. |
label | 38 | Classification of the book. A book can have zero or more such labels. |
labels | 29 | Mandatory element of the <book> element, and contains a sequence of zero or more <label> elements. |
library | 1 | Root element of entire XML document. |
list_title | 1 | Mandatory attribute of root element, <library> . It is the name of the bookshelf that is displayed by Google Books. |
title | 29 | Mandatory attribute of <book> element, and is the title of the book. |
type | 25 | Part of the <identifier> element. Seems to always to be 'ISBN'. Not all books have an ISBN for some reason. |
url | 30 | Tag is used in both the <library> and <book> elements. This tag is mandatory for both elements. |
value | 25 | Part of the <identifier> element. It is the ISBN of the book. Not all books have an ISBN for some reason. |
There are 29 books currently in this bookshelf. I used this frequency and the frequency of each tags to classify the above tags, after inspection of the XML document, as follows:
- A frequency of one (1) indicates that the tag belongs to the root element,
<library>
. These are identified as:blurb
books
list_title
- A frequency of 29 indicates that the tag is a mandatory attribute of the
<book>
element. These are identified as:contributor
id
labels
title
- A frequency of 30 indicates that the tag is a mandatory attribute of book the
<book>
and<library>
elements. This is identified as:url
- A frequency less than 29 indicates a possible optional tag for the
<book>
element. These are identified as:identifier
type
value
- Other tags which have a frequency gather than 30. This is identified as:
label
I used the following commands to confirm that all book identifiers are ISBNs. The use of the [hxselect](https://www.w3.org/Tools/HTML-XML-utils/man1/hxselect.html)
command allows me to query the path book//identifier//type
and return each occurrence on a separate line through the -s '\n'
parameter:
hxselect -s '\n' book identifier type < /opt/oracle/xml/Database_20240309.xml | \
sort | \
uniq --count
The result is:
25 <type>ISBN</type>
This means that are four (4) books that do not have an ISBN. This highlights another data quality issue as every book should have an ISBN.
From this distribution of extracted tags, I infer that the logical model for this Google Books bookshelf is:
In other words, a library consists of three (3) mandatory attributes (blurb, list title, and URL). Within a library, there are one or more books. Each book has four (4) mandatory attributes (contributpr, ID, title, and URL), and two (2) optional attributes (a structured type called 'identifier' and a list of labels). The structured type, 'identifier', descibes how the book is identified. Usually, this is an ISBN. Each label is a single word.
I created a user and database directory for demonstrating the load and access to an XML document.
I created a user using the following SQL using a user with DBA access:
-- USER SQL
CREATE USER "DB_BOOKS" IDENTIFIED BY "&pw."
DEFAULT TABLESPACE "USER"
TEMPORARY TABLESPACE "TEMP";
-- QUOTAS
ALTER USER "DB_BOOKS" QUOTA UNLIMITED ON "DB_BOOKS";
-- SYSTEM PRIVILEGES
GRANT CREATE MATERIALIZED VIEW TO "DB_BOOKS" ;
GRANT CREATE VIEW TO "DB_BOOKS" ;
GRANT CREATE SESSION TO "DB_BOOKS" ;
GRANT CREATE TABLE TO "DB_BOOKS" ;
I created a directory for use by SQL*Loader to hold the source file and SQL*Loader logs.
mkdir --parents /opt/oracle/xml
I used the following SQL to register this directory to the XE database, and to give the user, BOOKS
, the necessary privileges to use this directory with SQL*Loader.
CREATE OR REPLACE DIRECTORY xml_dir AS '/opt/oracle/xml';
GRANT READ, WRITE ON DIRECTORY xml_dir TO "DB_BOOKS";
The WRITE
privilege on the directory is needed because the log files for SQL*Loader (aka ORACLE_LOADER
) are written to the same directory, by default.
The downloaded bookself was copied to the /opt/oracle/xml
directory.
I used the following SQL as the DB_BOOKS
database user to create an external table, called DATABASE_BOOKSHELF_EXT
, that loads the XML from the file /opt/oracle/xml/Database_20240309.xml
starting at the <books>
tag:
CREATE TABLE "DB_BOOKS"."DATABASE_BOOKSHELF_EXT"
("OBJECT_VALUE" CLOB
)
ORGANIZATION EXTERNAL
( TYPE ORACLE_LOADER
DEFAULT DIRECTORY "XML_DIR"
ACCESS PARAMETERS
( records
xmltag ("books")
FIELDS NOTRIM MISSING FIELD VALUES ARE NULL (
object_value char(1000000)
)
)
LOCATION
( 'Database_20240309.xml' )
)
REJECT LIMIT UNLIMITED ;
The directory, XML_DIR
, points to /opt/oracle/xml
, as described above in Create Directory.
I am extracting the XML document starting at the <books>
tag instead of the real root element, <library>
, as I am only instead in the books, rather than the whole bookshelf. This is done through the use of the [XMLTAG](https://docs.oracle.com/en/database/oracle/oracle-database/21/sutil/oracle_loader-access-driver.html#GUID-04134666-4940-484E-BD7C-0E5BAF057DB9)
parameter.
The specified size of 1,000,000 characters for the OBJECT_VALUE
column is more than enough to load the XML document. (Size of XML Documents says that the size is under 13,000 characters.)
The LOCATION
parameter specifies the name of the XML document to be loaded from the /opt/oracle/xml
directory.
I used the following SQL to confirm that the XML is accessible and that the loaded XML document has the expected root element.
SELECT
xmltype(e.object_value).getrootelement()
FROM
DATABASE_BOOKSHELF_EXT e;
Note: I had to cast the CLOB column, OBJECT_VALUE
, to XMLTYPE
in order to use the XML function, GETROOTELEMENT
.
The result is:
XMLTYPE(E.OBJECT_VALUE).GETROOTELEMENT()
----------------------------------------
books
To see what data had been loaded, I ran the following SQL:
SELECT
b.*
FROM
database_bookshelf_ext e,
XMLTABLE ( '/books/book'
PASSING xmltype(e.object_value)
COLUMNS
"ID" VARCHAR2(16) PATH 'id',
url VARCHAR2(512) PATH 'url',
title VARCHAR2(1024) PATH 'title',
contributer VARCHAR2(1024) PATH 'contributor',
isbn VARCHAR2(16) PATH 'identifier/value'
) b;
The first five (5) rows are:
ID | URL | TITLE | CONTRIBUTER | ISBN |
---|---|---|---|---|
G9AJA91PL54C | https://books.google.com.au/books?id=G9AJA91PL54C |
Oracle Core: Essential Internals for DBAs and Developers | Jonathan Lewis | 1430239549 |
45fR_OeonfMC | https://books.google.com.au/books?id=45fR_OeonfMC |
Effective Oracle by Design | Thomas Kyte | 0072230657 |
1Aqw6mrxz5AC | https://books.google.com.au/books?id=1Aqw6mrxz5AC |
Expert Oracle Exadata | Kerry Osborne, Randy Johnson, Tanel Põder | 1430233923 |
lA3QygAACAAJ | https://books.google.com.au/books?id=lA3QygAACAAJ |
Seven Databases in Seven Weeks: A Guide to Modern Databases and the NoSQL Movement | Eric Redmond, Jim R. Wilson | 1934356921 |
-0QPlh0stXAC | https://books.google.com.au/books?id=-0QPlh0stXAC |
Data Architecture: From Zen to Reality | Charles Tupper | 0123851270 |
When I try:
SELECT
l.*
FROM
database_bookshelf_ext e,
XMLTABLE ( '/books/book'
PASSING xmltype(e.object_value)
COLUMNS
"ID" VARCHAR2(16) PATH 'id',
label VARCHAR2(16) PATH 'labels/label'
) l;
I get the following error message:
2 3 4 5 6 7 8 9 10 ERROR:
ORA-19279: XPTY0004 - XQuery dynamic type mismatch: expected singleton sequence
- got multi-item sequence
no rows selected
There is a hit from Oracle Support matching this error message: ORA-19279 XQuery dynamic type mismatch expected singleton sequence - got multi-item sequence (Doc ID 1505158.1). Based on the example provided in that support note:
SELECT
l.*
FROM
database_bookshelf_ext e,
XMLTABLE (
'for $i in /books/book/labels/label
return <row>
{
$i/../../id,
$i
}
</row>'
PASSING xmltype(e.object_value)
COLUMNS
"ID" VARCHAR2(16) PATH 'id',
label VARCHAR2(16) PATH 'label'
) l;
This SQL takes each row from the external table (DATABASE_BOOKSHELF_EXT
) and converts the CLOB in the OBJECT_VALUE
into XML using the XMLTABLE
function. For each occurrence on the LABEL
tag that is found by the XPATH /books/book/labels/label
, I get the contents of the LABEL
tag (which is pointed to by the $i
loop variable), then I go up two (2) levels (to XPATH /books/book
) and get the contents of the ID
tag. These two (2) values are combined into a row for the XMLTABLE
function.
The first few rows returned are:
ID LABEL
---------------- --------------------------------
45fR_OeonfMC Oracle
45fR_OeonfMC Database
suhSAAAAMAAJ Database
suhSAAAAMAAJ Security
OhNBJ8_wMLwC Oracle
OhNBJ8_wMLwC Database
OhNBJ8_wMLwC Security
2egNzTk871wC Database
2egNzTk871wC Relational Theory
2egNzTk871wC Logic
2ImPFP6Yk64C Database
Following on from my success in extracting the labels from the uploaded XML document, I tried the following SQL to extract individual contributors:
SELECT
c.*
FROM
database_bookshelf_ext e,
XMLTABLE (
'for $b in /books/book/contributor[text()]
let $list := ora:tokenize($b, ","),
$cnt := count($list)
for $c at $r in $list where $r < $cnt
return $c'
PASSING xmltype(e.object_value)
COLUMNS
contributor VARCHAR2(512) PATH '.'
) c;
This query failed with:
SELECT
*
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-19176: FORX0003: regular expression matches zero-length string
The problem is caused by books with a single contributor. That is, there is no comma for contributor elements as shown by the following command:
grep --extended-regexp '<contributor>[^,]+</contributor>' /opt/oracle/xml/Database_20240309.xml
The contributor elements with only one (1) person are:
<contributor>Jonathan Lewis</contributor>
<contributor>Thomas Kyte</contributor>
<contributor>Charles Tupper</contributor>
<contributor>Fabian Pascal</contributor>
<contributor>S.M. Deen</contributor>
<contributor>John G. Hughes</contributor>
<contributor>Clare Churcher</contributor>
<contributor>C.J. Date</contributor>
<contributor>David Litchfield</contributor>
<contributor>C. J. Date</contributor>
<contributor>Edward Whalen</contributor>
<contributor>Ron Ben-Natan</contributor>
<contributor>James Luetkehoelter</contributor>
<contributor>Christopher Date</contributor>
<contributor>C.J. Date</contributor>
<contributor>Dan Tow</contributor>
<contributor>Sajal Dam</contributor>
<contributor>Sam R. Alapati</contributor>
<contributor>Thomas Kyte</contributor>
Note: There are two (2) C.J. Dates (the other is 'C. J. Date'). There are some data quality issues with Google Books.
My XQUERY knowledge is too limited to get around this issue. All of the examples that I have encountered so far getting around this issue by appending a comma to the contributor element text, such as shown by Splitting Strings using XMLTable and FLWOR expressions. I consider this to be a kludge. I will have to do a two-step process:
- Extract the text of the contributor element into a SQL view
- Use SQL to extract the individual contributors.
According to Chris Saxon in his post, How to split comma separated value strings into rows in Oracle Database, the secret is to use regexp_substr
looping through a CONNECT BY
. However, this solution only works for a single row table.
Based on the example from 13.5.3 Pipelined Table Functions as Transformation Functions
First, I create the package definition. A package helps to hide the three (3) types used as these are used only within the package:
-
CONTRIBUTOR_LIST_TYPE
for the input row. -
CONTRIBUTOR_TYPE
for the output row. -
CONTRIBUTOR_TAB
for the return value of the function. That is, it has to be a table of the output row type.
I use a weak cursor, SYS_REFCURSOR
. (See 14.18 Cursor Variable Declaration for more details.) With a weak cursor, there is no checking that of columns by the RDBMS engine. Since I have complete control in this schema, I am willing to accept the risk of suppressing type checking for cursors.
CREATE OR REPLACE PACKAGE contributor_pkg AUTHID current_user IS
TYPE contributor_list_type IS RECORD (
"ID" VARCHAR2(32),
contributor_list VARCHAR2(1024)
);
TYPE contributor_type IS RECORD (
"ID" VARCHAR2(32),
contributor VARCHAR2(64)
);
TYPE contributor_tab IS
TABLE OF contributor_type;
FUNCTION get_contributor (
p_csr sys_refcursor
) RETURN contributor_tab
PIPELINED;
END contributor_pkg;
/
The actual implementation that uses Chris Saxon's idea but within a PL/SQL FOR loop:
CREATE OR REPLACE PACKAGE BODY contributor_pkg IS
FUNCTION get_contributor (
p_csr SYS_REFCURSOR
) RETURN contributor_tab
PIPELINED
IS
out_rec contributor_type;
in_rec contributor_list_type;
l_num_contributor NUMBER := 0;
l_idx_contributor NUMBER := 0;
BEGIN
LOOP
FETCH p_csr INTO in_rec; -- input row
EXIT WHEN p_csr%notfound;
out_rec.id := in_rec.id;
l_num_contributor := regexp_count(in_rec.contributor_list, '[^,]+');
FOR l_idx_contributor IN 1..l_num_contributor LOOP
out_rec.contributor := ltrim(regexp_substr(in_rec.contributor_list, '[^,]+', 1, l_idx_contributor));
PIPE ROW ( out_rec );
END LOOP;
END LOOP;
CLOSE p_csr;
RETURN;
END get_contributor;
END contributor_pkg;
/
NOTE: The LTRIM
function is needed to remove spaces after commas.
For each CONTRIBUTOR
tag encountered, I use the REGEXP_SUBSTR
to extract the nth occurrence of the required regular expression (which is a string of at least one (1) character that is not a comma, i.e. '[^,]+'
). The REGEXP_COUNT
gives the number of such substrings, and is used to drive the FOR
loop.
For each substring that I find, I create a row and pass that to the calling procedure or SQL statement using the PIPE ROW
statement.
This worked as was verified by the following SQL. This SQL has three (3) steps:
- Create a data source using the CURSOR expression on the innermost SQL statement that extracts
ID
andCONTRIBUTOR
elements from eachBOOK
element of the uploaded XML document. - Transforms this data source using the
contributor_pkg.get_contributor
function by splitting individual contributors from the comma-separated list in theCONTRIBUTOR
element. - Converts the output from the transformation function back into an SQL table using the TABLE collection expression.
SELECT
*
FROM
TABLE ( contributor_pkg.get_contributor(CURSOR(
SELECT
c.*
FROM
database_bookshelf_ext e, XMLTABLE('/books/book'
PASSING xmltype(e.object_value)
COLUMNS
"ID" VARCHAR2(32) PATH 'id',
contributors VARCHAR2(1024) PATH 'contributor'
) c
)) );
The results are:
ID CONTRIBUTOR
G9AJA91PL54C Jonathan Lewis
45fR_OeonfMC Thomas Kyte
1Aqw6mrxz5AC Kerry Osborne
1Aqw6mrxz5AC Randy Johnson
1Aqw6mrxz5AC Tanel Põder
lA3QygAACAAJ Eric Redmond
lA3QygAACAAJ Jim R. Wilson
-0QPlh0stXAC Charles Tupper
t9ZQAAAAMAAJ Fabian Pascal
2dsUS-64JmIC S.M. Deen
g5lLCV2mCnwC John G. Hughes
r1OpkDVDDmwC Clare Churcher
grTubz0fjSEC C.J. Date
grTubz0fjSEC Hugh Darwen
grTubz0fjSEC Nikos A. Lorentzos
grTubz0fjSEC Nikos Lorentzos
FU7uuHc3oNcC C.J. Date
X85QAAAAMAAJ C. J. Date
X85QAAAAMAAJ Hugh Darwen
rdBQAAAAMAAJ C. J. Date
rdBQAAAAMAAJ Hugh Darwen
suhSAAAAMAAJ David Litchfield
suhSAAAAMAAJ Chris Anley
suhSAAAAMAAJ John Heasman
suhSAAAAMAAJ Bill Grindlay
OhNBJ8_wMLwC David Litchfield
2egNzTk871wC C. J. Date
2ImPFP6Yk64C Edward Whalen
-uYZcnEt4uQC Lex deHaan
-uYZcnEt4uQC Toon Koppelaars
TD5eibhYyd8C Ron Ben-Natan
v43VNrxWo70C James Luetkehoelter
y_eVBB5qdwMC Christopher Date
406_pJtiJ6sC C.J. Date
dOYPtfVqS_MC Dan Tow
yK9wRfYX9nkC Sajal Dam
RRsHc7iXcyAC Sam R. Alapati
GjYTbJTIr54C Steve Shaw
GjYTbJTIr54C Julian Dyke
EmvWP1vgTsYC Rick Greenwald
EmvWP1vgTsYC Robert Stackowiak
EmvWP1vgTsYC Jonathan Stern
3aEIIqRHkY8C Matthew Hart
3aEIIqRHkY8C Robert G. Freeman
TmPoYfpeJAUC Thomas Kyte
To complete the transformation of the uploaded extract from the Google Books bookshelf, I want to create three (3) materialised views:
BOOKS
BOOK_LABELS
BOOK_CONTRIBUTORS
Another advantage of using materialised views is that SQL*Loader is invoked every time the external table is accessed. This makes sense as the Oracle RDBMS engine has no control over external files, and has to assume that the contents of the file can be changed since the last access, thereby invalidating any possible caching. With materialised views, I can have a consistent snapshot of the loaded data.
I used the following SQL to create a materialised view, called BOOKS
, with four (4) columns:
-
ID
for that assigned by the Google Books to uniquely identify an entry -
TITLE
for the book title -
URL
for the Google Books page about the book -
ISBN
of the book
CREATE MATERIALIZED VIEW books AS
SELECT
b.ID,
b.title,
b.url,
b.isbn
FROM
database_bookshelf_ext e,
XMLTABLE ( '/books/book'
PASSING xmltype(e.object_value)
COLUMNS
"ID" VARCHAR2(16) PATH 'id',
url VARCHAR2(512) PATH 'url',
title VARCHAR2(1024) PATH 'title',
contributer VARCHAR2(1024) PATH 'contributor',
isbn VARCHAR2(16) PATH 'identifier/value'
) b;
I ran the following SQL to find books without an ISBN:
SELECT title FROM books WHERE isbn IS NULL;
There are five (5) books that do not have an ISBN:
TITLE
--------------------------------------------------------------------------------
Practical Issues in Database Management: A Reference for the Thinking Practitioner
Databases, Types and the Relational Model: The Third Manifesto
Foundation for Object/relational Databases: The Third Manifesto : a Detailed Study of the Impact of Objects and Type Theory on the Relational Model of Data Including a Comprehensive Proposal for Type Inheritance
The Database Hacker's Handbook: Defending Database Servers
This is strange as the book, Practical Issues in Database Management: A Reference for the Thinking Practitioner, has the following ISBNs:
- 9780201485554
- 0201485559
Just another data quality issue from Google Books!
I used the following SQL to create the materialised view, BOOK_LABELS
, with the columns:
-
ID
for that assigned by the Google Books to uniquely identify an entry -
LABEL
describing a book
CREATE MATERIALIZED VIEW book_labels AS
SELECT
l.ID,
l.label
FROM
database_bookshelf_ext e,
XMLTABLE (
'for $i in /books/book/labels/label
return <row>
{
$i/../../id,
$i
}
</row>'
PASSING xmltype(e.object_value)
COLUMNS
"ID" VARCHAR2(16) PATH 'id',
label VARCHAR2(32) PATH 'label'
) l;
I used the following SQL to validate the materialised view, BOOK_LABELS
, by getting the frequency distribution for each label:
SELECT
label,
count(*) AS freq
FROM
book_labels
GROUP BY
label
ORDER BY
label;
There are nine (9) different labels with the following frequencies (FREQ
):
LABEL FREQ
-------------------------------- ----------
Database 15
Exam Guide 1
Linux 2
Logic 1
Oracle 9
Relational Theory 3
SQL 2
SQL Server 2
Security 3
9 rows selected.
I used the following SQL to create the materialised view, BOOK_LABELS
, with the columns:
-
ID
for that assigned by the Google Books to uniquely identify an entry -
CONTRIBUTOR
for each author of a book
CREATE MATERIALIZED VIEW book_contributors AS
SELECT
"ID",
contributor
FROM
TABLE (
contributor_pkg.get_contributor(
CURSOR(
SELECT
c.*
FROM
database_bookshelf_ext e,
XMLTABLE(
'/books/book'
PASSING xmltype(e.object_value)
COLUMNS
"ID" VARCHAR2(32) PATH 'id',
contributors VARCHAR2(1024) PATH 'contributor'
) c
)
)
);
I verified the materialised view by running the following SQL:
COLUMN contributor FOR A30
SELECT "ID", contributor FROM book_contributors
The first few rows returned are:
ID CONTRIBUTOR
------------------ --------------
G9AJA91PL54C Jonathan Lewis
45fR_OeonfMC Thomas Kyte
1Aqw6mrxz5AC Kerry Osborne
1Aqw6mrxz5AC Randy Johnson
1Aqw6mrxz5AC Tanel P?der
lA3QygAACAAJ Eric Redmond
lA3QygAACAAJ Jim R. Wilson
-0QPlh0stXAC Charles Tupper
t9ZQAAAAMAAJ Fabian Pascal
2dsUS-64JmIC S.M. Deen
g5lLCV2mCnwC John G. Hughes
With the completion of materialised views, I composed a new logical database model for the DB_BOOKS
schema:
There are three (3) entities:
-
BOOKS
with attributes:-
ID
is primary key -
TITLE
is mandatory -
URL
is mandatory
-
-
BOOKS LABELS
with attributes:-
ID
is mandatory and refers toBOOKS.ID
-
LABEL
is mandatory -
ID
andLABEL
form the primary key
-
-
BOOK CONTRIBUTORS
-
ID
is mandatory and refers toBOOKS.ID
-
CONTRIBUTOR
is mandatory -
ID
andCONTRIBUTOR
form the primary key
-
- Database PL/SQL Language Reference 21c
- PL/SQL Packages and Types Reference 21c
- SQL Language Reference 21c
- 18 SQL Statements: DROP TABLE to LOCK TABLE
- 19 SQL Statements: MERGE to UPDATE
- Utilities 21c