Querying External Data with Autonomous Database on Dedicated Exadata Infrastructure
Describes packages and tools to query and validate data with Autonomous Database on Dedicated Exadata Infrastructure.
External data is not managed by the database; however, you can use DBMS_CLOUD
procedures to query your external data. Although queries on external data will not be as fast as queries on database tables, you can use this approach
to quickly start running queries on your external source files and external data.
You can use DBMS_CLOUD
procedures to validate the data in the external source
files for an external table so that you can identify problems and either correct the data in the external table or exclude invalid data before you use
the data.
If you are not using
ADMIN
user, ensure the user has the necessary privileges for the operations the user needs to
perform. See Manage Database User
Privileges for more information.
- Query External Data
To query data in files in the Cloud, you need to first store your object storage credentials in your Autonomous Database, and then create an external table using the PL/SQL procedureDBMS_CLOUD.CREATE_EXTERNAL_TABLE
. - Validate External Data
To validate an external table, you use the procedureDBMS_CLOUD.VALIDATE_EXTERNAL_TABLE
. - View Logs for Data Validation
Parent topic: Integrate
Query External Data
To query data in files in the Cloud, you need to first store your object storage credentials in your Autonomous Database, and then create an external table using the PL/SQL procedure DBMS_CLOUD.CREATE_EXTERNAL_TABLE
.
The source file in this example, channels.txt
, has the following
data:
S,Direct Sales,Direct
T,Tele Sales,Direct
C,Catalog,Indirect
I,Internet,Indirect
P,Partners,Others
Validate External Data
To validate an external table, you use the procedure DBMS_CLOUD.VALIDATE_EXTERNAL_TABLE
.
Before validating an external table you need to create the external table using
the DBMS_CLOUD.CREATE_EXTERNAL_TABLE
procedure. Then use the DBMS_CLOUD.VALIDATE_EXTERNAL_TABLE
procedure to
validate it. For
example:
BEGIN
DBMS_CLOUD.VALIDATE_EXTERNAL_TABLE
(
table_name => 'CHANNELS_EXT' );
END;
/
This procedure scans your source files and validates them using the format options specified when you created the external table.
The validate operation, by default, scans all the rows in your source files and stops when a row is rejected. If you want to validate only
a subset of the rows, use the rowcount
parameter. When the rowcount
parameter is set, the validate operation scans
rows and stops either when a row is rejected or when the specified number of rows are validated without errors.
For example, the following validate operation scans 100 rows and stops when a row is rejected or when 100 rows are validated without errors:
BEGIN
DBMS_CLOUD.VALIDATE_EXTERNAL_TABLE (
table_name => 'CHANNELS_EXT',
rowcount => 100 );
END;
/
If you do not want the validate to stop when a row is rejected and you want to see all rejected rows, set the
stop_on_error
parameter to FALSE
. In this case VALIDATE_EXTERNAL_TABLE
scans all rows and reports
all rejected rows.
If you want to validate only a subset of rows use the rowcount
parameter. When
rowcount
is set and stop_on_error
is set to FALSE
, the validate operation scans rows and stops
either when the specified number of rows are rejected or when the specified number of rows are validated without errors. For example, the following
example scans 100 rows and stops when 100 rows are rejected or when 100 rows are validated without
errors:
BEGIN
DBMS_CLOUD.VALIDATE_EXTERNAL_TABLE (
table_name => 'CHANNELS_EXT',
rowcount => 100,
stop_on_error => FALSE );
END;
/
See VALIDATE_EXTERNAL_TABLE Procedure for detailed information about DBMS_CLOUD.VALIDATE_EXTERNAL_TABLE
parameters.
See View Logs for Data Validation to see the
results of validate operations in the tables dba_load_operations
and
user_load_operations
.
View Logs for Data Validation
After validating an external table, you can see the result of the validate operation by querying a load operations table:
-
dba_load_operations
: shows all validate operations. -
user_load_operations
: shows the validate operations in your schema.
You can use these tables to view load validation information. For example use this SELECT statement to query user_load_operations
:
SELECT table_name,owner_name,type,status,start_time,update_time,logfile_table,badfile_table
FROM user_load_operations
WHERE type = 'VALIDATE';
TABLE_NAME OWNER_NAME TYPE STATUS START_TIME UPDATE_TIME LOGFILE_TABLE BADFILE_TABLE
------------- ---------- ---------- ---------- -------------- ------------- --------------- ---------------
CHANNELS_EXT SH VALIDATE COMPLETED 13-NOV-17... 13-NOV-17... VALIDATE$21_LOG VALIDATE$21_BAD
Using this SQL statement with the WHERE
clause on the TYPE
column displays all of the load operations
with type VALIDATE
.
The LOGFILE_TABLE
column shows the name of the table you can query to look at
the log of a validate operation. For example, the following query shows the log for this validate operation:
SELECT * FROM VALIDATE$21_LOG;
The column BADFILE_TABLE
shows the name of the table you can query to look at the rows where there were errors during validation. For example, the following query shows the
rejected records for the above validate operation:
SELECT * FROM VALIDATE$21_BAD;
Depending on the errors shown in the log and the rows shown in the BADFILE_TABLE
, you can correct the error by dropping the external table using the DROP TABLE
command and recreating it by specifying the correct format options in DBMS_CLOUD.CREATE_EXTERNAL_TABLE
.
The
LOGFILE_TABLE
and BADFILE_TABLE
tables are stored for two days for each validate operation and then removed
automatically.