Query External Data with Autonomous Database
Describes packages and tools to query and validate data with Autonomous Database.
External data is not managed by the database; however, you can use DBMS_CLOUD
procedures to query
your external data. Although queries on external data will not be as fast as queries on
database tables, you can use this approach to quickly start running queries on your
external source files and external data. Depending on the type of external table, you
can validate external data using the DBMS_CLOUD
validation procedures. The data validation
procedures let you validate the source files for an external table so that you can
identify problems and either correct the data in the external table or exclude invalid
data before you use the data.
If you are not using
ADMIN
user, ensure the user has the necessary privileges for the operations the user needs to perform. See Manage User Privileges on Autonomous Database - Connecting with a Client Tool for more information.
- Query External Data
To query data in files in the Cloud, you need to first store your object storage credentials in your Autonomous Database, and then create an external table using the PL/SQL procedureDBMS_CLOUD.CREATE_EXTERNAL_TABLE
. - Query External Data with ORC, Parquet, or Avro Source Files
Autonomous Database makes it easy to access ORC, Parquet, or Avro data stored in object store using external tables. ORC, Parquet, and Avro sources have metadata embedded in them and theDBMS_CLOUD.CREATE_EXTERNAL_TABLE
procedure can utilize this metadata to simplify the creation of external tables. - Query External Tables with Implicit Partitioning
On Autonomous Database you can create implicit partitioned external tables from Hive style partitioned data or from simple folder partitioned data stored on your Cloud Object Store. - Query External Tables with Partitioning Specified in Source Files
If you want to query multiple data files in the Object Store as a single external table and the files can be represented as multiple logical partitions, it is highly recommended to use an external partitioned table. Using an external partitioned table preserves the logical partitioning of your data files for query access. - Query External Partitioned Data (with Partitioning Clause)
If you want to query multiple data files in the Object Store as a single external table and the files can be represented as multiple logical partitions, then it is highly recommended to use an external partitioned table. Using an external partitioned table preserves the logical partitioning of your data files for query access. Use the procedureDBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE
to create an external partitioned table. - Query Hybrid Partitioned Data
If you want to query internal data and multiple data files in the Object Store as single logical table you can use a hybrid partitioned table to represent the data as single object. Use the procedureDBMS_CLOUD.CREATE_HYBRID_PART_TABLE
to create a hybrid partitioned table. - Query External Data Pump Dump Files
You can also query Oracle Data Pump dump files in the Cloud by creating an external table usingDBMS_CLOUD.CREATE_EXTERNAL_TABLE
. - Query Big Data Service Hadoop (HDFS) Data from Autonomous Database
You can create database links to Oracle Big Data Service from Autonomous Database. - Query External Data with Data Catalog
Oracle Cloud Infrastructure Data Catalog is the metadata management service for Oracle Cloud that helps you discover data and support data governance. It provides an inventory of assets, a business glossary, and a common metastore for data lakes. - Query External Data with AWS Glue Data Catalog
Autonomous Database supports a system for synchronizing with an Amazon AWS Glue Data Catalog instance. - Query Apache Iceberg Tables
Autonomous Database supports querying Apache Iceberg tables. - Validate External Data
To validate any external table, you can use the procedureDBMS_CLOUD.VALIDATE_EXTERNAL_TABLE
. - Validate External Partitioned Data
To validate an external partitioned table, you can use the procedureDBMS_CLOUD.VALIDATE_EXTERNAL_PART_TABLE
. This procedure includes a parameter that lets you specify a specific partition to validate. - Validate Hybrid Partitioned Data
To validate a hybrid partitioned table, you can use the procedureDBMS_CLOUD.VALIDATE_HYBRID_PART_TABLE
. This procedure includes a parameter that lets you specify a specific partition to validate. - View Logs for Data Validation
To validate an external table, use the proceduresDBMS_CLOUD.VALIDATE_EXTERNAL_TABLE
,DBMS_CLOUD.VALIDATE_EXTERNAL_PART_TABLE
, andDBMS_CLOUD.VALIDATE_HYBRID_PART_TABLE
.
Parent topic: Link Data