File Naming for Text Output (CSV, JSON, Parquet, or XML)
Describes the output file naming using DBMS_CLOUD.EXPORT_DATA
with CSV, JSON, Parquet, or XML text file output.
DBMS_CLOUD.EXPORT_DATA
performs the query specified with the query
parameter and sends the results to text files either in the Cloud Object Store bucket or to a directory. The output format depends on the format
parameter type
you specify (one of CSV, JSON, Parquet, or XML).
To speed up the procedure and to generate the output as fast as possible, DBMS_CLOUD.EXPORT_DATA
divides its work. This means that, depending on system resources, when you run DBMS_CLOUD.EXPORT_DATA
the procedure creates multiple output files either in the Cloud Object Store bucket or in the directory.
The format for each generated file is:
[FileNamePrefix | client_info_module_action]_sequenceNum_timestamp.format_extension.[compression_extension]
-
FileNamePrefix: (optional) If a FileNamePrefix is supplied,
DBMS_CLOUD.EXPORT_DATA
uses the file name prefix to generate file names for the results. The FileNamePrefix is specified using the text supplied after the bucket or directory name in thefile_uri_list
parameter value.You cannot provide multiple values for the FileNamePrefix in the
file_uri_list
. -
client_info_module_action: If a file name prefix is not supplied with the
file_uri_list
parameter,DBMS_CLOUD.EXPORT_DATA
uses the combination of client_info, application module and action as the file name prefix (when this information is available). The procedure obtains these names from the application information for the database session that runs the query. See DBMS_APPLICATION_INFO for information on client_info, module name, and action name.If a file name prefix is not supplied with the
file_uri_list
and the database session attributes are not available,DBMS_CLOUD.EXPORT_DATA
uses the file name prefix "data
". -
sequenceNum: The sequence number associated with the
DBMS_CLOUD.EXPORT_DATA
query. Depending on the query, the database service, and the number of ECPUs (OCPUs if your database uses OCPUs) there are one or more sequenceNums. Also, depending on the size of the results, there are one or more output files for each sequenceNum.See Manage Concurrency and Priorities on Autonomous Database for information on database services.
-
timestamp: Timestamp when the file is uploaded.
-
format_extension: The default value depends on the
format
type
value:- CSV format:
.csv
- JSON format:
.json
- PARQUET format
.parquet
- XML format:
.xml
For more information, see the description for
format
optionfileextension
in DBMS_CLOUD Package Format Options for EXPORT_DATA. - CSV format:
-
compression_extension: When you include the
format
parameter with thecompression
option with the valuegzip
, this is"gz"
.When the
format
type
isparquet
, thecompression
valuesnappy
is also supported and is the default.
For example, the file name prefix in the following DBMS_CLOUD.EXPORT_DATA
procedure is specified in the file_uri_list
parameter, as dept_export
. The example generates the output to the provided Object Store in the specified format.
BEGIN
DBMS_CLOUD.EXPORT_DATA
(
credential_name => 'DEF_CRED_NAME',
file_uri_list => 'https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/o/dept_export',
query => 'SELECT * FROM DEPT',
format => JSON_OBJECT('type' value 'json'));
END;
/
When you specify a file name prefix the generated output files include the file name prefix, similar to the following:
dept_export_1_20210809T173033Z.json
dept_export_2_20210809T173034Z.json
dept_export_3_20210809T173041Z.json
dept_export_4_20210809T173035Z.json
The number of generated output files depends on the size of the results, the database service, and the number of ECPUs (OCPUs if your database uses OCPUs) in the Autonomous Database instance.
In the following example the file_uri_list
parameter does not
include a file name prefix and the compression
parameter is supplied, with
value gzip
:
BEGIN
DBMS_CLOUD.EXPORT_DATA
(
credential_name => 'DEF_CRED_NAME',
file_uri_list => 'https://objectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/o/',
query => 'SELECT * FROM DEPT',
format => json_object('type' value 'json', 'compression' value 'gzip'));
END;
/
When a file name prefix is not in the file_uri_list
parameter, DBMS_CLOUD.EXPORT_DATA
uses a file
name prefix of the form: client_info_module_action. For
this example the generated output files include the file name prefix that DBMS_CLOUD.EXPORT_DATA
supplies and the files are
compressed with gzip and the file extension .gz
is added, as follows:
Client1_Module1_Action1_1_20210809T173033Z.json.gz
Client1_Module1_Action1_2_20210809T173034Z.json.gz
Client1_Module1_Action1_3_20210809T173041Z.json.gz
Client1_Module1_Action1_4_20210809T173035Z.json.gz
If the client_info_module_action session information is
not available when you run DBMS_CLOUD.EXPORT_DATA
, the file name prefix is set to
data
. For example:
data_1_20210809T173033Z.json.gz
data_2_20210809T173034Z.json.gz
data_3_20210809T173041Z.json.gz
data_4_20210809T173035Z.json.gz
For example, the file name prefix in the following DBMS_CLOUD.EXPORT_DATA
procedure is specified in the file_uri_list
parameter, as dept_export
. The example generates the output to the provided directory in the specified format.
BEGIN
DBMS_CLOUD.EXPORT_DATA
(
file_uri_list => 'DATA_PUMP_DIR:sales.json',
query => 'SELECT * FROM SALES',
format => JSON_OBJECT('type' value 'json'));
END;
/
When you specify a file name prefix the generated output file include the file name prefix, similar to the following:
sales_1_20230705T124523275915Z.csv
Notes for file naming with DBMS_CLOUD.EXPORT_DATA
:
-
DBMS_CLOUD.EXPORT_DATA
does not create buckets or directories. -
The number of files that
DBMS_CLOUD.EXPORT_DATA
generates is determined by the number of ECPUs (OCPUs if your database uses OCPUs), the database service, and the size of the result data. -
The following applies when providing a directory object name in the
file_uri_list
parameter:-
The provided directory must exist and you must have
WRITE
access to the directory. -
The directory name is case-sensitive when enclosed in double quotes.
-
The credential name parameter must not be provided.
-
-
For CSV, JSON, or XML output, by default when a generated file contains 10MB of data a new output file is created. However, if you have less than 10MB of result data you may have multiple output files, depending on the database service and the number of ECPUs (OCPUs if your database uses OCPUs) for the Autonomous Database instance.
The default output file chunk size is 10MB for CSV, JSON, or XML. You can change this value with the
format
parametermaxfilesize
option. See DBMS_CLOUD Package Format Options for EXPORT_DATA for more information. -
For Parquet output, each generated file is less than 128MB and multiple output files may be generated. However, if you have less than 128MB of result data, you may have multiple output files depending on the database service and the number of ECPUs (OCPUs if your database uses OCPUs) for the Autonomous Database instance.
The
format
parametermaxfilesize
option does not apply for Parquet files.
Parent topic: Export Data to Object Store as Text