Reading data from OCI Autonomous Data Warehouse(ADW)¶
Overview¶
ML Monitoring Application provides ADWApplicationDataReader
reader to read data from OCI ADW tables using user-provided SQL queries.
To use ADW Reader, fulfil the prerequisites mentioned in ADW Prerequisites
ADW Reader Authentication¶
ADW Readers connect to an OCI ADW instance using a mTLS connection. This requires the use of wallet files and database credentials which contain the required authentication information.
ADWApplicationDataReader
reads wallet files and database credentials from OCI Vault. The steps are as follows:
Upload each wallet file to OCI Vault. This creates a secret for each file.
Write the appropriate policy to allow the principal (user/resource) to read secrets from OCI Vault.
Add the following details in ML Monitoring Application Configuration file:
{ "type": "ADWApplicationDataReader", "params": { "auth": { "vault_ocid": "<vault_ocid>", "vault_tenancy_ocid": "<tanancy_ocid>" }, } }
vault_ocid: The OCID of the vault that contains the secret.
vault_tenancy_ocid: The OCID of the tenancy where vault resides.
By default, The ADW Reader expects the following wallet files to be present in OCI Vault in base 64 encoded format:
[
"cwallet.sso",
"ewallet.pem",
"ewallet.p12",
"keystore.jks",
"ojdbc.properties",
"tnsnames.ora",
"truststore.jks",
"sqlnet.ora",
"db_username",
"db_password"
]
6. If the secret names in OCI Vault are not same as default file names, provide the names explicitly in the configuration. As shown below, the cwallet.sso is uploaded file as custom-cwallet.sso in OCI Vault. In this case, the ADW Reader is configured to read the custom wallet file names from OCI Vault.
"baseline_reader": {
"type": "ADWApplicationDataReader",
"params": {
"auth": {
"vault_ocid": "<vault_ocid>",
"vault_tenancy_ocid": "<tenancy_ocid>",
"vault_secret_names": {
"db_username": "custom_db_username",
"db_password": "custom_db_password",
"cwallet.sso": "custom-cwallet.sso",
"ewallet.pem": "custom-ewallet.pem",
"ewallet.p12": "custom-ewallet.p12",
"keystore.jks": "custom-keystore.jks",
"ojdbc.properties": "custom-ojdbc.properties",
"tnsnames.ora": "custom-tnsnames.ora",
"truststore.jks": "custom-truststore.jks",
"sqlnet.ora": "custom-sqlnet.ora"
}
}
}
}
Read data using ADW Application Reader¶
ADW readers read data from OCI cloud hosted Autonomous Data Warehouse using SQL query as mandatory input, which in return, outputs the data into the dataframe on which you can run the data analytics specifying the metrics and test results.
sample_config.json
{
"input_schema": {
"salary": {
"data_type": "INTEGER",
"variable_type": "CONTINUOUS",
"column_type": "INPUT"
},
"first_name": {
"data_type": "STRING",
"variable_type": "NOMINAL",
"column_type": "INPUT"
}
},
"dataset_metrics": [
{
"type": "RowCount"
}
],
"feature_metrics": {
"salary": [
{
"type": "Count"
},
{
"type": "Quartiles"
},
{
"type": "FrequencyDistribution"
},
{
"type": "ProbabilityDistribution"
},
{
"type": "DistinctCount"
},
{
"type": "Skewness"
},
{
"type": "Variance"
},
{
"type": "TypeMetric"
},
{
"type": "KolmogorovSmirnov"
}
],
"first_name": [
{
"type": "Count"
},
{
"type": "TopKFrequentElements"
},
{
"type": "TypeMetric"
},
{
"type": "ChiSquare"
}
]
},
"prediction_reader": {
"type": "ADWApplicationDataReader",
"params": {
"auth": {
"vault_ocid": "<vault_ocid>",
"vault_tenancy_ocid": "<tenancy_ocid>",
"vault_secret_names": {
"db_username": "mlm-int2-db_username",
"db_password": "mlm-int2-db_password",
"cwallet.sso": "mlm-int2-cwallet.sso",
"ewallet.pem": "mlm-int2-ewallet.pem",
"ewallet.p12": "mlm-int2-ewallet.p12",
"keystore.jks": "mlm-int2-keystore.jks",
"ojdbc.properties": "mlm-int2-ojdbc.properties",
"tnsnames.ora": "mlm-int2-tnsnames.ora",
"truststore.jks": "mlm-int2-truststore.jks",
"sqlnet.ora": "mlm-int2-sqlnet.ora"
}
},
"sql_query": "select * from employees where hire_date > TO_DATE(:start,'DD-MM-YY') and appt_date < TO_DATE(:end,'DD-MM-YY')",
"index_col": "employee_id",
"bind_variables": {
"start": "",
"end": ""
}
}
},
"baseline_reader": {
"type": "ADWApplicationDataReader",
"params": {
"auth": {
"vault_ocid": "<vault_ocid>",
"vault_tenancy_ocid": "<tenancy_ocid>",
"vault_secret_names": {
"db_username": "mlm-int2-db_username",
"db_password": "mlm-int2-db_password",
"cwallet.sso": "mlm-int2-cwallet.sso",
"ewallet.pem": "mlm-int2-ewallet.pem",
"ewallet.p12": "mlm-int2-ewallet.p12",
"keystore.jks": "mlm-int2-keystore.jks",
"ojdbc.properties": "mlm-int2-ojdbc.properties",
"tnsnames.ora": "mlm-int2-tnsnames.ora",
"truststore.jks": "mlm-int2-truststore.jks",
"sqlnet.ora": "mlm-int2-sqlnet.ora"
}
},
"sql_query": "select * from employees where hire_date > TO_DATE('12-03-10','DD-MM-YY')",
"index_col": "employee_id"
}
}
}
Override bind parameters using Runtime Parameters¶
When doing a prediction run, you can override bind_parameters
to be used in sql_query
using the date_range
parameters for the prediction reader.
date_range
start and end values should have date values compatible with date format specified in thesql_query
select * from employees where hire_date > TO_DATE(:start,'DD-MM-YY') and appt_date < TO_DATE(:end,'DD-MM-YY');
- Runtime parameters are specified as follows:
{"ACTION_TYPE":"RUN_PREDICTION", "DATE_RANGE": {"start": "15-03-23", "end": "17-03-23"}}
- Bind parameters are specified as follows:
"bind_variables": {
"start": "",
"end": ""
}
The user API parameters of ADWApplicationDataReader
are defined in the following table:
Parameter |
Description |
Examples |
---|---|---|
|
|
query example
|
|
|
bind variables example
|
|
|
|
|
|
index column example
|
|
|
number of partition example
|
|
|
partition size example
|
Note
If the
partition_size
andno_of_partitions
are not provided, the number of partitions are calculated using
the default value of partition_size
which is 256 MiB
.
If the
partition_size
andno_of_partitions
are both provided,no_of_partitions
is honoured.