Overview of Vision

Oracle Cloud Infrastructure Vision is a serverless, multi-tenant service, accessible using the Console, REST APIs, SDK, or CLI.

You can upload images to detect and classify objects in them. If you have lots of images, you can process them in batch using asynchronous API endpoints. Vision's features are thematically split between Document AI for document-centric images, and Image Analysis for object and scene-based images. Pretrained models and custom models are supported.

Image Analysis Pretrained Models

Object detection
Vision detects the location of objects in an image, such as a person, a car, a tree, or a dog. The output includes the bounding box coordinates for each object found.
Image classification
Vision categorizes scene-based features and objects in an image.
Face detection
Face detection lets you pass an image or a batch of images to Vision to detect faces, their locations, their features, and their visual quality.
Optical Character Recognition (OCR)
Vision can find and digitize text in an image.

Image Analysis Custom Models

Custom Object Detection
Build a model to detect the location of custom objects in an image. The output includes the bounding box coordinates for each object found.
Custom Image Classification
Build a model to identify objects and scene-based features in an image.

Document AI

Important

The AnalyzeDocument and DocumentJob capabilities in Vision are moving to a new service, Document Understanding. The following features are impacted:
  • Table detection
  • Document classification
  • Receipt key-value extraction
  • Document OCR
These features are available in Vision until January 1, 2024. After then, they're available only in Document Understanding.
Pretrained models
  • Optical Character Recognition (OCR): Vision can detect and recognize text in a document.
  • Document classification: Vision can classify a document, for example, whether the document is a tax form, an invoice, or a receipt.
  • Language classification: Vision classifies the language of a document based on its visual features.
  • Table extraction: Vision extracts content in tabular format, maintaining the row and column relationships of cells.
  • Key-value extraction: Vision identifies values for common fields in receipts.
  • Optical Character Recognition (OCR) PDF: Vision generates a searchable PDF file in your Object Storage.

Supported File Formats

Vision supports the following file types.

  • JPG and PNG,files are supported for Image Analysis.
  • JPG, PNG, PDF and TIFF files are supported for Document AI.

Regions and Availability Domains

Oracle Cloud Infrastructure services are hosted in regions and availablility domains. A region is a localized geographic area, and an availability domain is one or more data centers located in that region.

Vision is hosted in these regions and availability domains:
  • Australia East (Sydney)
  • Australia Southeast (Melbourne)
  • Brazil East (Sao Paulo)
  • Brazil Southeast (Vinhedo)
  • Canada Southeast (Montreal)
  • Canada Southeast (Toronto)
  • Chile (Santiago)
  • France South (Marseille)
  • Germany Central (Frankfurt)
  • India South (Hyderabad)
  • India West (Mumbai)
  • Israel Central (Jerusalem)
  • Italy Northwest (Milan)
  • Japan Central (Osaka)
  • Japan East (Tokyo)
  • Netherlands Northwest (Amsterdam)
  • Saudi Arabia West (Jeddah)
  • Singapore (Singapore)
  • South Africa Central (Johannesburg)
  • South Korea Central (Seoul)
  • South Korea North (Chuncheon)
  • Sweden Central (Stockholm)
  • Switzerland North (Zurich)
  • UAE Central (Abu Dhabi)
  • UAE East (Dubai)
  • UK South (London)
  • UK West (Newport)
  • US East (Ashburn)
  • US West (Phoenix)
  • US West (San Jose)
Note

Vision is not available in government regions.

How to Access Vision

You access Vision using the Console, REST API, SDKs, or CLI.

Use any of the following options, based on your preference and its suitability, for the task you want to complete:
  • The OCI Console is an easy-to-use, browser-based interface. To access the Console, you must use a supported browser.
  • The REST APIs provide the most functionality, but require programming expertise. API reference and endpoints provide endpoint details and links to the available API reference documents including the Artificial Intelligence Services REST API.
  • Oracle Cloud Infrastructure provides SDKs that interact with Language without the need to create a framework.
  • The CLI provides both quick access and full functionality without the need for programming.
Note

Vision is not supported in the Oracle Always Free tier.