Key Phrase Extraction
Keyword extraction is the automated process of extracting the words with the most relevance, and expressions from the input text. It helps summarize the content, and recognizes the main topics.
The key phrase extraction model uses NLP and ML to find insights related to the main points of the text. It understands the unstructured input text, and returns key words and key phrases (KPs).
The KPs consists of subjects and objects that are being talked about in the document. Any modifiers, such as adjectives associated with these subjects and objects, are also included in the output. Confidence scores for each key phrase that signify the confidence about the KP are included. Confidence scores are a value from 0 to 1.
Use Cases
Some business use cases are:
-
Brand monitoring
-
Monitoring market research
-
Competitive market analysis
-
Customer support tickets
-
Employee feedback analysis
-
Customer reviews
-
Email analysis
Supported Features
-
Key phrases
-
Confidence scores
-
Requests support single record and multi-record batches.
Supported Languages for Input Text
- English
- Spanish
Examples
Input Text | Key Phrases |
---|---|
|
|
|
|
The JSON for the first example is:
- Sample Request
-
POST https://<region-url>/20210101/actions/batchDetectLanguageKeyPhrases
- API Request format:
-
{ "documents": [ { "key": "doc1", "text": "Red Bull Racing Honda, the four-time Formula-1 World Champion team, has chosen Oracle Cloud Infrastructure (OCI) as their infrastructure partner." } ] }
- Response JSON:
-
{ "documents": [ { "key": "1", "keyPhrases": [ { "text": "red bull racing honda", "score": 0.9997546563973576 }, { "text": "oracle cloud infrastructure", "score": 0.9997546563973576 }, { "text": "infrastructure partner", "score": 0.9997546563973576 }, { "text": "oci", "score": 0.9979336625058923 } ], "languageCode": "en" } ], "errors": [] }
Limitations
-
Key phrases that are noun phrases with adjective modifiers are identified so words that don't follow this criteria could be ignored.
-
This model is case insensitive.
-
Text that contains multiple punctuation between words might be flagged as a key phrase.
-
URLs that are well formed (begin with http, https, or www) are identified.