azure.cognitiveservices.vision.computervision package

Module contents

class azure.cognitiveservices.vision.computervision.ComputerVisionAPI(azure_region, credentials)[source]

Bases: msrest.service_client.SDKClient

The Computer Vision API provides state-of-the-art algorithms to process images and return information. For example, it can be used to determine if an image contains mature content, or it can be used to find all the faces in an image. It also has other features like estimating dominant and accent colors, categorizing the content of images, and describing an image with complete English sentences. Additionally, it can also intelligently generate images thumbnails for displaying large images effectively.

Variables:

config (ComputerVisionAPIConfiguration) – Configuration for client.

Parameters:
  • azure_region (str or AzureRegions) – Supported Azure regions for Cognitive Services endpoints. Possible values include: ‘westus’, ‘westeurope’, ‘southeastasia’, ‘eastus2’, ‘westcentralus’, ‘westus2’, ‘eastus’, ‘southcentralus’, ‘northeurope’, ‘eastasia’, ‘australiaeast’, ‘brazilsouth’
  • credentials (None) – Subscription credentials which uniquely identify client subscription.
analyze_image(url, visual_features=None, details=None, language='en', custom_headers=None, raw=False, **operation_config)[source]

This operation extracts a rich set of visual features based on the image content. Two input methods are supported – (1) Uploading an image or (2) specifying an image URL. Within your request, there is an optional parameter to allow you to choose which features to return. By default, image categories are returned in the response.

Parameters:
  • url (str) – Publicly reachable URL of an image
  • visual_features (list[str or VisualFeatureTypes]) – A string indicating what visual feature types to return. Multiple values should be comma-separated. Valid visual feature types include:Categories - categorizes image content according to a taxonomy defined in documentation. Tags - tags the image with a detailed list of words related to the image content. Description - describes the image content with a complete English sentence. Faces - detects if faces are present. If present, generate coordinates, gender and age. ImageType - detects if image is clipart or a line drawing. Color - determines the accent color, dominant color, and whether an image is black&white.Adult - detects if the image is pornographic in nature (depicts nudity or a sex act). Sexually suggestive content is also detected.
  • details (list[str or Details]) – A string indicating which domain-specific details to return. Multiple values should be comma-separated. Valid visual feature types include:Celebrities - identifies celebrities if detected in the image.
  • language (str) – The desired language for output generation. If this parameter is not specified, the default value is "en".Supported languages:en - English, Default. es - Spanish, ja - Japanese, pt - Portuguese, zh - Simplified Chinese. Possible values include: ‘en’, ‘es’, ‘ja’, ‘pt’, ‘zh’
  • custom_headers (dict) – headers that will be added to the request
  • raw (bool) – returns the direct response alongside the deserialized response
  • operation_configOperation configuration overrides.
Returns:

ImageAnalysis or ClientRawResponse if raw=true

Return type:

ImageAnalysis or ClientRawResponse

Raises:

ComputerVisionErrorException

analyze_image_by_domain(model, url, language='en', custom_headers=None, raw=False, **operation_config)[source]

This operation recognizes content within an image by applying a domain-specific model. The list of domain-specific models that are supported by the Computer Vision API can be retrieved using the /models GET request. Currently, the API only provides a single domain-specific model: celebrities. Two input methods are supported – (1) Uploading an image or (2) specifying an image URL. A successful response will be returned in JSON. If the request failed, the response will contain an error code and a message to help understand what went wrong.

Parameters:
  • model (str) – The domain-specific content to recognize.
  • url (str) – Publicly reachable URL of an image
  • language (str) – The desired language for output generation. If this parameter is not specified, the default value is "en".Supported languages:en - English, Default. es - Spanish, ja - Japanese, pt - Portuguese, zh - Simplified Chinese. Possible values include: ‘en’, ‘es’, ‘ja’, ‘pt’, ‘zh’
  • custom_headers (dict) – headers that will be added to the request
  • raw (bool) – returns the direct response alongside the deserialized response
  • operation_configOperation configuration overrides.
Returns:

DomainModelResults or ClientRawResponse if raw=true

Return type:

DomainModelResults or ClientRawResponse

Raises:

ComputerVisionErrorException

analyze_image_by_domain_in_stream(model, image, language='en', custom_headers=None, raw=False, callback=None, **operation_config)[source]

This operation recognizes content within an image by applying a domain-specific model. The list of domain-specific models that are supported by the Computer Vision API can be retrieved using the /models GET request. Currently, the API only provides a single domain-specific model: celebrities. Two input methods are supported – (1) Uploading an image or (2) specifying an image URL. A successful response will be returned in JSON. If the request failed, the response will contain an error code and a message to help understand what went wrong.

Parameters:
  • model (str) – The domain-specific content to recognize.
  • image (Generator) – An image stream.
  • language (str) – The desired language for output generation. If this parameter is not specified, the default value is "en".Supported languages:en - English, Default. es - Spanish, ja - Japanese, pt - Portuguese, zh - Simplified Chinese. Possible values include: ‘en’, ‘es’, ‘ja’, ‘pt’, ‘zh’
  • custom_headers (dict) – headers that will be added to the request
  • raw (bool) – returns the direct response alongside the deserialized response
  • callback (Callable[Bytes, response=None]) – When specified, will be called with each chunk of data that is streamed. The callback should take two arguments, the bytes of the current chunk of data and the response object. If the data is uploading, response will be None.
  • operation_configOperation configuration overrides.
Returns:

DomainModelResults or ClientRawResponse if raw=true

Return type:

DomainModelResults or ClientRawResponse

Raises:

ComputerVisionErrorException

analyze_image_in_stream(image, visual_features=None, details=None, language='en', custom_headers=None, raw=False, callback=None, **operation_config)[source]

This operation extracts a rich set of visual features based on the image content.

Parameters:
  • image (Generator) – An image stream.
  • visual_features (list[str or VisualFeatureTypes]) – A string indicating what visual feature types to return. Multiple values should be comma-separated. Valid visual feature types include:Categories - categorizes image content according to a taxonomy defined in documentation. Tags - tags the image with a detailed list of words related to the image content. Description - describes the image content with a complete English sentence. Faces - detects if faces are present. If present, generate coordinates, gender and age. ImageType - detects if image is clipart or a line drawing. Color - determines the accent color, dominant color, and whether an image is black&white.Adult - detects if the image is pornographic in nature (depicts nudity or a sex act). Sexually suggestive content is also detected.
  • details (str) – A string indicating which domain-specific details to return. Multiple values should be comma-separated. Valid visual feature types include:Celebrities - identifies celebrities if detected in the image. Possible values include: ‘Celebrities’, ‘Landmarks’
  • language (str) – The desired language for output generation. If this parameter is not specified, the default value is "en".Supported languages:en - English, Default. es - Spanish, ja - Japanese, pt - Portuguese, zh - Simplified Chinese. Possible values include: ‘en’, ‘es’, ‘ja’, ‘pt’, ‘zh’
  • custom_headers (dict) – headers that will be added to the request
  • raw (bool) – returns the direct response alongside the deserialized response
  • callback (Callable[Bytes, response=None]) – When specified, will be called with each chunk of data that is streamed. The callback should take two arguments, the bytes of the current chunk of data and the response object. If the data is uploading, response will be None.
  • operation_configOperation configuration overrides.
Returns:

ImageAnalysis or ClientRawResponse if raw=true

Return type:

ImageAnalysis or ClientRawResponse

Raises:

ComputerVisionErrorException

describe_image(url, max_candidates='1', language='en', custom_headers=None, raw=False, **operation_config)[source]

This operation generates a description of an image in human readable language with complete sentences. The description is based on a collection of content tags, which are also returned by the operation. More than one description can be generated for each image. Descriptions are ordered by their confidence score. All descriptions are in English. Two input methods are supported – (1) Uploading an image or (2) specifying an image URL.A successful response will be returned in JSON. If the request failed, the response will contain an error code and a message to help understand what went wrong.

Parameters:
  • url (str) – Publicly reachable URL of an image
  • max_candidates (str) – Maximum number of candidate descriptions to be returned. The default is 1.
  • language (str) – The desired language for output generation. If this parameter is not specified, the default value is "en".Supported languages:en - English, Default. es - Spanish, ja - Japanese, pt - Portuguese, zh - Simplified Chinese. Possible values include: ‘en’, ‘es’, ‘ja’, ‘pt’, ‘zh’
  • custom_headers (dict) – headers that will be added to the request
  • raw (bool) – returns the direct response alongside the deserialized response
  • operation_configOperation configuration overrides.
Returns:

ImageDescription or ClientRawResponse if raw=true

Return type:

ImageDescription or ClientRawResponse

Raises:

ComputerVisionErrorException

describe_image_in_stream(image, max_candidates='1', language='en', custom_headers=None, raw=False, callback=None, **operation_config)[source]

This operation generates a description of an image in human readable language with complete sentences. The description is based on a collection of content tags, which are also returned by the operation. More than one description can be generated for each image. Descriptions are ordered by their confidence score. All descriptions are in English. Two input methods are supported – (1) Uploading an image or (2) specifying an image URL.A successful response will be returned in JSON. If the request failed, the response will contain an error code and a message to help understand what went wrong.

Parameters:
  • image (Generator) – An image stream.
  • max_candidates (str) – Maximum number of candidate descriptions to be returned. The default is 1.
  • language (str) – The desired language for output generation. If this parameter is not specified, the default value is "en".Supported languages:en - English, Default. es - Spanish, ja - Japanese, pt - Portuguese, zh - Simplified Chinese. Possible values include: ‘en’, ‘es’, ‘ja’, ‘pt’, ‘zh’
  • custom_headers (dict) – headers that will be added to the request
  • raw (bool) – returns the direct response alongside the deserialized response
  • callback (Callable[Bytes, response=None]) – When specified, will be called with each chunk of data that is streamed. The callback should take two arguments, the bytes of the current chunk of data and the response object. If the data is uploading, response will be None.
  • operation_configOperation configuration overrides.
Returns:

ImageDescription or ClientRawResponse if raw=true

Return type:

ImageDescription or ClientRawResponse

Raises:

ComputerVisionErrorException

generate_thumbnail(width, height, url, smart_cropping=False, custom_headers=None, raw=False, callback=None, **operation_config)[source]

This operation generates a thumbnail image with the user-specified width and height. By default, the service analyzes the image, identifies the region of interest (ROI), and generates smart cropping coordinates based on the ROI. Smart cropping helps when you specify an aspect ratio that differs from that of the input image. A successful response contains the thumbnail image binary. If the request failed, the response contains an error code and a message to help determine what went wrong.

Parameters:
  • width (int) – Width of the thumbnail. It must be between 1 and 1024. Recommended minimum of 50.
  • height (int) – Height of the thumbnail. It must be between 1 and 1024. Recommended minimum of 50.
  • url (str) – Publicly reachable URL of an image
  • smart_cropping (bool) – Boolean flag for enabling smart cropping.
  • custom_headers (dict) – headers that will be added to the request
  • raw (bool) – returns the direct response alongside the deserialized response
  • callback (Callable[Bytes, response=None]) – When specified, will be called with each chunk of data that is streamed. The callback should take two arguments, the bytes of the current chunk of data and the response object. If the data is uploading, response will be None.
  • operation_configOperation configuration overrides.
Returns:

object or ClientRawResponse if raw=true

Return type:

Generator or ClientRawResponse

Raises:

HttpOperationError

generate_thumbnail_in_stream(width, height, image, smart_cropping=False, custom_headers=None, raw=False, callback=None, **operation_config)[source]

This operation generates a thumbnail image with the user-specified width and height. By default, the service analyzes the image, identifies the region of interest (ROI), and generates smart cropping coordinates based on the ROI. Smart cropping helps when you specify an aspect ratio that differs from that of the input image. A successful response contains the thumbnail image binary. If the request failed, the response contains an error code and a message to help determine what went wrong.

Parameters:
  • width (int) – Width of the thumbnail. It must be between 1 and 1024. Recommended minimum of 50.
  • height (int) – Height of the thumbnail. It must be between 1 and 1024. Recommended minimum of 50.
  • image (Generator) – An image stream.
  • smart_cropping (bool) – Boolean flag for enabling smart cropping.
  • custom_headers (dict) – headers that will be added to the request
  • raw (bool) – returns the direct response alongside the deserialized response
  • callback (Callable[Bytes, response=None]) – When specified, will be called with each chunk of data that is streamed. The callback should take two arguments, the bytes of the current chunk of data and the response object. If the data is uploading, response will be None.
  • operation_configOperation configuration overrides.
Returns:

object or ClientRawResponse if raw=true

Return type:

Generator or ClientRawResponse

Raises:

HttpOperationError

get_text_operation_result(operation_id, custom_headers=None, raw=False, **operation_config)[source]

This interface is used for getting text operation result. The URL to this interface should be retrieved from ‘Operation-Location’ field returned from Recognize Text interface.

Parameters:
  • operation_id (str) – Id of the text operation returned in the response of the ‘Recognize Text’
  • custom_headers (dict) – headers that will be added to the request
  • raw (bool) – returns the direct response alongside the deserialized response
  • operation_configOperation configuration overrides.
Returns:

TextOperationResult or ClientRawResponse if raw=true

Return type:

TextOperationResult or ClientRawResponse

Raises:

ComputerVisionErrorException

list_models(custom_headers=None, raw=False, **operation_config)[source]

This operation returns the list of domain-specific models that are supported by the Computer Vision API. Currently, the API only supports one domain-specific model: a celebrity recognizer. A successful response will be returned in JSON. If the request failed, the response will contain an error code and a message to help understand what went wrong.

Parameters:
  • custom_headers (dict) – headers that will be added to the request
  • raw (bool) – returns the direct response alongside the deserialized response
  • operation_configOperation configuration overrides.
Returns:

ListModelsResult or ClientRawResponse if raw=true

Return type:

ListModelsResult or ClientRawResponse

Raises:

ComputerVisionErrorException

recognize_printed_text(url, detect_orientation=True, language='unk', custom_headers=None, raw=False, **operation_config)[source]

Optical Character Recognition (OCR) detects printed text in an image and extracts the recognized characters into a machine-usable character stream. Upon success, the OCR results will be returned. Upon failure, the error code together with an error message will be returned. The error code can be one of InvalidImageUrl, InvalidImageFormat, InvalidImageSize, NotSupportedImage, NotSupportedLanguage, or InternalServerError.

Parameters:
  • detect_orientation (bool) – Whether detect the text orientation in the image. With detectOrientation=true the OCR service tries to detect the image orientation and correct it before further processing (e.g. if it’s upside-down).
  • url (str) – Publicly reachable URL of an image
  • language (str or OcrLanguages) – The BCP-47 language code of the text to be detected in the image. The default value is ‘unk’. Possible values include: ‘unk’, ‘zh-Hans’, ‘zh-Hant’, ‘cs’, ‘da’, ‘nl’, ‘en’, ‘fi’, ‘fr’, ‘de’, ‘el’, ‘hu’, ‘it’, ‘ja’, ‘ko’, ‘nb’, ‘pl’, ‘pt’, ‘ru’, ‘es’, ‘sv’, ‘tr’, ‘ar’, ‘ro’, ‘sr-Cyrl’, ‘sr-Latn’, ‘sk’
  • custom_headers (dict) – headers that will be added to the request
  • raw (bool) – returns the direct response alongside the deserialized response
  • operation_configOperation configuration overrides.
Returns:

OcrResult or ClientRawResponse if raw=true

Return type:

OcrResult or ClientRawResponse

Raises:

ComputerVisionErrorException

recognize_printed_text_in_stream(image, detect_orientation=True, language='unk', custom_headers=None, raw=False, callback=None, **operation_config)[source]

Optical Character Recognition (OCR) detects printed text in an image and extracts the recognized characters into a machine-usable character stream. Upon success, the OCR results will be returned. Upon failure, the error code together with an error message will be returned. The error code can be one of InvalidImageUrl, InvalidImageFormat, InvalidImageSize, NotSupportedImage, NotSupportedLanguage, or InternalServerError.

Parameters:
  • detect_orientation (bool) – Whether detect the text orientation in the image. With detectOrientation=true the OCR service tries to detect the image orientation and correct it before further processing (e.g. if it’s upside-down).
  • image (Generator) – An image stream.
  • language (str or OcrLanguages) – The BCP-47 language code of the text to be detected in the image. The default value is ‘unk’. Possible values include: ‘unk’, ‘zh-Hans’, ‘zh-Hant’, ‘cs’, ‘da’, ‘nl’, ‘en’, ‘fi’, ‘fr’, ‘de’, ‘el’, ‘hu’, ‘it’, ‘ja’, ‘ko’, ‘nb’, ‘pl’, ‘pt’, ‘ru’, ‘es’, ‘sv’, ‘tr’, ‘ar’, ‘ro’, ‘sr-Cyrl’, ‘sr-Latn’, ‘sk’
  • custom_headers (dict) – headers that will be added to the request
  • raw (bool) – returns the direct response alongside the deserialized response
  • callback (Callable[Bytes, response=None]) – When specified, will be called with each chunk of data that is streamed. The callback should take two arguments, the bytes of the current chunk of data and the response object. If the data is uploading, response will be None.
  • operation_configOperation configuration overrides.
Returns:

OcrResult or ClientRawResponse if raw=true

Return type:

OcrResult or ClientRawResponse

Raises:

ComputerVisionErrorException

recognize_text(url, mode, custom_headers=None, raw=False, **operation_config)[source]

Recognize Text operation. When you use the Recognize Text interface, the response contains a field called ‘Operation-Location’. The ‘Operation-Location’ field contains the URL that you must use for your Get Recognize Text Operation Result operation.

Parameters:
  • mode (str or TextRecognitionMode) – Type of text to recognize. Possible values include: ‘Handwritten’, ‘Printed’
  • url (str) – Publicly reachable URL of an image
  • custom_headers (dict) – headers that will be added to the request
  • raw (bool) – returns the direct response alongside the deserialized response
  • operation_configOperation configuration overrides.
Returns:

None or ClientRawResponse if raw=true

Return type:

None or ClientRawResponse

Raises:

ComputerVisionErrorException

recognize_text_in_stream(image, mode, custom_headers=None, raw=False, callback=None, **operation_config)[source]

Recognize Text operation. When you use the Recognize Text interface, the response contains a field called ‘Operation-Location’. The ‘Operation-Location’ field contains the URL that you must use for your Get Recognize Text Operation Result operation.

Parameters:
  • image (Generator) – An image stream.
  • mode (str or TextRecognitionMode) – Type of text to recognize. Possible values include: ‘Handwritten’, ‘Printed’
  • custom_headers (dict) – headers that will be added to the request
  • raw (bool) – returns the direct response alongside the deserialized response
  • callback (Callable[Bytes, response=None]) – When specified, will be called with each chunk of data that is streamed. The callback should take two arguments, the bytes of the current chunk of data and the response object. If the data is uploading, response will be None.
  • operation_configOperation configuration overrides.
Returns:

None or ClientRawResponse if raw=true

Return type:

None or ClientRawResponse

Raises:

ComputerVisionErrorException

tag_image(url, language='en', custom_headers=None, raw=False, **operation_config)[source]

This operation generates a list of words, or tags, that are relevant to the content of the supplied image. The Computer Vision API can return tags based on objects, living beings, scenery or actions found in images. Unlike categories, tags are not organized according to a hierarchical classification system, but correspond to image content. Tags may contain hints to avoid ambiguity or provide context, for example the tag ‘cello’ may be accompanied by the hint ‘musical instrument’. All tags are in English.

Parameters:
  • url (str) – Publicly reachable URL of an image
  • language (str) – The desired language for output generation. If this parameter is not specified, the default value is "en".Supported languages:en - English, Default. es - Spanish, ja - Japanese, pt - Portuguese, zh - Simplified Chinese. Possible values include: ‘en’, ‘es’, ‘ja’, ‘pt’, ‘zh’
  • custom_headers (dict) – headers that will be added to the request
  • raw (bool) – returns the direct response alongside the deserialized response
  • operation_configOperation configuration overrides.
Returns:

TagResult or ClientRawResponse if raw=true

Return type:

TagResult or ClientRawResponse

Raises:

ComputerVisionErrorException

tag_image_in_stream(image, language='en', custom_headers=None, raw=False, callback=None, **operation_config)[source]

This operation generates a list of words, or tags, that are relevant to the content of the supplied image. The Computer Vision API can return tags based on objects, living beings, scenery or actions found in images. Unlike categories, tags are not organized according to a hierarchical classification system, but correspond to image content. Tags may contain hints to avoid ambiguity or provide context, for example the tag ‘cello’ may be accompanied by the hint ‘musical instrument’. All tags are in English.

Parameters:
  • image (Generator) – An image stream.
  • language (str) – The desired language for output generation. If this parameter is not specified, the default value is "en".Supported languages:en - English, Default. es - Spanish, ja - Japanese, pt - Portuguese, zh - Simplified Chinese. Possible values include: ‘en’, ‘es’, ‘ja’, ‘pt’, ‘zh’
  • custom_headers (dict) – headers that will be added to the request
  • raw (bool) – returns the direct response alongside the deserialized response
  • callback (Callable[Bytes, response=None]) – When specified, will be called with each chunk of data that is streamed. The callback should take two arguments, the bytes of the current chunk of data and the response object. If the data is uploading, response will be None.
  • operation_configOperation configuration overrides.
Returns:

TagResult or ClientRawResponse if raw=true

Return type:

TagResult or ClientRawResponse

Raises:

ComputerVisionErrorException