azure.cognitiveservices.vision.computervision package¶
Submodules¶
Module contents¶
-
class
azure.cognitiveservices.vision.computervision.
ComputerVisionAPI
(azure_region, credentials)[source]¶ Bases:
msrest.service_client.SDKClient
The Computer Vision API provides state-of-the-art algorithms to process images and return information. For example, it can be used to determine if an image contains mature content, or it can be used to find all the faces in an image. It also has other features like estimating dominant and accent colors, categorizing the content of images, and describing an image with complete English sentences. Additionally, it can also intelligently generate images thumbnails for displaying large images effectively.
Variables: config (ComputerVisionAPIConfiguration) – Configuration for client.
Parameters: - azure_region (str or AzureRegions) – Supported Azure regions for Cognitive Services endpoints. Possible values include: ‘westus’, ‘westeurope’, ‘southeastasia’, ‘eastus2’, ‘westcentralus’, ‘westus2’, ‘eastus’, ‘southcentralus’, ‘northeurope’, ‘eastasia’, ‘australiaeast’, ‘brazilsouth’
- credentials (None) – Subscription credentials which uniquely identify client subscription.
-
analyze_image
(url, visual_features=None, details=None, language='en', custom_headers=None, raw=False, **operation_config)[source]¶ This operation extracts a rich set of visual features based on the image content. Two input methods are supported – (1) Uploading an image or (2) specifying an image URL. Within your request, there is an optional parameter to allow you to choose which features to return. By default, image categories are returned in the response.
Parameters: - url (str) – Publicly reachable URL of an image
- visual_features (list[str or VisualFeatureTypes]) – A string indicating what visual feature types to return. Multiple values should be comma-separated. Valid visual feature types include:Categories - categorizes image content according to a taxonomy defined in documentation. Tags - tags the image with a detailed list of words related to the image content. Description - describes the image content with a complete English sentence. Faces - detects if faces are present. If present, generate coordinates, gender and age. ImageType - detects if image is clipart or a line drawing. Color - determines the accent color, dominant color, and whether an image is black&white.Adult - detects if the image is pornographic in nature (depicts nudity or a sex act). Sexually suggestive content is also detected.
- details (list[str or Details]) – A string indicating which domain-specific details to return. Multiple values should be comma-separated. Valid visual feature types include:Celebrities - identifies celebrities if detected in the image.
- language (str) – The desired language for output generation. If this parameter is not specified, the default value is "en".Supported languages:en - English, Default. es - Spanish, ja - Japanese, pt - Portuguese, zh - Simplified Chinese. Possible values include: ‘en’, ‘es’, ‘ja’, ‘pt’, ‘zh’
- custom_headers (dict) – headers that will be added to the request
- raw (bool) – returns the direct response alongside the deserialized response
- operation_config – Operation configuration overrides.
Returns: ImageAnalysis or ClientRawResponse if raw=true
Return type: ImageAnalysis or ClientRawResponse
Raises:
-
analyze_image_by_domain
(model, url, language='en', custom_headers=None, raw=False, **operation_config)[source]¶ This operation recognizes content within an image by applying a domain-specific model. The list of domain-specific models that are supported by the Computer Vision API can be retrieved using the /models GET request. Currently, the API only provides a single domain-specific model: celebrities. Two input methods are supported – (1) Uploading an image or (2) specifying an image URL. A successful response will be returned in JSON. If the request failed, the response will contain an error code and a message to help understand what went wrong.
Parameters: - model (str) – The domain-specific content to recognize.
- url (str) – Publicly reachable URL of an image
- language (str) – The desired language for output generation. If this parameter is not specified, the default value is "en".Supported languages:en - English, Default. es - Spanish, ja - Japanese, pt - Portuguese, zh - Simplified Chinese. Possible values include: ‘en’, ‘es’, ‘ja’, ‘pt’, ‘zh’
- custom_headers (dict) – headers that will be added to the request
- raw (bool) – returns the direct response alongside the deserialized response
- operation_config – Operation configuration overrides.
Returns: DomainModelResults or ClientRawResponse if raw=true
Return type: DomainModelResults or ClientRawResponse
Raises:
-
analyze_image_by_domain_in_stream
(model, image, language='en', custom_headers=None, raw=False, callback=None, **operation_config)[source]¶ This operation recognizes content within an image by applying a domain-specific model. The list of domain-specific models that are supported by the Computer Vision API can be retrieved using the /models GET request. Currently, the API only provides a single domain-specific model: celebrities. Two input methods are supported – (1) Uploading an image or (2) specifying an image URL. A successful response will be returned in JSON. If the request failed, the response will contain an error code and a message to help understand what went wrong.
Parameters: - model (str) – The domain-specific content to recognize.
- image (Generator) – An image stream.
- language (str) – The desired language for output generation. If this parameter is not specified, the default value is "en".Supported languages:en - English, Default. es - Spanish, ja - Japanese, pt - Portuguese, zh - Simplified Chinese. Possible values include: ‘en’, ‘es’, ‘ja’, ‘pt’, ‘zh’
- custom_headers (dict) – headers that will be added to the request
- raw (bool) – returns the direct response alongside the deserialized response
- callback (Callable[Bytes, response=None]) – When specified, will be called with each chunk of data that is streamed. The callback should take two arguments, the bytes of the current chunk of data and the response object. If the data is uploading, response will be None.
- operation_config – Operation configuration overrides.
Returns: DomainModelResults or ClientRawResponse if raw=true
Return type: DomainModelResults or ClientRawResponse
Raises:
-
analyze_image_in_stream
(image, visual_features=None, details=None, language='en', custom_headers=None, raw=False, callback=None, **operation_config)[source]¶ This operation extracts a rich set of visual features based on the image content.
Parameters: - image (Generator) – An image stream.
- visual_features (list[str or VisualFeatureTypes]) – A string indicating what visual feature types to return. Multiple values should be comma-separated. Valid visual feature types include:Categories - categorizes image content according to a taxonomy defined in documentation. Tags - tags the image with a detailed list of words related to the image content. Description - describes the image content with a complete English sentence. Faces - detects if faces are present. If present, generate coordinates, gender and age. ImageType - detects if image is clipart or a line drawing. Color - determines the accent color, dominant color, and whether an image is black&white.Adult - detects if the image is pornographic in nature (depicts nudity or a sex act). Sexually suggestive content is also detected.
- details (str) – A string indicating which domain-specific details to return. Multiple values should be comma-separated. Valid visual feature types include:Celebrities - identifies celebrities if detected in the image. Possible values include: ‘Celebrities’, ‘Landmarks’
- language (str) – The desired language for output generation. If this parameter is not specified, the default value is "en".Supported languages:en - English, Default. es - Spanish, ja - Japanese, pt - Portuguese, zh - Simplified Chinese. Possible values include: ‘en’, ‘es’, ‘ja’, ‘pt’, ‘zh’
- custom_headers (dict) – headers that will be added to the request
- raw (bool) – returns the direct response alongside the deserialized response
- callback (Callable[Bytes, response=None]) – When specified, will be called with each chunk of data that is streamed. The callback should take two arguments, the bytes of the current chunk of data and the response object. If the data is uploading, response will be None.
- operation_config – Operation configuration overrides.
Returns: ImageAnalysis or ClientRawResponse if raw=true
Return type: ImageAnalysis or ClientRawResponse
Raises:
-
describe_image
(url, max_candidates='1', language='en', custom_headers=None, raw=False, **operation_config)[source]¶ This operation generates a description of an image in human readable language with complete sentences. The description is based on a collection of content tags, which are also returned by the operation. More than one description can be generated for each image. Descriptions are ordered by their confidence score. All descriptions are in English. Two input methods are supported – (1) Uploading an image or (2) specifying an image URL.A successful response will be returned in JSON. If the request failed, the response will contain an error code and a message to help understand what went wrong.
Parameters: - url (str) – Publicly reachable URL of an image
- max_candidates (str) – Maximum number of candidate descriptions to be returned. The default is 1.
- language (str) – The desired language for output generation. If this parameter is not specified, the default value is "en".Supported languages:en - English, Default. es - Spanish, ja - Japanese, pt - Portuguese, zh - Simplified Chinese. Possible values include: ‘en’, ‘es’, ‘ja’, ‘pt’, ‘zh’
- custom_headers (dict) – headers that will be added to the request
- raw (bool) – returns the direct response alongside the deserialized response
- operation_config – Operation configuration overrides.
Returns: ImageDescription or ClientRawResponse if raw=true
Return type: ImageDescription or ClientRawResponse
Raises:
-
describe_image_in_stream
(image, max_candidates='1', language='en', custom_headers=None, raw=False, callback=None, **operation_config)[source]¶ This operation generates a description of an image in human readable language with complete sentences. The description is based on a collection of content tags, which are also returned by the operation. More than one description can be generated for each image. Descriptions are ordered by their confidence score. All descriptions are in English. Two input methods are supported – (1) Uploading an image or (2) specifying an image URL.A successful response will be returned in JSON. If the request failed, the response will contain an error code and a message to help understand what went wrong.
Parameters: - image (Generator) – An image stream.
- max_candidates (str) – Maximum number of candidate descriptions to be returned. The default is 1.
- language (str) – The desired language for output generation. If this parameter is not specified, the default value is "en".Supported languages:en - English, Default. es - Spanish, ja - Japanese, pt - Portuguese, zh - Simplified Chinese. Possible values include: ‘en’, ‘es’, ‘ja’, ‘pt’, ‘zh’
- custom_headers (dict) – headers that will be added to the request
- raw (bool) – returns the direct response alongside the deserialized response
- callback (Callable[Bytes, response=None]) – When specified, will be called with each chunk of data that is streamed. The callback should take two arguments, the bytes of the current chunk of data and the response object. If the data is uploading, response will be None.
- operation_config – Operation configuration overrides.
Returns: ImageDescription or ClientRawResponse if raw=true
Return type: ImageDescription or ClientRawResponse
Raises:
-
generate_thumbnail
(width, height, url, smart_cropping=False, custom_headers=None, raw=False, callback=None, **operation_config)[source]¶ This operation generates a thumbnail image with the user-specified width and height. By default, the service analyzes the image, identifies the region of interest (ROI), and generates smart cropping coordinates based on the ROI. Smart cropping helps when you specify an aspect ratio that differs from that of the input image. A successful response contains the thumbnail image binary. If the request failed, the response contains an error code and a message to help determine what went wrong.
Parameters: - width (int) – Width of the thumbnail. It must be between 1 and 1024. Recommended minimum of 50.
- height (int) – Height of the thumbnail. It must be between 1 and 1024. Recommended minimum of 50.
- url (str) – Publicly reachable URL of an image
- smart_cropping (bool) – Boolean flag for enabling smart cropping.
- custom_headers (dict) – headers that will be added to the request
- raw (bool) – returns the direct response alongside the deserialized response
- callback (Callable[Bytes, response=None]) – When specified, will be called with each chunk of data that is streamed. The callback should take two arguments, the bytes of the current chunk of data and the response object. If the data is uploading, response will be None.
- operation_config – Operation configuration overrides.
Returns: object or ClientRawResponse if raw=true
Return type: Generator or ClientRawResponse
Raises: HttpOperationError
-
generate_thumbnail_in_stream
(width, height, image, smart_cropping=False, custom_headers=None, raw=False, callback=None, **operation_config)[source]¶ This operation generates a thumbnail image with the user-specified width and height. By default, the service analyzes the image, identifies the region of interest (ROI), and generates smart cropping coordinates based on the ROI. Smart cropping helps when you specify an aspect ratio that differs from that of the input image. A successful response contains the thumbnail image binary. If the request failed, the response contains an error code and a message to help determine what went wrong.
Parameters: - width (int) – Width of the thumbnail. It must be between 1 and 1024. Recommended minimum of 50.
- height (int) – Height of the thumbnail. It must be between 1 and 1024. Recommended minimum of 50.
- image (Generator) – An image stream.
- smart_cropping (bool) – Boolean flag for enabling smart cropping.
- custom_headers (dict) – headers that will be added to the request
- raw (bool) – returns the direct response alongside the deserialized response
- callback (Callable[Bytes, response=None]) – When specified, will be called with each chunk of data that is streamed. The callback should take two arguments, the bytes of the current chunk of data and the response object. If the data is uploading, response will be None.
- operation_config – Operation configuration overrides.
Returns: object or ClientRawResponse if raw=true
Return type: Generator or ClientRawResponse
Raises: HttpOperationError
-
get_text_operation_result
(operation_id, custom_headers=None, raw=False, **operation_config)[source]¶ This interface is used for getting text operation result. The URL to this interface should be retrieved from ‘Operation-Location’ field returned from Recognize Text interface.
Parameters: - operation_id (str) – Id of the text operation returned in the response of the ‘Recognize Text’
- custom_headers (dict) – headers that will be added to the request
- raw (bool) – returns the direct response alongside the deserialized response
- operation_config – Operation configuration overrides.
Returns: TextOperationResult or ClientRawResponse if raw=true
Return type: TextOperationResult or ClientRawResponse
Raises:
-
list_models
(custom_headers=None, raw=False, **operation_config)[source]¶ This operation returns the list of domain-specific models that are supported by the Computer Vision API. Currently, the API only supports one domain-specific model: a celebrity recognizer. A successful response will be returned in JSON. If the request failed, the response will contain an error code and a message to help understand what went wrong.
Parameters: - custom_headers (dict) – headers that will be added to the request
- raw (bool) – returns the direct response alongside the deserialized response
- operation_config – Operation configuration overrides.
Returns: ListModelsResult or ClientRawResponse if raw=true
Return type: ListModelsResult or ClientRawResponse
Raises:
-
recognize_printed_text
(url, detect_orientation=True, language='unk', custom_headers=None, raw=False, **operation_config)[source]¶ Optical Character Recognition (OCR) detects printed text in an image and extracts the recognized characters into a machine-usable character stream. Upon success, the OCR results will be returned. Upon failure, the error code together with an error message will be returned. The error code can be one of InvalidImageUrl, InvalidImageFormat, InvalidImageSize, NotSupportedImage, NotSupportedLanguage, or InternalServerError.
Parameters: - detect_orientation (bool) – Whether detect the text orientation in the image. With detectOrientation=true the OCR service tries to detect the image orientation and correct it before further processing (e.g. if it’s upside-down).
- url (str) – Publicly reachable URL of an image
- language (str or OcrLanguages) – The BCP-47 language code of the text to be detected in the image. The default value is ‘unk’. Possible values include: ‘unk’, ‘zh-Hans’, ‘zh-Hant’, ‘cs’, ‘da’, ‘nl’, ‘en’, ‘fi’, ‘fr’, ‘de’, ‘el’, ‘hu’, ‘it’, ‘ja’, ‘ko’, ‘nb’, ‘pl’, ‘pt’, ‘ru’, ‘es’, ‘sv’, ‘tr’, ‘ar’, ‘ro’, ‘sr-Cyrl’, ‘sr-Latn’, ‘sk’
- custom_headers (dict) – headers that will be added to the request
- raw (bool) – returns the direct response alongside the deserialized response
- operation_config – Operation configuration overrides.
Returns: OcrResult or ClientRawResponse if raw=true
Return type: OcrResult or ClientRawResponse
Raises:
-
recognize_printed_text_in_stream
(image, detect_orientation=True, language='unk', custom_headers=None, raw=False, callback=None, **operation_config)[source]¶ Optical Character Recognition (OCR) detects printed text in an image and extracts the recognized characters into a machine-usable character stream. Upon success, the OCR results will be returned. Upon failure, the error code together with an error message will be returned. The error code can be one of InvalidImageUrl, InvalidImageFormat, InvalidImageSize, NotSupportedImage, NotSupportedLanguage, or InternalServerError.
Parameters: - detect_orientation (bool) – Whether detect the text orientation in the image. With detectOrientation=true the OCR service tries to detect the image orientation and correct it before further processing (e.g. if it’s upside-down).
- image (Generator) – An image stream.
- language (str or OcrLanguages) – The BCP-47 language code of the text to be detected in the image. The default value is ‘unk’. Possible values include: ‘unk’, ‘zh-Hans’, ‘zh-Hant’, ‘cs’, ‘da’, ‘nl’, ‘en’, ‘fi’, ‘fr’, ‘de’, ‘el’, ‘hu’, ‘it’, ‘ja’, ‘ko’, ‘nb’, ‘pl’, ‘pt’, ‘ru’, ‘es’, ‘sv’, ‘tr’, ‘ar’, ‘ro’, ‘sr-Cyrl’, ‘sr-Latn’, ‘sk’
- custom_headers (dict) – headers that will be added to the request
- raw (bool) – returns the direct response alongside the deserialized response
- callback (Callable[Bytes, response=None]) – When specified, will be called with each chunk of data that is streamed. The callback should take two arguments, the bytes of the current chunk of data and the response object. If the data is uploading, response will be None.
- operation_config – Operation configuration overrides.
Returns: OcrResult or ClientRawResponse if raw=true
Return type: OcrResult or ClientRawResponse
Raises:
-
recognize_text
(url, mode, custom_headers=None, raw=False, **operation_config)[source]¶ Recognize Text operation. When you use the Recognize Text interface, the response contains a field called ‘Operation-Location’. The ‘Operation-Location’ field contains the URL that you must use for your Get Recognize Text Operation Result operation.
Parameters: - mode (str or TextRecognitionMode) – Type of text to recognize. Possible values include: ‘Handwritten’, ‘Printed’
- url (str) – Publicly reachable URL of an image
- custom_headers (dict) – headers that will be added to the request
- raw (bool) – returns the direct response alongside the deserialized response
- operation_config – Operation configuration overrides.
Returns: None or ClientRawResponse if raw=true
Return type: None or ClientRawResponse
Raises:
-
recognize_text_in_stream
(image, mode, custom_headers=None, raw=False, callback=None, **operation_config)[source]¶ Recognize Text operation. When you use the Recognize Text interface, the response contains a field called ‘Operation-Location’. The ‘Operation-Location’ field contains the URL that you must use for your Get Recognize Text Operation Result operation.
Parameters: - image (Generator) – An image stream.
- mode (str or TextRecognitionMode) – Type of text to recognize. Possible values include: ‘Handwritten’, ‘Printed’
- custom_headers (dict) – headers that will be added to the request
- raw (bool) – returns the direct response alongside the deserialized response
- callback (Callable[Bytes, response=None]) – When specified, will be called with each chunk of data that is streamed. The callback should take two arguments, the bytes of the current chunk of data and the response object. If the data is uploading, response will be None.
- operation_config – Operation configuration overrides.
Returns: None or ClientRawResponse if raw=true
Return type: None or ClientRawResponse
Raises:
-
tag_image
(url, language='en', custom_headers=None, raw=False, **operation_config)[source]¶ This operation generates a list of words, or tags, that are relevant to the content of the supplied image. The Computer Vision API can return tags based on objects, living beings, scenery or actions found in images. Unlike categories, tags are not organized according to a hierarchical classification system, but correspond to image content. Tags may contain hints to avoid ambiguity or provide context, for example the tag ‘cello’ may be accompanied by the hint ‘musical instrument’. All tags are in English.
Parameters: - url (str) – Publicly reachable URL of an image
- language (str) – The desired language for output generation. If this parameter is not specified, the default value is "en".Supported languages:en - English, Default. es - Spanish, ja - Japanese, pt - Portuguese, zh - Simplified Chinese. Possible values include: ‘en’, ‘es’, ‘ja’, ‘pt’, ‘zh’
- custom_headers (dict) – headers that will be added to the request
- raw (bool) – returns the direct response alongside the deserialized response
- operation_config – Operation configuration overrides.
Returns: TagResult or ClientRawResponse if raw=true
Return type: TagResult or ClientRawResponse
Raises:
-
tag_image_in_stream
(image, language='en', custom_headers=None, raw=False, callback=None, **operation_config)[source]¶ This operation generates a list of words, or tags, that are relevant to the content of the supplied image. The Computer Vision API can return tags based on objects, living beings, scenery or actions found in images. Unlike categories, tags are not organized according to a hierarchical classification system, but correspond to image content. Tags may contain hints to avoid ambiguity or provide context, for example the tag ‘cello’ may be accompanied by the hint ‘musical instrument’. All tags are in English.
Parameters: - image (Generator) – An image stream.
- language (str) – The desired language for output generation. If this parameter is not specified, the default value is "en".Supported languages:en - English, Default. es - Spanish, ja - Japanese, pt - Portuguese, zh - Simplified Chinese. Possible values include: ‘en’, ‘es’, ‘ja’, ‘pt’, ‘zh’
- custom_headers (dict) – headers that will be added to the request
- raw (bool) – returns the direct response alongside the deserialized response
- callback (Callable[Bytes, response=None]) – When specified, will be called with each chunk of data that is streamed. The callback should take two arguments, the bytes of the current chunk of data and the response object. If the data is uploading, response will be None.
- operation_config – Operation configuration overrides.
Returns: TagResult or ClientRawResponse if raw=true
Return type: TagResult or ClientRawResponse
Raises: