Friday, 2 December 2022

Putting Content Aware Encoding to work with AWS Media Convert "Automated ABR"

Due to internet development, under the sea optic cables and better compute power streaming video is now accepted as granted by most internet users.

However it is actually a very complicated process. And also very expensive. Let's assume I am OTT provider and  I got 4k video as an input. I need to convert it to different resolutions to support different internet speed connection and moreover I need to adjust it to be displayed good on different devices (PC, TV, Mobiles....). So I need a lot of compute power to create different videos. I also consume a lot of traffic to stream this data. All of it costs a lot of money. And I am always searching where can I save.

One of the areas directly related to video streaming it the bitrate. Generally speaking bitrate is how many information does my video contains every second. Obviously if my video is showing Malevich's black screen I need a very low bitrate since all I need to display is black color on the screen. However crowded street will require a very high bitrate. Lower bitrate value will produce smaller files (less compute power), smaller files require less internet traffic.

Ok, so we know now that adjusting the bitrate to current picture can be potentially very cost saving. But how exactly I know if my video contains a crowded street or a black screen? Can you put an army of cheap workers to go over all your videos and manually set the bitrate value? Sounds non sense. 

Another approach is to develop a sophisticated ML model that can determine the needed bitrate for specific pieces of the video. Yes, such things exist and it was implemented by Netflix several years ago.

And as every good idea it was adopted by other vendors and also by AWS.

The service which is performing the transcoding is AWS Media Convert and the feature is called "Automated ABR". While probably an enormous job was done to train such model, from the end user perspective all you need to do is to enable this option in Media Convert job definition.



 So I decided to do a little hands-on, to see the actual result.

Since no one likes to invent the wheel I used AWS Sample to transcode the video with Media Convert .

The idea is simple, upload the media file to S3, Lambda will pick the file and use Media Convert to perform the encoding. The result will be stored back into S3 bucket.

The guide is not bad. They only thing that was missing from explanation is creation of "MediaConvertRole".

I created it manually. It needs to have S3 full access and include the following "trust relationships":
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": [
                    "s3.amazonaws.com",
                    "mediaconvert.amazonaws.com"
                ]
            },
            "Action": "sts:AssumeRole"
        }
    ]
}


The guide contains the CloudFormation stack and this role is one of the parameters. 

I also created a different JSON file with definition of the job with "ABR" options.
You can download it here
And lastly I modified the Lambda code a little bit to be more specific to our use-case
 
#!/usr/bin/env python

import glob
import json
import os
import uuid
import boto3
import datetime
import random

from botocore.client import ClientError

def handler(event, context):

    assetID = str(uuid.uuid4())
    sourceS3Bucket = event['Records'][0]['s3']['bucket']['name']
    sourceS3Key = event['Records'][0]['s3']['object']['key']
    sourceS3 = 's3://'+ sourceS3Bucket + '/' + sourceS3Key
    sourceS3Basename = os.path.splitext(os.path.basename(sourceS3))[0]
    destinationS3 = 's3://' + os.environ['DestinationBucket']
    destinationS3basename = os.path.splitext(os.path.basename(destinationS3))[0]
    mediaConvertRole = os.environ['MediaConvertRole']
    region = os.environ['AWS_DEFAULT_REGION']
    statusCode = 200
    body = {}
    
    # Use MediaConvert SDK UserMetadata to tag jobs with the assetID 
    # Events from MediaConvert will have the assetID in UserMedata
    jobMetadata = {'assetID': assetID}

    print (json.dumps(event))
    
    try:
        # Job settings are in the lambda zip file in the current working directory
        with open('job_abr.json') as json_data:
            jobSettings = json.load(json_data)
            print(jobSettings)
        
        # get the account-specific mediaconvert endpoint for this region
        mc_client = boto3.client('mediaconvert', region_name=region)
        endpoints = mc_client.describe_endpoints()

        # add the account-specific endpoint to the client session 
        client = boto3.client('mediaconvert', region_name=region, endpoint_url=endpoints['Endpoints'][0]['Url'], verify=False)

        # Update the job settings with the source video from the S3 event and destination 
        # paths for converted videos
        
        jobSettings['Inputs'][0]['FileInput'] = sourceS3
        
        
        S3KeyHLS = 'assets_abr/' + assetID + '/HLS/' + sourceS3Basename
        jobSettings['OutputGroups'][0]['OutputGroupSettings']['HlsGroupSettings']['Destination'] \
            = destinationS3 + '/' + S3KeyHLS
         
 
        print('jobSettings:')
        print(json.dumps(jobSettings))

        # Convert the video using AWS Elemental MediaConvert
        job = client.create_job(Role=mediaConvertRole, UserMetadata=jobMetadata, Settings=jobSettings,Queue='arn:aws:mediaconvert:us-west-2:621094298987:queues/ABR')
        print (json.dumps(job, default=str))

    except Exception as e:
        
        print ('Exception: %s' % e)
        statusCode = 500
        traceback.print_exc()
        raise

    finally:
        return {
            'statusCode': statusCode,
            'body': json.dumps(body),
            'headers': {'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*'}
        }
I used the same mp4 file as the one that was used in the original guide. 

Several seconds after I did an upload to S3, the Media Convert picked the job and started the transcoding process. 




It took a while but eventually the process completed and I saw all the files in the S3 bucket.
Since I didn't want to open my S3 bucket to the public access I used aws cli "sync" command to copy the content of S3 to my local folder.



I used VLC to play different resolution.


And thats it.  Media Convert made it really easy to implement the feature, that otherwise you need to spend huge effort to implement it by yourself. One important note, the "ABR" feature is part of the professional tier.
For the cost you also need to consider S3 storage. Several MB file eventually produced about 300 MB to support different resolutions and devices.

No comments:

Post a Comment