AWS Machine Learning Blog

Build a work-from-home posture tracker with AWS DeepLens and GluonCV

April 2023 Update: Starting January 31, 2024, you will no longer be able to access AWS DeepLens through the AWS management console, manage DeepLens devices, or access any projects you have created. To learn more, refer to these frequently asked questions about AWS DeepLens end of life.

Working from home can be a big change to your ergonomic setup, which can make it hard for you to keep a healthy posture and take frequent breaks throughout the day. To help you maintain good posture and have fun with machine learning (ML) in the process, this post shows you how to build a posture tracker project with AWS DeepLens, the AWS programmable video camera for developers to learn ML. You will learn how to use the latest pose estimation ML models from GluonCV to map out body points from profile images of yourself working from home and send yourself text message alerts whenever your code detects bad posture. GluonCV is a computer vision library built on top of the Apache MXNet ML framework that provides off-the-shelf ML models from state-of-the-art deep learning research. With the ability run GluonCV models on AWS DeepLens, engineers, researchers, and students can quickly prototype products, validate new ideas, and learn computer vision. In addition to detecting bad posture, you will learn to analyze your posture data over time with Amazon QuickSight, an AWS service that lets you easily create and publish interactive dashboards from your data.

This tutorial includes the following steps:

  1. Experiment with AWS DeepLens and GluonCV
  2. Classify postures with the GluonCV pose key points
  3. Deploy pre-trained GluonCV models to AWS DeepLens
  4. Send text message reminders to stretch when the tracker detects bad posture
  5. Visualize your posture data over time with Amazon QuickSight

The following diagram shows the architecture of our posture tracker solution.

Prerequisites

Before you begin this tutorial, make sure you have the following prerequisites:

Experimenting with AWS DeepLens and GluonCV

Normally, AWS developers use Jupyter notebooks hosted in Amazon SageMaker to experiment with GluonCV models. Jupyter notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations, and narrative text. In this tutorial you are going to create and run Jupyter notebooks directly on an AWS DeepLens device, just like any other Linux computer, in order to enable rapid experimentation.

Starting with version AWS DeepLens software version 1.4.5, you can run GluonCV pretrained models directly on AWS DeepLens. To check the version number and update your software, go to the AWS DeepLens console, under Devices select your DeepLens device, and look at the Device status section. You should see the version number similar to the following screenshot.

To start experimenting with GluonCV models on DeepLens, complete the following steps:

  1. SSH into your AWS DeepLens device.

To do so, you need the IP address of AWS DeepLens on the local network. To find the IP address, select your device on the AWS DeepLens console. Your IP address is listed in the Device Details section.

You also need to make sure that SSH is enabled for your device. For more information about enabling SSH on your device, see View or Update Your AWS DeepLens 2019 Edition Device Settings.

Open a terminal application on your computer. SSH into your DeepLens by entering the following code into your terminal application:

ssh aws_cam@<YOUR_DEEPLENS_IP>

When you see a password prompt, enter the SSH password you chose when you set up SSH on your device.

  1. Install Jupyter notebook and GluonCV on your DeepLens. Enter each of the following commands one at a time in the SSH terminal. Press Enter after each line entry.
    sudo python3 -m pip install --upgrade pip
    
    sudo python3 -m pip install notebook
    
    sudo python3.7 -m pip install ipykernel
    
    python3.7 -m ipykernel install  --name 'Python3.7' --user
    
    sudo python3.7 -m pip install gluoncv
    
  2. Generate a default configuration file for Jupyter notebook:
    jupyter notebook --generate-config
  3. Edit the Jupyter configuration file in your SSH session to allow access to the Jupyter notebook running on AWS DeepLens from your laptop.
    nano ~/.jupyter/jupyter_notebook_config.py
  4. Add the following lines to the top of the config file:
    c.NotebookApp.ip = '0.0.0.0'
    c.NotebookApp.open_browser = False
    
  5. Save the file (if you are using the nano editor, press Ctrl+X and then Y).
  6. Open up a port in the AWS DeepLens firewall to allow traffic to Jupyter notebook. See the following code:
    sudo ufw allow 8888
  7. Run the Jupyter notebook server with the following code:
    jupyter notebook

    You should see output like the following screenshot:

  8. Copy the link and replace the IP portion (DeepLens or 127.0.0.1). See the following code:
    http://(DeepLens or 127.0.0.1):8888/?token=sometoken

    For example, the URL based on the preceding screenshot is http://10.0.0.250:8888/?token=7adf9c523ba91f95cfc0ba3cacfc01cd7e7b68a271e870a8.

  9. Enter this link into your laptop web browser.

You should see something like the following screenshot.

  1. Choose New to create a new notebook.
  2. Choose Python3.7.

Capturing a frame from your camera

To capture a frame from the camera, first make sure you aren’t running any projects on AWS DeepLens.

  1. On the AWS Deeplens console, go to your device page.
  2. If a project is deployed, you should see a project name in the Current Project pane. Choose Remove Project if there is a project deployed to your AWS DeepLens.
  3. Now go back to the Jupyter notebook running on your AWS DeepLens, enter the following code into your first code cell:
    import awscam
    import cv2
    
    ret,frame = awscam.getLastFrame()
    print(frame.shape)
    
  4. Press Shift+Enter to execute the code inside the cell.

Alternatively, you can press the Run button in the Jupyter toolbar as shown in the screenshot below:

You should see the size of the image captured by AWS DeepLens similar to the following text:

(1520, 2688, 3)

The three numbers show the height, width, and number of color channels (red, green, blue) of the image.

  1. To view the image, enter the following code in the next code cell:
    %matplotlib inline
    from matplotlib import pyplot as plt
    plt.imshow(frame)
    plt.show()
    

    You should see an image similar to the following screenshot:

Detecting people and poses

Now that you have an image, you can use GluonCV pre-trained models to detect people and poses. For more information, see Predict with pre-trained Simple Pose Estimation models from the GluonCV model zoo.

  1. In a new code cell, enter the following code to import the necessary dependencies:
    import mxnet as mx
    from gluoncv import model_zoo, data, utils
    from gluoncv.data.transforms.pose import detector_to_simple_pose, heatmap_to_coord
    
  2. You load two pre-trained models, one to detect people (yolo3_mobilenet1.0_coco) in the frame and one to detect the pose (simple_pose_resnet18_v1b) for each person detected. To load the pre-trained models, enter the following code in a new code cell:
    people_detector = model_zoo.get_model('yolo3_mobilenet1.0_coco', pretrained=True)
    pose_detector = model_zoo.get_model('simple_pose_resnet18_v1b', pretrained=True)
    
  3. Because the yolo_mobilenet1.0_coco pre-trained model is trained to detect many types of objects in addition to people, the code below narrows down the detection criteria to just people so that the model runs faster. For more information about the other types of objects that the model can predict, see the GluonCV MSCoco Detection source code.
    people_detector.reset_class(["person"], reuse_weights=['person'])
  4. The following code shows how to use the people detector to detect people in the frame. The outputs of the people detector are the class_IDs (just “person” in this use case because we’ve limited the model’s search scope), the confidence scores, and a bounding box around each person detected in the frame.
    img = mx.nd.array(frame)
    x, img = data.transforms.presets.ssd.transform_test(img, short=256)
    class_IDs, scores, bounding_boxs = people_detector(x)
    
  5. Enter the following code to feed the results from the people detector into the pose detector for each person found. Normally you need to use the bounding boxes to crop out each person found in the frame by the people detector, then resize each cropped person image into appropriately sized inputs for the pose detector. Fortunately GluonCV comes with a detector_to_simple_pose function that takes care of cropping and resizing for you.
    pose_input, upscale_bbox = detector_to_simple_pose(img, class_IDs, scores, bounding_boxs)
    
    predicted_heatmap = pose_detector(pose_input)
    pred_coords, confidence = heatmap_to_coord(predicted_heatmap, upscale_bbox)
    
  6. The following code overlays the results of the pose detector onto the original image so you can visualize the result:
    ax = utils.viz.plot_keypoints(img, pred_coords, confidence,
                                  class_IDs, bounding_boxs,scores, box_thresh=0.5, keypoint_thresh=0.2)
    plt.show(ax)

After completing steps 1-6, you should see an image similar to the following screenshot.

If you get an error similar to the ValueError output below, make sure you have at least one person in the camera’s view.

ValueError: In HybridBlock, there must be one NDArray or one Symbol in the input. Please check the type of the args

So far, you experimented with a pose detector on AWS DeepLens using Jupyter notebooks. You can now collect some data to figure out how to detect when someone is hunching, sitting, or standing. To collect data, you can save the image frame from the camera out to disk using the built-in OpenCV module. See the following code:

cv2.imwrite('output.jpg', frame)

Classifying postures with the GluonCV pose key points

After you have collected a few samples of different postures, you can start to detect bad posture by applying some rudimentary rules.

Understanding the GluonCV pose estimation key points

The GluonCV pose estimation model outputs 17 key points for each person detected. In this section, you see how those points are mapped to human body joints and how to apply simple rules to determine if a person is sitting, standing, or hunching.

This solution makes the following assumptions:

  • The camera sees your entire body from head to toe, regardless of whether you are sitting or standing
  • The camera sees a profile view of your body
  • No obstacles exist between camera and the subject

The following is an example input image. We’ve asked the actor in this image to face the camera instead of showing the profile view to illustrate the key body joints produced by the pose estimation model.

The following image is the output of the model drawn as lines and key points onto the input image. The cyan rectangle shows where the people detector thinks a person is in the image.

The following code shows the raw results of the pose detector. The code comments show how each entry maps to point on the a human body:

array([[142.96875,  84.96875],# Nose
       [152.34375,  75.59375],# Right Eye
       [128.90625,  75.59375],# Left Eye
       [175.78125,  89.65625],# Right Ear
       [114.84375,  99.03125],# Left Ear
       [217.96875, 164.65625],# Right Shoulder
       [ 91.40625, 178.71875],# Left Shoulder
       [316.40625, 197.46875],# Right Elblow
       [  9.375  , 232.625  ],# Left Elbow
       [414.84375, 192.78125],# Right Wrist
       [ 44.53125, 244.34375],# Left Wrist
       [199.21875, 366.21875],# Right Hip
       [128.90625, 366.21875],# Left Hip
       [208.59375, 506.84375],# Right Knee
       [124.21875, 506.84375],# Left Knee
       [215.625  , 570.125  ],# Right Ankle
       [121.875  , 570.125  ]],# Left Ankle

Deploying pre-trained GluonCV models to AWS DeepLens

In the following steps, you convert your code written in the Jupyter notebook to an AWS Lambda inference function to run on AWS DeepLens. The inference function optimizes the model to run on AWS DeepLens and feeds each camera frame into the model to get predictions.

This tutorial provides an example inference Lambda function for you to use. You can also copy and paste code sections directly from the Jupyter notebook you created earlier into the Lambda code editor.

Before creating the Lambda function, you need an Amazon Simple Storage Service (Amazon S3) bucket to save the results of your posture tracker for analysis in Amazon QuickSight. If you don’t have an Amazon S3 Bucket, see How to create an S3 bucket.

To create a Lambda function to deploy to AWS DeepLens, complete the following steps:

  1. Download aws-deeplens-posture-lambda.zip onto your computer.
  2. On the Lambda console, choose Create Function.
  3. Choose Author from scratch and choose the following options:
    1. For Runtime, choose Python 3.7.
    2. For Choose or create an execution role, choose Use an existing role.
    3. For Existing role, enter service-role/AWSDeepLensLambdaRole.
  4. After you create the function, go to function’s detail page.
  5. For Code entry type¸ choose Upload zip.
  6. Upload the aws-deeplens-posture-lambda.zip you downloaded earlier.
  7. Choose Save.
  8. In the AWS Lambda code editor, select the lambda_funtion.py file and enter an Amazon S3 bucket where you want to store the results.
    S3_BUCKET = '<YOUR_S3_BUCKET_NAME>'
  9. Choose Save.
  10. From the Actions drop-down menu, choose Publish new version.
  11. Enter a version number and choose Publish. Publishing the function makes it available on the AWS DeepLens console so you can add it to your custom project.
  12. Give your AWS DeepLens Lambda function permissions to put files in the Amazon S3 bucket. Inside your Lambda function editor, click on Permissions, then click on the AWSDeepLensLambda role name.
  13. You will be directed to the IAM editor for the AWSDeepLensLambda role. Inside the IAM role editor, click Attach Policies.
  14. Type in S3 to search for the AmazonS3 policy and check the AmazonS3FullAccess policy. Click Attach Policy.

Understanding the Lambda function

This section walks you through some important parts of the Lambda function.

You load the GluonCV model with the following code:

detector = model_zoo.get_model('yolo3_mobilenet1.0_coco', \
                pretrained=True, root='/opt/awscam/artifacts/')
pose_net = model_zoo.get_model('simple_pose_resnet18_v1b', \
                pretrained=True, root='/opt/awscam/artifacts/')

# Note that we can reset the classes of the detector to only include
# human, so that the NMS process is faster.

detector.reset_class(["person"], reuse_weights=['person'])

You run the model frame-per-frame over the images from the camera with the following code:

ret, frame = awscam.getLastFrame()
img = mx.nd.array(frame)
x, img = data.transforms.presets.ssd.transform_test(img, short=200)

class_IDs, scores, bounding_boxs = detector(x)
pose_input, upscale_bbox = detector_to_simple_pose(img, class_IDs, scores, bounding_boxs)

predicted_heatmap = pose_net(pose_input)
pred_coords, confidence = heatmap_to_coord(predicted_heatmap, upscale_bbox)

The following code shows you how to send the text prediction results back to the cloud. Viewing the text results in the cloud is a convenient way to make sure the model is working correctly. Each AWS DeepLens device has a dedicated iot_topic automatically created to receive the inference results.

# Send the top k results to the IoT console via MQTT
cloud_output = {
        'boxes': bounding_boxs,
        'box_scores': scores,
        'coords': pred_coords,
        'coord_scors': confidence
    }
client.publish(topic=iot_topic, payload=json.dumps(cloud_output))

Using the preceding key points, you can apply the geometric rules shown in the following sections to calculate angles between the body joints to determine if the person is sitting, standing, or hunching. You can change the geometric rules to suit your setup. As a follow-up activity to this tutorial, you can collect the pose data and train a simple ML model to more accurately predict when someone is standing or sitting.

Sitting vs. Standing

To determine if a person is standing or sitting, use the angle between the horizontal (ground) and the line connecting the hip and knee.

Hunching

When a person hunches, their head is typically looking down and their back is crooked. You can use the angles between the ear and shoulder and the shoulder and hip to determine if someone is hunching. Again, you can modify these geometric rules as you see fit. The following code inside the provided AWS DeepLens Lambda function determines if a person is hunching:

def hip_and_hunch_angle(left_array):
    '''

    :param left_array: pass in the left most coordinates of a person , should be ok, since from side left and right overlap
    :return:
    '''
    # hip to knee angle
    hipX = left_array[-2][0] - left_array[-3][0]
    hipY = left_array[-2][1] - left_array[-3][1]

    # hunch angle = (hip to shoulder ) - (shoulder to ear )
    # (hip to shoulder )
    hunchX1 = left_array[-3][0] - left_array[-6][0]
    hunchY1 = left_array[-3][1] - left_array[-6][1]

    ang1 = degrees(atan2(hunchY1, hunchX1))

    # (shoulder to ear)
    hunchX2 = left_array[-6][0] - left_array[-7][0]
    hunchY2 = left_array[-6][1] - left_array[-7][1]
    ang2 = degrees(atan2(hunchY2, hunchX2))

    return degrees(atan2(hipY, hipX)), abs(ang1 - ang2)


def sitting_and_hunching(left_array):
    hip_ang, hunch_ang = hip_and_hunch_angle(left_array)
    if hip_ang < 25 or hip_ang > 155:
        print("sitting")
        hip = 0
    else:
        print("standing")
        hip = 1
    if hunch_ang < 3:
        print("no hunch")
        hunch = 0
    else:
        hunch = 1
    return hip, hunch

Deploying the Lambda inference function to your AWS DeepLens device

To deploy your Lambda inference function to your AWS DeepLens device, complete the following steps:

  1. On the AWS DeepLens console, under Projects, choose Create new project.
  2. Choose Create a new blank project.
  3. For Project name, enter posture-tracker.
  4. Choose Add model.

To deploy a project, AWS DeepLens requires you to select a model and a Lambda function. In this tutorial, you are downloading the GluonCV models directly onto AWS DeepLens from inside your Lambda function so you can choose any existing model on the AWS DeepLens console to be deployed. The model selected on the AWS DeepLens console only serves as a stub and isn’t be used in the Lambda function. If you don’t have an existing model, deploy a sample project and select the sample model.

  1. Choose Add function.
  2. Choose the Lambda function you created earlier.
  3. Choose Create.
  4. Select your newly created project and choose Deploy to device.
  5. On the Target device page, select your device from the list.
  6. Choose Review.
  7. On the Review and deploy page, choose Deploy.

To verify that the project has deployed successfully, you can check the text prediction results sent back to the cloud via AWS IoT Greengrass. For instructions on how to view the text results, see Viewing text output of custom model in AWS IoT Greengrass.

In addition to the text results, you can view the pose detection results overlaid on top of your AWS DeepLens live video stream. For instructions on viewing the live video stream, see Viewing AWS DeepLens Output Streams.

The following screenshot shows what you will see in the project stream:

Sending text reminders to stand and stretch

In this section, you use Amazon Simple Notification Service (Amazon SNS) to send reminder text messages when your posture tracker determines that you have been sitting or hunching for an extended period of time.

  1. Register a new SNS topic to publish messages to.
  2. After you create the topic, copy and save the topic ARN, which you need to refer to in the AWS DeepLens Lambda inference code.
  3. Subscribe your phone number to receive messages posted to this topic.

Amazon SNS sends a confirmation text message before your phone number can receive messages.

You can now change the access policy for the SNS topic to allow AWS DeepLens to publish to the topic.

  1. On the Amazon SNS console, choose Topics.
  2. Choose your topic.
  3. Choose Edit.
  4. On the Access policy tab, enter the following code, be sure to replace YOUR_AWS_ACCOUNT_ID with your AWS account ID. See How to find your Account ID.

    {
      "Version": "2008-10-17",
      "Id": "lambda_only",
      "Statement": [
        {
          "Sid": "allow-lambda-publish",
          "Effect": "Allow",
          "Principal": {
            "Service": "lambda.amazonaws.com"
          },
          "Action": "sns:Publish",
          "Resource": "arn:aws:sns:us-east-1:your-account-no:your-topic-name",
          "Condition": {
            "StringEquals": {
              "AWS:SourceOwner": "YOUR_AWS_ACCOUNT_ID"
            }
          }
        }
      ]
    }
    
  5. Update the AWS DeepLens Lambda function with the ARN for the SNS topic. See the following code:
    def publishtoSNSTopic(SittingTime=None, hunchTime=None):
        sns = boto3.client('sns')
        
        # Publish a simple message to the specified SNS topic
        response = sns.publish(
        TopicArn='arn:aws:sns:us-east-1:xxxxxxxxxx:deeplenspose', # update topic arn
        Message='Alert: You have been sitting for {}, Stand up and stretch, and you have hunched for {}'.format(
        SittingTime, hunchTime),
        )
        
        print(SittingTime, hunchTime)
    

Visualizing your posture data over time with Amazon QuickSight

This next section shows you how to visualize your posture data with Amazon QuickSight. You first need to store the posture data in Amazon S3.

Storing the posture data in Amazon S3

The following code example records posture data one time every second; you can adjust this interval to suit your needs. The code writes the records to a CSV file every 60 seconds and uploads the results to the Amazon S3 bucket you created earlier.

  if len(physicalList) > 60:
            try:
                with open('/tmp/temp2.csv', 'w') as f:
                    writer = csv.writer(f)
                    writer.writerows(physicalList)
                physicalList = []
                write_to_s3('/tmp/temp2.csv', S3_BUCKET,
                            "Deeplens-posent/gluoncvpose/physicalstate-" + datetime.datetime.now().strftime(
                                "%Y-%b-%d-%H-%M-%S") + ".csv")
            except Exception as e:
                print(e)

Your Amazon S3 bucket now starts to fill up with CSV files containing posture data. See the following screenshot.

Using Amazon QuickSight

You can now use Amazon QuickSight to create an interactive dashboard to visualize your posture data. First, make sure that Amazon QuickSight has access to the S3 bucket with your pose data.

  1. On the Amazon QuickSight console, from the menu bar, choose Manage QuickSight.
  2. Choose Security & permissions.
  3. Choose Add or remove.
  4. Select Amazon S3.
  5. Choose Select S3 buckets.
  6. Select the bucket containing your pose data.
  7. Choose Update.
  8. On the Amazon QuickSight landing page, choose New analysis.
  9. Choose New data set.

You see a variety of options for data sources.

  1. Choose S3.

A pop-up window appears that asks for your data source name and manifest file. A manifest file tells Amazon QuickSight where to look for your data and how your dataset is structured.

  1. To build a manifest file for your posture data files in Amazon S3, open your preferred text editor and enter the following code:
    { "fileLocations": [ { "URIPrefixes": ["s3://YOUR_BUCKET_NAME/FOLDER_OF_POSE_DATA" ] } ], "globalUploadSettings": { "format": "CSV", "delimiter": ",", "textqualifier": "'", "containsHeader": "true" } }
  2. Save the text file with the name manifest.json.
  3. In the New S3 data source window, select Upload.
  4. Upload your manifest file.
  5. Choose Connect.

If you set up the data source successfully, you see a confirmation window like the following screenshot.

To troubleshoot any access or permissions errors, see How do I allow Amazon QuickSight access to my S3 bucket when I have a deny policy?

  1. Choose Visualize.

You can now experiment with the data to build visualizations. See the following screenshot.

The following bar graphs show visualizations you can quickly make with the posture data.

For instructions on creating more complex visualizations, see Tutorial: Create an Analysis.

Conclusion

In this post, you learned how to use Jupyter notebooks to prototype with AWS DeepLens, deploy a pre-trained GluonCV pose detection model to AWS DeepLens, send text messages using Amazon SNS based on triggers from the pose model, and visualize the posture data with Amazon QuickSight. You can deploy other GluonCV pre-trained models to AWS DeepLens or replace the hard-coded rules for classifying standing and sitting positions with a robust machine learning model. You can also dive deeper with Amazon QuickSight to reveal posture patterns over time.

For a detailed walkthrough of this tutorial and other tutorials, sample code, and project ideas with AWS DeepLens, see AWS DeepLens Recipes.


About the Authors

Phu Nguyen is a Product Manager for AWS DeepLens. He builds products that give developers of any skill level an easy, hands-on introduction to machine learning.

Raj Kadiyala is an AI/ML Tech Business Development Manager in AWS WWPS Partner Organization. Raj has over 12 years of experience in Machine Learning and likes to spend his free time exploring machine learning for practical every day solutions and staying active in the great outdoors of Colorado.