How to use pandas in AWS Lambda

pandas is an open source library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language. pandas library is by default not available in AWS Lambda Python environments. If you try to import pandas in aws lambda function, you will get below error.

    
import pandas

def lambda_handler(event, context):

    return {
        'statusCode': 200,
        'body': json.dumps('Hello from Lambda!')
    }

# Output 

Response
{
  "errorMessage": "Unable to import module 'lambda_function': No module named 'pandas'",
  "errorType": "Runtime.ImportModuleError",
  "requestId": "a9dfd983-8cd6-4fe2-85c3-5e107ac230a4",
  "stackTrace": []
}

    

For using pandas library in Lambda function a Lambda Layer needs to attached to the Lambda function. This tutorials lists the required steps for creating and attaching Lambda Layer for pandas module.

Note: Step 1 to Step 6 needs to performed on EC2 instance which uses the same Amazon Linux version as AWS Lambda to have proper dependencies.The steps for this tutorial as are performed with Python 3.9.7, to follow the steps make sure you are using Python 3.9.7

Step 1: Create Python Virtual Environment


python3.9 -m venv test_venv


Step 2: Activate Virtual Environment

source test_venv/bin/activate

Step 3: Check Python Version


python --version  


Step 4: Create directory with name python


mkdir python


Step 5: Install pandas library in python directory created in Step 4


pip install pandas -t python  


Step 6: Zip python directory


zip -r pandas.zip python


Step 7: Login to AWS account and Navigate to AWS Lambda Service.

Step 8: In AWS Lambda select Layers from Additional resources.

Step 9: Click on create layer, enter the required information.
  • Name: pandas_layer
  • Description: Lambda layer for pandas module
  • Select Upload a .zip file, click on upload and choose pandas.zip created in Step 6
  • Compatible architectures - optional: x86_64
  • Compatible runtimes - Choose run time as per the python version from output of Step 3

Step 10: Click on Create

Step 11: Navigate to AWS Lambda function and select Functions

Step 12: Click on Create function

Step 13: Select Author from scratch

Step 14: Enter Below details in Basic information
  • Function name: test_lambda_function
  • Runtime: choose run time as per the python version from output of Step 3
  • Architecture: x86_64

Step 15: Click on create function

Step 16: In the Function overview pane click on Layers or Scroll down to select Layers section

Step 17: Click on Add a layer

Step 18: Select Custom layers , choose layer created in Step 9, select version 1 and click on Add.

Step 19: Write below code in lambda function and click on Deploy


import logging
import pandas as pd

logger = logging.getLogger()
logger.setLevel(logging.INFO)


def lambda_handler(event, context):
    logger.info(pd.__version__)
    


Step 20: Click on Test, enter any name for Configure test event and click on create

Step 21: Click on Test again, you should see pandas version in the output.

If you don't want to create you own layer than you can directly use lambda layers from this link.

Categories: AWS

Similar Articles