Sagemaker Processing Job With Pyspark And Step Functions

December 27, 2023 Post a Comment

this is my problem: I have to run a Sagemaker processing job using custom code written in PySpark. I've used the Sagemaker SDK by running these commands: spark_processor = sagemake

Solution 1:

sagemaker sdk is not installed by default in the lambda container environment: you should include it in the lambda zip that you upload to s3.

There are various ways to do this, one of the easiest is to deploy your lambda with Serverless Application Model (SAM) cli. In this case it might be enough to place sagemaker in the requirements.txt placed in the folder that contains your lambda code, and SAM will ensure that the dependency is included in the zip.

Alternatively you can create the zip manually with pip install sagemaker -t lambda_folder but you should execute this command in an Amazon Linux OS, for example with an EC2 with the appropriate image or in a Docker container. Search for "python dependencies in aws lambda" for more info.

Python Dummy

Sagemaker Processing Job With Pyspark And Step Functions

Solution 1:

Post a Comment for "Sagemaker Processing Job With Pyspark And Step Functions"