[ad_1]
I have been playing around with AWS Batch, and I am having some trouble understanding why everything work when I build a docker image from my local windows machine and push it to ECR, while it doesn’t work when I do this from a ubuntu EC2 instance.
What I show below is adapted from this tutorial.
The docker file is very simple:
FROM python:3.6.10-alpine
RUN apk add --no-cache --upgrade bash
COPY ./ /usr/local/aws_batch_tutorial
RUN pip3 install -r /usr/local/aws_batch_tutorial/requirements.txt
WORKDIR /usr/local/aws_batch_tutorial
Where the local folder contains the following bash script (run_job.sh
):
#!/bin/bash
error_exit () {
echo "${BASENAME} - ${1}" >&2
exit 1
}
################################################################################
###### Convert envinronment variables to command line arguments ########
pat="--([^ ]+).+"
arg_list=""
while IFS= read -r line; do
# Check if line contains a command line argument
if [[ $line =~ $pat ]]; then
E=${BASH_REMATCH[1]}
# Check that a matching environmental variable is declared
if [[ ! ${!E} == "" ]]; then
# Make sure argument isn't already include in argument list
if [[ ! ${arg_list} =~ "--${E}=" ]]; then
# Add to argument list
arg_list="${arg_list} --${E}=${!E}"
fi
fi
fi
done < <(python3 script.py --help)
################################################################################
python3 -u script.py ${arg_list} | tee "${save_name}.txt"
aws s3 cp "./${save_name}.p" "s3://bucket/${save_name}.p" || error_exit "Failed to upload results to s3 bucket."
aws s3 cp "./${save_name}.txt" "s3://bucket/logs/${save_name}.txt" || error_exit "Failed to upload logs to s3 bucket."
It also contains a requirement.txt
file with a three packages (awscli
,boto3
,botocore
),
and a dummy python script (script.py
) that simply lists the files in a s3 bucket and saves the list in a file that is then uploaded to s3.
Both in my local windows environment and in the EC2 instance I have set up my AWS credentials with aws configure
, and in both cases I can successfully build the image, tag it and push it to ECR.
The problem arises when I submit the job on AWS Batch, which should run the ECR container using the command ["./run_job.sh"]
:
- if AWS Batch uses the ECR image pushed from windows, everything works fine
- if it uses the image pushed from ec2 linux, the job fails, and the only info I can get is this:
Status reason: Task failed to start
I was wondering if anyone has any idea of what might be causing the error.
[ad_2]