Chrome not reachable after the first execution in lambda #131
-
First off thank you for the great article on how to run selenium in aws lambda. I followed the instructions outlined here and was able to get the lambda to run once.
If I try again after 20mins or so I'm able to get another successful run which are then followed by unsuccessful runs. Here is my code -
I've tried to increase the memory size to 3008 and that had no impact either. What's also weird is that it fails at different points on different runs. I suspect chrome is crashing but I have no clue why or how to get around this. Any suggestions would be greatly appreciated! Full error message -
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Finally figured out what was going on. Apparently all the lambda executions share a single volume mounted in /tmp with a default size of 512mb that was getting filled up with the data produced by the crawler. Increasing the size to 3gb fixed it for me. I also implemented a workaround to clean up the data after the execution run. In case this helps anyone, here is the code for the driver initialization -
And the clean up code on the exit function -
|
Beta Was this translation helpful? Give feedback.
-
Glad you were able to find a solution. You are correct, SOMETIMES Lambda reuses the container it created for an invocation so it's a good practice to clean **You can confirm this with the below experiment. ** Testing multiple consecutive invocationsI ran a simple Python program to write the current time stamp to a file in import json
import os
import time
def lambda_handler(event, context):
time_stamp = time.strftime("%H%M%S")
with open(f"/tmp/{time_stamp}.txt", "w") as f:
f.write(f"time_stamp = {time_stamp}")
f.close()
with open(f"/tmp/timestamps.txt", "a") as f:
f.write(time_stamp)
f.write("\n")
f.close()
print("Contents of tmp folder")
os.system('ls -la /tmp')
print("Contents of file")
os.system("cat /tmp/timestamps.txt")
return {
'statusCode': 200,
'body': json.dumps('Hello from Lambda!')
} |
Beta Was this translation helpful? Give feedback.
Finally figured out what was going on. Apparently all the lambda executions share a single volume mounted in /tmp with a default size of 512mb that was getting filled up with the data produced by the crawler. Increasing the size to 3gb fixed it for me. I also implemented a workaround to clean up the data after the execution run. In case this helps anyone, here is the code for the driver initialization -