Stagehand Todo Evaluator A proof-of-concept evaluator using Stagehand to evaluate the todo example app. See the blog post for more info. Usage export OPENAI_API_KEY=your-openai-api-key npm install npm run test