Presentation: Presentation slides
During the workshop you will learn how to implement a web scraper using Scrapy, store the output in a Blob storage on Azure, and use an Azure function to generate a wordcloud of the text obtained.
Python version: 3.8.5
You can check the required python libraries to run this project here
- Source-code editor: Visual Studio Code
- To test our function: Microsoft Azure Storage Explorer
- Azure account: You can create one with 12 months of free service for free here
- The Azure Functions Core Tools version 3.x
- The Python extension for Visual Studio Code
- The Azure Functions extension for Visual Studio Code
- The Azurite extension for testing functions in Visual Studio Code
- Clone the repository
- Start Visual Studio Code and navigate to the solutions folder
To put our spider to work, go to the project’s top level directory and run:
scrapy crawl cuisines
The BlobTrigger
makes it incredibly easy to react to new Blobs inside of Azure Blob Storage. This sample demonstrates a simple use case of processing data from a given Blob using Python.
For a BlobTrigger
to work, you provide a path which dictates where the blobs are located inside your container, and can also help restrict the types of blobs you wish to return. For instance, you can set the path to samples/{name}.png
to restrict the trigger to only the samples path and only blobs with ".png" at the end of their name.
Azure Blob storage trigger for Azure Functions
Re-watch YouTube stream here
This workshop was set up by @pyladiesams and @danielamiranda