Skip to content

Commit

Permalink
Adding demo video_clip sampling nb
Browse files Browse the repository at this point in the history
  • Loading branch information
DiogenesAnalytics committed Feb 2, 2024
1 parent 2ba8b59 commit 132ac5a
Showing 1 changed file with 156 additions and 0 deletions.
156 changes: 156 additions & 0 deletions notebooks/video_clip_sampling.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "4732842d-7bea-46ed-a397-6f3204d1992b",
"metadata": {},
"source": [
"# Video Clip Sampling\n",
"This notebook seeks to explore the *video clip sampling* strategy avaiable in the `frame_sampling` package."
]
},
{
"cell_type": "markdown",
"id": "a39e90a7-4d3f-4966-866b-82806f96742b",
"metadata": {},
"source": [
"## Setup\n",
"First we need to get everything setup ..."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9a10719a-a86e-4abd-b8ee-c44b0c2a911f",
"metadata": {},
"outputs": [],
"source": [
"# check for colab\n",
"if \"google.colab\" in str(get_ipython()):\n",
" # install colab dependencies\n",
" !pip install tqdm git+https://github.com/DiogenesAnalytics/frame-sampling"
]
},
{
"cell_type": "markdown",
"id": "489d687e-6e4e-4cae-8158-1eafc5b45f13",
"metadata": {},
"source": [
"## Get Data\n",
"Need to set up a few things in order to have data for the demo. We will download *10 videos* from the [WebVid-10M dataset](https://maxbain.com/webvid-dataset/) *validation subset*."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3e7f616c-e4c7-478c-a485-e20756f9aa00",
"metadata": {},
"outputs": [],
"source": [
"## load necessary libs\n",
"import pathlib\n",
"from urllib.request import urlretrieve\n",
"import pandas as pd\n",
"from tqdm.auto import tqdm\n",
"\n",
"# get csv data from WebVid dataset: https://maxbain.com/webvid-dataset/\n",
"webvid_csv = pd.read_csv(\"http://www.robots.ox.ac.uk/~maxbain/webvid/results_2M_val.csv\")\n",
"\n",
"# videos for testing purposes\n",
"DWNLD_LNKS = webvid_csv[\"contentUrl\"].to_list()[:10]\n",
"\n",
"# set path to data\n",
"WEBVID_VAL_DATA = pathlib.Path(\"/usr/local/src/frame-sampling/tests/data/video/demo\")\n",
"\n",
"# create directory\n",
"WEBVID_VAL_DATA.mkdir(parents=True, exist_ok=True)\n",
"\n",
"# notify of downloading\n",
"print(\"Downloading videos ...\")\n",
"\n",
"# download test videos\n",
"for video_url in tqdm(DWNLD_LNKS):\n",
" # get video file name\n",
" vid_file_name = pathlib.Path(video_url).name\n",
" \n",
" # create new output path for video file\n",
" video_output_path = WEBVID_VAL_DATA / vid_file_name\n",
" \n",
" # download to demo_data path\n",
" if not video_output_path.exists():\n",
" _ = urlretrieve(video_url, video_output_path)"
]
},
{
"cell_type": "markdown",
"id": "e0648368-7091-4cc9-99cd-d33fe01398c6",
"metadata": {},
"source": [
"## Frame Sampling\n",
"Now we are finally ready to start using the `VideoClipSampler` class to *sample frames* from the WebVid-10M video dataset. Here we will implement a function to get *frames* that have an *entropy* above a certain threshold."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "52606ab8-f8e9-4c30-a870-abd2be620e67",
"metadata": {},
"outputs": [],
"source": [
"# get necessary libs\n",
"import secrets\n",
"from PIL.Image import Image\n",
"from frame_sampling.dataset import VideoDataset\n",
"from frame_sampling.strategy import VideoClipSampler\n",
"\n",
"# get instance object\n",
"data = VideoDataset(WEBVID_VAL_DATA)\n",
"\n",
"# set frame sample output directory\n",
"FRAME_SAMPLE_OUTPUT = pathlib.Path(f\"./frame_samples/{secrets.token_hex(8)}\")\n",
"\n",
"# set chosen frame sample interval\n",
"FRAME_SAMPLE_INTERVAL = 30\n",
"\n",
"# custom function\n",
"def calculate_entropy_threshold(image: Image) -> bool:\n",
" \"\"\"Check PIL image entropy is above threshold.\"\"\"\n",
" # convert the image to grayscale\n",
" image_gray = image.convert('L')\n",
"\n",
" # calculate entropy\n",
" entropy_value = image_gray.entropy()\n",
"\n",
" # check threshold\n",
" return entropy_value > 7.0\n",
"\n",
"# get instance\n",
"frame_sampler = VideoClipSampler(FRAME_SAMPLE_INTERVAL, calculate_entropy_threshold) \n",
"\n",
"# ... now sample\n",
"frame_sampler.sample(data, FRAME_SAMPLE_OUTPUT)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.7"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

0 comments on commit 132ac5a

Please sign in to comment.