-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cloudflare block when fetching for stream url with correct user agent #1041
Comments
Thank you for looking into this. I've been trying to get the rtsp stream back after the old user agent I was using was deprecated. I'll keep playing around with this and report back. The way I was looking at getting this to work was by adding an |
I figured out the new agent by inspecting the traffic from my app and got the working user agent. I'm interested in your idea. If you can share more information or obstacles I can also help try out. Also, I tried to recreate the scraper with the cookies and user agent like this (your code): Furthermore, I find using proxies is necessary for me. So every rtsp stream call Im making is with random rotating proxies |
Using the mpeg-dash stream is easy enough, here is come code I was using to test stream_url = camera.start_stream("mac")
print("stream-url={}".format(stream_url))
url = urlparse(stream_url)
egress_token = parse_qs(url.query)["egressToken"][0]
print('starting ffmpeg')
os.system(f"ffmpeg -v debug "
f"-headers 'Egress-Token: {egress_token}\r\n"
"Origin: https://my.arlo.com\r\n"
"Referer: https://my.arlo.com/\r\n"
"User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 14_7_3) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/18.3 Safari/605.1.15\r\n' "
f"-i '{stream_url}' "
"-c copy out.mp4") I just need a mechanism to pass those extra headers into the stream component of Home Assistant to make it work. There is a mechanism to pass in options I think I just need to expand that. I'm going to push a change so other people can test the user agent you find. We might be able to find a pattern of when it stops working. |
Talking about the mpeg-dash stream, Im always getting:
that's why I have to fall back to the rtsp. Any quick thoughts? I'd be really appreciated if can get this https stream to work cz then we'll have two options |
This is what I showed you. We need to pass the headers I showed you in the previous message to the stream component. I'm trying to work out how to do it. ffmpeg needs to pass an |
These diffs allow me get mpeg-dash streaming. This diff applies to the core homeassistant.diff --git a/homeassistant/components/stream/__init__.py b/homeassistant/components/stream/__init__.py
index 8fa4c69ac5a..51758f0ede8 100644
--- a/homeassistant/components/stream/__init__.py
+++ b/homeassistant/components/stream/__init__.py
@@ -44,6 +44,7 @@ from .const import (
ATTR_SETTINGS,
ATTR_STREAMS,
CONF_EXTRA_PART_WAIT_TIME,
+ CONF_HTTP_HEADERS,
CONF_LL_HLS,
CONF_PART_DURATION,
CONF_RTSP_TRANSPORT,
@@ -166,6 +167,8 @@ def _convert_stream_options(
pyav_options["rtsp_transport"] = rtsp_transport
if stream_options.get(CONF_USE_WALLCLOCK_AS_TIMESTAMPS):
pyav_options["use_wallclock_as_timestamps"] = "1"
+ if headers := stream_options.get(CONF_HTTP_HEADERS):
+ pyav_options[CONF_HTTP_HEADERS] = headers
# For RTSP streams, prefer TCP
if isinstance(stream_source, str) and stream_source[:7] == "rtsp://":
@@ -624,5 +627,6 @@ STREAM_OPTIONS_SCHEMA: Final = vol.Schema(
vol.Optional(CONF_RTSP_TRANSPORT): vol.In(RTSP_TRANSPORTS),
vol.Optional(CONF_USE_WALLCLOCK_AS_TIMESTAMPS): bool,
vol.Optional(CONF_EXTRA_PART_WAIT_TIME): cv.positive_float,
+ vol.Optional(CONF_HTTP_HEADERS): cv.string,
}
)
diff --git a/homeassistant/components/stream/const.py b/homeassistant/components/stream/const.py
index c81d2f6cb18..d6b96deef5c 100644
--- a/homeassistant/components/stream/const.py
+++ b/homeassistant/components/stream/const.py
@@ -60,6 +60,7 @@ RTSP_TRANSPORTS = {
}
CONF_USE_WALLCLOCK_AS_TIMESTAMPS = "use_wallclock_as_timestamps"
CONF_EXTRA_PART_WAIT_TIME = "extra_part_wait_time"
+CONF_HTTP_HEADERS = "headers"
class StreamClientError(IntEnum): This is for the aarlo piece:diff --git a/custom_components/aarlo/camera.py b/custom_components/aarlo/camera.py
index 9f4aa9e..0ef985d 100644
--- a/custom_components/aarlo/camera.py
+++ b/custom_components/aarlo/camera.py
@@ -13,6 +13,7 @@ import logging
import voluptuous as vol
from collections.abc import Callable
from haffmpeg.camera import CameraMjpeg
+from urllib.parse import urlparse, parse_qs
import homeassistant.helpers.config_validation as cv
from homeassistant.components import websocket_api
@@ -517,6 +518,23 @@ class ArloCam(Camera):
return attrs
+ def _stream_source(self, user_agent):
+ """Return the source of the stream.
+
+ This set stream_options if the stream is https so we can pass egress
+ token on.
+ """
+ self.stream_options = {}
+ stream_url = self._camera.get_stream(user_agent)
+ if stream_url is not None:
+ if stream_url.startswith("https"):
+ url = urlparse(stream_url)
+ egress_token = parse_qs(url.query)["egressToken"][0]
+ self.stream_options = {
+ "headers": f"Egress-Token: {egress_token}\r\n"
+ }
+ return stream_url
+
async def stream_source(self):
"""Return the source of the stream.
@@ -524,11 +542,11 @@ class ArloCam(Camera):
to the original Arlo one. This means we get a `rtsps` stream back which the stream
component can handle.
"""
- return await self.hass.async_add_executor_job(self._camera.get_stream, "arlo")
+ return await self.hass.async_add_executor_job(self._stream_source, "linux")
async def async_stream_source(self, user_agent=None):
return await self.hass.async_add_executor_job(
- self._camera.get_stream, user_agent
+ self._stream_source, user_agent
)
def camera_image( edit: removed the manifest changes |
Great the mpeg-dash streaming works. Thank you for the help. Also I wish to share the information when investigating the cloudflare issue. I figured out that the cloudflare issue doesn't seem to relate to the user agent. With either linux, mac or arlo, after requesting for the stream url for 6-9 times for the same device, I start to have 403. I tried to make the request pattern a bit more random (like random wait time, or random retry, etc) but doesn't seem to be helpful without refreshing the cloudscraper. After all, the cloudflare is protecting the endpoint so it's before we even got the stream url. |
Background
There was some user agent issue tracked in other thread but none of the existing user agent could give rtsp stream url, so I reverse engineered and grabbed the correct user agent that works on my phone.
The user agent looks like this
(iPhone15,2 18_1_1) iOS Arlo 5.4.3
I verified that this user agent works and is giving me the correct rtsp stream url I want. So I start to use this user agent when fetching for stream urls whenever there's a motion triggered event. Which is a pretty normal thing.
However
It seems like, with the same user agent, after successfully fetching the stream url for a couple times, I start to get
403 Unknown error occurred
. I verified that the credentials are still working fine. When I restarted the Pyaarlo object (meaning reload the session file and grabbed a new scraper), most time it comes back to work for a couple tries and then it runs into the same problemI have the strong doubt that it's due to cloudflare. So I tried to refresh the scraper upon failure and it gets the situation better. However, it doesn't seem to work for every account. For account that has more devices, it seems more likely to fail.
Any idea, suggestion, experience @twrecked to bypass the cloudflare issue? I'm a little bit running out of options for now. Really Appreciated!
Please let me know if I should provide more information.
The text was updated successfully, but these errors were encountered: