Skip to content

Commit 40675e9

Browse files
committed
Strip perex from Aktuality perex text
1 parent b397b98 commit 40675e9

File tree

2 files changed

+3
-1
lines changed

2 files changed

+3
-1
lines changed

modules/rss.py

+2-1
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
from abc import ABC, abstractmethod
44
from pprint import pprint
55
from typing import Dict, List
6+
from bs4 import BeautifulSoup
67

78
import feedparser
89

@@ -70,7 +71,7 @@ def parse(self) -> List[RssArticle]:
7071
title = entry.get('title', '')
7172
url = entry.get('link', '')
7273
keywords = entry.get('keywords', [])
73-
perex = entry.get('description', '')
74+
perex = BeautifulSoup(entry.get('description', '')).text.strip()
7475
res.append(RssArticle(title, url, keywords, perex, '', self.source))
7576
return res
7677

requirements.txt

+1
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
beautifulsoup4==4.6.0
12
certifi==2017.4.17
23
chardet==3.0.4
34
feedparser==5.2.1

0 commit comments

Comments
 (0)