-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
4e86f26
commit d015b8d
Showing
1 changed file
with
101 additions
and
10 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,22 +5,113 @@ | |
|
||
Django Feeds is an aggregator for RSS and Atom feeds. | ||
|
||
## Features | ||
The app contains a loader and models for loading and processing the entries | ||
from feeds. A Source represents any web site that syndicates their content. | ||
A Source can have one or more Feeds. For example press sites often have a | ||
single web site with feeds for different categories. Feeds are loaded | ||
automatically on a define schedule using Celery (you could also use cron). | ||
You can also load feeds manually in the Django Admin. The default schedule | ||
is to load Feeds every hour on the hour. However, you can also set the | ||
schedule for individual Feeds. When a Feed is first loaded an Article is | ||
created for each entry. On subsequent loads the Articles are updated BUT | ||
authors and tags are NOT updated. That allows you, for example, to set the | ||
correct Author of a guest post or add new Tags without losing the changes | ||
the next time the Feed is loaded. | ||
|
||
When a Feed is loaded for each entry you can perform a limited amount of | ||
preprocessing: the title can be normalized to remove surrounding quotes, | ||
trailing spaces, ending periods, etc. That generally makes thing look | ||
better and consistent. Titles that are almost mini articles (sometimes | ||
several hundred words long) can be truncated to a given length. For | ||
authors, the Alias table allows you to map usernames to real names. Each | ||
Feed has a set of Categories which are added to the Article. That way you | ||
can automatically identify Articles that are from You Tube. The Categories | ||
use django-tagulous which support hierarchical tags. For example you can | ||
use this to add different types of meta-date, e.g. "media/video" or | ||
"label/repost", etc. The Feed model also has an auto_publish flag so all | ||
Articles added have their publish set to True. That way you can decide | ||
which Articles appear on your site automatically and which might need to | ||
be reviewed before being published. Similarly, the load_tags flag on Feed | ||
controls whether the tags from the feed are automatically added to the | ||
Article. That is useful as some authors diligently tag their posts while | ||
others simply leave the default 'Uncategorized' tag in place. Incidentally | ||
there is a lookup table in the settings, FEEDS_FILTER_TAGS that you can | ||
configure to remove or rename tags. | ||
|
||
The app tries to be flexible. Django-tagulous can be used to add any kind | ||
of meta-data about a feed or the Article. The Source and Article models | ||
also have a JSON field which can be used to store pretty much anything. | ||
You could for example, run an additional task to scrape OpenGraph data | ||
from the source to add support for thumbnails. | ||
|
||
## Quick Start | ||
|
||
Download and install the app: | ||
|
||
```pip istall django-rss-feeds``` | ||
|
||
Add the app to Django: | ||
|
||
```python | ||
INSTALLED_APPS = [ | ||
..., | ||
"tagulous", | ||
"feeds.apps.Config", | ||
] | ||
``` | ||
|
||
Run the migrations: | ||
|
||
```python manage.py migrate``` | ||
|
||
Now log into the Django Admin. In the Feeds section, add a Source and a Feed for that Source. | ||
Try, https://news.ycombinator.com/rss. Now in the Feed changelist, select the Feed you just | ||
added and run the 'Load selected feeds' action. Voila, you now have a set of Articles created | ||
from the feed. | ||
|
||
## Demo | ||
|
||
If you clone or download the [django-feeds](https://github.com/StuartMacKay/django-feeds) | ||
repository there is a demonstration application that lets you see how it | ||
all works. | ||
repository there is a demonstration Django application, with celery, that lets | ||
you see how it all works. The demo site aggregates the feeds and publishes the | ||
Articles , grouped by date, with each page showing the Articles for the past 7 | ||
days. Links on each entry allow you navigate to ListViews for each Source, | ||
Author or Tag. | ||
|
||
```shell | ||
git clone [email protected]:StuartMacKay/django-feeds.git | ||
cd django-feeds | ||
make install | ||
make demo | ||
``` | ||
It's a standard django project so if you don't have `make` available | ||
then just look at the [Makefile](Makefile) and run the commands from | ||
the various targets. | ||
docker-compose up | ||
``` | ||
|
||
Next run a shell on the web service, so you can create an admin account, log in | ||
and add a Source and a Feed. | ||
|
||
```shell | ||
docker-compose exec web bash | ||
./manage.py createsuperuser | ||
``` | ||
|
||
## Settings | ||
|
||
`FEEDS_TASK_SCHEDULE`, default "0 * * * *". A crontab string that set when | ||
a Celery task runs to check whether any Feeds are scheduled to load. | ||
|
||
`FEEDS_LOAD_SCHEDULE`, default "0 * * * *". A crontab string that sets when | ||
Feeds is scheduled to be loaded. This can be overridden on Feeds individually. | ||
|
||
`FEEDS_USER_AGENT`, the User-Agent string that identifies who is requesting the | ||
feed. Some sites won't work without this set. In any case it's always good | ||
manners to identify yourself. | ||
|
||
`FEEDS_NORMALIZE_TITLES`, default True. Tidy up titles to remove surrounding quotes, | ||
remove trailing periods, etc. That way titles from different Feeds have the same style. | ||
|
||
`FEEDS_TRUNCATE_TITLES`, default None. Limit the length of titles. Some titles are | ||
mini-posts all by themselves so you can used this to truncate them to a given number | ||
of characters. | ||
|
||
`FEEDS_FILTER_TAGS`, default {"uncategorized": None}. Use this to rename or delete | ||
tags from the Feed. The default allows you to remove the default "Uncategorized" tag | ||
that often appears in Wordpress feeds. There is a `load_tags` flag on Feed that controls | ||
whether tags are added to Articles. That allows you to selectively load the tags from | ||
conscientious blogs and skip the lazy one. |