Skip to content

Commit

Permalink
Update introductory documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
StuartMacKay committed Sep 29, 2023
1 parent 4e86f26 commit d015b8d
Showing 1 changed file with 101 additions and 10 deletions.
111 changes: 101 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,22 +5,113 @@

Django Feeds is an aggregator for RSS and Atom feeds.

## Features
The app contains a loader and models for loading and processing the entries
from feeds. A Source represents any web site that syndicates their content.
A Source can have one or more Feeds. For example press sites often have a
single web site with feeds for different categories. Feeds are loaded
automatically on a define schedule using Celery (you could also use cron).
You can also load feeds manually in the Django Admin. The default schedule
is to load Feeds every hour on the hour. However, you can also set the
schedule for individual Feeds. When a Feed is first loaded an Article is
created for each entry. On subsequent loads the Articles are updated BUT
authors and tags are NOT updated. That allows you, for example, to set the
correct Author of a guest post or add new Tags without losing the changes
the next time the Feed is loaded.

When a Feed is loaded for each entry you can perform a limited amount of
preprocessing: the title can be normalized to remove surrounding quotes,
trailing spaces, ending periods, etc. That generally makes thing look
better and consistent. Titles that are almost mini articles (sometimes
several hundred words long) can be truncated to a given length. For
authors, the Alias table allows you to map usernames to real names. Each
Feed has a set of Categories which are added to the Article. That way you
can automatically identify Articles that are from You Tube. The Categories
use django-tagulous which support hierarchical tags. For example you can
use this to add different types of meta-date, e.g. "media/video" or
"label/repost", etc. The Feed model also has an auto_publish flag so all
Articles added have their publish set to True. That way you can decide
which Articles appear on your site automatically and which might need to
be reviewed before being published. Similarly, the load_tags flag on Feed
controls whether the tags from the feed are automatically added to the
Article. That is useful as some authors diligently tag their posts while
others simply leave the default 'Uncategorized' tag in place. Incidentally
there is a lookup table in the settings, FEEDS_FILTER_TAGS that you can
configure to remove or rename tags.

The app tries to be flexible. Django-tagulous can be used to add any kind
of meta-data about a feed or the Article. The Source and Article models
also have a JSON field which can be used to store pretty much anything.
You could for example, run an additional task to scrape OpenGraph data
from the source to add support for thumbnails.

## Quick Start

Download and install the app:

```pip istall django-rss-feeds```

Add the app to Django:

```python
INSTALLED_APPS = [
...,
"tagulous",
"feeds.apps.Config",
]
```

Run the migrations:

```python manage.py migrate```

Now log into the Django Admin. In the Feeds section, add a Source and a Feed for that Source.
Try, https://news.ycombinator.com/rss. Now in the Feed changelist, select the Feed you just
added and run the 'Load selected feeds' action. Voila, you now have a set of Articles created
from the feed.

## Demo

If you clone or download the [django-feeds](https://github.com/StuartMacKay/django-feeds)
repository there is a demonstration application that lets you see how it
all works.
repository there is a demonstration Django application, with celery, that lets
you see how it all works. The demo site aggregates the feeds and publishes the
Articles , grouped by date, with each page showing the Articles for the past 7
days. Links on each entry allow you navigate to ListViews for each Source,
Author or Tag.

```shell
git clone [email protected]:StuartMacKay/django-feeds.git
cd django-feeds
make install
make demo
```
It's a standard django project so if you don't have `make` available
then just look at the [Makefile](Makefile) and run the commands from
the various targets.
docker-compose up
```

Next run a shell on the web service, so you can create an admin account, log in
and add a Source and a Feed.

```shell
docker-compose exec web bash
./manage.py createsuperuser
```

## Settings

`FEEDS_TASK_SCHEDULE`, default "0 * * * *". A crontab string that set when
a Celery task runs to check whether any Feeds are scheduled to load.

`FEEDS_LOAD_SCHEDULE`, default "0 * * * *". A crontab string that sets when
Feeds is scheduled to be loaded. This can be overridden on Feeds individually.

`FEEDS_USER_AGENT`, the User-Agent string that identifies who is requesting the
feed. Some sites won't work without this set. In any case it's always good
manners to identify yourself.

`FEEDS_NORMALIZE_TITLES`, default True. Tidy up titles to remove surrounding quotes,
remove trailing periods, etc. That way titles from different Feeds have the same style.

`FEEDS_TRUNCATE_TITLES`, default None. Limit the length of titles. Some titles are
mini-posts all by themselves so you can used this to truncate them to a given number
of characters.

`FEEDS_FILTER_TAGS`, default {"uncategorized": None}. Use this to rename or delete
tags from the Feed. The default allows you to remove the default "Uncategorized" tag
that often appears in Wordpress feeds. There is a `load_tags` flag on Feed that controls
whether tags are added to Articles. That allows you to selectively load the tags from
conscientious blogs and skip the lazy one.

0 comments on commit d015b8d

Please sign in to comment.