Releases: brunoamaral/gregory-ai
New API endpoints to fetch content per team
What's Changed
- Add API endpoints to filter content by teams by @brunoamaral in #396
- Remove deprecated ML fields from the articles model by @brunoamaral in #399
- Add new API endpoints per team by @brunoamaral in #401
Full Changelog: v19...v20
v19 - Gregory Teams
We are going through some changes and will be breaking some things soon.
- Teams can be configured in the backend
- Each team can set their own categories, subjects, and sources
- Different subjects (topics) can use different Machine Learning Models
Right now some deprecated fields are still present in the database: relevant
, categories
, and ml_prediction_*
for example.
Upgrading
Run manage.py makemigrations
and manage.py migrate
; then run manage.py migrate_categories
.
What's Changed
- Send digest emails to the whole team in charge of that subject by @brunoamaral in #388
- removes metabase from docker compose and config file by @brunoamaral in #391
- Add support for multiple ml models per subject by @brunoamaral in #389
- Allow teams to configure their own categories by @brunoamaral in #393
- remove node-red from docker-compose by @brunoamaral in #394
Full Changelog: v18...v19
Multiple Research Subjects (topics)
- We reviewed the pipeline that fetches and processes articles and clinical trials. it now runs with Django base commands:
docker exec -it admin python manage.py
- We added the option to create a Team using django-organizations
- Each team can now have one or more research subjects and set sources to fetch articles for that subject.
Word of warning about the source
field: the source field is no longer needed and was replaced with a ManyToMany field called sources. It is there for the time being, so that you can migrate your data using something like this:
from django.db import migrations
def copy_source_to_sources(apps, schema_editor):
Article = apps.get_model('gregory', 'Articles') # Replace 'gregory' with the actual app name
for article in Article.objects.all():
if article.source: # Check if the old source field is not None
article.sources.add(article.source) # Add the old source to the new ManyToMany sources field
class Migration(migrations.Migration):
dependencies = [
('gregory', '0077_articles_sources_alter_articles_source'),
]
operations = [
migrations.RunPython(copy_source_to_sources),
]
I suggest you create a team and subject and use the following to assign your articles to them:
def add_teams_and_subjects_to_articles():
for article in Articles.objects.all():
article.teams.add(Team.objects.first())
article.subjects.add(Subject.objects.first())
for trial in Trial.objects.all():
trial.teams.add(Team.objects.first())
trial.subjects.add(Subject.objects.first())
Right now, all the data is public and anyone can see what each team is researching. We want to make the API show a segregated view of the articles and clinical trials based on the user making the API request, with an option to set subjects as public or private. But I don't know how to do that.
Helping the Multiple Sclerosis Project
I have never asked for donations, but I am having a hard time keeping the MS project running. If you would like to help, you can donate to cover expenses through the Human Singularity Network, a non-profit we created to manage resources and partnerships.
https://donate.stripe.com/6oEeVmf1tdHIdOw7ss
Or if you can't, please share the project with everyone you feel might benefit from it.
Thank you!
What's Changed
- move db management command to the gregory app by @brunoamaral in #385
- New pipeline to process articles and clinical trials by @brunoamaral in #386
- Adding organisations to the user and article models by @brunoamaral in #387
Full Changelog: v17...v18
clean up release
I've lost the energy for witty texts.
We removed django-cron and updated the python packages, adding a new pipeline
command that fetches and processes articles and clinical trials.
We also made it possible to import data from the WHO clinical trial registry.
What's Changed
- Improve who import by @brunoamaral in #364
- minor addition to the admin page by @brunoamaral in #365
- Add article list to authors' API endpoint by @brunoamaral in #366
- Improve feedreader to update trial if it exists by @brunoamaral in #367
- avoid error if api accesss expired. WIP by @brunoamaral in #363
- Add last updated date to trials by @brunoamaral in #368
- remove field for real_time_notification by @brunoamaral in #369
- Removes Django Cron and move all jobs to Django BaseCommands and adds track changes with django simple-history by @brunoamaral in #371
- Improve the workflow to import Articles and Trials and Records trial change history by @brunoamaral in #373
- upgrade packages by @brunoamaral in #377
- Create a single django command to run the full pipeline by @brunoamaral in #378
- update pipeline by @brunoamaral in #379
- Add error handling to orcid update, change the timeframe to query orcid by @brunoamaral in #380
- add missing import by @brunoamaral in #381
- remove redundant setting of variables by @brunoamaral in #382
- Update Documentation for Database Fields by @brunoamaral in #383
- add db diagram by @brunoamaral in #384
Full Changelog: v16...v17
v16
I was going to say this was mostly a maintenance release, but there are 3 big changes.
- We have changed the license for Gregory. It is now under a Creative Commons — Attribution 4.0 International license.
This change was made because some people are working on amazing add-ons for Gregory and we want to do all we can to ensure their recognition.
-
Added fields for saving Machine Learning predictions made by the Linear SVC Support Vector Classifier.
-
Added a django command to import data from the World Health Organisation (WHO)
The Clinical Trials table was changed to include the same fields used by the WHO, so remember to run ./manage.py makemigrations && ./manage.py migrate
.
After, you can now go to https://trialsearch.who.int/AdvSearch.aspx, find the clinical trials you want, and import them into gregory with ./manage.py importWHOXML FILE
What's Changed
- Update LICENSE to require attribution by @brunoamaral in #351
- 356 add to the trials table the same columns we have from the who ictrp by @brunoamaral in #357
- add identifier to value field to avoid ambiguity by @brunoamaral in #358
- 139 implement lsvc machine learning model by @brunoamaral in #361
- add lsvc prediction to api frontend by @brunoamaral in #362
Full Changelog: v15.5...v16
v15.5
Rest assured, this project is still very much alive. We have been doing a lot of backoffice work that we hope will give us more ways to develop Gregory.
For now, a quick maintenance release
What's Changed
- upgrade Django by @brunoamaral in #346
- add article count by @brunoamaral in #347
- Unique authors by @brunoamaral in #348
- Adds Country information to authors by @brunoamaral in #349
- deletes the twitter feed by @brunoamaral in #354
- Improve urls py by @brunoamaral in #355
Full Changelog: v15...v16
The Doc Brown Edition
Great Scott, it's been a while!
The team as been busy with other tasks and opening new doors. No fancy text this time, here are the changes since April.
- Improvement to the sources, adding a list of fetch options instead of text field, with a new field for description of the source, just in case you need to keep notes.
- A few api changes and endpoints to help build charts that we now see in the observatory page for Multiple Sclerosis research.
- Check if a clinical trial is duplicate based on the ID number
- Keep history of changes to a clinical trial, if possible, this needs more testing, if you can help.
- We have included an authentication option using Django Rest Framework and djangorestframework-simplejwt
Remember to run manage.py makemigrations
and manage.py migrate
after pulling this new release
What's Changed
- For sources, change fetch method to a list of options by @brunoamaral in #327
- Add a Category slug by @brunoamaral in #330
- Adds category information to the API endpoints for Articles and Trials by @brunoamaral in #331
- Check if clinical trial is duplicate before saving, and update fields if needed. Also keep track of changes to clinical trials by @brunoamaral in #332
- Add get_monthly_article_counts and get_monthly_trial_counts by @brunoamaral in #333
- New api endpoint /categories/category_slug/monthly-counts/ by @brunoamaral in #334
- minor fix by @brunoamaral in #335
- include category name and slug in api response by @brunoamaral in #336
- add article count to categories by @brunoamaral in #338
- Authentication with jwt for the django rest framework by @brunoamaral in #341
- add missing files by @brunoamaral in #342
- 343 include an optional description field to the sources by @brunoamaral in #344
Full Changelog: v14...v15
The Virginia Lacy Jones Edition
Just because this one is short and sweet doesn't make it a small feet. We upgraded python under the hood, so rebuild your container when you're in an upgrading mood.
Size doesn't matter, quality does. We changed to a new summariser, much better than it was.
And for your favourite author, we now provide an RSS feed with all their articles. Find it at feed/articles/author/<author_id>/
What's Changed
- add timezone with dateutil.parse by @brunoamaral in #316
- upgrade python to v3.11 by @brunoamaral in #317
- avoids error if DOI is an empty string by @brunoamaral in #319
- Implement new summariser by @brunoamaral in #321
- Minor updates by @brunoamaral in #323
- cleaned up code by @brunoamaral in #322
- minor fixes in setup.py by @brunoamaral in #324
- adds an rss feed to every author, available at /feed/articles/author/<author_id>/ by @brunoamaral in #325
Full Changelog: v13.12...v14
The Bunny Edition
Not just bugs in this one, we've been working on trials. Now, we try to fetch the Eduract and NCT numbers for clinical trials, working our way to make sure we don't have duplicate content on that table. In the future, we want to move forward with tracking the progress of every clinical trial until completion.
What's Changed
- Add identifier numbers to table of clinical trials by @brunoamaral in #304
- add source object when saving clinical trials from rss feed by @brunoamaral in #306
- Add identifiers to public API by @brunoamaral in #307
- implements ClinicalTrial class and moves code to functions.py by @brunoamaral in #313
- create a clinicaltrial class for processing new information by @brunoamaral in #310
- 291 docker compose fails to build django container by @brunoamaral in #292
- add a new api endpoint to view relevant articles in the given week by @brunoamaral in #294
- some fixes for the newsletter api by @brunoamaral in #295
- fixes queries in the newsletter api by @brunoamaral in #296
- fixes the error in running makemigrations on a fresh install by @brunoamaral in #299
- upgrade packages by @brunoamaral in #303
- 311 move remove utm to functionspy by @brunoamaral in #312
Full Changelog: v13...v14
The António Lopes Edition
Hope you're ready, what we have is heavy. Let's hear it for @antoniolopes who shines from the shadows and gave Gregory an AI upgrade. Let's get to it before your patience starts to fade.
António has been helping Gregory since the early stage, with the relevancy algorithm, and advice worthy of a sage. This time he brought a new summariser for the abstracts that can process the database through Django's management commands.
./manage.py get_takeaways
will populate the "takeaways" column with the key points within the abstract of each article.
In future releases we may use this to improve the newsletters and automatic tweets.
And his magic didn't stop here. There is a new API endpoint that allows you to add new articles via http POST requests.
There is also a new SciencePaper
class to make sure we have all the required information when saving article. This is also used to clean up the abstracts of any weird characters or html.
To save on CPU, and be gentle with the crossref API, we now stop trying to fetch missing data after trying for 30 days.
A special word of appreciation goes out to @codeZenon for taking the time to help us improve the documentation.
Development of new features and improvements has been 3x faster than documentation, and I don't expect it to improve. Our time is scarce. Which isn't the same as saying we don't care.
If you have any questions, please reach out by posting an issue or adding a thread in the discussion page.
Final note, remember to run ./manage.py migrate
and pip install -r requirements.txt
in the admin container when upgrading.
What's Changed
- quick fix by @brunoamaral in #258
- remove hardcoded information from crossref script by @brunoamaral in #259
- removes utm parameters from urls in feedreader by @brunoamaral in #261
- API returns authors as an object with first name, family name, and ORCID url by @brunoamaral in #264
- add information from crossref.org upon fetching articles from the rss feeds by @brunoamaral in #269
- Improves the way we fetch authors by using Django's ORM by @brunoamaral in #267
- Refactor pipeline by @brunoamaral in #270
- Apply method to avoid excessive queries to crossreforg by @brunoamaral in #273
- partial fix for naive datetime warning by @brunoamaral in #275
- fixed some grammer issues by @codeZenon in #276
- Auth API by @antoniolopes by @brunoamaral in #271
- Added two methods to SciencePaper class refresh() and clean_abstract() by @brunoamaral in #280
- Make DOI optional when adding content from the API by @brunoamaral in #281
- adds a new endpoint to list articles by journal name by @brunoamaral in #283
- Adds a script to calculate the summary of abstracts by @brunoamaral in #284
- fix variable name by @brunoamaral in #286
- Clean up debug prints, limit results to 100 rows, fix #285 by @brunoamaral in #287
- create shell command to process takeaways by @brunoamaral in #288
- include takeaways in article json output by @brunoamaral in #289
New Contributors
- @codeZenon made their first contribution in #276
Full Changelog: v12...v13