Implement `extract_from_text` to get neutral citations for `pasuperct` #1251

grossir · 2024-11-21T00:53:41Z

Neutral citations are present inside the document's text, but we are not collecting them. Once we implement this, and freelawproject/courtlistener#4520 is merged, we can collect those citations

Example

Related to #858 (comment)

grossir · 2024-11-21T15:05:08Z

@flooie is this a neutral citation? I was checking reporters-db, and it's listed as a variation of a state citation
https://github.com/freelawproject/reporters-db/blob/3eab4222612154224fb5e513741b5bf7b8cfa252/reporters_db/data/reporters.json#L21493

The indigo book lists it as a "public domain" citation

Helps solve ##1251

grossir · 2025-01-13T18:26:30Z

This is working as far as parsing goes; but there is a validation bug in Courtlistener that makes us unable to ingest it. We have PRs addressing the problem. After that, it's a matter of re-running update_from_text

(Not so sure about the start date)

./manage.py update_from_text --courts juriscraper.opinions.united_states.state.pasuperct --cluster-status Published --date-filed-gte 2019-08-01 --date-filed-lte 2025-01-01 --verbosity 3

Output:
INFO Modified objects counts: {'Docket': 0, 'OpinionCluster': 0, 'Opinion': 0, 'Citation': 1567, 'No text to extract from': 68, 'No metadata extracted': 0, 'Error': 0}

grossir · 2025-01-16T22:39:39Z

We gained around 1638 citations from this run. There may be more due to a bug on the first backscrape (missing pagination); which I will collect in another issue

grossir mentioned this issue Nov 21, 2024

Collect regional citations for pasuperct from API #1252

Closed

grossir mentioned this issue Nov 21, 2024

Add new neutral edition to Pa Super reporter freelawproject/reporters-db#186

Closed

grossir added a commit that referenced this issue Nov 21, 2024

feat(pasuperct): get neutral citations using extract_from_text

944ab2f

Helps solve ##1251

grossir mentioned this issue Nov 21, 2024

feat(pa): collect neutral citations and regional citations; also paginate results #1255

Merged

grossir added this to Case Law Sprint Nov 21, 2024

grossir moved this to In progress in Case Law Sprint Nov 21, 2024

grossir self-assigned this Jan 13, 2025

flooie moved this to General Backlog in Case Law Sprint Jan 14, 2025

grossir closed this as completed Jan 16, 2025

github-project-automation bot moved this from General Backlog to Done in Case Law Sprint Jan 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement `extract_from_text` to get neutral citations for `pasuperct` #1251

Implement `extract_from_text` to get neutral citations for `pasuperct` #1251

grossir commented Nov 21, 2024

grossir commented Nov 21, 2024

grossir commented Jan 13, 2025 •

edited

Loading

grossir commented Jan 16, 2025

Implement extract_from_text to get neutral citations for pasuperct #1251

Implement extract_from_text to get neutral citations for pasuperct #1251

Comments

grossir commented Nov 21, 2024

grossir commented Nov 21, 2024

grossir commented Jan 13, 2025 • edited Loading

grossir commented Jan 16, 2025

Implement `extract_from_text` to get neutral citations for `pasuperct` #1251

Implement `extract_from_text` to get neutral citations for `pasuperct` #1251

grossir commented Jan 13, 2025 •

edited

Loading