Skip to content

Latest commit

 

History

History
761 lines (509 loc) · 32.6 KB

CHANGELOG.md

File metadata and controls

761 lines (509 loc) · 32.6 KB

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog 1.0.0.

v29.1.0 (2025-01-08)

Feat

  • FCL-568: add new class for Press Summary identifiers

Fix

  • deps: update dependency mypy-boto3-s3 to v1.35.92
  • deps: update dependency boto3 to v1.35.91
  • deps: update dependency boto3 to v1.35.88
  • deps: update dependency charset-normalizer to v3.4.1
  • deps: update dependency boto3 to v1.35.87
  • deps: update dependency boto3 to v1.35.85

v29.0.1 (2024-12-20)

Fix

  • Identifiers: preferred identifier now correctly handles case where there are none of type
  • Identifiers: fix case where unpacking unknown identifier type would raise an exception
  • deps: update dependency mypy-boto3-s3 to v1.35.81
  • deps: update dependency boto3 to v1.35.82

v29.0.0 (2024-12-18)

BREAKING CHANGE

  • Methods which were previously guaranteed to return a Neutral Citation may now return None.

Feat

  • FCL-533: getting scored or preferred identifiers can now be done by type
  • FCL-533: modify human identifier to rely on identifiers framework
  • FCL-533: add scoring to Identifiers

Fix

  • IdentifierSchema: use hasattr instead of getattr with a default when testing required attributes

v28.2.0 (2024-12-17)

Feat

  • FCL-532: assign FCLIDs on document publication
  • FCL-532: add ability to retrieve identifiers by type
  • FCL-499: add new FCLID identifier class
  • FCL-499: add method to get next sequence number from MarkLogic

Fix

  • deps: update dependency certifi to >=2024.12.14,<2024.13.0
  • deps: update dependency boto3 to v1.35.80

v28.1.0 (2024-12-12)

Feat

  • FCL-309: identifier UUIDs are now prefixed with 'id-'
  • FCL-309: identifiers can compile URL slugs
  • FCL-309: identifiers can now be saved to and retrieved from MarkLogic
  • FCL-309: add functionality for packing and unpacking XML representations of identifiers
  • FCL-309: add stub for defining identifier schemas, and a Neutral Citation schema
  • FCL-309: add ability to add, delete and update identifiers

Fix

  • deps: update boto packages to v1.35.69
  • deps: update dependency ds-caselaw-utils to v2.0.1
  • deps: update dependency mypy-boto3-sns to v1.35.68
  • deps: update boto packages to v1.35.67
  • deps: update dependency boto3 to v1.35.64
  • deps: update boto packages to v1.35.61
  • deps: update dependency boto3 to v1.35.77
  • deps: update dependency mypy-boto3-s3 to v1.35.76
  • deps: update dependency boto3 to v1.35.75
  • deps: update boto packages to v1.35.72

v28.0.0 (2024-11-14)

BREAKING CHANGE

  • Code which provided unsanitised URIs when initialising DocumentURIStrings will now cause InvalidDocumentURIExceptions to be raised.
  • Document can now no longer be initialised with a string as the uri, it must be a DocumentURIString.

Feat

  • Validate strings when creating a new DocumentURIString

Fix

  • deps: update dependency boto3 to v1.35.58
  • deps: update dependency boto3 to v1.35.56

Refactor

  • Document: initialising a Document now requires a DocumentURIString, not a str
  • tests: simpler test changes to pass type checking

v27.4.0 (2024-11-07)

Change of behaviour

  • Require documents to be published before bulk enrichment will enrich them

Feature

  • Add logging of xquery commands and values passed to them if DEBUG environment set

v27.3.0 (2024-10-30)

Feat

  • FCL-386: search query can now be passed to get_document_by_uri

v27.2.0 (2024-10-28)

Feat

  • FCL-396: query highlighting is now done as a function of requesting the Document

Fix

  • deps: update dependency boto3 to v1.35.48
  • deps: update dependency mypy-boto3-s3 to v1.35.45
  • deps: update dependency boto3 to v1.35.45

Refactor

  • FCL-396: tidy up API implementation for search query highlighting change

v27.1.0 (2024-10-23)

  • Feature: Add native XSLT transformations to the API
  • Allow things on doc.body to be called from doc with a warning
  • client.checkout_judgment now accepts a timeout_seconds parameter
  • Allow test failures for Python 3.13/3.14
  • Ensure Judgment- and PressSummaryFactory have working NCNs

v27.0.1 (2024-10-17)

  • Fix .content_as_html on Document Factory

v27.0.0 (2024-10-08)

BREAKING CHANGE

  • Remove Document.overwrite and MarkLogicApiClient.overwrite
  • The models.documents.body.CourtIdentifierString type has been replaced with the more specific courts.CourtCode type from ds-caselaw-utils.

Feat

  • NeutralCitationMixin: use ABC to flag abstract methods properly

Fix

  • deps: update dependency boto3 to v1.35.33
  • deps: update dependency mypy-boto3-s3 to v1.35.32
  • deps: update dependency boto3 to v1.35.30
  • SearchResponse: total now returns an int, not a str
  • SearchResult: update behaviour to meet type checking
  • deps: update dependency ds-caselaw-utils to v1.7.0
  • deps: update dependency boto3 to v1.35.28
  • deps: update dependency ds-caselaw-utils to v1.5.7

Refactor

  • FCL-331: move api_client, xml and html params to build method signature instead of kwargs
  • types: typing improvements around NeutralCitationString
  • Document: remove unused overwrite method
  • DocumentBody: replace CourtIdentifierString with utils.courts.CourtCode

v26.0.0 (2024-09-25)

BREAKING CHANGE

  • Multiple methods which used to be within Document are now in Document.body

Feat

  • FCL-268: break functions which rely on the document body into their own subclass

Fix

  • FCL-268: update factory behaviour to match new document body model
  • FCL-268: use real date when testing if document date should be sent in reparse payload
  • deps: update dependency boto3 to v1.35.23

Refactor

  • FCL-268: move document statuses to their own submodule
  • FCL-268: move document exceptions into their own submodule
  • FCL-268: move XML manipulation into its own file
  • FCL-268: move the documents module in readiness for better code separation
  • Breaking: Remove xml_tools
  • Multiple stylistic improvements, and enabling ruff to allow us to keep standards up in future
  • Truncate reparse references to avoid overlong step function names in TRE
  • Always set last sent date to parser, even on failed parses
  • [FCL-176] Tooling configuration audit
  • [FCL-195] Skip pre-commit branch check in CI
  • Make enrichment date maths not care about timezones
  • Remove explicit urllib3 v1 dependency, rely on implicit dependency only
  • Remove fclex_id prefix from UUID of reparse execution ID
  • Implement handling of facets received from MarkLogic search results
  • Add an enriched_recently property
  • Add a validates_against_schema property
  • Add a can_enrich property
  • Only enrich if not recently enriched and valid against current schema
  • Allow fetching linked documents for Judgements and PressSummarys
  • Add function to check if the docx exists for a judgment

[Release 22.0.2]

  • Add a method to allow fetching press summaries for a given document
  • Ensure that we log a warning and do not error when a judgment has an unrecognised jurisdiction
  • Expose court jurisdictions in search results
  • Breaking: Client.get_pending_enrichment_for_version now requires both a target enrichment version and a target parser version, and will not include documents which have not been parsed with the target version.
  • Feature: Add accessors for judgment jurisdiction
  • Feature: New Client.get_pending_parse_for_version and Client.get_highest_parser_version methods to help find documents in need of re-parsing.
  • Breaking: Client.get_pending_enrichment_for_version now accepts a tuple of (major_version, minor_version) rather than a single major version.
  • Add support for quoted phrase prioritisation in result snippets
  • Breaking: Client.set_published no longer has a default argument; you must always be explicit.
  • Feature: New Client.get_pending_enrichment_for_version method finds documents which are not yet enriched with a given version, and which haven't recently been sent for enrichment.
  • Breaking: Fully remove the deprecated caselawclient.api_client instance.
  • Breaking: Remove top-level methods for interacting with a document's XML representation. These are now all encapsulated in document.xml, which is an instance of Document.XML.
  • Feature: New Document.xml_root_element function to replace get_judgment_root
  • Feature: Documents which are not valid XML are now identified by the raising of a new Document.NonXMLDocumentError exception
  • Feature: Add method to return document's lock status and message.
  • Feature: Document.enrich() method will send a message to the announce SNS, requesting that a document be enriched.
  • document.content_as_html now takes an optional query= string parameter, which, when supplied, highlights instances of the query within the document with <mark> tags, each of which has a numbered id indicating its sequence in the document.
  • document.number_of_mentions method which takes a query= string parameter, and returns the number of highlighted mentions in the html.
  • New Client.get_combined_stats_table method to run a combined statistics query against MarkLogic.
  • BREAKING: VersionAnnotation now requires a statement of if the action is automated or not
  • VersionAnnotation can now accept an optional dict of structured payload data
  • VersionAnnotation can now record a user agent string
  • New versions of a document created with insert_document_xml can now be annotated
  • BREAKING: Renamed save_judgment_xml to update_document_xml
  • BREAKING: All annotations for versions are now mandatory instances of the new VersionAnnotation class
  • Expose the creation date of a version
  • Get version annotation for a single document
  • Expose the type of the latest manifestation date of a document
  • Search results for press summaries now include NCNs
  • Search results now correctly include document status information
  • Latest manifestation datetime is available for documents (including versions)
  • Bugfix: document_date_as_date shouldn't fail hard if we can't parse it.
  • Changed is_failure to rely on failed_to_parse, rather than failure in the URI.

  • Added transformation_datetime to Document

  • Added enrichment_datetime to Document

  • Added get_manifestation_datetimes to Document

  • Added get_latest_manifestation_datetime to Document

  • Added versions_as_documents to Document

  • Added is_version to Document

  • Added version_number to Document

  • Add default user agent string
  • Add functions for overwriting and moving judgments
  • Fixed neutral_citation property to look within preface tag rather than mainBody for press summaries, due to updated parsing resulting in updated press summary xml structure.
  • Added python-dotenv as a poetry dev dependency to be able to run the new smoketest.py file that connects to a MarkLogic instance.
  • Fixed Client.set_document_court method
  • Fixed Client.get_document_type_from_uri method
  • Breaking:: Removed document.is_editable in favour of the more descriptive and better-tested document.failed_to_parse.
  • Add new Document.delete() method.
  • Generalised the set judgment metadata methods to set document metadata methods specifically for name, court and date.
  • Fix issues blocking push to PyPI
  • Add a "Best human identifier" to Documents
  • Added get_judgment_xml_bytestring and content_as_xml_bytestring to Client
  • Fixed content_as_xml_tree by making it use content_as_xml_bytestring
  • Made Document class' name, court, document_date_as_string and document_date_as_date work for Press Summaries also.
  • Added neutral_citation property and validation to PressSummary class.
  • Significant improvements to inline documentation of the code.
  • Deprecated: The caselawclient.api_client instance should be considered deprecated. Projects should instead initialise their own instance.

Breaking changes

  • supplemental/anonymous/sensitive getters/setters removed

  • XQueries which return multiple responses will raise an error

  • Refactored Document class' name, court, document_date_as_string and document_date_as_date (previously judgmentdate...) on Document class and neutral_citation on Judgment class making use of the new cached content_as_xml_tree property.

  • Renamed judgment_date_as_string judgment_date_as_date to document_date_as_string and document_date_as_date respectively.

  • Added content_as_xml_tree cached property to Document class

  • Changed the Document class' content_as_xml to be a cached_property also. [Note: this changelog line previously mistakenly referred to content_as_html.]

  • Removed get_judgment_name, get_judgment_citation, get_judgment_court, get_judgment_work_date from the Client class and associated .xqy files.

  • Add a new MarklogicApiClient.get_document_by_uri method to retrieve a document (of any type) by URI.

  • New get_document_by_uri method on API client returning unique types for Judgments and PressSummarys.

  • New Document.enrich() method to trigger enrichment

  • Breaking: Renamed Judgment to Document
  • Breaking: Document.judgment_exists is now Document.document_exists
  • Check for a valid court, rather than an present one
  • Trim whitespace when trying to set an NCN
  • Breaking: Renamed copy_judgment to copy_document
  • copy_document now adds the document to the appropriate collection based on the uri.
  • Judgment.validation_failure_messages method for retrieving a list of strings with reasons a judgment cannot be published.
  • Fixed insert_document_xml to pattern match uris with and add documents to press-summary, not press_summary.
  • BREAKING: Renamed insert_judgment_xml to insert_document_xml and enhanced it to place a document in the appropriate collection (press_summary or judgment)
  • BREAKING: Changed SearchParameters dataclass field from q to query
  • Added search_helpers module to allow clients to search and process document search responses in one go.
  • Added SearchParameters dataclass for use with search functions using the legacy kwargs from Client.advanced_search and new collections field for filtering by collections
  • BREAKING: Changed Client.advanced_search interface to take in SearchParameters as opposed to the legacy kwargs.
  • Added search_and_decode_response and search_judgments_and_decode_response methods to Client
  • Added SearchResponse, SearchResult, SearchResultMetadata classes to encapsulate and process document search responses.
  • BREAKING: Instantiating aJudgment object will now raise a caselawclient.errors.JudgmentNotFoundError if the uri passed in does not correspond to a valid Judgment, rather than attempting (and failing) to return a MarklogicResourceNotFoundError
  • Added judgment_exists method to Client class
  • Make version_uri optional in Judgment.content_as_html
  • Ensure XSLT_IMAGE_LOCATION existing doesn't break tests
  • Improve detection of when a judgment doesn't exist
  • Unlock judgment on Judgment.unpublish() so editors can unpublish immediately after a publish
  • Judgment.publish method will now reject publication in more invalid states (must have a name, must have a valid NCN, must have a court code).
  • Less strict version pinning of dependencies to give downstream package users more flexibility in resolving.
  • Significantly more type annotations on Client and Judgment methods, including some which are stricter than before. This is potentially a breaking change if implementations have been relying on duck typing.
  • Automatic generation of strict typing for XQuery files which run against MarkLogic.
  • Improvements to the methods used in content hashing, which will be breaking changes if these are used downstream.

[Release 5.3.2]

  • Correct import location used in Judgment model, so it's usable when packaged
  • Fix broken build process

Release 5.3.0 (Yanked)

  • Dependabot now updates dependencies for all new versions, not just security updates
  • Use Poetry for dependency management, to improve robustness
  • Add a Judgment class (copied from Editor Interface) to begin the process of harmonising how various services interface with the data
  • Add code coverage reporting to CI
  • Make a PEP-561 declaration of typing
  • Re-add the code that was pointing the XSLT to the assets
  • This release had a bug, fixed by 5.2.5
  • HTML view: Do not default to current version if the version doesn't exist (cause an error instead)
  • Add content hash validation when we save a locked judgment
  • Bug fix: setting court was not valid XQuery in eval context
  • Improvements to code linting
  • Expose hash of judgment content
  • Unset the court tag where the court is an empty string
  • Clarify release process documentation
  • Add pypi version badge and libraries.io dependency shield
  • Expose MarklogicValidationFailed exception
  • Validate against a schema when priv API document is uploaded
  • Add CodeQL configuration
  • Add a check for secrets
  • Bump certifi from 2021.10.8 to 2022.12.7
  • Don't crash if multipart data is actually an empty bytestring.

** This release had a bug where the Editor UI was unusable. **

  • Ensure Work Date and Court values are returned as text
  • Get properties for a range of URIs for use in search results
  • Remove a debug print() statement that was missed
  • Admin users can't read unpublished judgments
  • Deprecate XMLTools methods
  • Fix TypeError: 'type' object is not subscriptable
  • Breaking change: passes a list of zero-or-more courts, rather than a string that might be empty.
  • Search queries: pages less than one are treated as one
  • Add linting
  • Ensure only people who are allowed to view unpublished judgments can view them
  • Refactor tests
  • Break judgment checkout
  • Methods & XQueries to get & set all metadata
  • DRY up some aspects of the API Client
  • Support renaming of the XSL Transformation files
  • Speed up privilege checking
  • Add user_has_privilege method & XQuery to check if a user has a privilege
  • Use user_has_privilege to check if a user can see unpublished documents
  • Move error message codes and messages into this client
  • New errors handled from Marklogic
  • New function to save XML for a locked judgment
  • Fix: add external declaration to XQuery parameter
  • Bump version of requests to 2.28.1
  • Raise error if unpublished document is not returned
  • Use -1 as value meaning 'lock forever' in checkout_judgment
  • Return none if the judgment is not locked, rather than an empty string
  • Add optional annotation parameter to checkout_judgment method
  • Add method to get the lock/checkout status of a judgment
  • Judgment checkout may optionally expire at midnight
  • Gracefully handle a null, empty or unexpected error response from Marklogic
  • Rename set_judgment_date to set_judgment_work_expression_date
  • Update the FRBRWork and FRBRExpression dates and @name attributes
  • Fix a typo in setting the internal URI of a judgment
  • Change the XQuery delete method from xdmp:document-delete to dls:document-delete
  • Change the behaviour of 'last-modified' dates to use prop:last-modified rather than xdmp:document-timestamp
  • Set the judgment's internal URI (FRBRthis and FRBRuri nodes)
  • Allow the xsl filename used in the judgment transformation to vary. We have two xsls available in Marklogic - judgment2 (the accessible version) and judgment0 (the "as handed down" version). Add two helper methods accessible_judgment_transformation() and original_judgment_transformation() to call these transformations without specifying the xsl filename.
  • Copy judgment from URI to URI
  • Adds a new delete_judgment endpoint, for deleting a judgment from marklogic
  • Create a new akn:FRBRdate, uk:cite and uk:court nodes for the judgment metadata, if they do not exist
  • Create a new akn:FRBRname node for the judgment metadata name, if one does not exist
  • Patch release to update setup.cfg, which was missed from v4.5.0
  • Allow metadata elements (name, date, court and citation) to be edited in the XML via XQuery, not by deserialising and serialising the XML in the implementing client code.
  • If an element doesn't exist in a document, xml_tools.get_element tries to return an empty element with the same name as the desired element.
  • Add function to retrieve the last time a document or it's properties was updated
  • Add neutral citation and specific keyword search parameters to advanced search
  • Use error code from eval response body to throw MarklogicResourceNotFound errors
  • Parameterize the location of images in the XSLT transformation
  • Use the invoke endpoint, and the search.xqy stored on Marklogic, to search
  • Remove the database parameter in eval, it's not required; the db associated with the REST server is used
  • Add flag to advanced_search to enable filtering out published documents from the search
  • Add function to get and set text properties on a document
  • Fix insert_document xquery to call the document-insert-and-manage function
  • Replace LXML with standard library xml for wider compatibility and reduced build times
  • Add a new anonymised content flag
  • Fix intermittently failing XSLT transforms
  • Refactor save_judgment_xml to use the eval endpoint, so that we can introduce versioning via na XQuery.
  • List all versions of a managed judgment
  • Get a version of a managed judgment
  • Restrict search to managed judgments only
  • Set properties on a judgment using the dls namespace, not xdmp
  • Insert & manage a new document
  • Check in and check out a document for editing
  • Use document properties on the "original" version of the judgment, not its version, to see if a judgment is published
  • Minor bugfixes
  • Refactored property accessor methods
  • BREAKING CHANGE is_document_published changed to get_published and publish_document changed to set_published.
  • Initial tagged release