Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs example fix requested - Finding all the patents for a given company (or organization) #176

Open
MatthewLoffredo opened this issue Aug 14, 2024 · 1 comment

Comments

@MatthewLoffredo
Copy link

Hi, I'm trying to follow the page here (https://patent-client.readthedocs.io/en/latest/examples/3%20-%20Company%20Ownership.html) to get a company's patent portfolio, but the first step doesn't seem to be working:

Calling the first step:

applicant_apps = USApplication.objects.filter(first_named_applicant='University of California, Berkeley').values_list('appl_id', flat=True).to_list()

results in:

[/usr/local/lib/python3.10/dist-packages/pydantic/main.py](https://localhost:8080/#) in model_validate(cls, obj, strict, from_attributes, context)
    566         # `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks
    567         __tracebackhide__ = True
--> 568         return cls.__pydantic_validator__.validate_python(
    569             obj, strict=strict, from_attributes=from_attributes, context=context
    570         )

ValidationError: 1 validation error for PedsPage
queryResults.searchResponse.response.docs.1.appFilingDate
  Field required [type=missing, input_value={'corrAddrCountryName': '... 02818 (UNITED STATES)'}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.8/v/missing

Additionally, I was wondering if there were any tricks you had for getting all the different potential assignee names for companies? Some companies might have multiple different ones, which using this method, would potentially exclude any name variations. Any advice on how to handle that? The end goal is to get a company's complete patent portfolio.

@Hobly
Copy link

Hobly commented Sep 3, 2024

On the first point, it's worth noting that PEDS will be deprecated very soon by the USPTO so if you want to use patent_client for landscaping, your best bet is to use the open data portal (patent_client.odp) module.

Secondly, if you only want retrieve a list of applications and don't need all the patent_client bells and whistles then it's pretty trivial to just do a python requests call to the odp search API here: https://beta-data.uspto.gov/apis/getting-started. Construct a suitable query string (e.g. something like q = "applicationMetaData.applicantBag.applicantNameText: Google")

On normalising different company names, there is no universal solution to this problem so you'll almost always have to manually do this. Patent applicant name data is notoriously fractured due to attorneys typing these things in slightly differently so you'll need to build a custom query to capture as many variants as you can. Directly calling the uspto odp api also lets you do wildcards in your search query to try to capture as many variants as you can. It's pretty interesting to do a broad subject-matter search and then listing all the unique ways a company has been listed. I once did one where a well-known Korean research institute was named in 15 different ways...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants