Skip to content

Commit

Permalink
codespell: workflow, config + some (quite a few) typos fixed (langcha…
Browse files Browse the repository at this point in the history
…in-ai#6785)

Probably the most  boring PR to review ;)

Individual commits might be easier to digest

---------

Co-authored-by: Bagatur <[email protected]>
Co-authored-by: Bagatur <[email protected]>
  • Loading branch information
3 people authored Jul 12, 2023
1 parent 931e686 commit 0d92a7f
Show file tree
Hide file tree
Showing 100 changed files with 213 additions and 127 deletions.
26 changes: 26 additions & 0 deletions .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,32 @@ This can be very helpful when you've made changes to only certain parts of the p

We recognize linting can be annoying - if you do not want to do it, please contact a project maintainer, and they can help you with it. We do not want this to be a blocker for good code getting contributed.

### Spellcheck

Spellchecking for this project is done via [codespell](https://github.com/codespell-project/codespell).
Note that `codespell` finds common typos, so could have false-positive (correctly spelled but rarely used) and false-negatives (not finding misspelled) words.

To check spelling for this project:

```bash
make spell_check
```

To fix spelling in place:

```bash
make spell_fix
```

If codespell is incorrectly flagging a word, you can skip spellcheck for that word by adding it to the codespell config in the `pyproject.toml` file.

```python
[tool.codespell]
...
# Add here:
ignore-words-list = 'momento,collison,ned,foor,reworkd,parth,whats,aapply,mysogyny,unsecure'
```

### Coverage

Code coverage (i.e. the amount of code that is covered by unit tests) helps identify areas of the code that are potentially more or less brittle.
Expand Down
22 changes: 22 additions & 0 deletions .github/workflows/codespell.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
name: Codespell

on:
push:
branches: [master]
pull_request:
branches: [master]

permissions:
contents: read

jobs:
codespell:
name: Check for spelling errors
runs-on: ubuntu-latest

steps:
- name: Checkout
uses: actions/checkout@v3
- name: Codespell
uses: codespell-project/actions-codespell@v2
6 changes: 6 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,12 @@ format format_diff:
poetry run black $(PYTHON_FILES)
poetry run ruff --select I --fix $(PYTHON_FILES)

spell_check:
poetry run codespell --toml pyproject.toml

spell_fix:
poetry run codespell --toml pyproject.toml -w

######################
# HELP
######################
Expand Down
2 changes: 1 addition & 1 deletion docs/extras/ecosystem/integrations/grobid.mdx
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Grobid

This page covers how to use the Grobid to parse articles for LangChain.
It is seperated into two parts: installation and running the server
It is separated into two parts: installation and running the server

## Installation and Setup
#Ensure You have Java installed
Expand Down
12 changes: 6 additions & 6 deletions docs/extras/ecosystem/integrations/langchain_decorators.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ For Feedback, Issues, Contributions - please raise an issue here:
Main principles and benefits:

- more `pythonic` way of writing code
- write multiline prompts that wont break your code flow with indentation
- write multiline prompts that won't break your code flow with indentation
- making use of IDE in-built support for **hinting**, **type checking** and **popup with docs** to quickly peek in the function to see the prompt, parameters it consumes etc.
- leverage all the power of 🦜🔗 LangChain ecosystem
- adding support for **optional parameters**
Expand All @@ -31,7 +31,7 @@ def write_me_short_post(topic:str, platform:str="twitter", audience:str = "devel
"""
return

# run it naturaly
# run it naturally
write_me_short_post(topic="starwars")
# or
write_me_short_post(topic="starwars", platform="redit")
Expand Down Expand Up @@ -122,7 +122,7 @@ await write_me_short_post(topic="old movies")

# Simplified streaming

If we wan't to leverage streaming:
If we want to leverage streaming:
- we need to define prompt as async function
- turn on the streaming on the decorator, or we can define PromptType with streaming on
- capture the stream using StreamingContext
Expand All @@ -149,7 +149,7 @@ async def write_me_short_post(topic:str, platform:str="twitter", audience:str =



# just an arbitrary function to demonstrate the streaming... wil be some websockets code in the real world
# just an arbitrary function to demonstrate the streaming... will be some websockets code in the real world
tokens=[]
def capture_stream_func(new_token:str):
tokens.append(new_token)
Expand Down Expand Up @@ -250,7 +250,7 @@ the roles here are model native roles (assistant, user, system for chatGPT)

# Optional sections
- you can define a whole sections of your prompt that should be optional
- if any input in the section is missing, the whole section wont be rendered
- if any input in the section is missing, the whole section won't be rendered

the syntax for this is as follows:

Expand All @@ -273,7 +273,7 @@ def prompt_with_optional_partials():
# Output parsers

- llm_prompt decorator natively tries to detect the best output parser based on the output type. (if not set, it returns the raw string)
- list, dict and pydantic outputs are also supported natively (automaticaly)
- list, dict and pydantic outputs are also supported natively (automatically)

``` python
# this code example is complete and should run as it is
Expand Down
2 changes: 1 addition & 1 deletion docs/extras/ecosystem/integrations/myscale.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ We also deliver with live demo on huggingface! Please checkout our [huggingface
## Installation and Setup
- Install the Python SDK with `pip install clickhouse-connect`

### Setting up envrionments
### Setting up environments

There are two ways to set up parameters for myscale index.

Expand Down
2 changes: 1 addition & 1 deletion docs/extras/ecosystem/integrations/vectara/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ vectara = Vectara(
```
The customer_id, corpus_id and api_key are optional, and if they are not supplied will be read from the environment variables `VECTARA_CUSTOMER_ID`, `VECTARA_CORPUS_ID` and `VECTARA_API_KEY`, respectively.

Afer you have the vectorstore, you can `add_texts` or `add_documents` as per the standard `VectorStore` interface, for example:
After you have the vectorstore, you can `add_texts` or `add_documents` as per the standard `VectorStore` interface, for example:

```python
vectara.add_texts(["to be or not to be", "that is the question"])
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1840,7 +1840,7 @@ This category contains articles that are incomplete and are tagged with the {{T|
<username>FANDOM</username>
<id>32769624</id>
</contributor>
<comment>Created page with "{{LicenseBox|text=''This work is licensed under the [https://opensource.org/licenses/MIT MIT License].''}}{{#ifeq: {{NAMESPACENUMBER}} | 0 | &lt;includeonly&gt;Category:MIT licens..."</comment>
<comment>Created page with "{{LicenseBox|text=''This work is licensed under the [https://opensource.org/licenses/MIT MIT License].''}}{{#ifeq: {{NAMESPACENUMBER}} | 0 | &lt;includeonly&gt;Category:MIT license..."</comment>
<origin>104</origin>
<model>wikitext</model>
<format>text/x-wiki</format>
Expand Down
2 changes: 1 addition & 1 deletion docs/extras/modules/paul_graham_essay.txt
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ There were three main parts to the software: the editor, which people used to bu

There were a lot of startups making ecommerce software in the second half of the 90s. We were determined to be the Microsoft Word, not the Interleaf. Which meant being easy to use and inexpensive. It was lucky for us that we were poor, because that caused us to make Viaweb even more inexpensive than we realized. We charged $100 a month for a small store and $300 a month for a big one. This low price was a big attraction, and a constant thorn in the sides of competitors, but it wasn't because of some clever insight that we set the price low. We had no idea what businesses paid for things. $300 a month seemed like a lot of money to us.

We did a lot of things right by accident like that. For example, we did what's now called "doing things that don't scale," although at the time we would have described it as "being so lame that we're driven to the most desperate measures to get users." The most common of which was building stores for them. This seemed particularly humiliating, since the whole raison d'etre of our software was that people could use it to make their own stores. But anything to get users.
We did a lot of things right by accident like that. For example, we did what's now called "doing things that don't scale," although at the time we would have described it as "being so lame that we're driven to the most desperate measures to get users." The most common of which was building stores for them. This seemed particularly humiliating, since the whole reason d'etre of our software was that people could use it to make their own stores. But anything to get users.

We learned a lot more about retail than we wanted to know. For example, that if you could only have a small image of a man's shirt (and all images were small then by present standards), it was better to have a closeup of the collar than a picture of the whole shirt. The reason I remember learning this was that it meant I had to rescan about 30 images of men's shirts. My first set of scans were so beautiful too.

Expand Down
4 changes: 2 additions & 2 deletions docs/extras/use_cases/question_answering/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ Let's load this [blog post](https://lilianweng.github.io/posts/2023-06-23-agent/

We have a QA app in a few lines of code.

Set enviorment varaibles and get packages:
Set environment variables and get packages:
```python
pip install openai
pip install chromadb
Expand Down Expand Up @@ -140,7 +140,7 @@ Here are the three pieces together:

#### 1.2.2 Retaining metadata

`Context-aware splitters` keep the location ("context") of each split in the origional `Document`:
`Context-aware splitters` keep the location ("context") of each split in the original `Document`:

* [Markdown files](https://python.langchain.com/docs/use_cases/question_answering/document-context-aware-QA)
* [Code (py or js)](https://python.langchain.com/docs/modules/data_connection/document_loaders/integrations/source_code)
Expand Down
Loading

0 comments on commit 0d92a7f

Please sign in to comment.