-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
46 changed files
with
2,107 additions
and
250 deletions.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
32 changes: 32 additions & 0 deletions
32
docs/core_docs/docs/integrations/document_loaders/web_loaders/browserbase.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
# Browserbase Loader | ||
|
||
## Description | ||
|
||
[Browserbase](https://browserbase.com) is a serverless platform for running headless browsers, it offers advanced debugging, session recordings, stealth mode, integrated proxies and captcha solving. | ||
|
||
## Installation | ||
|
||
- Get an API key from [browserbase.com](https://browserbase.com) and set it in environment variables (`BROWSERBASE_API_KEY`). | ||
- Install the [Browserbase SDK](http://github.com/browserbase/js-sdk): | ||
|
||
```bash npm2yarn | ||
npm i @browserbasehq/sdk | ||
``` | ||
|
||
## Example | ||
|
||
Utilize the `BrowserbaseLoader` as follows to allow your agent to load websites: | ||
|
||
import CodeBlock from "@theme/CodeBlock"; | ||
import Example from "@examples/document_loaders/browserbase.ts"; | ||
|
||
<CodeBlock language="typescript">{Example}</CodeBlock> | ||
|
||
## Arguments | ||
|
||
- `urls`: Required. List of URLs to load. | ||
|
||
## Options | ||
|
||
- `api_key`: Optional. Specifies Browserbase API key. Defaults is the `BROWSERBASE_API_KEY` environment variable. | ||
- `text_content`: Optional. Load pages as readable text. Default is `False`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
import { BrowserbaseLoader } from "langchain/document_loaders/web/browserbase"; | ||
|
||
const loader = new BrowserbaseLoader(["https://example.com"], { | ||
textContent: true, | ||
}); | ||
const docs = await loader.load(); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,82 @@ | ||
import { Document, type DocumentInterface } from "@langchain/core/documents"; | ||
import Browserbase, { BrowserbaseLoadOptions } from "@browserbasehq/sdk"; | ||
import { BaseDocumentLoader } from "../base.js"; | ||
import type { DocumentLoader } from "../base.js"; | ||
|
||
interface BrowserbaseLoaderOptions extends BrowserbaseLoadOptions { | ||
apiKey?: string; | ||
} | ||
|
||
/** | ||
* Load pre-rendered web pages using a headless browser hosted on Browserbase. | ||
* | ||
* Depends on `@browserbasehq/sdk` package. | ||
* Get your API key from https://browserbase.com | ||
* | ||
* @example | ||
* ```typescript | ||
* import { BrowserbaseLoader } from "langchain/document_loaders/web/browserbase"; | ||
* | ||
* const loader = new BrowserbaseLoader(["https://example.com"], { | ||
* apiKey: process.env.BROWSERBASE_API_KEY, | ||
* textContent: true, | ||
* }); | ||
* | ||
* const docs = await loader.load(); | ||
* ``` | ||
* | ||
* @param {string[]} urls - The URLs of the web pages to load. | ||
* @param {BrowserbaseLoaderOptions} [options] - Browserbase client options. | ||
*/ | ||
export class BrowserbaseLoader | ||
extends BaseDocumentLoader | ||
implements DocumentLoader | ||
{ | ||
urls: string[]; | ||
|
||
options: BrowserbaseLoaderOptions; | ||
|
||
browserbase: Browserbase; | ||
|
||
constructor(urls: string[], options: BrowserbaseLoaderOptions = {}) { | ||
super(); | ||
this.urls = urls; | ||
this.options = options; | ||
this.browserbase = new Browserbase(options.apiKey); | ||
} | ||
|
||
/** | ||
* Load pages from URLs. | ||
* | ||
* @returns {Promise<DocumentInterface[]>} - A promise which resolves to a list of documents. | ||
*/ | ||
async load(): Promise<DocumentInterface[]> { | ||
const documents: DocumentInterface[] = []; | ||
for await (const doc of this.lazyLoad()) { | ||
documents.push(doc); | ||
} | ||
|
||
return documents; | ||
} | ||
|
||
/** | ||
* Load pages from URLs. | ||
* | ||
* @returns {Generator<DocumentInterface>} - A generator that yields documents. | ||
*/ | ||
async *lazyLoad() { | ||
const pages = await this.browserbase.loadURLs(this.urls, this.options); | ||
|
||
let index = 0; | ||
for await (const page of pages) { | ||
yield new Document({ | ||
pageContent: page, | ||
metadata: { | ||
url: this.urls[index], | ||
}, | ||
}); | ||
|
||
index += index + 1; | ||
} | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.