Skip to content

Commit

Permalink
Updates for v1.3 (#58)
Browse files Browse the repository at this point in the history
* fixed test broken due to $vector projection change

* minor collection.distinct optimzation

* minor error msg fix

* getSortVector (no documentation)

* added some documentation regarding sortVector

* updated vectorize test suite

* minor fix to vectorize tests when no params

* further refactoring of vectorize tests

* example vectorize_credentials.json file

* removed api-extractor from package.json

* added find-missing-licensing script for convenience

* tiny vectorize test + devguide updats

* dse db admin + fixed some tests (no docs)

* fixed typo in package.json script name

* minor update to list-embedding-providers script

* few internal refactors

* more minor refactors

* documentation for environments and such

* some new unit tests

* create namespace options

* fixed some documention + code issues

* made modelName not fully optional, just allowably nullish

* added doc for vectorize service

* updated cursor.getSortVector() to return null if includeSortVector !== true

* fixed a couple tests

* minor internal refactors/fixes

* added dse tests

* db admin tests

* vectorize whitelist

* few test fixes + allowed null tokens in StaticTokenProvider

* deleteAll() => deleteMany({})

* updated many examples using vector[ize] params

* some documentation and such

* couple tiny internal fixes

* example for non-astra backends

* a

* fixed couple tiny internal things
  • Loading branch information
toptobes authored Jun 22, 2024
1 parent b10da81 commit f2abb91
Show file tree
Hide file tree
Showing 95 changed files with 2,886 additions and 1,057 deletions.
12 changes: 11 additions & 1 deletion .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,24 @@
################################################################################

# Astra API endpoint
ASTRA_URI=https://<db_id>-<region>.apps.astra.datastax.com
APPLICATION_URI=https://<db_id>-<region>.apps.astra.datastax.com

# Application token, used to authenticate with the Astra API
APPLICATION_TOKEN=AstraCS:<rest_of_token>

# Backend for the Data API (astra | dse | hcd | cassandra | other). Defaults to 'astra'.
APPLICATION_ENVIRONMENT=astra

# Set this to some value to enable running tests that require a $vectorize enabled environment
ASTRA_RUN_VECTORIZE_TESTS=1

# Regex whitelist for vectorize tests to run (test names formatted as providerName@modelName@authType@dimension)
# - where dimension := 'specified' | 'default' | a specific number
# - where authType := 'header' | 'providerKey' | 'none'
# Only needs to match part of the test name to whitelist (use ^$ as necessary)
# VECTORIZE_WHITELIST=^.*@(header|none)@default
VECTORIZE_WHITELIST=.*

# Set this to some value to enable running long-running tests
ASTRA_RUN_LONG_TESTS=1

Expand Down
2 changes: 1 addition & 1 deletion .eslintrc.cjs
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
/* eslint-env node */
module.exports = {
ignorePatterns: ['dist/*', 'scripts/*'],
ignorePatterns: ['dist/*', 'scripts/*', 'examples/*'],
extends: ['eslint:recommended', 'plugin:@typescript-eslint/recommended'],
parser: '@typescript-eslint/parser',
plugins: ['@typescript-eslint'],
Expand Down
37 changes: 28 additions & 9 deletions DEVGUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,23 @@ npm run test -- -f 'integration.'
npm run test:types
```

### Running the tests on local Stargate
You can do `sh scripts/start-stargate-4-tests.sh` to spin up an ephemeral Data API on DSE instance which automatically
creates the required keyspaces and destroys itself on exit.

Then, be sure to set the following vars in `.env` exactly, then run the tests as usual.
```dotenv
APPLICATION_URI=http://localhost:8181
APPLICATION_TOKEN=Cassandra:Y2Fzc2FuZHJh:Y2Fzc2FuZHJh
APPLICATION_ENVIRONMENT=dse
```

### Running tagged tests
Tests can be given certain tags to allow for more granular control over which tests are run. These tags currently include:
- `[long]`/`'LONG'`: Longer running tests that take more than a few seconds to run
- `[admin]`/`'ADMIN'`: Tests that require admin permissions to run
- `[dev]`/`'DEV'`: Tests that require the dev environment to run
- `[prod]`/`'PROD'`: Tests that require the dev environment to run
- `[not-dev]`/`'NOT-DEV'`: Tests that require the dev environment to run
- `[vectorize]`/`'VECTORIZE'`: Tests that require a specific vectorize-enabled kube to run

To enable these some of these tags, you can set the corresponding environment variables to some values. The env
Expand Down Expand Up @@ -77,28 +88,36 @@ To run vectorize tests, you need to have a vectorize-enabled kube running, with
You must create a file, `vectorize_tests.json`, in the root folder, with the following format:

```ts
interface Config {
interface VectorizeTestSpec {
[providerName: string]: {
apiKey?: string,
providerKey?: string,
dimension?: {
[modelNameRegex: string]: number,
},
parameters?: {
[modelName: string]: Record<string, string>
[modelNameRegex: string]: Record<string, string>
},
}
}
```

where:
- `providerName` is the name of the provider (e.g. `nvidia`, `openai`, etc.) as found in `findEmbeddingProviders`
- `apiKey` is the API key for the provider (which will be passed in through the header)
- optional if no header auth test wanted
- `providerKey` is the provider key for the provider (which will be passed in @ collection creation)
- optional if no KMS auth test wanted
- `parameters` is a mapping of model names to their corresponding parameters
- `providerName` is the name of the provider (e.g. `nvidia`, `openai`, etc.) as found in `findEmbeddingProviders`.
- `apiKey` is the API key for the provider (which will be passed in through the header) .
- optional if no header auth test wanted.
- `providerKey` is the provider key for the provider (which will be passed in @ collection creation) .
- optional if no KMS auth test wanted.
- `parameters` is a mapping of model names to their corresponding parameters. The model name can be some regex that partially matches the full model name.
- `"text-embedding-3-small"`, `"3-small"`, and `".*"` will all match `"text-embedding-3-small"`.
- optional if not required. `azureOpenAI`, for example, will need this.
- `dimension` is a also a mapping of model name regex to their corresponding dimensions, like the `parameters` field.
- optional if not required. `huggingfaceDedicated`, for example, will need this.

This file is gitignored by default and will not be checked into VCS.

See `vectorize_credentials.example.json` for—guess what—an example.

### Coverage testing

To run coverage testing, run the following command:
Expand Down
32 changes: 31 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
- [Working with Dates](#working-with-dates)
- [Working with ObjectIds and UUIDs](#working-with-objectids-and-uuids)
- [Monitoring/logging](#monitoringlogging)
- [Non-astra support](#non-astra-support)
- [Non-standard environment support](#non-standard-environment-support)
- [HTTP/2 with minification](#http2-with-minification)
- [Browser support](#browser-support)
Expand Down Expand Up @@ -41,7 +42,7 @@ const db = client.db('*ENDPOINT*', { namespace: '*NAMESPACE*' });
const collection = await db.createCollection<Idea>('vector_5_collection', {
vector: {
dimension: 5,
metric: 'cosine'
metric: 'cosine',
},
checkExists: false,
});
Expand Down Expand Up @@ -334,6 +335,35 @@ client.on('commandFailed', (event) => {
})();
```

## Non-astra support

`astra-db-ts` officially supports Data API instances using non-Astra backends, such as Data API on DSE or HCD.

However, while support is native, detection is not; you will have to manually declare the environment at times.

```typescript
import { DataAPIClient, UsernamePasswordTokenProvider, DataAPIDbAdmin } from '@datastax/astra-db-ts';

// You'll need to pass in environment to the DataAPIClient when not using Astra
const tp = new UsernamePasswordTokenProvider('*USERNAME*', '*PASSWORD*');
const client = new DataAPIClient(tp, { environment: 'dse' });
const db = client.db('*ENDPOINT*');

// You'll also need to pass it to db.admin() when not using Astra for typing purposes
// If the environment does not match, an error will be thrown as a reminder
const dbAdmin: DataAPIDbAdmin = db.admin({ environment: 'dse' });
dbAdmin.createNamespace(...);
```

The `TokenProvider` class is an extensible concept to allow you to create or even refresh your tokens
as necessary, depending on the Data API backend. Tokens may even be omitted if necessary.

`astra-db-ts` provides two `TokenProvider` instances by default:
- `StaticTokenProvider` - This unit provider simply regurgitates whatever token was passed into its constructor
- `UsernamePasswordTokenProvider` - Turns a user/pass pair into an appropriate token for DSE/HCD

(See `examples/non-astra-backends` for a full example of this in action.)

## Non-standard environment support

`astra-db-ts` is designed foremost to work in Node.js environments.
Expand Down
7 changes: 6 additions & 1 deletion api-extractor.jsonc
Original file line number Diff line number Diff line change
Expand Up @@ -402,7 +402,12 @@
// "addToApiReportFile": false
},

"ae-internal-missing-underscore": {
// "ae-internal-missing-underscore": {
// "logLevel": "none",
// "addToApiReportFile": false
// }

"ae-unresolved-link": {
"logLevel": "none",
"addToApiReportFile": false
}
Expand Down
72 changes: 56 additions & 16 deletions etc/astra-db-ts.api.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ export class AstraAdmin {
// Warning: (ae-forgotten-export) The symbol "InternalRootClientOpts" needs to be exported by the entry point index.d.ts
//
// @internal
constructor(options: InternalRootClientOpts);
constructor(rootOpts: InternalRootClientOpts, adminOpts?: AdminSpawnOptions);
createDatabase(config: DatabaseConfig, options?: CreateDatabaseOptions): Promise<AstraDbAdmin>;
db(endpoint: string, options?: DbSpawnOptions): Db;
db(id: string, region: string, options?: DbSpawnOptions): Db;
Expand All @@ -115,7 +115,7 @@ export class AstraAdmin {
// @public
export class AstraDbAdmin extends DbAdmin {
// @internal
constructor(_db: Db, options: InternalRootClientOpts);
constructor(db: Db, rootOpts: InternalRootClientOpts, adminOpts?: AdminSpawnOptions);
createNamespace(namespace: string, options?: AdminBlockingOptions): Promise<void>;
db(): Db;
drop(options?: AdminBlockingOptions): Promise<void>;
Expand Down Expand Up @@ -177,6 +177,7 @@ export class Collection<Schema extends SomeDoc = SomeDoc> {
bulkWrite(operations: AnyBulkWriteOperation<Schema>[], options?: BulkWriteOptions): Promise<BulkWriteResult<Schema>>;
readonly collectionName: string;
countDocuments(filter: Filter<Schema>, upperBound: number, options?: WithTimeout): Promise<number>;
// @deprecated
deleteAll(options?: WithTimeout): Promise<void>;
deleteMany(filter?: Filter<Schema>, options?: WithTimeout): Promise<DeleteManyResult>;
deleteOne(filter?: Filter<Schema>, options?: DeleteOneOptions): Promise<DeleteOneResult>;
Expand Down Expand Up @@ -298,6 +299,11 @@ export type CreateDatabaseOptions = AdminBlockingOptions & {
dbOptions?: DbSpawnOptions;
};

// @public
export type CreateNamespaceOptions = AdminBlockingOptions & {
replication?: NamespaceReplicationOptions;
};

// @public
export abstract class CumulativeDataAPIError extends DataAPIResponseError {
readonly partialResult: unknown;
Expand Down Expand Up @@ -348,6 +354,7 @@ export interface DataAPIClientOptions {
adminOptions?: AdminSpawnOptions;
caller?: Caller | Caller[];
dbOptions?: DbSpawnOptions;
environment?: DataAPIEnvironment;
httpOptions?: DataAPIHttpOptions;
// @deprecated
preferHttp2?: boolean;
Expand All @@ -360,13 +367,29 @@ export type DataAPICommandEvents = {
commandFailed: (event: CommandFailedEvent) => void;
};

// @public
export class DataAPIDbAdmin extends DbAdmin {
// @internal
constructor(db: Db, httpClient: DataAPIHttpClient, adminOpts?: AdminSpawnOptions);
createNamespace(namespace: string, options?: CreateNamespaceOptions): Promise<void>;
db(): Db;
dropNamespace(namespace: string, options?: AdminBlockingOptions): Promise<void>;
listNamespaces(options?: WithTimeout): Promise<string[]>;
}

// @public
export interface DataAPIDetailedErrorDescriptor {
readonly command: Record<string, any>;
readonly errorDescriptors: DataAPIErrorDescriptor[];
readonly rawResponse: RawDataAPIResponse;
}

// @public
export type DataAPIEnvironment = typeof DataAPIEnvironments[number];

// @public
export const DataAPIEnvironments: readonly ["astra", "dse", "hcd", "cassandra", "other"];

// @public
export abstract class DataAPIError extends Error {
}
Expand Down Expand Up @@ -483,8 +506,13 @@ export type DateUpdate<Schema> = {
// @public
export class Db {
// @internal
constructor(endpoint: string, options: InternalRootClientOpts);
admin(options?: AdminSpawnOptions): AstraDbAdmin;
constructor(endpoint: string, rootOpts: InternalRootClientOpts, dbOpts: DbSpawnOptions | nullish);
admin(options?: AdminSpawnOptions & {
environment?: 'astra';
}): AstraDbAdmin;
admin(options: AdminSpawnOptions & {
environment: Exclude<DataAPIEnvironment, 'astra'>;
}): DataAPIDbAdmin;
collection<Schema extends SomeDoc = SomeDoc>(name: string, options?: CollectionSpawnOptions): Collection<Schema>;
collections(options?: WithNamespace & WithTimeout): Promise<Collection[]>;
command(command: Record<string, any>, options?: RunCommandOptions): Promise<RawDataAPIResponse>;
Expand Down Expand Up @@ -613,12 +641,6 @@ export class DevOpsUnexpectedStateError extends DevOpsAPIError {
export interface DropCollectionOptions extends WithTimeout, WithNamespace {
}

// @public
export class DSEUsernamePasswordTokenProvider extends TokenProvider {
constructor(username: string, password: string);
getTokenAsString(): Promise<string>;
}

// @public
export class FailedToLoadDefaultClientError extends Error {
// @internal
Expand Down Expand Up @@ -699,8 +721,10 @@ export class FindCursor<T, TRaw extends SomeDoc = SomeDoc> {
filter(filter: Filter<TRaw>): this;
// @deprecated
forEach(consumer: ((doc: T) => boolean) | ((doc: T) => void)): Promise<void>;
getSortVector(): Promise<number[] | null>;
hasNext(): Promise<boolean>;
includeSimilarity(includeSimilarity?: boolean): this;
includeSortVector(includeSortVector?: boolean): this;
limit(limit: number): this;
map<R>(mapping: (doc: T) => R): FindCursor<R, TRaw>;
get namespace(): string;
Expand Down Expand Up @@ -764,6 +788,7 @@ export interface FindOneOptions extends WithTimeout {
// @public
export interface FindOptions {
includeSimilarity?: boolean;
includeSortVector?: boolean;
limit?: number;
projection?: Projection;
skip?: number;
Expand Down Expand Up @@ -921,6 +946,15 @@ export interface ModifyResult<Schema extends SomeDoc> {
value: WithId<Schema> | null;
}

// @public
export type NamespaceReplicationOptions = {
class: 'SimpleStrategy';
replicationFactor: number;
} | {
class: 'NetworkTopologyStrategy';
[datacenter: string]: number | 'NetworkTopologyStrategy';
};

// @public
export interface NoBlockingOptions extends WithTimeout {
blocking: false;
Expand Down Expand Up @@ -1041,8 +1075,8 @@ export type SortDirection = 1 | -1 | 'asc' | 'desc' | 'ascending' | 'descending'

// @public
export class StaticTokenProvider extends TokenProvider {
constructor(token: string);
getTokenAsString(): Promise<string>;
constructor(token: string | nullish);
getToken(): Promise<string | nullish>;
}

// @public
Expand Down Expand Up @@ -1134,9 +1168,9 @@ export type ToDotNotation<Schema extends SomeDoc> = Merge<_ToDotNotation<Schema,

// @public
export abstract class TokenProvider {
abstract getTokenAsString(): Promise<string>;
abstract getToken(): Promise<string | nullish>;
// @internal
static parseToken(token: unknown): TokenProvider | nullish;
static parseToken(token: unknown): TokenProvider;
}

// @public
Expand Down Expand Up @@ -1219,6 +1253,12 @@ export interface UpsertedUpdateOptions<Schema extends SomeDoc> {
upsertedId: IdOf<Schema>;
}

// @public
export class UsernamePasswordTokenProvider extends TokenProvider {
constructor(username: string, password: string);
getToken(): Promise<string>;
}

// @public
export class UUID {
constructor(uuid: string, validate?: boolean);
Expand All @@ -1244,10 +1284,10 @@ export interface VectorizeDoc {
$vectorize: string;
}

// @alpha
// @public
export interface VectorizeServiceOptions {
authentication?: Record<string, string | undefined>;
modelName: string;
modelName: string | nullish;
parameters?: Record<string, unknown>;
provider: string;
}
Expand Down
Loading

0 comments on commit f2abb91

Please sign in to comment.