Skip to content

Commit

Permalink
ThejasNU/init command (#42)
Browse files Browse the repository at this point in the history
* add init command

* move local init

* refactor command

* add db init

* add exceptions in other cmds

* fix cli tests

* add init local test

* update README files

* update docs
  • Loading branch information
ThejasNU authored Jan 30, 2025
1 parent e49b888 commit c660cc2
Show file tree
Hide file tree
Showing 16 changed files with 401 additions and 87 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,7 @@ Commands:
execute Search and execute a specific tool.
find Find items from the catalog based on a natural language QUERY string or by name.
index Walk the source directory trees (SOURCE_DIRS) to index source files into the local catalog.
init Initialize the necessary files/collections for local/database catalog.
ls List all indexed tools and/or prompts in the catalog.
publish Upload the local catalog and/or logs to a Couchbase instance.
status Show the status of the local catalog.
Expand Down
22 changes: 17 additions & 5 deletions docs/source/guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,11 @@ Both tool builders and prompt builders (i.e., agent builders) will follow this w
Agent Catalog currently integrates with Git (using the working Git SHA) to version each item.
**You must be in a Git repository to use Agent Catalog.**

4. **Indexing**: Use the command below to index your tools/prompts:
4. **Indexing**: Use the command below to first initialize local catalog and later index your tools/prompts:

.. code-block:: bash
agentc init local catalog
.. code-block:: bash
Expand Down Expand Up @@ -112,16 +116,24 @@ Both tool builders and prompt builders (i.e., agent builders) will follow this w
export AGENT_CATALOG_USERNAME=Administrator
export AGENT_CATALOG_PASSWORD=password
3. Use the command to publish your items to your Couchbase instance.
3. Use the command to initialize database catalog.

.. code-block:: bash
agentc publish [[tool|prompt]] --bucket [BUCKET_NAME]
agentc init db catalog --bucket [BUCKET_NAME]
This will create a new scope in the specified bucket called ``agent_catalog``, which will contain all of your
items.
items. It also creates necessary collections and indexes for the catalog.

4. Use the command to publish your items to your Couchbase instance.

.. code-block:: bash
agentc publish [[tool|prompt]] --bucket [BUCKET_NAME]
This will push all local catalog items to the scope ``agent_catalog`` in the specified bucket.

4. Note that Agent Catalog isn't meant for the "publish once and forget" case.
5. Note that Agent Catalog isn't meant for the "publish once and forget" case.
You are encouraged to run the :command:`agentc publish` command as often as you like to keep your items
up-to-date.

Expand Down
1 change: 1 addition & 0 deletions docs/source/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,7 @@ some libraries like numpy need to be built, subsequent runs will be faster).
execute Search and execute a specific tool.
find Find items from the catalog based on a natural language QUERY string or by name.
index Walk the source directory trees (SOURCE_DIRS) to index source files into the local catalog.
init Initialize the necessary files/collections for local/database catalog.
ls List all indexed tools and/or prompts in the catalog.
publish Upload the local catalog and/or logs to a Couchbase instance.
status Show the status of the local catalog.
Expand Down
8 changes: 7 additions & 1 deletion libs/agentc/agentc/auditor.py
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,13 @@ def _find_local_log(self) -> typing.Self:
# We have reached the root. We cannot find the catalog folder.
return self
working_path = working_path.parent
(working_path / DEFAULT_ACTIVITY_FOLDER).mkdir(exist_ok=True)

auditor_directory_path = working_path / DEFAULT_ACTIVITY_FOLDER
if not auditor_directory_path.exists():
raise ValueError(
f"Could not find the {DEFAULT_ACTIVITY_FOLDER} folder!\nPlease use 'agentc init' command first.\nExecute 'agentc init --help' for more information."
)

self.auditor_output = working_path / DEFAULT_ACTIVITY_FOLDER / DEFAULT_LLM_ACTIVITY_NAME

return self
Expand Down
2 changes: 2 additions & 0 deletions libs/agentc_cli/agentc_cli/cmds/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
from .execute import cmd_execute
from .find import cmd_find
from .index import cmd_index
from .init import cmd_init
from .ls import cmd_ls
from .publish import cmd_publish
from .status import cmd_status
Expand All @@ -22,4 +23,5 @@
"cmd_version",
"cmd_web",
"cmd_ls",
"cmd_init",
]
7 changes: 5 additions & 2 deletions libs/agentc_cli/agentc_cli/cmds/index.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@
from ..models.context import Context
from .util import DASHES
from .util import KIND_COLORS
from .util import init_local
from agentc_core.catalog import __version__ as CATALOG_SCHEMA_VERSION
from agentc_core.catalog.index import MetaVersion
from agentc_core.catalog.index import index_catalog
Expand Down Expand Up @@ -43,7 +42,11 @@ def cmd_index(
# and on how to add .agent-activity/ to the .gitignore file? Or, should
# we instead preemptively generate a .agent-activity/.gitiginore
# file during init_local()?
init_local(ctx)

if not os.path.exists(ctx.catalog):
raise RuntimeError(
"Local catalog directory does not exist!\nPlease use 'agentc init' command first.\nExecute 'agentc init --help' for more information."
)

# TODO: One day, maybe allow users to choose a different branch instead of assuming
# the HEAD branch, as users currently would have to 'git checkout BRANCH_THEY_WANT'
Expand Down
45 changes: 45 additions & 0 deletions libs/agentc_cli/agentc_cli/cmds/init.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
import typing

from ..models import Context
from .util import init_db_auditor
from .util import init_db_catalog
from .util import init_local_activity
from .util import init_local_catalog
from agentc_core.util.models import CouchbaseConnect
from agentc_core.util.models import Keyspace
from agentc_core.util.publish import get_connection

func_mappings = {"local": {"catalog": init_local_catalog, "auditor": init_local_activity}}


def cmd_init(
ctx: Context,
catalog_type: typing.List[typing.Literal["catalog", "auditor"]],
type_metadata: typing.List[typing.Literal["catalog", "auditor"]],
connection_details_env: typing.Optional[CouchbaseConnect] = None,
keyspace_details: typing.Optional[Keyspace] = None,
):
if ctx is None:
ctx = Context()
initialize_local = "local" in catalog_type
initialize_db = "db" in catalog_type
initialize_catalog = "catalog" in type_metadata
initialize_auditor = "auditor" in type_metadata

if initialize_local:
if initialize_catalog:
init_local_catalog(ctx)
if initialize_auditor:
init_local_activity(ctx)

if initialize_db:
# Get bucket ref
err, cluster = get_connection(conn=connection_details_env)
if err:
raise ValueError(f"Unable to connect to Couchbase!\n{err}")

if initialize_catalog:
init_db_catalog(ctx, cluster, keyspace_details, connection_details_env)

if initialize_auditor:
init_db_auditor(ctx, cluster, keyspace_details)
45 changes: 9 additions & 36 deletions libs/agentc_cli/agentc_cli/cmds/publish.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,10 @@
from agentc_core.defaults import DEFAULT_CATALOG_SCOPE
from agentc_core.defaults import DEFAULT_LLM_ACTIVITY_NAME
from agentc_core.defaults import DEFAULT_META_COLLECTION_NAME
from agentc_core.util.ddl import create_gsi_indexes
from agentc_core.util.ddl import create_vector_index
from agentc_core.util.ddl import check_if_scope_collection_exist
from agentc_core.util.models import CouchbaseConnect
from agentc_core.util.models import CustomPublishEncoder
from agentc_core.util.models import Keyspace
from agentc_core.util.publish import create_scope_and_collection
from agentc_core.util.publish import get_connection
from couchbase.exceptions import CouchbaseException
from pydantic import ValidationError
Expand Down Expand Up @@ -95,14 +93,12 @@ def cmd_publish(

logger.debug(len(log_messages), "logs found..\n")

bucket_manager = cb.collections()

log_col = DEFAULT_AUDIT_COLLECTION
log_scope = DEFAULT_AUDIT_SCOPE
try:
(msg, err) = create_scope_and_collection(bucket_manager, scope=log_scope, collection=log_col)
except:
raise ValueError(msg) from err

bucket_manager = cb.collections()

check_if_scope_collection_exist(bucket_manager, log_scope, log_col, True)

# get collection ref
cb_coll = cb.scope(log_scope).collection(log_col)
Expand Down Expand Up @@ -149,9 +145,8 @@ def cmd_publish(
# ---------------------------------------------------------------------------------------- #
meta_col = k + DEFAULT_META_COLLECTION_NAME
meta_scope = scope
(msg, err) = create_scope_and_collection(bucket_manager, scope=meta_scope, collection=meta_col)
if err is not None:
raise ValueError(msg)

check_if_scope_collection_exist(bucket_manager, meta_scope, meta_col, True)

# get collection ref
cb_coll = cb.scope(meta_scope).collection(meta_col)
Expand All @@ -178,9 +173,8 @@ def cmd_publish(
# ---------------------------------------------------------------------------------------- #
catalog_col = k + DEFAULT_CATALOG_COLLECTION_NAME
catalog_scope = scope
(msg, err) = create_scope_and_collection(bucket_manager, scope=catalog_scope, collection=catalog_col)
if err is not None:
raise ValueError(msg)

check_if_scope_collection_exist(bucket_manager, catalog_scope, catalog_col, True)

# get collection ref
cb_coll = cb.scope(catalog_scope).collection(catalog_col)
Expand All @@ -206,24 +200,3 @@ def cmd_publish(
click.secho(f"Couldn't insert catalog items!\n{e.message}", fg="red")
return e
click.secho(f"{k.capitalize()} catalog items successfully uploaded to Couchbase!\n", fg="green")

# ---------------------------------------------------------------------------------------- #
# GSI and Vector Indexes #
# ---------------------------------------------------------------------------------------- #
click.secho(f"Now building the GSI indexes for the {k} catalog.", fg="yellow")
s, err = create_gsi_indexes(bucket, cluster, k, True)
if not s:
raise ValueError(f"GSI indexes could not be created \n{err}")
else:
click.secho(f"All GSI indexes for the {k} catalog have been successfully created!\n", fg="green")
logger.debug("Indexes created successfully!")

click.secho(f"Now building the vector index for the {k} catalog.", fg="yellow")
dims = len(catalog_desc.items[0].embedding)
_, err = create_vector_index(bucket, k, connection_details_env, dims)
if err is not None:
raise ValueError(f"Vector index could not be created \n{err}")
else:
click.secho(f"Vector index for the {k} catalog has been successfully created!", fg="green")
logger.debug("Vector index created successfully!")
click.secho(DASHES, fg=KIND_COLORS[k])
116 changes: 108 additions & 8 deletions libs/agentc_cli/agentc_cli/cmds/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,20 +10,34 @@
import typing

from ..models.context import Context
from agentc_core.analytics.create import create_analytics_udfs
from agentc_core.catalog import CatalogChain
from agentc_core.catalog import CatalogDB
from agentc_core.catalog import CatalogMem
from agentc_core.catalog import __version__ as CATALOG_SCHEMA_VERSION
from agentc_core.catalog.descriptor import CatalogDescriptor
from agentc_core.catalog.index import MetaVersion
from agentc_core.catalog.index import index_catalog
from agentc_core.catalog.version import lib_version
from agentc_core.defaults import DEFAULT_AUDIT_COLLECTION
from agentc_core.defaults import DEFAULT_AUDIT_SCOPE
from agentc_core.defaults import DEFAULT_CATALOG_COLLECTION_NAME
from agentc_core.defaults import DEFAULT_CATALOG_NAME
from agentc_core.defaults import DEFAULT_MAX_ERRS
from agentc_core.defaults import DEFAULT_META_COLLECTION_NAME
from agentc_core.defaults import DEFAULT_SCAN_DIRECTORY_OPTS
from agentc_core.learned.embedding import EmbeddingModel
from agentc_core.util.ddl import create_gsi_indexes
from agentc_core.util.ddl import create_vector_index
from agentc_core.util.models import CouchbaseConnect
from agentc_core.util.models import Keyspace
from agentc_core.util.publish import create_scope_and_collection
from agentc_core.version import VersionDescriptor
from couchbase.cluster import Cluster
from couchbase.exceptions import CouchbaseException

# The following are used for colorizing output.
CATALOG_KINDS = ["prompt", "tool"]
LEVEL_COLORS = {"good": "green", "warn": "yellow", "error": "red"}
KIND_COLORS = {"tool": "bright_magenta", "prompt": "blue", "log": "cyan"}
try:
Expand All @@ -35,14 +49,6 @@
logger = logging.getLogger(__name__)


def init_local(ctx: Context):
# Init directories.
os.makedirs(ctx.catalog, exist_ok=True)
os.makedirs(ctx.activity, exist_ok=True)

# (Note: the version checking logic has been moved into index).


def load_repository(top_dir: pathlib.Path = None):
# The repo is the user's application's repo and is NOT the repo
# of agentc_core. The agentc CLI / library should be run in
Expand Down Expand Up @@ -173,3 +179,97 @@ def logging_printer(content: str, *args, **kwargs):
# the pattern similar to repo_load()'s searching for a .git/ directory
# and scan up the parent directories to find the first .agent-catalog/
# subdirectory?


def init_local_catalog(ctx: Context):
# Init directories.
os.makedirs(ctx.catalog, exist_ok=True)


def init_local_activity(ctx: Context):
# Init directories.
os.makedirs(ctx.activity, exist_ok=True)


def init_db_catalog(
ctx: Context, cluster: Cluster, keyspace_details: Keyspace, connection_details_env: CouchbaseConnect
):
# Get the bucket manager
cb = cluster.bucket(keyspace_details.bucket)
bucket_manager = cb.collections()

# ---------------------------------------------------------------------------------------- #
# SCOPES and COLLECTIONS #
# ---------------------------------------------------------------------------------------- #
for kind in CATALOG_KINDS:
# Create the metadata collection if it does not exist
click.secho(f"Now creating scope and collections for the {kind} catalog.", fg="yellow")
meta_col = kind + DEFAULT_META_COLLECTION_NAME
(msg, err) = create_scope_and_collection(bucket_manager, scope=keyspace_details.scope, collection=meta_col)
if err is not None:
raise ValueError(msg)
else:
click.secho(f"Metadata collection for the {kind} catalog has been successfully created!\n", fg="green")

# Create the catalog collection if it does not exist
click.secho(f"Now creating the catalog collection for the {kind} catalog.", fg="yellow")
catalog_col = kind + DEFAULT_CATALOG_COLLECTION_NAME
(msg, err) = create_scope_and_collection(bucket_manager, scope=keyspace_details.scope, collection=catalog_col)
if err is not None:
raise ValueError(msg)
else:
click.secho(f"Catalog collection for the {kind} catalog has been successfully created!\n", fg="green")

# ---------------------------------------------------------------------------------------- #
# GSI and Vector Indexes #
# ---------------------------------------------------------------------------------------- #
for kind in CATALOG_KINDS:
click.secho(f"Now building the GSI indexes for the {kind} catalog.", fg="yellow")
completion_status, err = create_gsi_indexes(keyspace_details.bucket, cluster, kind, True)
if not completion_status:
raise ValueError(f"GSI indexes could not be created \n{err}")
else:
click.secho(f"All GSI indexes for the {kind} catalog have been successfully created!\n", fg="green")

click.secho(f"Now building the vector index for the {kind} catalog.", fg="yellow")
catalog_path = pathlib.Path(ctx.catalog) / (kind + DEFAULT_CATALOG_NAME)

try:
with catalog_path.open("r") as fp:
catalog_desc = CatalogDescriptor.model_validate_json(fp.read())
except FileNotFoundError:
click.secho(
f"Unable to create vector index for {kind} catalog because dimension of vector can't be determined!\nInitialize the local catalog first, index items and try initializing the db catalog again.\n",
fg="red",
)
continue

dims = len(catalog_desc.items[0].embedding)
_, err = create_vector_index(keyspace_details.bucket, kind, connection_details_env, dims)
if err is not None:
raise ValueError(f"Vector index could not be created \n{err}")
else:
click.secho(f"Vector index for the {kind} catalog has been successfully created!\n", fg="green")


def init_db_auditor(ctx: Context, cluster: Cluster, keyspace_details: Keyspace):
# Get the bucket manager
cb = cluster.bucket(keyspace_details.bucket)
bucket_manager = cb.collections()

log_col = DEFAULT_AUDIT_COLLECTION
log_scope = DEFAULT_AUDIT_SCOPE
click.secho("Now creating scope and collections for the auditor.", fg="yellow")
(msg, err) = create_scope_and_collection(bucket_manager, scope=log_scope, collection=log_col)
if err is not None:
raise ValueError(msg)
else:
click.secho("Scope and collection for the auditor have been successfully created!\n", fg="green")

click.secho("Now creating the analytics UDFs for the auditor.", fg="yellow")
try:
create_analytics_udfs(cluster, keyspace_details.bucket)
click.secho("All analytics UDFs for the auditor have been successfully created!\n", fg="green")
except CouchbaseException as e:
click.secho("Analytics views could not be created.", fg="red")
logger.warning("Analytics views could not be created: %s", e)
Loading

0 comments on commit c660cc2

Please sign in to comment.