Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.3.0 #50

Merged
merged 25 commits into from
Nov 2, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
4ad48c0
allow scoring with features only (gradients deleted)
kristian-georgiev Oct 31, 2023
4c8d3d0
bump version
kristian-georgiev Oct 31, 2023
3141015
migrate to black codestyle
kristian-georgiev Oct 31, 2023
bf99c24
ignore codestyle changes for git blame
kristian-georgiev Oct 31, 2023
c2e52ef
update scores_finalized in JSON file
kristian-georgiev Oct 31, 2023
c788f84
--author=Alaa Khaddaj <[email protected]>
kristian-georgiev Nov 1, 2023
16e9d46
blockwise scoring (relevant when scoring large datasets, i.e. many ta…
kristian-georgiev Nov 1, 2023
33175ef
Merge branch '0.3.0' of github.com:MadryLab/trak into 0.3.0
kristian-georgiev Nov 1, 2023
9a8be10
move vectorize f-n to projectors; fast projector (incoming feature) w…
kristian-georgiev Nov 1, 2023
cbd3ecb
module docstrings
kristian-georgiev Nov 1, 2023
62426eb
save on I/O overhead by only writing once to disk when scoring
kristian-georgiev Nov 1, 2023
259f087
custom CudaProjector for large models to avoid overflow error in CUDA…
kristian-georgiev Nov 2, 2023
55cb9d1
allow computing TRAK with respect to specified parameter groups
kristian-georgiev Nov 2, 2023
575c838
no cudatoolkit in GH workflow (no disk space on new instances)
kristian-georgiev Nov 2, 2023
8f30687
update requirements for testing
kristian-georgiev Nov 2, 2023
72d73a2
fix cuda bug in unit test
kristian-georgiev Nov 2, 2023
be57466
fix cuda bug in unit test part 2
kristian-georgiev Nov 2, 2023
d9b292c
fix cuda bug in unit test part 3 oops
kristian-georgiev Nov 2, 2023
fa5bfc9
fix cuda bug in unit test part 4 oops
kristian-georgiev Nov 2, 2023
77b81fd
only pin memory if using cuda
kristian-georgiev Nov 2, 2023
1f415aa
make type hints compatible with python 3.8
kristian-georgiev Nov 2, 2023
102edbf
update docs
kristian-georgiev Nov 2, 2023
3c0696a
make type hints compatible with python 3.8
kristian-georgiev Nov 2, 2023
61d4f07
make type hints compatible with python 3.8
kristian-georgiev Nov 2, 2023
789e1cb
make type hints compatible with python 3.8
kristian-georgiev Nov 2, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .git-blame-ignore-revs
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# https://black.readthedocs.io/en/stable/guides/introducing_black_to_your_project.html
# Migrate code style to Black
3141015f3687dc11c311f1270c7dff80f1299fe3
5 changes: 0 additions & 5 deletions .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,11 +33,6 @@ jobs:
uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python-version }}
- name: cuda-toolkit
uses: Jimver/[email protected]
id: cuda-toolkit
with:
cuda: '11.7.0'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
Expand Down
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
[![arXiv](https://img.shields.io/badge/arXiv-2303.14186-b31b1b.svg?style=flat-square)](https://arxiv.org/abs/2303.14186)
[![PyPI version](https://badge.fury.io/py/traker.svg)](https://badge.fury.io/py/traker)
[![Documentation Status](https://readthedocs.org/projects/trak/badge/?version=latest)](https://trak.readthedocs.io/en/latest/?badge=latest)
[![Code style:
black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)


# TRAK: Attributing Model Behavior at Scale

Expand Down
87 changes: 57 additions & 30 deletions docs/source/bert.rst
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ to fit our API signatures.

We slightly redefine the :code:`forward` function so that we can pass in the inputs (:code:`input_ids`, etc.) as positional arguments instead of as keyword arguments.

For data loading, we adapt the code from Hugging Face example:
For data loading, we adapt the code from the HuggingFace example:

.. raw:: html

Expand Down Expand Up @@ -132,7 +132,7 @@ For data loading, we adapt the code from Hugging Face example:

# NOTE: CHANGE THIS IF YOU WANT TO RUN ON FULL DATASET
TRAIN_SET_SIZE = 5_000
VAL_SET_SIZE = 1_00
VAL_SET_SIZE = 10

def init_loaders(batch_size=16):
ds_train = get_dataset('train')
Expand Down Expand Up @@ -180,38 +180,59 @@ The model output function is implemented as follows:

.. code-block:: python

def get_output(func_model,
weights: Iterable[Tensor],
buffers: Iterable[Tensor],
input_id: Tensor,
token_type_id: Tensor,
attention_mask: Tensor,
label: Tensor,
) -> Tensor:
logits = func_model(weights, buffers, input_id.unsqueeze(0),
token_type_id.unsqueeze(0),
attention_mask.unsqueeze(0))
def get_output(
model,
weights: Iterable[Tensor],
buffers: Iterable[Tensor],
input_id: Tensor,
token_type_id: Tensor,
attention_mask: Tensor,
label: Tensor,
) -> Tensor:
kw_inputs = {
"input_ids": input_id.unsqueeze(0),
"token_type_ids": token_type_id.unsqueeze(0),
"attention_mask": attention_mask.unsqueeze(0),
}

logits = ch.func.functional_call(
model, (weights, buffers), args=(), kwargs=kw_inputs
)
bindex = ch.arange(logits.shape[0]).to(logits.device, non_blocking=False)
logits_correct = logits[bindex, label.unsqueeze(0)]

cloned_logits = logits.clone()
cloned_logits[bindex, label.unsqueeze(0)] = ch.tensor(-ch.inf).to(logits.device)
cloned_logits[bindex, label.unsqueeze(0)] = ch.tensor(
-ch.inf, device=logits.device, dtype=logits.dtype
)

margins = logits_correct - cloned_logits.logsumexp(dim=-1)
return margins.sum()

The implementation is identical to the standard classification example in :ref:`MODELOUTPUT tutorial`,
except here the signature of the method and the :code:`func_model` is slightly different
as the language model takes in three inputs instead of just one.
The implementation is identical to the standard classification example in
:ref:`MODELOUTPUT tutorial`, except here the signature of the method and the
:code:`func_model` is slightly different as the language model takes in three
inputs instead of just one.

Similarly, the gradient function is implemented as follows:

.. code-block:: python

def get_out_to_loss_grad(self, func_model, weights, buffers, batch: Iterable[Tensor]) -> Tensor:
def get_out_to_loss_grad(
self, model, weights, buffers, batch: Iterable[Tensor]
) -> Tensor:
input_ids, token_type_ids, attention_mask, labels = batch
logits = func_model(weights, buffers, input_ids, token_type_ids, attention_mask)
ps = self.softmax(logits / self.loss_temperature)[ch.arange(logits.size(0)), labels]
kw_inputs = {
"input_ids": input_ids,
"token_type_ids": token_type_ids,
"attention_mask": attention_mask,
}
logits = ch.func.functional_call(
model, (weights, buffers), args=(), kwargs=kw_inputs
)
ps = self.softmax(logits / self.loss_temperature)[
ch.arange(logits.size(0)), labels
]
return (1 - ps).clone().detach().unsqueeze(-1)

Putting it together
Expand All @@ -221,12 +242,14 @@ Using the above :code:`TextClassificationModelOutput` implementation, we can com

.. code-block:: python

traker = TRAKer(model=model,
task=TextClassificationModelOutput, # you can also just pass in "text_classification"
train_set_size=TRAIN_SET_SIZE,
save_dir=args.out,
device=device,
proj_dim=1024)
traker = TRAKer(
model=model,
task=TextClassificationModelOutput, # you can also just pass in "text_classification"
train_set_size=TRAIN_SET_SIZE,
save_dir=SAVE_DIR,
device=DEVICE,
proj_dim=1024,
)

def process_batch(batch):
return batch['input_ids'], batch['token_type_ids'], batch['attention_mask'], batch['labels']
Expand All @@ -235,18 +258,21 @@ Using the above :code:`TextClassificationModelOutput` implementation, we can com
for batch in tqdm(loader_train, desc='Featurizing..'):
# process batch into compatible form for TRAKer TextClassificationModelOutput
batch = process_batch(batch)
batch = [x.cuda() for x in batch]
batch = [x.to(DEVICE) for x in batch]
traker.featurize(batch=batch, num_samples=batch[0].shape[0])

traker.finalize_features()

traker.start_scoring_checkpoint(model.state_dict(), model_id=0, num_targets=VAL_SET_SIZE)
traker.start_scoring_checkpoint(exp_name='qnli',
checkpoint=model.state_dict(),
model_id=0,
num_targets=VAL_SET_SIZE)
for batch in tqdm(loader_val, desc='Scoring..'):
batch = process_batch(batch)
batch = [x.cuda() for x in batch]
traker.score(batch=batch, num_samples=batch[0].shape[0])

scores = traker.finalize_scores()
scores = traker.finalize_scores(exp_name='qnli')

We use :code:`process_batch` to transform the batch from dictionary (which is the form used by Hugging Face dataloaders) to a tuple.

Expand All @@ -256,4 +282,5 @@ That's all! You can find this tutorial as a complete script in `here <https://gi
Extending to other tasks
----------------------------------

For a more involved example that is *not* classification, see :ref:`CLIP tutorial`.
For a more involved example that is *not* classification, see :ref:`CLIP
tutorial`.
19 changes: 12 additions & 7 deletions docs/source/clip.rst
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,9 @@ Now we are ready to implement :meth:`.CLIPModelOutput.get_output`:
buffers: Iterable[Tensor],
image: Tensor,
label: Tensor):
# tailored for open_clip
# https://github.com/mlfoundations/open_clip/blob/fb72f4db1b17133befd6c67c9cf32a533b85a321/src/open_clip/model.py#L242-L245
clip_inputs = {"image": image.unsqueeze(0), "text": label.unsqueeze(0)}
image_embeddings, text_embeddings, _ = ch.func.functional_call(model,
(weights, buffers),
args=(),
Expand Down Expand Up @@ -116,24 +119,26 @@ Using the above :code:`CLIPModelOutput` implementation, we can compute
device=device,
proj_dim=1024)

traker.task.get_embeddings(model, loader_train, batch_size=...,
traker.task.get_embeddings(model, ds_train, batch_size=1, size=600, embedding_dim=1024,
preprocess_fn_img=lambda x: preprocess(x).to(device).unsqueeze(0),
preprocess_fn_txt=lambda x: tokenizer(x[0]).to(device))

traker.load_checkpoint(model.state_dict(), model_id=0)
for batch in tqdm(loader_train, desc='Featurizing...'):
batch = [x.cuda() for x in batch]
traker.featurize(batch=batch, num_samples=batch[0].shape[0])
for (img, captions) in tqdm(loader_train, desc='Featurizing...'):
x = preprocess(img).to('cuda').unsqueeze(0)
y = tokenizer(captions).to('cuda')
traker.featurize(batch=(x, y), num_samples=x.shape[0])

traker.finalize_features()

traker.start_scoring_checkpoint(exp_name='clip_example',
checkpoint=model.state_dict(),
model_id=0,
num_targets=VAL_SET_SIZE)
for batch in tqdm(loader_val, desc='Scoring...'):
batch = [x.cuda() for x in batch]
traker.score(batch=batch, num_samples=batch[0].shape[0])
for (img, captions) in tqdm(loader_val, desc='Scoring...'):
x = preprocess(img).to('cuda').unsqueeze(0)
y = tokenizer(captions).to('cuda')
traker.score(batch=(x, y), num_samples=x.shape[0])

scores = traker.finalize_scores(exp_name='clip_example')

Expand Down
6 changes: 4 additions & 2 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@
author = 'Kristian Georgiev'

# The full version, including alpha/beta/rc tags
release = '0.2.2'
version = '0.2.2'
release = '0.3.0'
version = '0.3.0'


# -- General configuration ---------------------------------------------------
Expand All @@ -46,11 +46,13 @@
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']


def skip(app, what, name, obj, would_skip, options):
if name == "__init__":
return False
return would_skip


def setup(app):
app.connect("autodoc-skip-member", skip)

Expand Down
3 changes: 0 additions & 3 deletions docs/source/modeloutput.rst
Original file line number Diff line number Diff line change
Expand Up @@ -80,9 +80,6 @@ the :code:`task` when instantiating :class:`.TRAKer`:
def get_output(...):
# Implement

def forward(...):
# Implement

def get_out_to_loss_grad(...):
# Implement

Expand Down
54 changes: 28 additions & 26 deletions setup.py
Original file line number Diff line number Diff line change
@@ -1,29 +1,31 @@
#!/usr/bin/env python
from setuptools import setup

setup(name="traker",
version="0.2.2",
description="TRAK: Attributing Model Behavior at Scale",
long_description="Check https://trak.csail.mit.edu/ to learn more about TRAK",
author="MadryLab",
author_email='[email protected]',
license_files=('LICENSE.txt', ),
packages=['trak'],
install_requires=[
"torch>=2.0.0",
"numpy",
"tqdm",
],
extras_require={
'tests':
["assertpy",
"torchvision",
"open_clip_torch",
"wget",
"scipy",
],
'fast':
["fast_jl"
]},
include_package_data=True,
)
setup(
name="traker",
version="0.3.0",
description="TRAK: Attributing Model Behavior at Scale",
long_description="Check https://trak.csail.mit.edu/ to learn more about TRAK",
author="MadryLab",
author_email="[email protected]",
license_files=("LICENSE.txt",),
packages=["trak"],
install_requires=[
"torch>=2.0.0",
"numpy",
"tqdm",
],
extras_require={
"tests": [
"assertpy",
"torchvision",
"open_clip_torch",
"wget",
"scipy",
"datasets",
"transformers",
],
"fast": ["fast_jl"],
},
include_package_data=True,
)
24 changes: 12 additions & 12 deletions tests/autocast.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,33 +23,33 @@ def compute_loss_autocast(params, inputs, targets):

print("1. Without autocast")
grads = ch.func.grad(compute_loss)(weights, inputs, targets)
print(f'grads are {grads}')
print(f"grads are {grads}")
print(f"grads dtype: {grads['weight'].dtype}")
print('='*50)
print("=" * 50)

inputs = inputs.half()
targets = targets.half()

print('2. With autocast for forward pass')
print("2. With autocast for forward pass")
grads = ch.func.grad(compute_loss_autocast)(weights, inputs, targets)
print(f'grads are {grads}')
print(f"grads are {grads}")
print(f"grads dtype: {grads['weight'].dtype}")
print('='*50)
print("=" * 50)

print('3. With autocast for forward pass and backward pass')
print("3. With autocast for forward pass and backward pass")
with autocast(device_type="cuda", dtype=ch.float16):
grads = ch.func.grad(compute_loss)(weights, inputs, targets)
print(f'inside grads are {grads}')
print(f"inside grads are {grads}")
print(f"inside grads dtype: {grads['weight'].dtype}")
print('exiting autocast')
print(f'grads are {grads}')
print("exiting autocast")
print(f"grads are {grads}")
print(f"grads dtype: {grads['weight'].dtype}")
print('='*50)
print("=" * 50)

print('4. .half() the model')
print("4. .half() the model")
model = model.half()
grads = ch.func.grad(compute_loss)(weights, inputs, targets)
print(f'grads are {grads}')
print(f"grads are {grads}")
print(f"grads dtype: {grads['weight'].dtype}")

"""
Expand Down
Loading