Skip to content

Commit

Permalink
Blog text update and layouting for mobile
Browse files Browse the repository at this point in the history
  • Loading branch information
rand0musername committed Dec 20, 2024
1 parent 05efb7e commit 7588302
Show file tree
Hide file tree
Showing 2 changed files with 52 additions and 24 deletions.
7 changes: 4 additions & 3 deletions _blogposts/synthid.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,15 @@
layout: blogpost
category: other
title: "Probing the SynthID-Text Watermark"
blogpost-authors: "Nikola Jovanović, Thibaud Gloaguen"
blogpost-authors: "Nikola Jovanović, Thibaud Gloaguen, Martin Vechev"
date: 2004-12-20
thumbnail: thumbnails/synthid.svg
image: assets/blog/fb_preview/synthid.jpg
usemathjax: true
draft: true
tldr: >
Google DeepMind's SynthID-Text is the first large-scale LLM watermark deployment. We extend the original evaluation of this scheme by applying our recent work on adversarial scenarios in LLM watermarking. In our evaluation we find that (1) the presence of SynthID-Text can be easily detected using black-box queries; (2) it is more resistant to spoofing than SOTA schemes; (3) attempts to spoof it leave discoverable clues; and (4) it is much easier to scrub than SOTA schemes even for naive adversaries. We provide ablations and insights into individual components of SynthID-Text and identify a range of research questions that could be studied in future work.
Google DeepMind's SynthID-Text is the first large-scale LLM watermark deployment. In our evaluation we find that (1) the presence of SynthID-Text can be easily detected using black-box queries; (2) it is more resistant to spoofing than SOTA schemes; (3) attempts to spoof it leave discoverable clues; and (4) it is easier to scrub than SOTA schemes even for naive adversaries.
Leveraging our recent work, we provide ablations and insights into individual components of SynthID-Text from an adversarial perspective, and identify a range of research questions that could be studied in the future.
excerpt: >
We apply the techniques from our recent work to investigate how SynthID-Text, the first large-scale deployment of an LLM watermarking scheme, fares in several adversarial scenarios. We discuss a range of findings, provide novel insights into the properties of this scheme, and outline interesting future research directions.
tweet-id: TODO
Expand Down Expand Up @@ -85,7 +86,7 @@ We confirm this effectiveness of tournament sampling by directly adding it to **
Finally, adding <span class="cache">caching</span> further drops spoofing success to $$4\%$$! This is caused by the fixed dataset of watermarked text containing less useful signal, as the cache often disables the watermark.

In a simple attempt to improve spoofing success, we tried increasing the number of black-box queries, which indeed helps: tripling the number of queries from $$30k$$ to $$90k$$ brings spoofing success back to $$15\%$$, reversing the effect of the previous two modifications, and suggesting that spoofing may still be possible but much more costly.
We finally notice that using the **Bayesian detector (BD)**, despite requiring training data and removing control of the FPR, further helps against spoofing, dropping the success rate to $$5\%$$.
We finally notice that using the **Bayesian detector (BD)**, despite requiring training data and loosening the statistical guarantee, further helps against spoofing, dropping the success rate to $$5\%$$.
As the SynthID-Text paper notes, to apply BD to some LLM, it should observe watermarked responses of _that LLM_, while unwatermarked examples are always taken from the human text distribution.
This makes BD effectively a hybrid between a watermark detector and a [post-hoc detector](https://arxiv.org/pdf/2310.15264#page=12.62), flagging the joint presence of the watermark _and_ the specific LLM.
This increases spoofing resistance, as the attacker will use a different model to produce spoofed texts.
Expand Down
69 changes: 48 additions & 21 deletions _sass/_blog.sass
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@
.blog-logoimg
display: block
margin: 0
width: 40%

.blog-list
margin-left: 0
Expand All @@ -18,7 +17,6 @@

.blog-list-box
border-bottom: 2px solid rgba(0,0,0,0.1)
width: 90%
box-shadow: 0.2rem 0.4rem 0.8rem 0.2rem #00000033

.blog-list-thumbnail
Expand All @@ -36,7 +34,7 @@
margin-top: 2%

.blog-list-title
font-size: 28px
font-size: 34px
font-weight: bold

.blog-list-authors
Expand All @@ -59,7 +57,7 @@
/* blogpost */

.blogpost
font-size: 18px
font-size: 16px
table
margin-left: auto
margin-right: auto
Expand Down Expand Up @@ -87,24 +85,20 @@

.blogpost-thumbnail
display: block
width: 15%
width: 30%
margin: auto

.blogpost-headerblock
margin-bottom: inherit
padding: 0 1.0rem
max-width: 75%
max-width: 100%
display: block
margin: auto

@media (max-width: $mobile-break)
.blogpost-headerblock
max-width: 100%

.blogpost-title
text-align: center
font-weight: bold
font-size: 54px
font-size: 40px

.blogpost-subtitle
display: flex
Expand All @@ -116,6 +110,7 @@

.blogpost-subtitle-left
flex: 0.7
flex-grow: 1
margin-right: 3%

.blogpost-subtitle-right
Expand Down Expand Up @@ -157,11 +152,8 @@

.blogpost-col
margin-bottom: inherit
padding: 0 1.0rem
max-width: 75%
justify-content: center
text-align: justify
font-size: 18px
font-family: Raleway
color: #404040

Expand All @@ -186,37 +178,51 @@

.blogpost-img10
@extend %blogpost-img
width: 10%

.blogpost-img15
@extend %blogpost-img
width: 15%

.blogpost-img20
@extend %blogpost-img
width: 20%

.blogpost-img25
@extend %blogpost-img
width: 25%

.blogpost-img30
@extend %blogpost-img
width: 30%

.blogpost-img40
@extend %blogpost-img
width: 40%

.blogpost-img50
@extend %blogpost-img
width: 50%

.blogpost-img100
@extend %blogpost-img


@media (min-width: $mobile-break)

.blog-list-box
width: 90%

.blogpost-thumbnail
width: 15%

.blogpost-headerblock
max-width: 75%

.blogpost-title
font-size: 54px

.blogpost
font-size: 18px

.blogpost-col
margin-bottom: inherit
padding: 0 1.0rem
max-width: 75%

.blogpost-wrap
display: flex
width: 100%
Expand All @@ -235,6 +241,27 @@
flex-shrink: 0
align-self: flex-start

.blogpost-img10
width: 10%

.blogpost-img15
width: 15%

.blogpost-img20
width: 20%

.blogpost-img25
width: 25%

.blogpost-img30
width: 30%

.blogpost-img40
width: 40%

.blogpost-img50
width: 50%

.blogpost-wrap > .blogpost-img10
flex-basis: 10%

Expand Down

0 comments on commit 7588302

Please sign in to comment.