Skip to content

Commit

Permalink
[WIP]: Analysis Input - Simulates Rendered Text
Browse files Browse the repository at this point in the history
Strips out link and image URLs, table markup, and other non-visible items that distract the analysis, giving false results. For example, URLs can be a source of many completion suggestions for numbers.
  • Loading branch information
seandenigris committed Feb 17, 2025
1 parent c5be2f5 commit 9b430fd
Show file tree
Hide file tree
Showing 3 changed files with 34 additions and 0 deletions.
9 changes: 9 additions & 0 deletions src/VirtualStash-Core/MailMessage.extension.st
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
Extension { #name : #MailMessage }

{ #category : #'*VirtualStash-Core' }
MailMessage >> vsAnalysisInput [

^ self bodyRlHtml
ifNotNil: [ :html | html vsAnalysisInput ]
ifNil: [ self bodyTextFormatted ]
]
6 changes: 6 additions & 0 deletions src/VirtualStash-Core/RlEmail.extension.st
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,12 @@ RlEmail >> possibleCurrencyAmounts [
^ matches asSet
]

{ #category : #'*VirtualStash-Core' }
RlEmail >> vsAnalysisInput [

^ self mailMessage vsAnalysisInput
]

{ #category : #'*VirtualStash-Core' }
RlEmail >> vsCounterpartyGuess [

Expand Down
19 changes: 19 additions & 0 deletions src/VirtualStash-Core/RlHTML.extension.st
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
Extension { #name : #RlHTML }

{ #category : #'*VirtualStash-Core' }
RlHTML >> vsAnalysisInput [
"Return 'just the plain text the user would see if rendered'"

^ PBApplication uniqueInstance newCommandStringFactory
bindingAt: #htmlString put: self contents;
script: 'import html2text
text_maker = html2text.HTML2Text()
text_maker.ignore_links = True
text_maker.ignore_images = True
text_maker.ignore_tables = True
text_maker.ignore_emphasis = True';
resultExpression: 'text_maker.handle(htmlString)';
sendAndWait

"Reference: https://github.com/Alir3z4/html2text/blob/master/docs/usage.md#using-options"
]

0 comments on commit 9b430fd

Please sign in to comment.