title | layout | nav_order | description | permalink |
---|---|---|---|---|
Home |
home |
1 |
Simular API documentation. |
/ |
{:toc}
Takes a screenshot of an element on the screen or the whole screen, and saves the screenshot as a PNG to a file.
func SaveScreenshot(
element: UIElement? = nil,
fileName: String? = nil,
directory: String? = nil
)
If provided, limit the screenshot to the frame of the element. Default is the whole screen.
Name for the image name, default is “simularSavedImage.png”.
Directory to save the image, default at desktop.
None
Examples:
- Instruction: Save screenshot of whole screen to screenshot.png
SaveScreenshot(fileName: "screenshot.png")
- Instruction: Screenshot the group element and save to clipboard
SaveScreenshot(element: group)
- Instruction: Screenshot the table element and save to /User/path/to/screenshot/
SaveScreenshot(element: table, directory: "/User/path/to/screenshot/")
Take a screenshot of an element or the current page and save it to the system clipboard
func ScreenshotToClipboard(element: UIElement? = nil)
If provided, limit the screenshot to the frame of the element. Default is the whole screen.
None
Examples:
- Instruction: take screenshot and save to clipboard
ScreenshotToClipboard()
- Instruction: get the table element on the page, save a screenshot of it to clipboard
var elements = GetElements(elementRoles: ["table"])
ScreenshotToClipboard(element: elements[0])
Open or switch to an application.
func Open(app: String? = nil,
url: String? = nil,
waitForLoadComplete: Bool = true,
waitTime: Int = 0)
application name, e.g., Google Chrome, iMessage.
url address of a web page, e.g., google.com
Whether or not to wait for the app or URL to complete loading. Default is true.
Duration in seconds to wait after opening the app or URL, in addition to the wait for load.
None
Examples:
- Instruction: open chrome
Open(app: "Google Chrome")
- Instruction: go to Zoom app
Open(app: "Zoom")
- Instruction: switch to Slack
Open(app: "Slack")
- Instruction: go to simular.ai
Open(url: "https://simular.ai")
Type using the keyboard.
func Type(
text: String,
withReturn: Bool = false,
waitTime: Int = 0,
waitForLoadComplete: Bool = false
)
A piece of text to type.
Whether or not to press the return (enter) key after typing.
Duration in seconds to wait after typing.
Whether or not to wait for a page to complete loading after typing.
None
Examples:
- Instruction: Enter John Smith with return
Type(text: "John Smith", withReturn: true)
- Instruction: Input John Smith
Type(text: "John Smith", withReturn: false)
- Instruction: Type John Smith
Type(text: "John Smith", withReturn: false)
- Instruction: Fill in John Smith
Type(text: "John Smith", withReturn: false)
- Instruction: type https://google.com and press enter
Type(text: "https://google.com", withReturn: true)
Perform keyboard shortcut in the current application.
func Shortcut(key: String,
cmd: Bool,
ctrl: Bool,
option: Bool,
shift: Bool,
waitTime: Int)
A key to be pressed
Whether the command modifier should be pressed when tapping the key.
Whether the control modifier should be pressed when tapping the key.
Whether the option modifier should be pressed when tapping the key.
Whether the shift modifier should be pressed when tapping the key.
None
Examples:
- Instruction: press ctrl+a
Shortcut(key: "a", ctrl: true)
- Instruction: use shortcut to select all
Shortcut(key: "a", cmd: true)
- Instruction: press cmd+shift+t
Shortcut(key: "t", cmd: true, shift: true)
- Instruction: press option and down arrow
Shortcut(key: "downArrow", option: true)
- Instruction: press enter
Shortcut(key: "enter")
Click(_:element:clickType:withCommand:spatialRelation:anchorConcept:prior:position:includeInvisible:waitForLoadComplete:waitTime)
Click on something, either specified by the at
argument or the element
argument.
Disambiguate by specifing the spatial relation between the target element and an anchor concept.
func Click(
_ at: String = "",
element: UIElement? = nil,
clickType: String = "left",
withCommand: Bool = false,
spatialRelation: String = "",
anchorConcept: String = "",
prior: String = "none",
position: String = "center",
includeInvisible: Bool = false,
waitForLoadComplete: Bool = false,
waitTime: Int = 0
)
Description of a target object to click. For higher accuracy, describe the target object by its role and value, e.g. "sign in button", "first name textfield".
A UIElement. If given, ignores the at
argument and directly clicks on this element.
Type of click. Options are: “left" (default), “right”, and “doubleClick” (two quick left clicks)
If true, agent presses the command key during click.
A comma-separated String of spatial relationship between the target object and element(s) that best
match the anchorConcept
. Options are: "closest", "furthest", "above", "right", "below", "left", "contains", "containedIn"
Description of an object used as an anchor with spatialRelation
.
A optional global spatial location prior used for disambiguation among elements that have similar descriptions. Options are: "left", "right", "top", "bottom". For example, "left" means to choose the left-most element among candidates.
Position within the frame of an element to click. Options are: topleft, topcenter, topright, middleleft, center, middleright, bottomleft, bottomcenter, bottomright, anywhere
Whether or not to find the target from sections of the page that are not currently visible (i.e., needs scrolling).
If true, waits for page to finish loading after the click.
Integer number of seconds to wait after click.
None
Examples:
- Instruction: click new tab
Click(at: "new tab")
- Instruction: cmd+click verify insurance button
Click(at: "verify insurance button", withCommand: true)
- Instruction: double click Styles.zip
Click(at: "Styles.zip", clickType: "doubleClick")
- Instruction: click "slider" closest to and on the right of "pointer size"
Click(at: "slider", spatialRelation: "closest,right", anchorConcept: "pointer size")
Moves the cursor to an object. Disambiguate by specifing the spatial relation between the target element and an anchor concept.
All parameters besides to
have the same definition as those in the Click
action.
func Move(
_ to: String = "",
element: UIElement? = nil,
spatialRelation: String = "",
anchorConcept: String = "",
prior: String = "none",
includeInvisible: Bool = false,
waitForLoadComplete: Bool = false,
waitTime: Int = 0
)
Description of a target object to click. For higher accuracy, describe the target object by its role and value, e.g. "sign in button", "first name textfield".
See definition in Click
.
See definition in Click
.
See definition in Click
See definition in Click
See definition in Click
See definition in Click
See definition in Click
None
Examples:
- Instruction: Move to textarea below verify insurance
Move(to: "textarea", spatialRelation: "below", anchorConcept: "verify insurance")
GetElements(elementRoles:elementOverallDescription:threshold:root:spatialRelation:anchorRole:anchorOverallDescription:anchorElements:horizontalRank:verticalRank:sortBy:useNeighborForMissingDescription:returnType)
Get elements that satisfy some conditions inside the current frontmost application or inside a root element (if given).
For disambiguation, one can constrain the search to elements that satisfy certain spatial relations to anchor elements.
This function supports multiple return types according to returnType
.
func GetElements(
elementRoles: [String] = [],
elementOverallDescription: String = "",
threshold: Double = 0.75,
root: UIElement? = nil,
spatialRelation: String = "",
anchorRole: String = "",
anchorOverallDescription: String = "",
anchorElements: [UIElement] = [],
horizontalRank: Int? = nil,
verticalRank: Int? = nil,
sortBy: String = "",
useNeighborForMissingDescription: Bool = false,
returnType: String = "elementArray"
) -> Any?
Constrains the search to elements whose role is included in this array of roles. Required if elementOverallDescription is not given.
Description of elements to get from the page. Required if elementRoles are not provided.
If elementOverallDescription
is given, then accept candidate elements whose normalized string
similarity to elementOverallDescription
is above this threshold value.
If given, then the search is limited to elements contained within this root element.
A comma-separated String of spatial relationships between the target elements and the anchor. Available options are: "closest", "furthest", "above", "right", "below", "left", "contains", "containedIn", "sameRow", "sameColumn"
Role of element(s) used as anchor for spatial relation.
Description of an object used as an anchor with spatialRelation.
Elements to use as anchor for spatial relation constraints.
If anchorElements
is provided, then anchorRole and anchorOverallDescription are ignored.
If given, sorts the elements by x-coordinate of frame midpoint and returns the element with this rank. Left-most element has rank 1.
If given, sorts the elements by y-coordinate of frame midpoint and returns the element with this rank. Top-most element has rank 1.
Returns the found elements in sorted order, by "x" (left to right) or "y" (top to bottom).
Used only if returnType
is "elementArray".
Whether or not to use the description of an element's neighbor as a substitute if the element's description is empty. This is only used if returnType involves returning element descriptions.
Options are: "elementArray" (default), "string", "stringArray", "strToElemDict".
- elementArray: returns an array of UIElements
- string: returns a semicolon-separated string of descriptions of found elements
- stringArray: returns an array of String descriptions of found elements
- strToElemDict: returns a [String: Element] dictionary with element description as keys and corresponding element as value
Depending on returnType: [UIElement], String, [String], [String: UIElement]
Examples:
- Instruction: get all links from the page
var allLinks = GetElements(elementRoles: ["link"])
- Instruction: Get all comboBox elements to the right of text "state selection
var comboBoxes = GetElements(elementRoles: ["comboBox"], spatialRelation: "right", anchorRole: "staticText", anchorOverallDescription: "state selection")
- Instruction: Get all cells in the table and return as dictionary
var cellDict = GetElements(elementRoles: ["cell"], spatialRelation: "containedIn", anchorRole: "table", returnType: "strToElemDict")
GetAttributeOfElement(elementRole:elementOverallDescription:attribute:threshold:root:spatialRelation:anchorRole:anchorOverallDescription:anchorElements:horizontalRank:verticalRank)
Searches for an element that matches the input criteria and gets the element's value for a specified attribute.
func GetAttributeOfElement(
elementRole: String = "",
elementOverallDescription: String = "",
attribute: String = "",
threshold: Double = 0.75,
root: UIElement? = nil,
spatialRelation: String = "",
anchorRole: String = "",
anchorOverallDescription: String = "",
anchorElements: [UIElement] = [],
horizontalRank: Int? = nil,
verticalRank: Int? = nil
) -> String
Role of the target element. Required if elementOverallDescription is not given.
Description of the element. Required if elementRoles are not provided.
Valid options are: "role", "description", "title", "value". For example, use "value" to get the value of text elements.
If elementOverallDescription
is given, then accept candidate elements whose normalized string similarity with elementOverallDescription
is above this threshold value.
If given, then the search is limited to elements contained within this root element.
A comma-separated String of spatial relationships between the target elements and the anchor.
Role of element(s) used as anchor for spatial relation.
Description of an object used as an anchor with spatialRelation.
Elements to use as anchor for spatial relation constraints. If anchorElements
is provided, then anchorRole and anchorOverallDescription are ignored.
If given, sorts the elements by x-coordinate of frame midpoint and returns the element with this rank. Left-most element has rank 1.
If given, sorts the elements by y-coordinate of frame midpoint and returns the element with this rank. Top-most element has rank 1.
String value of an attribute of an element
Examples:
- Instruction: Get the value of the radioButton element with description "radioButton tab point & click"
var value = GetAttributeOfElement(elementRole: "radioButton", elementOverallDescription: "radioButton tab point & click", attribute: "value")
- Instruction: Get the value of the valueIndicator element closest to and on the right of "statictext text double-click speed"
var value = GetAttributeOfElement(elementRole: "valueIndicator", attribute: "value", spatialRelation: "closest,right", anchorOverallDescription: "statictext text double-click speed")
Respond to the user with a message and optionally ask for user confirmation to proceed.
func Respond(_ message: String, requireConfirm: Bool = false)
A message to show to the user.
Whether or not user confirmation is required to proceed with the remaining actions.
None
Examples:
- Instruction: Ask the user for confirmation
Respond(message: "Could you confirm?", requireConfirm: true)
- Instruction: Output llmResult to the user
Respond(message: llmResult)
Runs a Large Language Model on the given input String.
func LLM(input: String, model: String? = nil) -> String
Input to LLM
Optional LLM model specification. Inquire for more details.
String
Examples:
- Instruction: run LLM on the prompt
var llmOutput = LLM(input: prompt)
- Instruction: get content of page and use LLM to summarize it into 50 words
var content = GetContent()
var summary = LLM(input: "Summarize the following into 50 words: \(content)")
Checks if all the concepts can be found on the current visible screen.
func ConceptsExist(_ concepts: [String]) -> Bool
An array of target concepts to find.
If all concepts can be found, returns true, otherwise false.
Examples:
- Instruction: if there is a play button, click it
if ConceptsExists(concepts: ["play button"]) {
Click(at: "play button")
}
- Instruction: if the page has a sign in button and a log in button, ask user to log in
if ConceptsExist(concepts: ["sign in button", "log in button"]) {
Respond(message: "Could you log in?", requireConfirm: true)
}
Waits until all concepts can be found in the current frontmost window. If not all concepts can be found within 10 seconds, action returns failure
func WaitForConcepts(concepts: [String])
None
Examples:
- Instruction: Wait for the log in button and sign up button to show up
WaitForConcepts(concepts: ["log in button", "sign up button"])
Get text content from the current frontmost window or a region corresponding to the provided concept or element.
func GetContent(
inConcept: String = "",
inElement: UIElement? = nil,
format: String = "flat"
) -> Any?
Gets content from the subtree rooted at elements that match this concept.
Get content from the subtree rooted at this element.
Format of the returned content. Options are: "flat", "json", "xml", "xmlSlim"
If inElement
argument is given or the frontmost window was used (because neither inConcept
nor
inElement
was given), then returns a single String. Otherwise, returns a [String] array with one String per root element.
Examples:
- Instruction: get content from page in json format
var pageContent = GetContent(format: "json")
- Instruction: get content from the table element in xml slim format
var tableContent = GetContent(inElement: table, format: "xmlSlim")
- Instruction: get content inside group job post details
var groupContent = GetContent(inConcept: "group job post details")
Summarizes the current page using an LLM and respond to the user.
func SummarizePage(prompt: String? = nil) -> String?
An optional prompt for an LLM.
String
Examples:
- Instruction: summarize the flight options on this page
var summary = SummarizePage(prompt: "summarize the flight options")
Gets the column ids of each header in a given array of column headers in a Google Sheet.
For example, if the sheet has column headers "website", "description", "date" in cells A1, B1, C1, respectively, then GetGoogleSheetColumns(headers: ["website", "description", "date"])
returns ["A", "B", "C"]
Note: This function currently assumes that the table headers are on row 1.
func GetGoogleSheetColumns(headers: [String]) -> [String]?
Array of column header
Array of column id, each is a capital letter from A to Z
Examples:
- Instruction: get the columns for the headers "description" and "website"
var columns = GetGoogleSheetColumns(headers: ["description", "website"])
Gets the value of a cell in a Google Sheet.
func GetGoogleSheetCellValue(cell: String) -> String?
Label of a cell. Column is indicated by a capital letter and row is indicated by a number. For example "B42" is the cell at column B row 42.
Value of the cell
Examples:
- Instruction: get the description of cell B42.
var cellValue = GetGoogleSheetCellValue(cell: "B42")
Sets the value of a Google Sheet cell.
func SetGoogleSheetCellValue(cell: String, value: String)
Label of a cell. Column is indicated by a capital letter and row is indicated by a number. For example "B42" is the cell at column B row 42.
value to write to the cell
None
Examples:
- Instruction: write "Simular" into cell B42
SetGoogleSheetCellValue(cell: "B42", value: "Simular")
Writes the textual representations of the given item into debug console.
func Print(_ item: Any)
None
Examples:
- Instruction: print("Simular") in console
Print("Simular")
Copies a String to clipboard.
func CopyToClipboard(text: String)
Text to be copied to the clipboard.
None
Examples:
- Instruction: copy cell to the clipboard
CopyToClipboard(text: cell)
Given an array of elements, and an array of requested attributes, returns the attributes of all the elements.
func GetAttributesOfElements(
elements: [UIElement],
attributes: [String],
separator: String = " ",
attributeNameThreshold: Double = 0.2
) -> [String]
An Array of UIElements.
A [String] array of attributes. Valid options are: "role", "description", "title", "value". For example, use "value" to get the value of text elements.
For an element, its attribute values are concatenated using this separator between values.
A threshold in [0,1]. If a query attribute name does not exactly match valid attribute names, then the attribute whose normalized Levenshtein distance to the query is the smallest and is lower than thos threshold will be used.
A [String] array with length equal to the number of elements. The i-th entry is the concatenated attribute values for the i-th element.
Examples:
- Instruction: Get value from elements with role textArea and description text entry area source editor
var elements = GetElements(elementRoles: ["textArea"], elemOverallDescription: "text entry area source editor")
var value = GetAttributesOfElements(elements: elements, attributes: ["value"])
- Instruction: Get description and title from the text entry area source editor
var sourceEditor = GetElements(elemOverallDescription: "text entry area source editor")
var desc = GetAttributesOfElements(elements: sourceEditor, attributes: ["description", "title"])
Get all cells from a row or column element. Either row or column must be given.
func GetCells(row: UIElement? = nil, column: UIElement? = nil) -> [UIElement]?
A row element that contains one or more cells.
A column element that contains one or more cells.
An array of cell elements contained in the given row or column. If input is a row, the output array is sorted by increasing x-coordinate (left to right). If input is a column, the output array is sorted by increasing y-coordinate (top to bottom).
Examples:
- Instruction: get all cells in the first row
var rows = GetElements(elementRoles: ["row"], sortBy: "y")
var cells = GetCells(row: rows[0])
Get the value of a given cell element.
func GetCellValue(cell: UIElement) -> String
A cell element.
Value contained in the cell.
Examples:
- Get the value of the cell
var cellValue = GetCellValue(cell: cell)
Gets a dictionary from the JSON-formatted part of a string. The input string must contain at most one pair of outermost curly braces { ... }. Returns an empty dictionary if the input does not contain a JSON-formatted substring.
func GetDictFromJson(jsonStr: String) -> [String: Any]?
A string that contains JSON-formatted data.
A dictionary representation of the JSON data in the input jsonStr
Examples:
- Instruction: get content from whole page, use LLM to convert it to JSON format enclosed in triple backticks, and get a dictionary from the JSON.
var wholePageContent = GetContent()
var llmInput = "Convert the following content to JSON format enclosed in triple backticks: \(wholePageContent)"
var llmOutput = LLM(input: llmInput)
var dict = GetDictFromJson(jsonStr: llmOutput)
Gets a JSON representation of a dictionary.
func GetJSONFromDict(dict: [String: Any]) -> String
A [String: Any] dictionary.
A JSON representation of the input dictionary.
Examples:
- Instruction: convert dict to json.
var json = GetJSONFromDict(dict)
Get the content of the current clipboard.
func GetFromClipboard() -> String
Content of the currrent clipboard
Examples:
- Instruction: get the clipboard content
var clipboardContent = GetFromClipboard()
Gets XML-formatted description of the contents in each element.
func GetStructuredDescription(fromElements: [UIElement]) -> [String]
An array of UIElements [u_1, ..., u_n]
.
An array of String [s_1, ..., s_n]
, where each s_i
is an XML-formatted description of the contents
rooted at u_i
.
Examples:
- Instruction: get the elements with role "group" and description "group under plan progress", then get description of the elements in XML format
var elements = GetElements(elementRoles: ["group"], elemOverallDescription: "group under plan progress")
var description = GetStructuredDescription(fromElements: elements)
Set the value of a text field or text area without using keyboard.
func SetValue(
text: String,
withReturn: Bool = false,
waitTime: Int = 0,
waitForLoadComplete: Bool = false
)
A piece of text to type.
Whether or not to press the return (enter) key after typing.
Duration in seconds to wait after typing.
Whether or not to wait for a page to complete loading after typing.
None
Examples:
- Instruction: set value to John Smith with return
SetValue(text: "John Smith", withReturn: true)
- Instruction: set text field value to John Smith
SetValue(text: "John Smith", withReturn: false)
Access the given dictionary using the query. Counts as a match if the dictionary key contains the query, all lowercased.
func SoftDictLookup(dict: [String: Any], query: String) -> Any?
A dictionary to access.
A query used to search the dictionary.
Dictionary value corresponding to the key that contains the query.
Examples:
- Instruction: Get all text fields inside the webpage. If any text field description matches "first name", click it
var elemDict = GetElements(elementRoles: ["textField"], spatialRelation: "containedIn", anchorRole: "webArea", returnType: "strToElemDict")
if var elem = SoftDictLookup(dict: elemDict, query: "first name") {
Click(elem: elem)
}
Put Agent into sleep state for a certain amount of time.
func Wait(_ waitTime: Int, unit: String = "s")
Options are "s" for seconds (default) and "ms" for milliseconds.
None
Examples:
- Instruction: Wait for 3s
Wait(3)
- Instruction: Wait for 500ms
Wait(waitTime: 500, unit: "ms")
Given an array of table cell values, return a corresponding array of cell indices.
For example, suppose the table has value1 in cell A10, then GetCellIndices(cellValues: ["value1"])
returns ["A10"]
func GetCellIndices(cellValues: [String])
values of the cell
[String] array of cell indices]
Examples:
- Instruction: Get the indices of the cells containing value1 and value2
var cellIndices = GetCellIndices(cellValues: ["value1", "value2"])
Get the label of the given cell element in Excel.
func GetCellLabel(cell: UIElement) -> String
A cell element
cell's label String. Example: "A1"
Examples:
- Instruction: Get the label of the cell
var label = GetCellLabel(cell: cell)
Given a header or a index String, return the column under it as [index: Element] dictionary If the table has a column with header "Website" in cell A1, and elements elem1 and elem2 under it, then this function returns ["A2": elem1, "A3": elem2].
func GetTableColumn(header: String? = nil, index: String? = nil)
value of column header.
index of column header, e.g. "B42".
Dictionary of [String: UIElement] pair for all information in the column under header
Examples:
- Instruction: Get the columns under Website
var websiteColumn = GetTableColumn(header: "Website")
- Instruction: Get the columns under index A1
var A1Column = GetTableColumn(index: "A1")
Read the contents of a file whose location is specified by path
.
func ReadFile(path: String) -> String
Either an absolute path to a file, or a name of a file (assumed to be in the default app cache directory).
Contents of the file as a String
Examples:
- Instruction: read the file named results.json
var results = ReadFile(path: "results.json")
- Instruction: get contents of file /Users/somebody/Documents/project/links.txt
var links = ReadFile(path: "/Users/somebody/Documents/project/links.txt")
Writes the given text to a file. If the file already exists, then appends text to it, with an option to overwrite the existing content. Unless specified path, writes to desktop /Library/Caches/com.simular.SimularNote/SimularActionResult/ Will throw an error if there is non-folder file named SimularActionResult also existing at desktop
func WriteToFile(
text: String,
path: String? = "SimularActionResult.txt",
overwrite: Bool = false
)
Text to write to a file.
path of the file, default at /Library/Caches/com.simular.SimularNote/SimularActionResult/{path}.txt. If path contains "/", treat it as full path
Whether or not to overwrite the contents if filePath points to an existing file.
None
Examples:
- Instruction: Append jsonResult to /User/somebody/Documents/result.json
WriteToFile(text: jsonResult, path: "/User/somebody/Documents/result.json")
- Instruction: Write jsonResult to /User/someone/Desktop/result.json, overwrite existing file.
WriteToFile(text: jsonResult, path: "/User/someone/Desktop/result.json", overwrite: true)
- Instruction: Write jsonResult to default action result file, overwrite existing file.
WriteToFile(text: jsonResult, overwrite: true)