Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a taint analysis engine #1160

Open
ccojocar opened this issue Jun 23, 2024 · 2 comments
Open

Create a taint analysis engine #1160

ccojocar opened this issue Jun 23, 2024 · 2 comments

Comments

@ccojocar
Copy link
Member

The gosec supports now besides AST based rule also SSA analyzers. The SSA code representation can be leveraged to build a taint analysis engine which can uncover more complex security issues.

More details about the SSA representation of Go code and the list of instructions available can be found in the go docs.

The ssadump tool can be used to print the SSA representation of a program as follows:

go install golang.org/x/tools/cmd/ssadump@latest

cat > main.go << EOF
package main

import "fmt"

func main() {
  fmt.Println("Hello SSA!)
}
EOF

ssadump -build F main.go

# Name: command-line-arguments.init
# Package: command-line-arguments
# Synthetic: package initializer
func init():
0:                                                                entry P:0 S:2
        t0 = *init$guard                                                   bool
        if t0 goto 2 else 1
1:                                                           init.start P:1 S:1
        *init$guard = true:bool
        t1 = fmt.init()                                                      ()
        jump 2
2:                                                            init.done P:2 S:0
        return

# Name: command-line-arguments.main
# Package: command-line-arguments
# Location: /usr/local/google/home/ccojocar/go/src/github.com/securego/samples/ssa/main.go:5:6
func main():
0:                                                                entry P:0 S:0
        t0 = new [1]any (varargs)                                       *[1]any
        t1 = &t0[0:int]                                                    *any
        t2 = make any <- string ("Hello SSA!":string)                       any
        *t1 = t2
        t3 = slice t0[:]                                                  []any
        t4 = fmt.Println(t3...)                              (n int, err error)
        return

There are different options for -build available (consult ssadump -h).

The SSA representation is compiled at the function unit.

The taint analysis engine should be able to track data(input arguments) through the call graph from a predefined sink function to a predefined list of source functions/variables/types form where the data is initially inputed, and make a judgement if this data was sanitized or not on its way through the call graph. In this manner, some security issues such as command injection, SSRF, path traversal, SQLi can be detected more reliably.

A source can originate in different places such as global variables, functions or types. These are the places from where the data is inputed initially into the program. This is a list of potential source candidates:

sources:
  # Sources that are defined in Go documentation as a "variable" (note: these variables will have an SSA type of "Global").
  variables:
    "os":
      - "Args"
  # Sources that are defined in Go documentation as a "function"
  functions:
    "flag":
      - "Arg"
      - "Args"
    "os":
      - "Environ"
      - "File"
    "crypto/tls":
      - "LoadX509KeyPair"
      - "X509KeyPair"
    "os/user":
      - "Lookup"
      - "LookupId"
      - "Current"
    "crypto/x509":
      - "Subjects"
    "io":
      - "ReadAtLeast"
      - "ReadFull"
    "database/sql":
      - "Query"
      - "QueryRow"
    "bytes":
      - "String"
      - "ReadBytes"
      - "ReadByte"
    "bufio":
      - "Text"
      - "Bytes"
      - "ReadString"
      - "ReadSlice"
      - "ReadRune"
      - "ReadLine"
      - "ReadBytes"
      - "ReadByte"
    "archive/tar":
      - "Next"
      - "FileInfo"
      - "Header"
    "net/url":
      - "ParseQuery"
      - "ParseUriRequest"
      - "Parse"
      - "Query"
  # Sources that are defined in Go documentation as a "type" (note: adding types will consider all functions that use that type to be tainted).
  types:
    "net/http":
      - "Request"

For example to detect a SSRF issue, the taint analysis engine will track the data (e.g. the URL value) from a sink such as net/http.Get/Do/Head/Post/PostForm or net/http.Client.Do/Get/Head/Post through the call graph back into one of the predefined sources from the list above (e.g. os.Args). If it discovers that the data comes from one of these predefined sources and it was not sanitized, then it can raise with more confidence an SSRF issue.

One challenge to tackle is to build the call graph using SSA representation starting from a sink function back to a source. The SSA representation breaks down the code representation into function units. This will require that the arguments of each function to be track from one SSA function unit to another in order to build the complete graph. And also beyond package boundaries since typically a program contains multiple packages. The analysis should stop when a call goes out of program own packages, otherwise the complexity and time will explode and not be visible in due time.

@audunmo
Copy link
Contributor

audunmo commented Jul 8, 2024

Hey, just want to say that as a security practioner in the go development space, I both see and deeply appreciate the work you're putting into Gosec. I'm hoping to be able to contribute to this project again soon. Keep up the great work!

@ccojocar
Copy link
Member Author

ccojocar commented Jul 8, 2024

Thanks @audunmo for your feedback! I am looking forward to your future contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants