diff --git a/README.Rmd b/README.Rmd index 619ae09..fca9488 100644 --- a/README.Rmd +++ b/README.Rmd @@ -6,10 +6,10 @@ output: github_document ```{r, include = FALSE} knitr::opts_chunk$set( - collapse = TRUE, - comment = "#>", - fig.path = "man/figures/README-", - out.width = "100%" + collapse = TRUE, + comment = "#>", + fig.path = "man/figures/README-", + out.width = "100%" ) ``` @@ -35,7 +35,7 @@ More general information on URL parsing can be found in the introductory vignett `adaR` is part of a series of R packages to analyse webtracking data: -- [webtrackR](https://github.com/schochastics/webtrackR): preprocess raw webtracking data +- [webtrackR](https://github.com/gesistsa/webtrackR): preprocess raw webtracking data - [domainator](https://github.com/schochastics/domainator): classify domains - [adaR](https://github.com/gesistsa/adaR): parse urls @@ -92,9 +92,9 @@ practical circumstances. ```{r faster} bench::mark( - ada = ada_url_parse("https://user_1:password_1@example.org:8080/dir/../api?q=1#frag", decode = FALSE), - urltools = urltools::url_parse("https://user_1:password_1@example.org:8080/dir/../api?q=1#frag"), - check = FALSE + ada = ada_url_parse("https://user_1:password_1@example.org:8080/dir/../api?q=1#frag", decode = FALSE), + urltools = urltools::url_parse("https://user_1:password_1@example.org:8080/dir/../api?q=1#frag"), + check = FALSE ) ``` @@ -113,9 +113,9 @@ There are four more groups of functions available to work with url parsing: ```{r public_suffix} urls <- c( - "https://subsub.sub.domain.co.uk", - "https://domain.api.gov.uk", - "https://thisisnotpart.butthisispartoftheps.kawasaki.jp" + "https://subsub.sub.domain.co.uk", + "https://domain.api.gov.uk", + "https://thisisnotpart.butthisispartoftheps.kawasaki.jp" ) public_suffix(urls) ``` diff --git a/README.md b/README.md index ba4be42..57b048d 100644 --- a/README.md +++ b/README.md @@ -31,7 +31,7 @@ vignette via `vignette("adaR")`. `adaR` is part of a series of R packages to analyse webtracking data: -- [webtrackR](https://github.com/schochastics/webtrackR): preprocess raw +- [webtrackR](https://github.com/gesistsa/webtrackR): preprocess raw webtracking data - [domainator](https://github.com/schochastics/domainator): classify domains @@ -61,12 +61,10 @@ URL. ``` r library(adaR) ada_url_parse("https://user_1:password_1@example.org:8080/dir/../api?q=1#frag") -#> href protocol -#> 1 https://user_1:password_1@example.org:8080/api?q=1#frag https: -#> username password host hostname port pathname -#> 1 user_1 password_1 example.org:8080 example.org 8080 /api -#> search hash -#> 1 ?q=1 #frag +#> href protocol username +#> 1 https://user_1:password_1@example.org:8080/api?q=1#frag https: user_1 +#> password host hostname port pathname search hash +#> 1 password_1 example.org:8080 example.org 8080 /api ?q=1 #frag ``` ``` cpp @@ -116,15 +114,15 @@ practical circumstances. ``` r bench::mark( - ada = ada_url_parse("https://user_1:password_1@example.org:8080/dir/../api?q=1#frag", decode = FALSE), - urltools = urltools::url_parse("https://user_1:password_1@example.org:8080/dir/../api?q=1#frag"), - check = FALSE + ada = ada_url_parse("https://user_1:password_1@example.org:8080/dir/../api?q=1#frag", decode = FALSE), + urltools = urltools::url_parse("https://user_1:password_1@example.org:8080/dir/../api?q=1#frag"), + check = FALSE ) #> # A tibble: 2 × 6 #> expression min median `itr/sec` mem_alloc `gc/sec` #> -#> 1 ada 160µs 171µs 5757. 0B 13.3 -#> 2 urltools 103µs 109µs 9071. 0B 15.5 +#> 1 ada 158µs 165µs 5913. 0B 45.3 +#> 2 urltools 104µs 108µs 8488. 0B 42.6 ``` For further benchmark results, see `benchmark.md` in `data_raw`. @@ -144,13 +142,12 @@ suffix list](https://publicsuffix.org/), **excluding** private domains. ``` r urls <- c( - "https://subsub.sub.domain.co.uk", - "https://domain.api.gov.uk", - "https://thisisnotpart.butthisispartoftheps.kawasaki.jp" + "https://subsub.sub.domain.co.uk", + "https://domain.api.gov.uk", + "https://thisisnotpart.butthisispartoftheps.kawasaki.jp" ) public_suffix(urls) -#> [1] "co.uk" -#> [2] "gov.uk" +#> [1] "co.uk" "gov.uk" #> [3] "butthisispartoftheps.kawasaki.jp" ```