-
Notifications
You must be signed in to change notification settings - Fork 12
/
linkify.html
308 lines (303 loc) · 19.8 KB
/
linkify.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>jmrware - URL Linkification (HTTP/FTP)</title>
<meta name="author" content="Jeff Roberson" />
<meta name="version" content="20101010_1000" />
<meta name="License" content="http://www.opensource.org/licenses/mit-license.php" />
<script type="text/javascript" src="linkify.js"></script>
<style type="text/css" media="all">
body {margin: 2em; color:#333; background:#DDB; font-family: monospace;}
h1 {text-align: center;}
p {margin: 1em 0; padding: .5em;}
p:hover {border-color: #555;}
div {margin: 1em 0; padding: 1em;}
div {border: 2px solid #555;}
.linkify {border: 2px solid #FFF;}
.linkified {border: 2px solid #555; color: #080;}
.unbalanced {color: red;}
.balanced {color: #00F;}
pre {margin: 1em 0; padding: 1em; border: 2px solid #555; font-size: 1.2em; overflow: auto;}
.regex_err {color: #FFF; background-color: #F00;}
.regex_hl {color: #FFF; background-color: #060;}
</style>
<!-- To add dynamic regex syntax highlighting and regex syntax colorizing to this page,
First get jsresyntaxhighlighter.js and jsresyntaxhighlighter.css from:
http://stevenlevithan.com/regex/syntaxhighlighter/.
Both: (copyright) 2010 Steven Levithan <http://stevenlevithan.com>, MIT License.
Also get DynamicRegexHighlighter.js and ColorizeRegexSyntax.js from:
http://github.com/jmrware/DynamicRegexHighlighter/
Both: (copyright) 2010 Jeff Roberson <http://jmrware.com>, MIT License.
Copy all four files to this folder then uncomment the following four lines: -->
<!--
<link rel="stylesheet" type="text/css" href="jsresyntaxhighlighter.css" />
<script type="text/javascript" src="jsresyntaxhighlighter.js"></script>
<script type="text/javascript" src="DynamicRegexHighlighter.js"></script>
<script type="text/javascript" src="ColorizeRegexSyntax.js"></script>
-->
</head>
<body>
<h1>URL Linkification (HTTP/FTP).</h1>
<p style="text-align: center; font: bold 1.2em monospace;"><a title="Download latest version from Github" href="http://github.com/jmrware/LinkifyURL/archives/master">Version 20101010_1000</a></p>
<p>PHP version: <a href="linkify.php" title="Page is pre-linkified by PHP script prior to page load">linkify.php</a>. Javascript version: <a href="linkify.html" title="Page is interactively linkified by Javascript after page load.">linkify.html</a>.</p>
<p>Click on paragraphs below to apply Javascript linkification to un-linkified URLs.</p>
<div><b>Well-formed URL syntax (required to match 100% correctly):</b>
<p class='linkify'>Plain URLs (not delimited):<br />
foo http://example.com bar...<br />
foo http://example.com:80 bar...<br />
foo http://example.com:80/path/ bar...<br />
foo http://example.com:80/path/file.txt bar...<br />
foo http://example.com:80/path/file.txt?query=val&var2=val2 bar...<br />
foo http://example.com:80/path/file.txt?query=val&var2=val2#fragment bar...<br />
foo http://example.com/(file's_name.txt) bar... (with ' and (parentheses))<br />
foo http://[2001:0db8:85a3:08d3:1319:8a2e:0370:7348] bar... ([IPv6 literal])<br />
foo http://[2001:0db8:85a3:08d3:1319:8a2e:0370:7348]/file.txt bar... ([IPv6] with path)<br />
</p>
<p class='linkify'>URLs ending with [.!',;:?] punctuation:<br />
foo http://example.com. bar...<br />
foo http://example.com! bar...<br />
foo http://example.com' bar...<br />
foo http://example.com, bar...<br />
foo http://example.com; bar...<br />
foo http://example.com: bar...<br />
foo http://example.com? bar...<br />
</p>
<p class='linkify'>URLs within matching "()[]{}<>" delimiters:<br />
foo (http://example.com) bar...<br />
foo [http://example.com] bar...<br />
foo {http://example.com} bar...<br />
foo <http://example.com> bar... (encoded as: &lt;URL&gt;)<br />
foo <http://example.com> bar... (encoded as: &#60;URL&#62;)<br />
foo <http://example.com> bar... (encoded as: &#x3C;URL&#x3E;)<br />
foo (http://example.com/(path)/file.txt) bar... (with inside (parentheses))<br />
foo (http://example.com/path/(file.txt)) bar... (with ending (parentheses))<br />
foo [http://[2001:0db8:85a3:08d3:1319:8a2e:0370:7348]] bar... ([IPv6 literal])<br />
</p>
<p class='linkify'>URLs within matching "()[]{}<>" delimiters ending with [.!',;:?] punctuation:<br />
foo (http://example.com.) bar...<br />
foo [http://example.com!] bar...<br />
foo {http://example.com'} bar...<br />
foo <http://example.com,> bar... (encoded as: &lt;URL&gt;)<br />
foo <http://example.com;> bar... (encoded as: &#60;URL&#62;)<br />
foo <http://example.com:> bar... (encoded as: &#x3C;URL&#x3E;)<br />
foo (http://example.com/(path)/file.txt?) bar... (with inside (parentheses))<br />
foo (http://example.com/path/(file.txt).) bar... (with ending (parentheses))<br />
foo [http://[2001:0db8:85a3:08d3:1319:8a2e:0370:7348]!] bar... ([IPv6 literal])<br />
</p>
<p class='linkify'>URLs within matching quotes:<br />
foo 'http://example.com' bar...<br />
foo 'http://example.com' bar... (encoded as: &apos;URL&apos; <strong>Note 1</strong>.)<br />
foo 'http://example.com' bar... (encoded as: &#39;URL&#39;)<br />
foo 'http://example.com' bar... (encoded as: &#039;URL&#039;)<br />
foo 'http://example.com' bar... (encoded as: &#x27;URL&#x27;)<br />
foo 'http://example.com' bar... (encoded as: &#x027;URL&#x027;)<br />
foo "http://example.com" bar...<br />
foo "http://example.com" bar... (encoded as: &quot;URL&quot;)<br />
foo "http://example.com" bar... (encoded as: &#34;URL&#34;)<br />
foo "http://example.com" bar... (encoded as: &#034;URL&#034;)<br />
foo "http://example.com" bar... (encoded as: &#x22;URL&#x22;)<br />
foo "http://example.com" bar... (encoded as: &#x022;URL&#x022;)<br />
</p>
<p><strong>Note 1</strong>. The &apos; entity is not part of the HTML 4 standard and Internet Explorer 6 does
not recognize it. If you are viewing the HTML version of this page with IE, this entity may initially appear as: "&apos;". In Firefox, Opera and Safari, it appears as "'". However, the <tt>linkify_html()</tt> function converts each &apos; to its numeric html entity equivalent: &#39;, so once this has run (either by clicking on the paragraph or loading the PHP version of the page), then they should all appear correctly. Note also that The <a href="http://www.w3.org/TR/2002/REC-xhtml1-20020801/#C_16">W3C recommends</a> to NOT use the &apos; entity in HTML documents, but to use &#39; instead. This page is using it to demonstrate how this char is handled by the <tt>Linkify()</tt> function.</p>
<p class='linkify'>URLs within matching quotes and ending [.!',;:?] punctuation inside:<br />
foo 'http://example.com.' bar...<br />
foo 'http://example.com!' bar... (encoded as: &apos;URL&apos; <strong>Note 1</strong>.)<br />
foo 'http://example.com'' bar... (encoded as: &#39;URL&#39;)<br />
foo 'http://example.com,' bar... (encoded as: &#039;URL&#039;)<br />
foo 'http://example.com;' bar... (encoded as: &#x27;URL&#x27;)<br />
foo 'http://example.com:' bar... (encoded as: &#x027;URL&#x027;)<br />
foo "http://example.com?" bar...<br />
foo "http://example.com." bar... (encoded as: &quot;URL&quot;)<br />
foo "http://example.com!" bar... (encoded as: &#34;URL&#34;)<br />
foo "http://example.com'" bar... (encoded as: &#034;URL&#034;)<br />
foo "http://example.com," bar... (encoded as: &#x22;URL&#x22;)<br />
foo "http://example.com;" bar... (encoded as: &#x022;URL&#x022;)<br />
</p>
<p class='linkify'>URLs within matching quotes and ending [.!',;:?] punctuation outside:<br />
foo 'http://example.com'. bar...<br />
foo 'http://example.com'! bar... (encoded as: &apos;URL&apos; <strong>Note 1</strong>.)<br />
foo 'http://example.com'' bar... (encoded as: &#39;URL&#39;)<br />
foo 'http://example.com', bar... (encoded as: &#039;URL&#039;)<br />
foo 'http://example.com'; bar... (encoded as: &#x27;URL&#x27;)<br />
foo 'http://example.com': bar... (encoded as: &#x027;URL&#x027;)<br />
foo "http://example.com"? bar...<br />
foo "http://example.com". bar... (encoded as: &quot;URL&quot;)<br />
foo "http://example.com"! bar... (encoded as: &#34;URL&#34;)<br />
foo "http://example.com"' bar... (encoded as: &#034;URL&#034;)<br />
foo "http://example.com", bar... (encoded as: &#x22;URL&#x22;)<br />
foo "http://example.com"; bar... (encoded as: &#x022;URL&#x022;)<br />
</p>
<p class='linkify'>URLs with embedded quote and ampersand HTML entities:<br />
foo http://example.com/file's_name.txt bar... ("'" encoded as: &apos; <strong>Note 1</strong>.)<br />
foo http://example.com/file's_name.txt bar... ("'" encoded as: &#39;)<br />
foo http://example.com/file's_name.txt bar... ("'" encoded as: &#x27;)<br />
foo http://example.com/file&s_name.txt bar... ("&" encoded as: &amp;)<br />
</p>
</div>
<div><b>Not well-formed improperly delimited URL syntax (may not match 100% correctly):</b>
<p class='linkify'>URLs within only opening "()[]{}<>" delimiter:<br />
foo (http://example.com bar...<br />
foo [http://example.com bar...<br />
foo {http://example.com bar...<br />
foo <http://example.com bar... (encoded as: &lt;URL)<br />
foo <http://example.com bar... (encoded as: &#60;URL)<br />
foo <http://example.com bar... (encoded as: &#x3C;URL)<br />
foo (http://example.com/(path)/file.txt bar... (<strong>Note 2</strong>.)<br />
foo (http://example.com/path/(file.txt) bar... (<strong>Note 2</strong>.)<br />
foo [http://[2001:0db8:85a3:08d3:1319:8a2e:0370:7348] bar... (<strong>Note 2</strong>.)<br />
</p>
<p class='linkify'>URLs within only closing "()[]{}<>" delimiter:<br />
foo http://example.com) bar... (<strong>Note 2</strong>.)<br />
foo http://example.com] bar... (<strong>Note 2</strong>.)<br />
foo http://example.com} bar...<br />
foo http://example.com> bar... (encoded as: URL&gt;)<br />
foo http://example.com> bar... (encoded as: URL&#62;)<br />
foo http://example.com> bar... (encoded as: URL&#x3E;)<br />
foo http://example.com/(path)/file.txt) bar... (<strong>Note 2</strong>.)<br />
foo http://example.com/path/(file.txt)) bar... (<strong>Note 2</strong>.)<br />
foo http://[2001:0db8:85a3:08d3:1319:8a2e:0370:7348]] bar... (<strong>Note 2</strong>.)<br />
</p>
<p><strong>Note 2</strong>. The linkify function demonstrated by this web page uses a single regex replace operation that is not smart enough to correctly exclude the trailing delimiter that is erroneously being included in these examples. However, by using more sophisticated logic, a smarter linkify function can be easily implemented. As an example, the <tt>analyse_links()</tt> function in <tt>linkify.js</tt> checks for balanced bracket nesting to determine which links to mark red.</p>
<p class='linkify'>URLs within only opening quotes:<br />
foo 'http://example.com bar...<br />
foo 'http://example.com bar... (encoded as: &apos;URL <strong>Note 1</strong>.)<br />
foo 'http://example.com bar... (encoded as: &#39;URL)<br />
foo 'http://example.com bar... (encoded as: &#039;URL)<br />
foo 'http://example.com bar... (encoded as: &#x27;URL)<br />
foo 'http://example.com bar... (encoded as: &#x027;URL)<br />
foo "http://example.com bar...<br />
foo "http://example.com bar... (encoded as: &quot;URL)<br />
foo "http://example.com bar... (encoded as: &#34;URL)<br />
foo "http://example.com bar... (encoded as: &#034;URL)<br />
foo "http://example.com bar... (encoded as: &#x22;URL)<br />
foo "http://example.com bar... (encoded as: &#x022;URL)<br />
</p>
<p class='linkify'>URLs within only closing quotes:<br />
foo http://example.com' bar...<br />
foo http://example.com' bar... (encoded as: URL&apos; <strong>Note 1</strong>.)<br />
foo http://example.com' bar... (encoded as: URL&#39;)<br />
foo http://example.com' bar... (encoded as: URL&#039;)<br />
foo http://example.com' bar... (encoded as: URL&#x27;)<br />
foo http://example.com' bar... (encoded as: URL&#x027;)<br />
foo http://example.com" bar...<br />
foo http://example.com" bar... (encoded as: URL&quot;)<br />
foo http://example.com" bar... (encoded as: URL&#34;)<br />
foo http://example.com" bar... (encoded as: URL&#034;)<br />
foo http://example.com" bar... (encoded as: URL&#x22;)<br />
foo http://example.com" bar... (encoded as: URL&#x022;)<br />
</p>
<p class='linkify'>URLs within only closing quotes and ending [.!',;:?] punctuation inside:<br />
foo http://example.com.' bar...<br />
foo http://example.com!' bar... (encoded as: URL&apos; <strong>Note 1</strong>.)<br />
foo http://example.com'' bar... (encoded as: URL&#39;)<br />
foo http://example.com,' bar... (encoded as: URL&#039;)<br />
foo http://example.com;' bar... (encoded as: URL&#x27;)<br />
foo http://example.com:' bar... (encoded as: URL&#x027;)<br />
foo http://example.com?" bar...<br />
foo http://example.com." bar... (encoded as: URL&quot;)<br />
foo http://example.com!" bar... (encoded as: URL&#34;)<br />
foo http://example.com'" bar... (encoded as: URL&#034;)<br />
foo http://example.com," bar... (encoded as: URL&#x22;)<br />
foo http://example.com;" bar... (encoded as: URL&#x022;)<br />
</p>
<p class='linkify'>URLs within only closing quotes and ending [.!',;:?] punctuation outside:<br />
foo http://example.com'. bar...<br />
foo http://example.com'! bar... (encoded as: URL&apos; <strong>Note 1</strong>.)<br />
foo http://example.com'' bar... (encoded as: URL&#39;)<br />
foo http://example.com', bar... (encoded as: URL&#039;)<br />
foo http://example.com'; bar... (encoded as: URL&#x27;)<br />
foo http://example.com': bar... (encoded as: URL&#x027;)<br />
foo http://example.com"? bar...<br />
foo http://example.com". bar... (encoded as: URL&quot;)<br />
foo http://example.com"! bar... (encoded as: URL&#34;)<br />
foo http://example.com"' bar... (encoded as: URL&#034;)<br />
foo http://example.com", bar... (encoded as: URL&#x22;)<br />
foo http://example.com"; bar... (encoded as: URL&#x022;)<br />
</p>
</div>
<div><b>Pre-linkified URLs in HTML or BBCode syntax (should never match):</b>
<p class='linkify'>URLs preceded with "=" (i.e. inside HTML tags):<br />
foo href=http://example.com bar... (unquoted, no spacing)<br />
foo href="http://example.com" bar... (double-quoted, no spacing)<br />
foo href='http://example.com' bar... (single-quoted, no spacing)<br />
foo href = http://example.com bar... (unquoted, with spacing)<br />
foo href = "http://example.com" bar... (double-quoted, with spacing)<br />
foo href = 'http://example.com' bar... (single-quoted, with spacing)<br />
</p>
<p class='linkify'>URL's preceded with "=" (i.e. inside BBCode tags):<br />
foo [url=http://example.com/path/]LINK[/url] bar...<br />
foo [url = http://example.com/path/]LINK[/url] bar...<br />
foo [url="http://example.com/path/"]LINK[/url] bar...<br />
foo [url = "http://example.com/path/"]LINK[/url] bar...<br />
foo [url='http://example.com/path/']LINK[/url] bar...<br />
foo [url = 'http://example.com/path/']LINK[/url] bar...<br />
foo [url]http://example.com/path/[/url] bar...<br />
</p>
</div>
<div class="">
<p>Here's the regular expression that plucks URL's from text (PHP version):</p>
<pre class="regex_x">
$url_pattern = '/# Rev:20100913_0900 github.com\/jmrware\/LinkifyURL
# Match http & ftp URL that is not already linkified.
# Alternative 1: URL delimited by (parentheses).
(\() # $1 "(" start delimiter.
((?:ht|f)tps?:\/\/[a-z0-9\-._~!$&\'()*+,;=:\/?#[\]@%]+) # $2: URL.
(\)) # $3: ")" end delimiter.
| # Alternative 2: URL delimited by [square brackets].
(\[) # $4: "[" start delimiter.
((?:ht|f)tps?:\/\/[a-z0-9\-._~!$&\'()*+,;=:\/?#[\]@%]+) # $5: URL.
(\]) # $6: "]" end delimiter.
| # Alternative 3: URL delimited by {curly braces}.
(\{) # $7: "{" start delimiter.
((?:ht|f)tps?:\/\/[a-z0-9\-._~!$&\'()*+,;=:\/?#[\]@%]+) # $8: URL.
(\}) # $9: "}" end delimiter.
| # Alternative 4: URL delimited by <angle brackets>.
(<|&(?:lt|\#60|\#x3c);) # $10: "<" start delimiter (or HTML entity).
((?:ht|f)tps?:\/\/[a-z0-9\-._~!$&\'()*+,;=:\/?#[\]@%]+) # $11: URL.
(>|&(?:gt|\#62|\#x3e);) # $12: ">" end delimiter (or HTML entity).
| # Alternative 5: URL not delimited by (), [], {} or <>.
( # $13: Prefix proving URL not already linked.
(?: ^ # Can be a beginning of line or string, or
| [^=\s\'"\]] # a non-"=", non-quote, non-"]", followed by
) \s*[\'"]? # optional whitespace and optional quote;
| [^=\s]\s+ # or... a non-equals sign followed by whitespace.
) # End $13. Non-prelinkified-proof prefix.
( \b # $14: Other non-delimited URL.
(?:ht|f)tps?:\/\/ # Required literal http, https, ftp or ftps prefix.
[a-z0-9\-._~!$\'()*+,;=:\/?#[\]@%]+ # All URI chars except "&" (normal*).
(?: # Either on a "&" or at the end of URI.
(?! # Allow a "&" char only if not start of an...
&(?:gt|\#0*62|\#x0*3e); # HTML ">" entity, or
| &(?:amp|apos|quot|\#0*3[49]|\#x0*2[27]); # a [&\'"] entity if
[.!&\',:?;]? # followed by optional punctuation then
(?:[^a-z0-9\-._~!$&\'()*+,;=:\/?#[\]@%]|$) # a non-URI char or EOS.
) & # If neg-assertion true, match "&" (special).
[a-z0-9\-._~!$\'()*+,;=:\/?#[\]@%]* # More non-& URI chars (normal*).
)* # Unroll-the-loop (special normal*)*.
[a-z0-9\-_~$()*+=\/#[\]@%] # Last char can\'t be [.!&\',;:?]
) # End $14. Other non-delimited URL.
/imx';
$url_replace = '$1$4$7$10$13<a href="$2$5$8$11$14">$2$5$8$11$14</a>$3$6$9$12';
</pre>
<p>Here's the Javascript version: (with some added line breaks):</p>
<pre class="regex">
var url_pattern = /(\()((?:ht|f)tps?:\/\/[a-z0-9\-._~!$&'()*+,;=:\/?#[\]@%]+)(\))
|(\[)((?:ht|f)tps?:\/\/[a-z0-9\-._~!$&'()*+,;=:\/?#[\]@%]+)(\])|(\{)((?:ht|f)tps?
:\/\/[a-z0-9\-._~!$&'()*+,;=:\/?#[\]@%]+)(\})|(<|&(?:lt|#60|#x3c);)((?:ht|f)tps?:
\/\/[a-z0-9\-._~!$&'()*+,;=:\/?#[\]@%]+)(>|&(?:gt|#62|#x3e);)|((?:^|[^=\s'"\]])\s
*['"]?|[^=\s]\s+)(\b(?:ht|f)tps?:\/\/[a-z0-9\-._~!$'()*+,;=:\/?#[\]@%]+(?:(?!&(?:
gt|#0*62|#x0*3e);|&(?:amp|apos|quot|#0*3[49]|#x0*2[27]);[.!&',:?;]?(?:[^a-z0-9\-.
_~!$&'()*+,;=:\/?#[\]@%]|$))&[a-z0-9\-._~!$'()*+,;=:\/?#[\]@%]*)*[a-z0-9\-_~$()*+
=\/#[\]@%])/img;
</pre>
<p id="headerlinks">Dynamic regex syntax highlighting provided by: <a href="http://jmrware.com/articles/2010/dynregexhl/DynamicRegexHighlighter.html">DynamicRegexHighlighter</a>.</p>
</div>
<div>
<p>Happy regexing!<br />
©2010 Jeff Roberson.<br />
Released as open source under the <a href="http://www.opensource.org/licenses/mit-license.php">MIT License</a></p>
</div>
</body></html>