forked from edymtt/opendetex
-
Notifications
You must be signed in to change notification settings - Fork 35
/
detex.1
142 lines (142 loc) · 4.38 KB
/
detex.1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
.TH DETEX 1 "August 12, 1993" "Purdue University"
.SH NAME
detex \- a filter to strip \fITeX\fP commands from a .tex file.
.SH SYNOPSIS
.B detex
[ \fB\-clnstw\fR ] [ \fB\-e\fI environment-list\fR ]
[ \fIfilename\fR[.tex] ... ]
.SH DESCRIPTION
.I Detex
reads each file in sequence, removes all comments and
.I TeX
control sequences
and writes the remainder on the standard output.
All text in math mode and display mode is removed.
By default,
.I detex
follows \einput commands.
If a file cannot be opened, a warning message is
printed and the command is ignored.
If the
.B \-n
option is used, no \einput or \einclude commands will be processed.
This allows single file processing.
If no input file is given on the command line,
.I detex
reads from standard input.
.PP
If the magic sequence ``\ebegin{document}'' appears in the text,
.I detex
assumes it is dealing with
.I LaTeX
source and
.I detex
recognizes additional constructs used in
.IR LaTeX .
These include the \einclude and \eincludeonly commands.
The
.B \-l
option can be used to force
.I LaTeX
mode and the
.B \-t
option can be used to force
.I TeX
mode regardless of input content.
.PP
Text in various environment modes of
.I LaTeX
is ignored. The default modes are array, eqnarray, equation, longtable,
picture, tabular and verbatim. The
.B \-e
option can be used to specify a comma separated
.I environment-list
of environments to ignore. The list replaces the defaults so specifying an
empty list effectively causes no environments to be ignored.
.PP
The
.B \-c
option can be used in
.I LaTeX
mode to have detex echo the arguments to \ecite,
\eref, and \epageref macros. This can be useful when sending the output to
a style checker.
.PP
.I Detex
assumes the standard character classes are being used for
.IR TeX .
.I Detex
allows white space between control sequences
and magic characters like `{' when recognizing things like
.I LaTeX
environments.
.PP
The
.B \-r
option tries to naively replace $..$, $$..$$, \e(..\e) and \e[..\e]
with nouns and verbs (in particular, "noun" and "verbs")
in a way that keeps sentences readable.
.PP
If the
.B \-w
flag is given, the output is a word list, one `word' (string of two or more
letters and apostrophes beginning with a letter)
per line, and all other characters ignored.
Without \fB\-w\fR the output follows the original,
with the deletions mentioned above. Newline characters are
preserved where possible
so that the lines of output match the input as closely as possible.
.PP
The
.B \-1
option will prefix each printed line with `filename:linenumber:` indicating
where that line is coming from in terms of the original (La)TeX document.
.PP
The TEXINPUTS environment variable is used to find \einput and \einclude
files. Like \fITeX\fP, it interprets a leading or trailing `:' as the default
TEXINPUTS.
It does \fInot\fP support the `//' directory expansion magic sequence.
.PP
Detex now handles the basic \fITeX\fP ligatures as a special case, replacing the
ligatures with acceptable character4 substitutes. This eliminates
spelling errors introduced by merely removing them. The ligatures are
\eaa, \eae, \eoe, \ess, \eo, \el (and their upper-case
equivalents). The special "dotless" characters \ei and \ej are also
replaced with i and j respectively.
.PP
Note that previous versions of
.I detex
would replace control sequences with a space character to prevent words
from running together.
However, this caused accents in the middle of words to break words, generating
"spelling errors" that were not desirable.
Therefore, the new version merely removes these accents.
The old functionality can be essentially duplicated by using the
.B \-s
option.
.SH SEE ALSO
.BR tex(1)
.SH DIAGNOSTICS
Nesting of \einput is allowed but the number of opened files must not
exceed the system's limit on the number of simultaneously opened files.
.I Detex
ignores unrecognized option characters after printing a warning message.
.SH AUTHOR
Originally written by Daniel Trinkle, Computer Science Department,
Purdue University.
.PP
Maintained by Piotr Kubowicz <https://github.com/pkubowicz/opendetex>.
.SH BUGS
.I Detex
is not a
.I TeX
interpreter (it essentially reads the input with a (f)lex program), so
it is easily confused by some constructs. Most errors result in too much
rather than too little output.
.PP
Running \fILaTeX\fR
source without a ``\ebegin{document}''
through \fIdetex\fR may produce
errors.
.PP
Suggestions for improvements are (mildly) encouraged.