Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

alter default config file processing #772

Closed
ler762 opened this issue Nov 8, 2018 · 13 comments
Closed

alter default config file processing #772

ler762 opened this issue Nov 8, 2018 · 13 comments

Comments

@ler762
Copy link
Contributor

ler762 commented Nov 8, 2018

The current configuration file processing behavior is

  read /etc/tidy.conf
  if evar HTML_TIDY exists
    read that file
  else
    read ~/tidy.conf

I don't see a way to prevent tidy from reading /etc/tidy.conf or to tell tidy to be quiet while processing /etc/tidy.conf other than adding a show-info: no as the first line of the file.

I want to be able to tell tidy not to read ~/.tidyrc or /etc/tidy.conf

If nothing else, it'd be nice to be able to run the regression tests without having to make sure ~/.tidyrc doesn't exist

So change the config file processing to

  if no '-config' option is specified then
    if evar HTML_TIDY exists
      read that file
    else if ~/tidy.conf exists
      read that file
    else if /etc/tidy.conf exists
      read that file

Note that if you have a ~/.tidyrc and you want /etc/tidy.conf read you'll need to have
config /etc/tidy.conf in your ~/.tidyrc

earlier discussion starts here:
#752 (comment)

What do you/anyone think about having tidy process the command line options first and allowing them to over-ride default behaviors?

@ler762
Copy link
Contributor Author

ler762 commented Nov 13, 2018

I find it easier to talk about a change if there's actual code to talk about.
See #774

@geoffmcl
Copy link
Contributor

@ler762 this seems to be an ever changing landscape ;=))

I had warmed to adding a new options, something like -no-config, or whatever... see Issue 30, to prevent these potentially 3 config files being loaded, but now you seem to have dropped this... have I got it wrong?

Such a non-invasive new option could certainly help with the regression test case... you have added a pre scan of the command line, which I think would be necessary for such a new option to work... although we could make it just if this is the first command line option, and only check one item...

I could see -

  if not -no-config option in command line...
    if defined and exists TIDY_CONFIG_FILE
       read TIDY_CONFIG_FILE
    if env HTML_TIDY exists
       read that file
    else if defined and exists TIDY_USER_CONFIG_FILE
       read TIDY_USER_CONFIG_FILE
  process command line...
    ...
      if arg == config
         read config
    ...

But it is a very minimal use case - such regression testers should be able to handle it... so can be skipped... minor...

Now, if I read this and the code you suggest - which could have been an embedded or attached diff/patch, rather than a PR, but - you seem to want to change the meaning of the -config <file> option, and use this as a configSpecified, ie no config signal...

And in addition, change the order of the files... put TIDY_USER_CONFIG_FILE first, then an else for ENV HTML_TIDY, then else for TIDY_CONFIG_FILE... Why?

I am not sure I understand why you want/need to potentially break many user configs that could be out there... tidy has been that way for more than 10 years... in fact the ENV HTML_TIDY and/or ~/.tidyrc since original tidy in 2000... nearly 20 years...

To change from the ordered additive config way it is now, to an else config case, different order! Why? What reason?

Can you give a use case? thanks...

@ler762
Copy link
Contributor Author

ler762 commented Nov 15, 2018

@geoffmcl my landscape has stayed the same :) I don't like tidy reading the default init files no matter what, so I'm coming at it from the viewpoint of -no-config is the shorter, more obvious equivalent of -config /dev/null. Both have the effect of telling tidy not to read the default init files.

I had warmed to adding a new options, something like -no-config ... but now you seem to have dropped this... have I got it wrong?

We can make it two separate issues if you like.

  1. add a -no-config option to tell tidy not to read any of the default init files.
  2. modify -config <filename> to also mean tidy doesn't read any of the default init files.

My only question for 1 is do you allow `-no-config`` anywhere on the command line or does it have to be the first option? I don't like positional parameters, so I'd go with anywhere.

If you don't do 2 then you have to modify every test script to include -no-config on the command line.
You also end up with the ugly kludge of -no-config -config <filename>

Such a non-invasive new option could certainly help with the regression test case... you have added a pre scan of the command line, which I think would be necessary for such a new option to work... although we could make it just if this is the first command line option, and only check one item...

I don't see any reason -no-config should be required to be the first option.

Now, if I read this and the code you suggest - you seem to want to change the meaning of the -config option, and use this as a configSpecified, ie no config signal...

Yes, I want to change the meaning of -config <file> to also mean don't read the default init files.

And in addition, change the order of the files... put TIDY_USER_CONFIG_FILE first, then an else for ENV HTML_TIDY, then else for TIDY_CONFIG_FILE... Why?

Because I like having just one init file and I don't like hidden variables (ie environment variables).
If we're changing the meaning of -config then this is the time to change how the init files are processed.

I am not sure I understand why you want/need to potentially break many user configs that could be out there... tidy has been that way for more than 10 years... in fact the ENV HTML_TIDY and/or ~/.tidyrc since original tidy in 2000... nearly 20 years...

The various mis-behaviors listed in #752 have existed for how long? My feeling is that if people were actually using /etc/tidy.conf and/or ~/.tidyrc these issues would have been raised & fixed long ago. So I'm guessing the potential breakage is small.

But this isn't something I care about all that much, so how about we leave it until -no-config & -config discussions are finished?

@geoffmcl
Copy link
Contributor

@ler762 maybe a new option could be say -no-env-config, given that -help-env shows these files... do not think -no-env-config -config <file> is an ugly kludge ;=))

But agree, such a new option is not really necessary... has a minor use case... so can be dropped, if no one is interested in coding it... and it should probably be a separate issue anyway...

It seems the #752 mis-behaviors are related the mute option, which was only introduced about a year ago... agree that aggressive use of this option would have lead to these wrinkles being smoothed out earlier, as with any new option... as I am trying to do in PR #764...

That gives us no indication of the usage of ~/.tidyrc, over nearly 20 years, nor of /etc/tidy.conf, over more than 10 years...

But why guess at the potential breakage when it can be avoided?

The current additive order of config files is not wrong!

It is as intended... Why change it? What perceived bug is being addressed?

No use case is offered for this order change... Without that, this issue wavers on a Won't Fix label...

Look forward to further feedback... thanks...

@ler762
Copy link
Contributor Author

ler762 commented Nov 18, 2018

@geoffmcl ok - what I need is a -no-config option to tell tidy not to read /etc/tidy.conf or ~/.tidyrc.
That or to delete my ~/.tidyrc ...

I have a ~/.tidyrc and the way I build tidy is

$ cat build-and-test.sh
#!/bin/sh

# starting from my local tidy git repository
cd tidy-html5/build/cmake

cmake ../.. -DCMAKE_BUILD_TYPE=Release \
            -DBUILD_SHARED_LIB:BOOL=OFF \
            -DCMAKE_INSTALL_PREFIX=/usr/local \
            -DTIDY_RC_NUMBER=ler762

# make && make install
  make
# go back up to the git <root-dir>
cd ../../..
# and down to where the regression test script lives
cd tidy-html5-tests/tools-sh
# run the test
./testall.sh ../../tidy-html5/build/cmake/tidy
# and show which tests returned different results
diff -U5 ../cases/testbase-expects/ ../cases/testbase-results

The regression tests fail if I have a ~/.tidyrc
If I build tidy with a patch that changes the -config <file> option to also mean "don't read /etc/tidy.conf or ~/.tidyrc" the regression tests work.

My other use case is for tidy used in a docbook => html make file. If I have a ~/.tidyrc then the tidy'ed result isn't going to match what any other team member gets when they build the documentation.

Adding a -no-config option fixes both problems.

maybe a new option could be say -no-env-config, given that -help-env shows these files

There's already a -config option, so it seems like -no-config is the obvious choice for the "don't read the default init files" option.

The current additive order of config files is not wrong!

That I can't tell tidy not to read any init file is, at least for me, a failing. And we touched on this before - you're on windows, so you don't see the problem. CMakeLists.txt has

#------------------------------------------------------------------------
# Runtime Configuration File Support
#   By default on Unix-like systems when building for the console program,
#   support runtime configuration files in /etc/ and in ~/. To prevent this,
#   set ENABLE_CONFIG_FILES to NO. Specify -DTIDY_CONFIG_FILE and/or
#   -DTIDY_USER_CONFIG_FILE to override the default paths in tidyplatform.h.
# @note: this section refactored to support #584.
#------------------------------------------------------------------------
if ( UNIX AND SUPPORT_CONSOLE_APP )
  option ( ENABLE_CONFIG_FILES "Set to OFF to disable Tidy runtime configuration file support" ON )
  <.. snip ..>
else ()
  option ( ENABLE_CONFIG_FILES "Set to ON to enable Tidy runtime configuration file support" OFF )

On the other hand, nobody else has commented on this issue so how about we just add a -no-config option to tell tidy not to read /etc/tidy.conf or ~/.tidyrc and close this out?

@geoffmcl
Copy link
Contributor

@ler762 seems two separate things... see if I can deal with each...

The regression tests fail if I have a ~/.tidyrc

This is a BUG and should be fixed!

It includes TIDY_CONFIG_FILE and/or TIDY_USER_CONFIG_FILE, what ever their value, and maybe ENV HTML_TIDY... in Windows and UNIX at least... this should probably be a separate issue... will try to get around to doing that, if no one beats me to it...

I am not talking about any valid, invalid or comment lines they may contain. The bug is triggered by the files presence, even as a null file...

Try $ sudo touch /etc/tidy.conf, if that is the compiled in file name for TIDY_CONFIG_FILE, then run tests 1003361, 1004051, 1004512, 1014993, ... and probably others...

FWIIW I use cmake ..\.. -DENABLE_CONFIG_FILES:BOOL=ON -DTIDY_CONFIG_FILE="C:/Users/user/AppData/Local/tidy.conf" -DTIDY_RC_NUMBER=Icfg-1 [others] in Windows, for testing, since the unix default of /etc/tidy.conf is not very applicable to a Windows file system...

Would never want to add a no config option just to paper over this problem! So this is certainly not a use case to add such an option...

"don't read the default init files" option

While the above bug is not a valid reason for this new option, it is also true that even valid options can upset, change tidy's output when conducting the regression tests, which require a strict, clean environment be used...

As suggested, the tools to run the tests, like your build-and-test.sh, could be adjusted to check more, and warn if there appears something unclean in the runtime environment, ...

Like using -help-env - sorry for my bad scripting, just an experiment -

#!/bin/sh
#< tidy-env.sh

TMPVAL=`tidy -help-env`
if [ ! "$?" = "0" ]; then
    echo "Running 'tidy -help-env' FAILED"
    exit 1
fi

echo "Value in TMPVAL"
echo "$TMPVAL"
echo "process it..."
TMPLINE=0
TMPEXIT=0
echo "$TMPVAL" | while read -r a; do
    TMPLINE=`expr $TMPLINE + 1`;
    if [ ! -z "$a" ]; then
        TMPCNT=`echo "$a" | wc -w`;
        if [ "$TMPCNT" = "1" ]; then
            if [ -f "$a" ]; then
                echo "$TMPLINE: WARNING: File '$a' FOUND!";
                TMPEXIT=1;
            else
                echo "$TMPLINE: File '$a' - NOT FOUND!";
            fi
        fi
    fi
done
exit $TMPEXIT
# eof

So that could be done for the test tools...

I repeat, given the quite weak use case, I was warming say -no-env-config, or even -no-def-config option...

do not think -no-env-config -config <file> is an ugly kludge ;=))

But do think your -no-config -config <file> could be read that you will also exclude the -config <file>... needs something extra...

Maybe even -no-pre-configs-please... but agree it should be anywhere on the command line...

And just so we are clear, would not particularly change the tests tools to include this option... but could see adding something like the above script, probably better written, to check and warn...

Such a tester is entirely responsible for his/her tidy setup... and at least in the Windows cmd tools, such an additional option can be added easily, if the user, for some reason of their own, does not want to, at least temporarily, remove, rename, ... offending files...

So once the name is chosen, and agreed, would at least consider some patches and/or PR to implement this... suggest patches, or pseudo code, since have some ideas about this... including docs... thanks...

@ler762
Copy link
Contributor Author

ler762 commented Nov 19, 2018

@geoffmcl I agree there's two separate issues here:

  • Some of the regression tests fail if you do a touch /etc/tidy.conf
  • The program design flaw of omitting an option to skip the processing of the default init files.

touch /etc/tidy.conf breaking some of the regression tests should be a separate bug discussion.

For fixing the design flaw, ''-no-env-config'' sounds too much like "ignore the HTML_TIDY env variable" so I'd go with ''-no-def-config'' allowed anywhere on the command line.

And just so we are clear, would not particularly change the tests tools to include this option...

Yup, changing the tests to include this option would be a pain.

So once the name is chosen, and agreed,

I'm ok with ''-no-def-config''

would at least consider some patches and/or PR to implement this... suggest patches, or pseudo code, since have some ideas about this... including docs... thanks...

In console/tidy.c, just before

    /*
     * Look for default configuration files using any of

add

    for ( int i = 1; i < argc && !skipDefaultInit; i++ )
    {
        ctmbstr arg = argv[i] + 1;
        if ( strcasecmp(arg, "no-def-config" )
        {
            skipDefaultInit = 1;
            break;
        }
    }

    if ( ! skipDefaultInit )
    {

and add the closing bracket right after

#endif /* TIDY_USER_CONFIG_FILE */

@ler762
Copy link
Contributor Author

ler762 commented Nov 19, 2018

drat if ( strcasecmp(arg, "no-def-config" ) == 0 )

@geoffmcl
Copy link
Contributor

@ler762 no I do not think this is a design flaw... it was by design... a choice... an option...

And omitting an option to skip the processing of the default init files is not a fault... and a request to add it now needs a use case...

While I was beginning to think there was a very minor regression test use case for such an option, I have now decided that is not so true...

If a regression tester can not look after the systems runtime environment, I certainly do not want to give them a handy out... run them in a clean VM... add the above script to warn, if they can't remember... be aware... does not seem too much to ask if they really need to run the tests, which are quite inadequate, as tests go...

As a sysadmin I might not want to allow any tidy user in my system to be able to avoid my /etc/tidy.conf... they would have sudo permission if trusted anyway...

So it seems this issue dies for want of a use case... Simply, to start again, Why is this needed?

Any last comments... thanks...

@ler762
Copy link
Contributor Author

ler762 commented Nov 23, 2018

The use case is simple: I'd like to have a ~/.tidyrc and also be able to get reproducible results when tidy is called in a script.

Note that my definition of reproducible results means that anyone using the same version of tidy as I am, using the same script as I am, gets the same results that I do no matter what's in their/my ~/.tidyrc, /etc/tidy.conf, or HTML_TIDY setting.

The regression tests are a good example - it'd be nice if they produced the same results if I have a ~/.tidyrc or not. It seems to me that the easiest way to do that is to have a -no-config option that says don't read any of the default init files

Same deal with a project I help out with -- the documentation is created using docbook to generate html files and tidy is used to clean up the generated html files. It'd be nice if I could do a make doc and get the same results as everybody else on the team even if I have a ~/.tidyrc. The easiest way to do that is add -no-config to the tidy call in the make file.

Or I could just delete my ./tidyrc

@geoffmcl
Copy link
Contributor

@ler762 if used in a script, why not ren/mv ~/.tidyrc xxxx, temporarily... and restore... seems easy...

While easy to achieve, like PR #774, it seems much more difficult to change tidy traditional order of parsing... but thanks for the thoughts...

No feedback in months...

Can we close this/these? thanks...

@ler762
Copy link
Contributor Author

ler762 commented Feb 10, 2019

if used in a script, why not ren/mv ~/.tidyrc xxxx, temporarily... and restore... seems easy...

Because it's so much easier to make sure the script invokes tidy as "tidy -no-config ..." than it is to temporarily rename ~/.tidyrc, call tidy and then undo the earlier rename.

If there is an /etc/tidy.conf you can't move or rename it without affecting everybody else that calls tidy during your

  • move /etc/tidy.conf aside
  • call tidy
  • restore /etc/tidy.conf

processing window.

@geoffmcl
Copy link
Contributor

@ler762 again no further feedback, for months... except so much easier... which is in debate...

Only a dev, running regression tests, maybe needs to do this... and scripting together build and test seems the wrong way to go... IMO...

Anyway, as yet, do not see sufficient arguments for changing tidy.c current config file processes, nor the introduction of a new option -no-config... sorry...

Accordingly, closing this... feel free to re-open...

Or better still open a new clean issue... thanks...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants