Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log4j configuration issue #449

Closed
ronjakoi opened this issue Jan 5, 2018 · 7 comments
Closed

Log4j configuration issue #449

ronjakoi opened this issue Jan 5, 2018 · 7 comments

Comments

@ronjakoi
Copy link

ronjakoi commented Jan 5, 2018

Apologies for commenting on a closed issue earlier.

If I try changing the line in log4j.properties to log4j.rootLogger=INFO, FILE_ONLY and run the command from the command line, the Collector doesn't output any logs to the terminal, but I still get this warning:

log4j:WARN No appenders could be found for logger (org.apache.http.client.protocol.ResponseProcessCookies).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

This is quite strange, because clearly the properties file is being read properly, as the configuration change is taking effect (no output to terminal).

Of course I could also send stderr to /dev/null from my cronjob, but being able to receive errors and warnings by mail would be nice. Assuming warnings and errors other than this log4j issue are printed to stderr and further mailed along by cron, that is.

@essiembre
Copy link
Contributor

I cannot reproduce. These lines are not printed for me. It looks as if the log4j.properties file is not loaded properly, or maybe there is no appender for org.apache.* in your file. Can you attach your full log4j.properties?

What if you hardcode the path to the log4j file in the collector-http.sh script? Does it make a difference?

Logging will be reworked at some point in a future release to give more flexibility to people integrating the Collector in their own solution. It may help with such issues as well. This can be tracked in #401 and Norconex/jef#6.

@ronjakoi
Copy link
Author

ronjakoi commented Jan 8, 2018

Hardcoding the path in the script had no change. If I set the path to something non-existent, I get this:

log4j:ERROR Could not read configuration file from URL [file:/foo/log4j.properties].
java.io.FileNotFoundException: /foo/log4j.properties (No such file or directory)
	at java.io.FileInputStream.open0(Native Method)
	at java.io.FileInputStream.open(FileInputStream.java:195)
	at java.io.FileInputStream.<init>(FileInputStream.java:138)
	at java.io.FileInputStream.<init>(FileInputStream.java:93)
	at sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90)
	at sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:188)
	at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:557)
	at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526)
	at org.apache.log4j.LogManager.<clinit>(LogManager.java:127)
	at com.norconex.collector.core.AbstractCollector.<clinit>(AbstractCollector.java:58)
log4j:ERROR Ignoring configuration file [file:/foo/log4j.properties].
log4j:WARN No appenders could be found for logger (com.norconex.collector.core.CollectorConfigLoader).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

Here is my complete log4j.properties:

#------------------------------------------------------------------------------
# Logging Level
#------------------------------------------------------------------------------

# Set level of information printed in log file/console
# (DEBUG > INFO > WARN > ERROR > FATAL)
# By default, use INFO
log4j.rootLogger=INFO, FILE_ONLY

# Other loggers (override above default setting)

# Default loggers for the collector:
log4j.logger.com.norconex.collector.http=INFO
log4j.logger.com.norconex.collector.core=INFO
log4j.logger.com.norconex.importer=INFO
log4j.logger.com.norconex.committer=INFO

# The following are CrawlerEvent types normally logged as INFO by a crawler.
# To disable the logging of certain event types, set their log level to 
# something higher than INFO (i.e., WARN or ERROR).
# To log additional information for an event type, set its log level so 
# something lower than INFO (e.g., DEBUG). This list is 
# non-exhaustive as some crawlers may add more:
log4j.logger.CrawlerEvent.CRAWLER_STARTED=INFO
log4j.logger.CrawlerEvent.CRAWLER_RESUMED=INFO
log4j.logger.CrawlerEvent.CRAWLER_FINISHED=INFO
log4j.logger.CrawlerEvent.REJECTED_DUPLICATE=INFO
log4j.logger.CrawlerEvent.REJECTED_FILTER=DEBUG
log4j.logger.CrawlerEvent.REJECTED_UNMODIFIED=INFO
log4j.logger.CrawlerEvent.REJECTED_NOTFOUND=INFO
log4j.logger.CrawlerEvent.REJECTED_BAD_STATUS=DEBUG
log4j.logger.CrawlerEvent.REJECTED_IMPORT=DEBUG
log4j.logger.CrawlerEvent.REJECTED_ERROR=DEBUG
log4j.logger.CrawlerEvent.DOCUMENT_PREIMPORTED=INFO
log4j.logger.CrawlerEvent.DOCUMENT_POSTIMPORTED=INFO
log4j.logger.CrawlerEvent.DOCUMENT_COMMITTED_ADD=INFO
log4j.logger.CrawlerEvent.DOCUMENT_COMMITTED_REMOVED=INFO
log4j.logger.CrawlerEvent.DOCUMENT_IMPORTED=INFO
log4j.logger.CrawlerEvent.DOCUMENT_METADATA_FETCHED=INFO
log4j.logger.CrawlerEvent.DOCUMENT_FETCHED=INFO
log4j.logger.CrawlerEvent.DOCUMENT_SAVED=INFO

log4j.logger.CrawlerEvent.REJECTED_ROBOTS_TXT=DEBUG
log4j.logger.CrawlerEvent.CREATED_ROBOTS_META=INFO
log4j.logger.CrawlerEvent.REJECTED_ROBOTS_META_NOINDEX=INFO
log4j.logger.CrawlerEvent.REJECTED_TOO_DEEP=INFO
log4j.logger.CrawlerEvent.REJECTED_CANONICAL=DEBUG
log4j.logger.CrawlerEvent.REJECTED_REDIRECTED=DEBUG
log4j.logger.CrawlerEvent.URLS_EXTRACTED=INFO

log4j.logger.org.apache=WARN
log4j.additivity.org.apache=false
#log4j.category.org.apache.velocity=WARN

# These loggers silence non-impacting errors:
log4j.logger.org.apache.pdfbox=ERROR
log4j.logger.org.apache.pdfbox.util.operator.SetTextFont=FATAL


#------------------------------------------------------------------------------
# APPENDER: FILE_ONLY
#------------------------------------------------------------------------------
# The collector programmatically adds a file appender.  To only use that file,
# specify this "FILE_ONLY" appender as the "log4j.rootLogger" instead of 
# the default "CONSOLE" value and it will ignore the console.
#
log4j.appender.FILE_ONLY=org.apache.log4j.varia.NullAppender

#------------------------------------------------------------------------------
# APPENDER: CONSOLE
#------------------------------------------------------------------------------
# Setup and adjust format for logging to console
# (Format example: "DEBUG [Class.method]: Here is the msg. "
# This is then followed by a stack trace, if an Exception was provided)
# NOTE: Using %M can be slow - it should only be used for debugging
log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender
log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout
#log4j.appender.CONSOLE.layout.ConversionPattern=%-5p [%C{1}.%M] %m%n
log4j.appender.CONSOLE.layout.ConversionPattern=%-5p [%C{1}] %m%n

@essiembre
Copy link
Contributor

Sorry I still can't reproduce. I am afraid you'll have to live with that warning until the logging mechanism changes.

One last idea: try to move the log4j.properties files under the "classes" folder (which is part of the class loader when you use the launch script).

@Krishna210414
Copy link

Hi can i prevent creating one big output logger file it seems when added the file property its not creating multiple log files, Can some one please provide configuration if they did it already?

Also I need to change the name of file.?

@essiembre
Copy link
Contributor

To reduce the log size, you can change the log levels in the log4j.properties. The log file is backed up automatically at the beginning of next run.

The next major version (3.x) will give you more flexibility with logging (relying on SLF4J). Until then, I would recommend you modify the launch script to modify the file(s) as you want between runs.

@essiembre
Copy link
Contributor

You can now rely 100% on your own log4j configuration with the latest snapshot release. Have a look at: #593 (comment) to find out how.

@jetnet
Copy link

jetnet commented Mar 25, 2021

this helps for collector 2.9:

log4j.logger.org.apache.http.client.protocol.ResponseProcessCookies=FATAL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants