-
Notifications
You must be signed in to change notification settings - Fork 0
a crawler for checking that all site is translated to a specified language
License
bogdartysh/languagechecker-crawler
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This crawler is designed for single purpouse: to check that available web site (or portion of it) is fully translated from one language to another. As a user you could open distro folder, find the last shipped distro, edit task.properties file (set task_external_id, start_urls, url_pattern, origin_language_code, shouldbe_language_code, pages_to_fill_all_forms) and run from terminal start.sh (or start.bat). Results will be in the results folder Developer will need to: 1. install maven (maven.apache.org) and Java 8+ 3. open terminal and execute 3.1 mvn clean install 3.2 mvn assembly:single 3.3 cd target 3.4 adjust settings of the project (file task.properties, examples in tasks folder) 3.5 execute : java -jar -Xsx1024m -Xmx1024 languagechecker-crawler-1.0-SNAPSHOT-jar-with-dependencies.jar >results.csv results will be in results.csv (some times in cases of error that csv will be not readable in OpenOffice, so please check it with editor first). log files are in target/logs Dictionary files of you language should be named as ru.dict, en.dict, pl.dict and be places in target/dict folder. These files should be in utf-8 encoding and should consist of all words of chosen language. For example please see en.dict and pl.dict files (they are extracted from aspell project) For license info please see license file
About
a crawler for checking that all site is translated to a specified language
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published