Skip to content

Commit

Permalink
Merge pull request #221 from DedSecInside/feature/torbotv2.1
Browse files Browse the repository at this point in the history
Torbot v2.1.0
  • Loading branch information
PSNAppz authored Aug 28, 2022
2 parents 79030c8 + 2aa776c commit 13841a8
Show file tree
Hide file tree
Showing 84 changed files with 646 additions and 16,330 deletions.
2 changes: 2 additions & 0 deletions .flake8
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[flake8]
max-line-length = 119
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ __pycache*
__pycache__/

# Misc
torBot

.*.swp
.ropeproject/
.idea/
Expand All @@ -33,3 +33,4 @@ venv/
.DS_Store
.env
data/*.csv
torbot/modules/nlp/training_data/
2 changes: 0 additions & 2 deletions .hound.yml

This file was deleted.

66 changes: 66 additions & 0 deletions .style.yapf
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
[style]
based_on_style=pep8

# The column limit.
column_limit=119

# Align closing bracket with visual indentation.
align_closing_bracket_with_visual_indent=False

allow_split_before_dict_value = False

# Put closing brackets on a separate line, dedented, if the bracketed
# expression can't fit in a single line. Applies to all kinds of brackets,
# including function definitions and calls. For example:
#
# config = {
# 'key1': 'value1',
# 'key2': 'value2',
# } # <--- this bracket is dedented and on a separate line
#
# time_series = self.remote_client.query_entity_counters(
# entity='dev3246.region1',
# key='dns.query_latency_tcp',
# transform=Transformation.AVERAGE(window=timedelta(seconds=60)),
# start_ts=now()-timedelta(days=3),
# end_ts=now(),
# ) # <--- this bracket is dedented and on a separate line
dedent_closing_brackets=True

# Insert a space between the ending comma and closing bracket of a list,
# etc.
space_between_ending_comma_and_closing_bracket=False

# Split after the opening parenthesis which surrounds an expression if it doesn't
# fit on a single line.
split_before_expression_after_opening_paren=True

# Set to True to split list comprehensions and generators that have
# non-trivial expressions and multiple clauses before each of these
# clauses. For example:
#
# result = [
# a_long_var + 100 for a_long_var in xrange(1000)
# if a_long_var % 10]
#
# would reformat to something like:
#
# result = [
# a_long_var + 100
# for a_long_var in xrange(1000)
# if a_long_var % 10]
split_complex_comprehension=True

# Insert a blank line before a 'def' or 'class' immediately nested
# within another 'def' or 'class'. For example:
#
# class Foo:
# # <------ this blank line
# def method():
# ...
blank_line_before_nested_class_or_def=True

# The i18n function call names. The presence of this function stops
# reformatting on that line, because the string it has cannot be moved
# away from the i18n comment.
i18n_function_call=['_']
31 changes: 31 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,37 @@
--------------------
All notable changes to this project will be documented in this file.

## 2.1.0

### Added
* GoTor API - A Golang implementation of Core TorBot functionality.
* Phone number extractor - Extracts phone numbers from urls.
* Integrated NLP module with TorBot
* Major code refactoring

### Removed
* No longer using the tree module
* Poetry Implementation removed

## 2.0.0

### Added
* Fix data collection and add progress indicator by @KingAkeem in #192
* convert port to integer by @KingAkeem in #193
* Use hiddenwiki.org as default URL for collecting data by @KingAkeem in #194
* Bump jinja2 from 2.11.2 to 2.11.3 in /src/api by @dependabot in #200
* Simplify LinkNode and add new display by @KingAkeem in #202
* Remove live flag by @KingAkeem in #203
* Poetry Implementation by @NeoLight1010 in #206
* Delete .DS_Store by @stefins in #204
* Fix the basic functionality of tree features by @KingAkeem in #214
* Save results as json by @KingAkeem in #215
* Organize data file location by @KingAkeem in #216
* Add CodeTriage link and image by @KingAkeem in #213
* Add website classification by @KingAkeem in #218
* Use GoTor HTTP service by @KingAkeem in #219


## 1.4.0 | Present

### Added
Expand Down
40 changes: 40 additions & 0 deletions CITATION.cff
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# @InProceedings{10.1007/978-981-15-0146-3_19,
# author="Narayanan, P. S.
# and Ani, R.
# and King, Akeem T. L.",
# editor="Ranganathan, G.
# and Chen, Joy
# and Rocha, {\'A}lvaro",
# title="TorBot: Open Source Intelligence Tool for Dark Web",
# booktitle="Inventive Communication and Computational Technologies",
# year="2020",
# publisher="Springer Singapore",
# address="Singapore",
# pages="187--195",
# abstract="The dark web has turned into a dominant source of illegal activities. With several volunteered networks, it is becoming more difficult to track down these services. Open source intelligence (OSINT) is a technique used to gather intelligence on targets by harvesting publicly available data. Performing OSINT on the Tor network makes it a challenge for both researchers and developers because of the complexity and anonymity of the network. This paper presents a tool which shows OSINT in the dark web. With the use of this tool, researchers and Law Enforcement Agencies can automate their task of crawling and identifying different services in the Tor network. This tool has several features which can help extract different intelligence.",
# isbn="978-981-15-0146-3"
# }

cff-version: 1.2.0
message: "If you use this software, please cite the following paper:"
authors:
- family-names: P. S.
given-names: Narayanan
affiliation: Department of Computer Science and Applications, Amrita Vishwa Vidyapeetham, Amritapuri, Kollam, India
- family-names: Akeem T. L.
given-names: King
affiliation: USPA Technologies
- family-names: R
given-names: Ani
affiliation: Department of Computer Science and Applications, Amrita Vishwa Vidyapeetham, Amritapuri, Kollam, India
keywords:
- tor
- research
- osint
identifiers:
- type: doi
value: 10.1007/978-981-15-0146-3_19
license: GNU Public License
reposiory-code: https://github.com/DedSecInside/TorBot
title: TorBot - Open Source Intelligence Tool for Dark Web
date-released: 2020-01-30
101 changes: 49 additions & 52 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ If its a new module, it should be put inside the modules directory.
The branch name should be your new feature name in the format <Feature_featurename_version(optional)>. For example, <i>Feature_FasterCrawl_1.0</i>.
Contributor name will be updated to the below list. 😀
<br>

<b> NOTE : The PR should be made only to `dev` branch of TorBot. </b>

### OS Dependencies
Expand All @@ -54,53 +55,62 @@ Contributor name will be updated to the below list. 😀

### Python Dependencies

(see pyproject.toml for more detail)
- beautifulsoup4
- pyinstaller
- PySocks
- termcolor
- requests
- requests_mock
- yattag
- numpy

(see requirements.txt for more details)
altgraph==0.17.2
beautifulsoup4==4.11.1
certifi==2022.5.18.1
charset-normalizer==2.0.12
decorator==5.1.1
ete3==3.1.2
idna==3.3
macholib==1.16
numpy==1.22.4
progress==1.6
pyinstaller==5.1
pyinstaller-hooks-contrib==2022.7
PySocks==1.7.1
python-dotenv==0.20.0
requests==2.28.0
requests-mock==1.9.3
six==1.16.0
soupsieve==2.3.2.post1
termcolor==1.1.0
threadsafe==1.0.0
urllib3==1.26.9
validators==0.20.0
yattag==1.14.0
pyqt5==5.15.6 (Install using apt/brew if pip installation fails.)
### Golang Dependencies
- https://github.com/KingAkeem/gotor (This service needs to be ran in tandem with TorBot)

## Basic setup
## Installation

### From source
Before you run the torBot make sure the following things are done properly:

* Run tor service
`sudo service tor start`

* Make sure that your torrc is configured to SOCKS_PORT localhost:9050

* Install [Poetry](https://python-poetry.org/docs/)
* Open a new terminal and run `cd gotor && go run main.go -server`

* Disable Poetry virtualenvs (not required)
`poetry config settings.virtualenvs.create false`
* Install TorBot Python requirements using
`pip install -r requirements.txt`

* Install TorBot Python requirements
`poetry install`
Finally run the following command

On Linux platforms, you can make an executable for TorBot by using the install.sh script.
You will need to give the script the correct permissions using `chmod +x install.sh`
Now you can run `./install.sh` to create the torBot binary.
Run `./torBot` to execute the program.

An alternative way of running torBot is shown below, along with help instructions.

`python3 torBot.py or use the -h/--help argument`
`python3 run.py -h`
<pre>
usage: torBot.py [-h] [-v] [--update] [-q] [-u URL] [-s] [-m] [-e EXTENSION]
usage: run.py [-h] [-v] [--update] [-q] [-u URL] [-s] [-m] [-e EXTENSION]
[-i]

optional arguments:
-h, --help Show this help message and exit
-v, --version Show current version of TorBot.
--update Update TorBot to the latest stable version
-q, --quiet Prevent header from displaying
-u URL, --url URL Specifiy a website link to crawl, currently returns links on that page (if used alone e.g. python3 torBot.py -u https://www.github.com)
-u URL, --url URL Specifiy a website link to crawl, currently returns links on that page (if used alone e.g. python3 run.py -u https://www.github.com)
-s, --save Save results to a file in json format
-m, --mail Get e-mail addresses from the crawled sites
-e EXTENSION, --extension EXTENSION
Expand All @@ -113,11 +123,7 @@ optional arguments:

Read more about torrc here : [Torrc](https://github.com/DedSecInside/TorBoT/blob/master/Tor.md)


#### Using the GUI


#### Using Docker
### Using Docker

- Ensure than you have a tor container running on port 9050.
- Build the image using following command (in the root directory):
Expand All @@ -127,6 +133,14 @@ Read more about torrc here : [Torrc](https://github.com/DedSecInside/TorBoT/blob

`docker run --link tor:tor --rm -ti dedsecinside/torbot`

### Using executable (Linux Only)

On Linux platforms, you can make an executable for TorBot by using the install.sh script.
You will need to give the script the correct permissions using `chmod +x install.sh`
Now you can run `./install.sh` to create the torBot binary.
Run `./torBot` to execute the program.


## TO-DO
- [x] Visualization Module
- [x] Implement BFS Search for webcrawler
Expand All @@ -140,27 +154,8 @@ Read more about torrc here : [Torrc](https://github.com/DedSecInside/TorBoT/blob
- [x] Increase efficiency

### Have ideas?
If you have new ideas which is worth implementing, mention those by starting a new issue with the title [FEATURE_REQUEST].
If the idea is worth implementing, congratz, you are now a contributor.

### Cite this [paper](https://link.springer.com/chapter/10.1007/978-981-15-0146-3_19)

@InProceedings{10.1007/978-981-15-0146-3_19,
author="Narayanan, P. S.
and Ani, R.
and King, Akeem T. L.",
editor="Ranganathan, G.
and Chen, Joy
and Rocha, {\'A}lvaro",
title="TorBot: Open Source Intelligence Tool for Dark Web",
booktitle="Inventive Communication and Computational Technologies",
year="2020",
publisher="Springer Singapore",
address="Singapore",
pages="187--195",
abstract="The dark web has turned into a dominant source of illegal activities. With several volunteered networks, it is becoming more difficult to track down these services. Open source intelligence (OSINT) is a technique used to gather intelligence on targets by harvesting publicly available data. Performing OSINT on the Tor network makes it a challenge for both researchers and developers because of the complexity and anonymity of the network. This paper presents a tool which shows OSINT in the dark web. With the use of this tool, researchers and Law Enforcement Agencies can automate their task of crawling and identifying different services in the Tor network. This tool has several features which can help extract different intelligence.",
isbn="978-981-15-0146-3"
}
If you have new ideas which is worth implementing, mention those by creating a new issue with the title [FEATURE_REQUEST].



### References
Expand Down Expand Up @@ -208,4 +203,6 @@ GNU Public License
- [X] [SubaruSama](https://github.com/SubaruSama) - New Contributor
- [X] [robly78746](https://github.com/robly78746) - New Contributor

... see all contributors here (https://github.com/DedSecInside/TorBot/graphs/contributors)


17 changes: 10 additions & 7 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,21 +1,24 @@
FROM python:3
FROM python:3.9
LABEL maintainer="dedsec_inside"

# Install PyQt5

RUN apt-get update \
&& apt-get install -y --no-install-recommends python3-pyqt5 \
&& apt-get install -y virtualenv \
&& apt-get install -y tor \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*

WORKDIR /app

COPY . .

RUN pip install --no-cache-dir poetry
RUN poetry config virtualenvs.create false
RUN python -m poetry install --no-dev
# Create virtual env
RUN virtualenv venv --python=python3.9
RUN source venv/bin/activate
RUN pip install -r requirements.txt


RUN chmod +x install.sh
RUN bash install.sh

ENTRYPOINT ["./torBot", "--ip", "tor"]
ENTRYPOINT ["./run.py", "--ip", "tor"]
2 changes: 1 addition & 1 deletion gotor
Submodule gotor updated from ddf4a7 to d12394
18 changes: 5 additions & 13 deletions install.sh
Original file line number Diff line number Diff line change
@@ -1,25 +1,17 @@
#!/bin/bash

# Makes directory for dependencies and executable to be installed
mkdir -p tmp_build
mkdir -p tmp_build
mkdir -p tmp_dist

# attempt to install pyinstaller using pip, python3 is prioritized
if command -v poetry &> /dev/null; then
poetry install
poetry update
else
echo "poetry is required for installation."
exit 1
fi

pip install pyinstaller

# Creates executable file and sends dependences to the recently created directories
pyinstaller --onefile --workpath ./tmp_build --distpath ./tmp_dist --paths=src src/torBot.py
pyinstaller --onefile --workpath ./tmp_build --distpath ./tmp_dist --paths=src torbot/main.py

# Puts the executable in the current directory
mv tmp_dist/torBot .
mv tmp_dist/torBot .

# Removes both directories and unneeded file
rm -r tmp_build tmp_dist
rm torBot.spec
rm torBot.spec
Loading

0 comments on commit 13841a8

Please sign in to comment.