Go to file
morph3 4da24db0db Improvement
User agents are now selected randomly from a predefined list
2022-07-17 21:47:08 +03:00
reports Did some code refactoring. Redirect location is now printed on the screen 2021-07-13 23:18:33 +03:00
src Improvement 2022-07-17 21:47:08 +03:00
.gitignore Initial commit 2020-11-05 02:36:31 +03:00
README.md Improvement 2022-07-17 21:47:08 +03:00
common.txt Added path specific recursive option 2021-09-27 21:40:01 +03:00
crawpy.py Improvement 2022-07-17 21:47:08 +03:00
requirements.txt Update requirements.txt 2020-11-05 02:54:00 +03:00



Yet another content discovery tool written in python.

What makes this tool different than others:

  • Sends asynchronous HTTP requests, fast
  • Calibration mode, applies filters on its own
  • Recursive scan mode with additional flags
  • Report generation, you can later go and check your results
  • Multiple url scan

An example run


An example run with auto calibration and recursive mode enabled


Example reports

Example reports can be found here



git clone https://github.com/morph3/crawpy
pip3 install -r requirements.txt 
python3 -m pip install -r requirements.txt


morph3 ➜ crawpy/ [main✗] λ python3 crawpy.py --help
usage: crawpy.py [-h] [-u URL] [-w WORDLIST] [-t THREADS] [-rc RECURSIVE_CODES] [-rp RECURSIVE_PATHS] [-rd RECURSIVE_DEPTH] [-e EXTENSIONS] [-to TIMEOUT] [-follow] [-ac] [-fc FILTER_CODE] [-fs FILTER_SIZE] [-fw FILTER_WORD] [-fl FILTER_LINE] [-k] [-m MAX_RETRY]
                 [-H HEADERS] [-o OUTPUT_FILE] [-gr] [-l URL_LIST] [-lt LIST_THREADS] [-s] [-X HTTP_METHOD] [-p PROXY_SERVER]

optional arguments:
  -h, --help            show this help message and exit
  -u URL, --url URL     URL
  -w WORDLIST, --wordlist WORDLIST
  -t THREADS, --threads THREADS
                        Size of the semaphore pool
  -rc RECURSIVE_CODES, --recursive-codes RECURSIVE_CODES
                        Recursive codes to scan recursively Example: 301,302,307
  -rp RECURSIVE_PATHS, --recursive-paths RECURSIVE_PATHS
                        Recursive paths to scan recursively, please note that only given recursive paths will be scanned initially Example: admin,support,js,backup
  -rd RECURSIVE_DEPTH, --recursive-depth RECURSIVE_DEPTH
                        Recursive scan depth Example: 2
                        Add extensions at the end. Seperate them with comas Example: -x .php,.html,.txt
  -to TIMEOUT, --timeout TIMEOUT
                        Timeouts, I suggest you to not use this option because it is procudes lots of erros now which I was not able to solve why
  -follow, --follow-redirects
                        Follow redirects
  -ac, --auto-calibrate
                        Automatically calibre filter stuff
  -fc FILTER_CODE, --filter-code FILTER_CODE
                        Filter status code
  -fs FILTER_SIZE, --filter-size FILTER_SIZE
                        Filter size
  -fw FILTER_WORD, --filter-word FILTER_WORD
                        Filter words
  -fl FILTER_LINE, --filter-line FILTER_LINE
                        Filter line
  -k, --ignore-ssl      Ignore untrusted SSL certificate
  -m MAX_RETRY, --max-retry MAX_RETRY
                        Max retry
  -H HEADERS, --headers HEADERS
                        Headers, you can set the flag multiple times.For example: -H "X-Forwarded-For:", -H "Host: foobar"
                        Output folder
  -gr, --generate-report
                        If you want crawpy to generate a report, default path is crawpy/reports/<url>.txt
  -l URL_LIST, --list URL_LIST
                        Takes a list of urls as input and runs crawpy on via multiprocessing -l ./urls.txt
  -lt LIST_THREADS, --list-threads LIST_THREADS
                        Number of threads for running crawpy parallely when running with list of urls
  -s, --silent          Make crawpy not produce output
  -X HTTP_METHOD, --http-method HTTP_METHOD
                        HTTP request method
                        Proxy server, ex: ''


python3 crawpy.py -u https://facebook.com/FUZZ -w ./common.txt  -k -ac  -e .php,.html
python3 crawpy.py -u https://google.com/FUZZ -w ./common.txt -k -fw 9,83 -rc 301,302 -rd 2 -ac
python3 crawpy.py -u https://morph3sec.com/FUZZ -w ./common.txt -e .php,.html -t 20 -ac -k
python3 crawpy.py -u https://google.com/FUZZ -w ./common.txt  -ac -gr
python3 crawpy.py -u https://google.com/FUZZ -w ./common.txt  -ac -gr -o /tmp/test.txt
sudo python3 crawpy.py -l urls.txt -lt 20 -gr -w ./common.txt -t 20 -o custom_reports -k -ac -s
python3 crawpy.py -u https://google.com/FUZZ -w ./common.txt -ac -gr -rd 1 -rc 302,301 -rp admin,backup,support -k