x32x01
  • by x32x01 ||
Hi, Pentesters! In this article, we are going to focus on the Kali Linux tool “Cewl” which will basically help you to create a wordlist. Let’s explore this tool and learn about what all other options this tool provides.

Table of Contents
  1. Introduction to Cewl
  2. Default Procedure
  3. Store this wordlist in a file
  4. Generating a Wordlist of a certain length
  5. Retrieval of Emails from the website.
  6. To count the number of words repeated on the website
  7. Increase spider depth
  8. Verbose Mode
  9. Alphanumeric Wordlist
  10. Cewl with Digest/Basic Authentication
  11. Lowercase all parsed words
  12. Proxy Support-

Introduction to Cewl​

CeWL – A custom wordlist generator is a ruby program that crawls a specific URL to a defined depth and returns a list of keywords, which password crackers like John the Ripper, Medusa, and WFuzz can use to crack the passwords. Cewl also has an associated command-line app FAB, which uses the same metadata extraction techniques to generate author/producer lists from already downloaded files using information extraction algorithms like CeWL.

CeWL comes preinstalled with Kali Linux. With this tool, we can easily collect words and phrases from the target page. It is a robust program that can quickly scrape the webserver of any website.

Open the terminal of Kali Linux and type “cewl -h” to see the lists of all the options it accepts, with a complete description.
Syntax: cewl <url> [options]
001.png
General Options :
-h, -help: Show help.
-k, -keep: Keep the downloaded file.
-d <x>, -depth <x>: Depth to spider to, default 2.
-m, -min_word_length: Minimum word length, default 3.
-o, -offsite: Let the spider visit other sites.
-w, -write: Write the output to the file.
-u, -ua <agent>: User agent to send.
-n, -no-words: Don’t output the wordlist.
-with-numbers: Accept words with numbers in as well as just letters
-a, -meta: include meta data.
-meta_file file: Output file for Meta data.
-e, -email: Include email addresses.
-email_file <file>:Output file for email addresses.
-c, -count: Show the count for each word found.
-v, -verbose: Verbose.
-debug: Extra debug information

Authentication
–auth_type: Digest or basic.
–auth_user: Authentication username.
–auth_pass: Authentication password.

Proxy Support
–proxy_host: Proxy host.
–proxy_port: Proxy port, default 8080.
–proxy_username: Username for proxy, if required.
–proxy_password: Password for proxy, if required.

Default Procedure​

Use the following command to generate a list of words that will spider the given URL to a specified depth and we can use it as a directory for cracking the passwords.
Code:
You Can, Log in or Register To View Codes Content !
002.png

Store this wordlist in a file​

Now to save this all wordlist in a file for record-keeping, efficiency and readability we will use the -w option to save the output in a text file.
Code:
You Can, Log in or Register To View Codes Content !
Here dict.txt is the file name where the wordlist will be stored. Once the file has been created you can open it to see if the output is stored in the file.
003.png

Generating wordlists of a certain length​

If you want to create a wordlist of a specific length then you can choose to use option -m and provide the minimum length for the keyword hence it will create wordlists for a certain length.
Code:
You Can, Log in or Register To View Codes Content !
004.png
So basically, this will create a wordlist in which each word has a minimum of 10 letters and store these keywords in the file dict.txt. Screenshot is attached for your reference.

Retrieval of Emails from the website:​

In order to retrieve emails from the website, we can use the -e option, while the -n option will hide the lists created while crawling the provided website. As you can see in the screenshot attached it has found 1 email-id from the website.
Code:
You Can, Log in or Register To View Codes Content !
005.png

To count the number of words repeated on the website​

If you want to count the number of times a word is repeated on a website, then use the -c option that will enable the count parameter.
Code:
You Can, Log in or Register To View Codes Content !
For your reference, a screenshot is added below which prints the count for every keyword repeated on website.
006.png

Increase Spider depth​

You can use -d option with the depth number to activate depth parameter for more quick and intense crawling so that a large list of words is created. The depth level is set to 2 as default.

Code:
You Can, Log in or Register To View Codes Content !
007.png

Verbose Mode​

We have a -v option for the verbose mode to extend the website crawling result and retrieve complete detail of the website.
Code:
You Can, Log in or Register To View Codes Content !
So, this will display extended website crawling results. Below we have attached a screenshot so that you will get a clear idea.
008.png

Alphanumeric Wordlist​

Sometimes it may happen that you may need an alpha-numeric wordlist that you can use –the with-numbers option to get an alpha-numeric wordlist.

Code:
You Can, Log in or Register To View Codes Content !
009.png

010.png

Cewl with Digest/Basic Authentication​

It may happen sometimes that some web applications may have an authentication page for login and for that the above basic command will not give desired results. So for that, you need to bypass the authentication page by using the command given below.
Code:
You Can, Log in or Register To View Codes Content !

In this command we have used the following options:
-auth_type: Digest /Basic
-auth_user: Authentication Username
-auth_pass: Authentication password
011.png

Lowercase all parsed words​

When you need the keywords to be generated in lowercase for that you can use the –lowercase option to generate the words in lowercase.
012.png

Proxy Support​

This default command for cewl will not work properly if you have attached a proxy server. We tried to access the application through ip address but the proxy server is attached hence this gave us a Forbidden Error page.
013.png
And here if we apply the default cewl command so it will generate the error page wordlist. Hence to get the appropriate wordlist of the web application we have used commands as:​
Code:
You Can, Log in or Register To View Codes Content !

In this command we have used the following options:
-proxy_host: Your Host
-proxy_port: Port number of your proxy
014.png
 

Similar Threads

x32x01
  • x32x01
Replies
0
Views
100
x32x01
x32x01
x32x01
  • x32x01
Replies
0
Views
99
x32x01
x32x01
x32x01
  • x32x01
Replies
0
Views
103
x32x01
x32x01
x32x01
  • x32x01
Replies
0
Views
83
x32x01
x32x01
x32x01
  • x32x01
Replies
0
Views
257
x32x01
x32x01
TAGs: Tags
cewl tool

Register & Login Faster

Forgot your password?

Latest Resources

Forum Statistics

Threads
507
Messages
508
Members
42
Latest Member
Mustafa123
Back
Top