theHarvester

Welcome to another tutorial! This time we’re taking a look at an OSINT tool called theHarvester.

theHarvester is a tool written by Christian Martorella. The current version as of this writing is 2.6. You can always check for the latest version on the project’s Github page: https://github.com/laramies/theHarvester.

According to the author, theHarvester is a tool that allows you to gather things like email addresses, sub-domains, virtual hosts, and employee names, all from a variety of public resources. theHarvester does its lookups on sites such as Google, Bing, LinkedIn, and Shodan.

Let’s take a look at the options that are available when running the program:

th-options1

Most of the options are self-explanatory, but I did want to highlight a couple of them:

-f:
saves results to HTML and XML files; it saves to both file formats at the same time, so you don’t have to specify which one you want; just give it a filename to save as

-b:
the search sources, such as Google, Bing, etc.; there are a couple of these sources that require you to have an API key, so keep that in mind when using them; the ones that require API keys are BingAPI, GoogleCSE, and Shodan. Shodan will cost a little money, but the others are free to get. Here are the links to get the API keys:
BingAPI – http://www.bing.com/toolbox/bingsearch.api (free; 5,000 queries per month)
GoogleCSE – https://cse.google.com/cse (free)
Shodan – https://developer.shodan.io (there is a fee for this one)

-c:
DNS Brute Force; there is an issue in Kali when trying to run this option using the default install; I had to change the path in the config file so that it pointed to the dictionary file. This is probably a quick and dirty fix, but it worked.
1. Copy the dictionary file from “/usr/share/golismero/tools/theHarvester” – file is “dns-names.txt”
2. Copy to “/usr/share/theHarvester/”
3. Edit the config file “/usr/share/theHarvester/discovery/dnssearch.py”
4. Go to the “Class dns_force()” section, and change the following line: self.file = “/usr/share/theharvester/dns-names.txt”
5. Save the file, and you should be good to go

SEARCHES
Now let’s run through a few searches. The first one will just be a basic search using a single data source.

Basic Search
Syntax: theharvester -d domain -b source

th-search-onlysource1
th-search-onlysource2

You can see this returned a few email addresses, and some hostnames that Google knows about.

DNS Brute Force
Syntax: theharvester -d domain -b source -c

th-search-onlysource3
th-search-onlysource4

Now we see additional hosts discovered from brute forcing using theHarvester’s dictionary file.

There’s a lot of information you can get back from these searches, so you’ll want to save the results, and keep them handy for use later. You never know when one piece of information will be the key to compromising the target.

Saving Results
Syntax: theharvester -d domain -b source -f

th-savefile1
th-savefile2

Once it finishes, it will save both an HTML, and an XML file to your current working directory. As you see above, opening the HTML file shows a nicely formatted page that’s easy to navigate, and is something you could include in your final report to your customer.

Virtual Hosts
Syntax: theharvester -d domain -b source -v

You may be searching on a domain that is hosted on a third-party web hosting provider, where there could be hundreds of domains all resolving to the same IP address. This scan will list those domains, and you’ll have to weed through them to see if any are relevant to your target.

th-vhost1
th-vhost2

What’s the potential security issue here? Say you’ve set up a web server, and you have a couple of friends who want you to host web sites for them. The first site is set up securely, no coding issues, etc., while the second site has a SQL Injection vulnerability. An attacker may target the first site, because there’s info they want to get, but seeing that it’s secure, they will look for another way in. It may be possible through the SQLi vulnerability in the second site, to compromise the web server, giving the attacker access to the first site. Everything depends on how the web server is configured, and if you’re checking for these types of vulnerabilities from a sysadmin perspective.

One thing to note in the results above, you’ll see the tag “strong” showing up. This is in fact a bug in theHarvester, and a bug report has been submitted to the author. He says it will be fixed in the next release.

API Keys
As I mentioned earlier, some of the data sources require you to have an API key in order to do searches against their engines. Once you get your API keys, you’ll have to edit the corresponding config file for that data source. Here’s an example with BingAPI:

th-api1

Notice the error that you need to enter the key. It gives you the location of the file you need to edit. Open the file, enter your key, then be sure to save the file.

th-api2

theHarvester is a great tool for doing some OSINT on a target. It’s simple to use, but can return a lot of information to you, and may help you get a foothold into the target organization.

Have an awesome day!