Introducing Gavel

October 13, 2015

This article was originally posted on nullsecure.org and has been republished with permission.

I’ve been pretty busy lately with updating Tango to version 2.0 and working on threatnote, but, another project I started on recently was something @__eth0 and I are calling Gavel. Gavel is a set of Maltego transforms that query traffic records in each state. This project started out really ambitiously and we wanted to cover all 50 states, however, we ran into several problems. Our goal was to provide a way to look up certain data that are available in the traffic records, to include:

Address
Height
Weight
Age
License Plate Number
Car Make/Model

This is some great Open-Source Intelligence (OSINT) information available, and we wanted to make it easy to be obtained by researchers by using Maltego. As mentioned above, we ran into several problems that are preventing us from releasing it as a full blown set of transforms.

Roadblocks

The first problem we hit, was some states require you to pay for each query you make against the database. If we hosted this transform on a server, we wouldn’t be able to cover the cost of each of these queries, and even if we provided the code to the users, I’m not sure we could code out a good solution to facilitate the payment information for each query.

The next problem was some states are broken out by County. This would create so much extra work for us, and by the time we finished one county, another one might have changed their code, so it’s a ton of maintenance work to get them all working. Also, some states/counties used Captcha codes for each query, and I’ve had no experience getting around them.

So, with those problems at hand, we decided to open-source this tool to the community with the hopes that any people that would benefit from this OSINT tool can code out their own county and/or state. We’re aware we may never have all the states and counties covered, however, we’d like to get as many done as we can.

Currently, only Maryland is complete, so if you live there, you’re in luck! The code isn’t that difficult, it just took a little bit working with the requests and getting the exact responses we needed. The worst part is trying to parse the HTML, which I have no problem saying….I suck at.

How It Works

To use Gavel, you’ll simply download the code we provide and import the transforms into Maltego. Once all the code is set up and in the right place, you would then just add a “Person” entity in your Maltego graph like so…

png

Next, you would right-click on the entity and run “Gavel — Get Names”. This transform searches through the states traffic records and gives you the names of individuals that match your search that it has records on. For example, if your name was John Smith, there would probably be a ton of case records for that name, that’s why we give you the names, since it’s easier to narrow to the specific person you are looking for. This step also adds properties to each entity of the case ID’s that it will need to query in the next step.

Next, you would right-click on the person of interest and select “Gavel — Get Addresses”. It will then iterate through those case ID’s in the entities properties and return location and vehicle entities based on the information it finds.

Here’s a screenshot of what the end result would be.

png

In the image above, you can see at the top is the original entity we added, “Brian Warehime”. Below that are the case records it found that match the name that will hold all the case ID’s in the properties. Below the name are all the addresses and vehicle information we could discover (This is made up data, since my last traffic stop was so many years ago, it aged out).

You’ll notice on the right-hand side in the “Property View” section, we added additional properties to the person entity. We added the height, weight and DOB for each target, which will help validate if this is your target.

With regards to the vehicle entities, we display the license plate number, however, you can select the entity and on the right-hand side in the properties area, you will find the year, make and possible body style for the vehicle. See below for a screenshot:

png

Looking at the screenshot above, we can see it’s a 2000 GMC with a possible body style/make of “05”. I’m not sure where we could look that number up to find what model it corresponds to, so if you know, please let me know!

Installation

On the github page, you’ll find a few files to download, a few Python scripts, the Maltego library we use and an .mtz file.

First up, place the Python scripts in a location on your computer, like /Users/<yourname>/Maltego/Transforms or wherever. Next, place the Maltego library in the same directory as the two Python scripts you just moved.

Next, open up Maltego. Click on “Manage” in the titlebar, followed by clicking on the “Import Config” button. Locate the .mtz file you downloaded and click next. Make sure the “Local Transforms” and “Transform Sets” buttons are checked and click next. Once installed, click on “Finish”.

To make sure these transforms run correctly, we’ll need to set up your environment. Click on “Manage Transforms” in the menubar and it’ll open the “Transform Manager”. Next, scroll down until you find the Gavel transforms. Click on the first one and look at the bottom right of the window. You’ll see a few options like so:

png

First up, make sure the “Command Line” points to your correct Python interpreter, for instance, I put /usr/bin/python for mine. Next, change the “Working Directory” to the location you saved the transforms earlier.

Repeat the steps above for both Gavel transforms and you should be all set. One last thing before you go though, I believe you need to download an Entity expansion pack to use one of the entities I added (the car), which can be found here. It’ll still work without this, however, it’ll show up as a chess piece if the entity type is not found.

That should cover it, however, if those instructions don’t work, please feel free to email me or reach out to me on Twitter or something.

Future Development

With Maryland being the only state, we definitely want to expand this as far as we can. We’ll try to do other states as time allows, but, that’s why we need your help!

@__eth0 has done a lot of work for Delaware, and just needs to do some minor tweaking, however, once that’s done, we’ll require users to add a property value of “State” when they create the person identity to know which state to query.

A minor thing that I’ll most likely complete this week is adding the date to each entity when it was for. So, each address and vehicle will have a month/year attribute so you can know how useful the data is. One thing we thought would be useful as well is to correlate this information against state property records for validation. Anyone can go into the state’s property records and look up an address to see the current owners, so this would be an excellent way to validate the data from the case records.

You can find all the code on my Github page for what we have currently, and if you have any comments or questions, please feel free to reach out to us on Twitter at @brian_warehime or @__eth0.

#osint #maltego #privacy #projects

Identifying Phishing Attack Vectors Using dnstwist and Splunk

October 6, 2015

This was originally published on nullsecure.org. It has be republished with permission

Hey everyone, today we're doing something different. This is going to be a joint blog post from Ethan Dodge and I in which we talk about phishing defense coverage by the Alexa Top 100 domains, which will also expose the best attack vectors for phishing against these domains.

We're going to be using a combination of the new DNS reconnaissance tool DNStwist as well as some custom Python scripts to gather and analyze all the information we find, which we'll include in this post if you want to follow along or do your own research.

Overview

Here's a rundown of what we'll be doing to get all the information we need. We'll start by pulling down the Alexa Top 100 Domains, then we'll create a script to run them through a modified version of DNStwist to give us the permutated domain as well as the type of permutation (bitsquatting, Insertion, Omission, Replacement, etc.). We'll take this list and then do a host lookup of the domain to get the IP addresses hosting this domain, lastly we'll do a Whois lookup and Reverse DNS lookup on the IP we get and compare the Registrar/Pointer Record information against the domain to see if they match up.

After we have the comparison data, we'll be able to calculate what types of permutations are the most covered against attacks (meaning the original domain registered the permutated domain to possibly prevent phishing attempts), as well as the types of permutations that are least covered.

Grabbing the Data

First thing we need to do is get the Alexa Top One Million sites and just narrow that down to our scope of 100.

wget http://s3.amazonaws.com/alexa-static/top-1m.csv.zip

Then we'll just cut that down to 100 domains.

cat top-1m.csv | awk -F ',' {'print $2'} | head -n 100 > alexatop100.txt

This will give us something that looks like this:

Getting Permutations of Domains

Starting out, we'll need to use DNStwist to get a list of permutations. We went ahead and modified the original script to not print out extra information we don't need, we only wanted the type of permutation and the resulting domain of that permutation.

Here's a screenshot of the resulting script being run using google.com as an example.

If you want the modified version of dnstwist we used, you can grab it here.

Now that we have our list of domains, we'll use this bash one-liner to loop through each of the domains, and then run that through our modified dnstwist and output the results into it's own file in a new directory:

while read domain; do python dnstwist.py $domain > ~/Desktop/alexatop100/$domain; done < ~/Desktop/alexatop100.txt

Running the above takes about 5 seconds and gives us a directory looking like this:

Host Lookup

Next thing on our plate is to do a host lookup on the resulting permutated domains. We want to end up with a text file containing the permutated domain, permutation type, and IP Address if valid and a string of “NXDOMAIN” if it's not a valid IP Address.

Here's the bash one-liner we used to look through all the permutations for each domain and run the host lookup on each, and then add it to a new file for future analysis.

for file in *; do python hostlookup.py $file; done

After letting the above command run for about 30 minutes or so, we're left with a directory that looks like this:

You'll notice the directory now has another 100 files appended with \_hostlookup. Inside each file, we see each permutated domain with the IP Address it resolved to.

Reverse DNS Lookup

Initially we were going to run a Reverse DNS lookup against the IP's to see what Pointer records they had, however, we thought doing a Whois lookup would be higher integrity. In any event, here's the steps we did to run a Reverse DNS lookup on all the domains and permutations.

Next we wrote a Python script to grab the pointer record that was returned from each permutation. It included the domain, IP, permutation type, the pointer record and a True or False statement depending on if the original hostname was seen in the hostname we grabbed.

Running the above script in another bash one-liner like this, for file in *; do python rdnslookup.py $file; done, we get another 100 files in our directory that are prepended with _rdns.

Inside each file we can see the resulting Pointer record and the True or False string.

Moving on to the Whois lookup...

Whois Lookup

The last part we need to code before we can analyze the data is the Whois lookup on all the IP Addresses we grabbed from the host lookup step.

With this part, we just want to grab the description field of the Whois info, which should tell us the company that owns that IP Address. Between the Whois and Reverse DNS lookups, we should be able to determine if the owner of the IP Address matches the permutated domain.

Now that we have our final data, which was two different sources (Whois and Reverse DNS), we can now run some statistics on this data to answer some of the questions we asked earlier in the post. First things first though, we'll need to get Splunk set up to ingest the data.

Splunk Setup

In order for Splunk to recognize the fields, we'll configure the props.conf file in /opt/splunk/etc/system/local/ with the following settings:

[phishing] REPORT-phishing = REPORT-phishing

[whois] REPORT-whois = REPORT-whois

Next, we edit the transforms.conf file in /opt/splunk/etc/system/local/ like so:

[REPORT-phishing] DELIMS = “ “ FIELDS = “domain”,“ip”,“permtype”,“hostname”,“ismatch”

[REPORT-whois] DELIMS = “ “ FIELDS = “domain”,“ip”,“permtype”,“owner”,“ismatch”

That's all that needs to be done in order to parse the events. Which will look something like this now:

Before We Begin

Before we get into the specific results between Whois and Reverse DNS, it'll help if we identify the different types of permutations, provided by Lenny Zeltser on his blog:

Bitsquatting, which anticipates a small portion of systems encountering hardware errors, resulting in the mutation of the resolved domain name by 1 bit. (e.g., xeltser.com).

Homoglyph, which replaces a letter in the domain name with letters that look similar (e.g., ze1tser.com).
Repetition, which repeats one of the letters in the domain name (e.g., zeltsser.com).
Transposition, which swaps two letters within the domain name (e.g., zelster.com).
Replacement, which replaces one of the letters in the domain name, perhaps with a letter in proximity of the original letter on the keyboard (e.g, zektser.com).
Ommission, which removes one of the letters from the domain name (e.g., zelser.com).
Insertion, which inserts a letter into the domain name (e.g., zerltser.com).

Whois Analysis

Ok, let's jump into the data! We'll start with the analysis of the whois data.

Below is a list of the top permutation types registered

sourcetype=whois | top perm_type

Permutation Type	Count
Replacement	2002
Insertion	1849
Bitsquatting	1347
Omission	454
Repetition	400
Transposition	347
Homoglyph	335
Concourse	313
Subdomain	193
Hyphenation	146

Alright, out of all the registered domains, how many permutated domains are potentially registered by the original domain owner?

sourcetype=whois is_match=true | stats count

Out of all the domains registered, according to our unvetted data, there are only 460 domains registered by the original domain owner.

Now, let's see the permutation type protected against the most.

Permutation Type	Count
Insertion	146
Replacement	130
Bitsquatting	53
Repetition	41
Omission	35
Transposition	23
Homoglyph	19
Hyphenation	11
Subdomain	2

Out of all the Insertion permutated domains, let's identify the domains that are protected the most:

sourcetype=whois is_match=true perm_type="Insertion" | rex field=source "\/tmp\/(?<original_domain>[^_]+)"| top original_domain

Domain	Count
amazon.com	29
microsoft.com	28
booking.com	26
amazon.co.uk	25
yahoo.com	15
amazon.in	9
netflix.com	7
wikipedia.org	2
yandex.ru	1
msn.com	1

Now, let's just do the most protected domains regardless of permutation:

sourcetype=whois is_match=true | rex field=source "\/tmp\/(?<original_domain>[^_]+)"| top original_domain

Domain	Count
amazon.com	82
amazon.co.uk	63
microsoft.com	62
booking.com	55
amazon.in	54
yahoo.com	44
netflix.com	29
bing.com	17
apple.com	10
wikipedia.org	6

Looks like Amazon is the most concerned about someone ripping off their domain :)

Reverse DNS Analysis

Alright, let's move onto Reverse DNS Analysis.

Let's get the most common permutation types registered.

sourcetype=phishing | top perm_type

Permutation Type	Count
Replacement	1163
Insertion	1025
Bitsquatting	796
Omission	236
Repetition	227
Homoglyph	203
Transposition	194
Subdomain	98
Hyphenation	95

Alright, out of all the registered domains, how many permutated domains are registered by the original domain owner?

sourcetype=phishing is_match=true | stats count

Out of all the domains registered, according to our unvetted data, there are only 381 domains registered by the original domain owner.

Now, let's see the permutation type protected against the most.

Permutation Type	Count
Insertion	114
Replacement	108
Bitsquatting	47
Repetition	32
Omission	27
Transposition	25
Homoglyph	17
Hyphenation	10
Subdomain	1

Out of all the Insertion permutated domains, let's identify the domains that are protected the most:

sourcetype=phishing is_match=true perm_type="Insertion" | rex field=source "\/tmp\/(?<original_domain>[^_]+)" | top original_domain

Domain	Count
amazon.com	29
booking.com	26
amazon.co.uk	25
yahoo.com	17
amazon.in	9
netflix.com	4
yandex.ru	1
wikipedia.org	1
msn.com	1
blogspot.com	1

Now, let's just do the most protected domains regardless of permutation:

sourcetype=phishing is_match=true | rex field=source "\/tmp\/(?<original_domain>[^_]+)" | top original_domain

Domain	Count
amazon.com	82
amazon.co.uk	63
yahoo.com	56
booking.com	55
amazon.in	54
netflix.com	19
bing.com	12
google.de	8
google.com	7
wikipedia.org	5

Just like the Whois data, Amazon takes the cake for most permutated domains registered. It makes sense though, since that would be a great site to phish for credentials.

DDoS Protection Sites

We knew that we would have some incorrect data, since not every domain registered would point exactly to the owner, i.e. wikipedia.com is owned by Wikimedia, thus we wouldn't count that as true.

One of the big things we noticed is the amount of domains resolving to prolexic.com, which is a DDoS protection site, which is what companies would use in order to prevent DDoS attempts on their sites (...obviously). We doubted that phishing domains and/or malicious actors would enlist the help of a DDoS protection service, since it probably costs a lot of money based on traffic. Based on this fact, we are going to count prolexic.com hits as true and see what kind of results we get then.

Let's rerun some of the original searches now...

First, how many permutated domains are protected?

sourcetype=phishing | eval ddos=if(searchmatch("hostname=*prolexic*"),"True","False") | search ddos="True" OR is_match="True" | stats count

We see that there are now 808 domains protected, instead of the original 381, big change!

Now we'll see what permutation types are protected against the most:

Permutation Type	Count
Replacement	243
Insertion	211
Bitsquatting	139
Repetition	48
Transposition	45
Omission	44
Homoglyph	40
Hyphenation	19
Subdomain	19

Lastly, what domains are protected the most:

Domain	Count
amazon.com	91
amazon.co.uk	66
booking.com	59
yahoo.com	58
amazon.in	56
pinterest.com	41
netflix.com	26
google.es	21
paypal.com	19
imdb.com	15

Final Thoughts

So, some interesting things came out of this research. First, the most common types of permutated domains that companies seem to register are replacement or insertion permutation techniques (netflox.com or netfliix.com). We also discovered that a majority of companies are using DDoS protection sites to register permutated domains (this isn't really a surprise, just interesting to note.). Lastly, we see that amazon.com, booking.com and yahoo.com are the most protected against potential phishing attempts.

The last note isn't that surprising again, but, it's interesting to see what companies took the time and steps to register these other domains. Amazon and Yahoo! are definitely sites I would expect to see, more so Amazon than Yahoo, however, since Yahoo has been around for a while, it makes sense.

If you're interested in checking out the data we used, you can find it here. If you have any questions or comments about this info, please feel free to reach out to us on Twitter, @brian_warehime or @egd_io.

#osint