cyber.egd.io

osint

This article was originally posted on nullsecure.org and has been republished with permission.

I’ve been pretty busy lately with updating Tango to version 2.0 and working on threatnote, but, another project I started on recently was something @__eth0 and I are calling Gavel. Gavel is a set of Maltego transforms that query traffic records in each state. This project started out really ambitiously and we wanted to cover all 50 states, however, we ran into several problems. Our goal was to provide a way to look up certain data that are available in the traffic records, to include:

  • Address
  • Height
  • Weight
  • Age
  • License Plate Number
  • Car Make/Model

This is some great Open-Source Intelligence (OSINT) information available, and we wanted to make it easy to be obtained by researchers by using Maltego. As mentioned above, we ran into several problems that are preventing us from releasing it as a full blown set of transforms.

Roadblocks

The first problem we hit, was some states require you to pay for each query you make against the database. If we hosted this transform on a server, we wouldn’t be able to cover the cost of each of these queries, and even if we provided the code to the users, I’m not sure we could code out a good solution to facilitate the payment information for each query.

The next problem was some states are broken out by County. This would create so much extra work for us, and by the time we finished one county, another one might have changed their code, so it’s a ton of maintenance work to get them all working. Also, some states/counties used Captcha codes for each query, and I’ve had no experience getting around them.

So, with those problems at hand, we decided to open-source this tool to the community with the hopes that any people that would benefit from this OSINT tool can code out their own county and/or state. We’re aware we may never have all the states and counties covered, however, we’d like to get as many done as we can.

Currently, only Maryland is complete, so if you live there, you’re in luck! The code isn’t that difficult, it just took a little bit working with the requests and getting the exact responses we needed. The worst part is trying to parse the HTML, which I have no problem saying….I suck at.

How It Works

To use Gavel, you’ll simply download the code we provide and import the transforms into Maltego. Once all the code is set up and in the right place, you would then just add a “Person” entity in your Maltego graph like so…

png

Next, you would right-click on the entity and run “Gavel — Get Names”. This transform searches through the states traffic records and gives you the names of individuals that match your search that it has records on. For example, if your name was John Smith, there would probably be a ton of case records for that name, that’s why we give you the names, since it’s easier to narrow to the specific person you are looking for. This step also adds properties to each entity of the case ID’s that it will need to query in the next step.

Next, you would right-click on the person of interest and select “Gavel — Get Addresses”. It will then iterate through those case ID’s in the entities properties and return location and vehicle entities based on the information it finds.

Here’s a screenshot of what the end result would be.

png

In the image above, you can see at the top is the original entity we added, “Brian Warehime”. Below that are the case records it found that match the name that will hold all the case ID’s in the properties. Below the name are all the addresses and vehicle information we could discover (This is made up data, since my last traffic stop was so many years ago, it aged out).

You’ll notice on the right-hand side in the “Property View” section, we added additional properties to the person entity. We added the height, weight and DOB for each target, which will help validate if this is your target.

With regards to the vehicle entities, we display the license plate number, however, you can select the entity and on the right-hand side in the properties area, you will find the year, make and possible body style for the vehicle. See below for a screenshot:

png

Looking at the screenshot above, we can see it’s a 2000 GMC with a possible body style/make of “05”. I’m not sure where we could look that number up to find what model it corresponds to, so if you know, please let me know!

Installation

On the github page, you’ll find a few files to download, a few Python scripts, the Maltego library we use and an .mtz file.

First up, place the Python scripts in a location on your computer, like /Users/<yourname>/Maltego/Transforms or wherever. Next, place the Maltego library in the same directory as the two Python scripts you just moved.

Next, open up Maltego. Click on “Manage” in the titlebar, followed by clicking on the “Import Config” button. Locate the .mtz file you downloaded and click next. Make sure the “Local Transforms” and “Transform Sets” buttons are checked and click next. Once installed, click on “Finish”.

To make sure these transforms run correctly, we’ll need to set up your environment. Click on “Manage Transforms” in the menubar and it’ll open the “Transform Manager”. Next, scroll down until you find the Gavel transforms. Click on the first one and look at the bottom right of the window. You’ll see a few options like so:

png

First up, make sure the “Command Line” points to your correct Python interpreter, for instance, I put /usr/bin/python for mine. Next, change the “Working Directory” to the location you saved the transforms earlier.

Repeat the steps above for both Gavel transforms and you should be all set. One last thing before you go though, I believe you need to download an Entity expansion pack to use one of the entities I added (the car), which can be found here. It’ll still work without this, however, it’ll show up as a chess piece if the entity type is not found.

That should cover it, however, if those instructions don’t work, please feel free to email me or reach out to me on Twitter or something.

Future Development

With Maryland being the only state, we definitely want to expand this as far as we can. We’ll try to do other states as time allows, but, that’s why we need your help!

@__eth0 has done a lot of work for Delaware, and just needs to do some minor tweaking, however, once that’s done, we’ll require users to add a property value of “State” when they create the person identity to know which state to query.

A minor thing that I’ll most likely complete this week is adding the date to each entity when it was for. So, each address and vehicle will have a month/year attribute so you can know how useful the data is. One thing we thought would be useful as well is to correlate this information against state property records for validation. Anyone can go into the state’s property records and look up an address to see the current owners, so this would be an excellent way to validate the data from the case records.

You can find all the code on my Github page for what we have currently, and if you have any comments or questions, please feel free to reach out to us on Twitter at @brian_warehime or @__eth0.

#osint #maltego #privacy #projects

This was originally published on nullsecure.org. It has be republished with permission

Hey everyone, today we're doing something different. This is going to be a joint blog post from Ethan Dodge and I in which we talk about phishing defense coverage by the Alexa Top 100 domains, which will also expose the best attack vectors for phishing against these domains.

We're going to be using a combination of the new DNS reconnaissance tool DNStwist as well as some custom Python scripts to gather and analyze all the information we find, which we'll include in this post if you want to follow along or do your own research.

Overview

Here's a rundown of what we'll be doing to get all the information we need. We'll start by pulling down the Alexa Top 100 Domains, then we'll create a script to run them through a modified version of DNStwist to give us the permutated domain as well as the type of permutation (bitsquatting, Insertion, Omission, Replacement, etc.). We'll take this list and then do a host lookup of the domain to get the IP addresses hosting this domain, lastly we'll do a Whois lookup and Reverse DNS lookup on the IP we get and compare the Registrar/Pointer Record information against the domain to see if they match up.

After we have the comparison data, we'll be able to calculate what types of permutations are the most covered against attacks (meaning the original domain registered the permutated domain to possibly prevent phishing attempts), as well as the types of permutations that are least covered.

Grabbing the Data

First thing we need to do is get the Alexa Top One Million sites and just narrow that down to our scope of 100.

wget http://s3.amazonaws.com/alexa-static/top-1m.csv.zip

Then we'll just cut that down to 100 domains.

cat top-1m.csv | awk -F ',' {'print $2'} | head -n 100 > alexatop100.txt

This will give us something that looks like this:

https://i.imgur.com/4BMakl1.png

Getting Permutations of Domains

Starting out, we'll need to use DNStwist to get a list of permutations. We went ahead and modified the original script to not print out extra information we don't need, we only wanted the type of permutation and the resulting domain of that permutation.

Here's a screenshot of the resulting script being run using google.com as an example.

https://i.imgur.com/ewhsmic.png

If you want the modified version of dnstwist we used, you can grab it here.

Now that we have our list of domains, we'll use this bash one-liner to loop through each of the domains, and then run that through our modified dnstwist and output the results into it's own file in a new directory:

while read domain; do python dnstwist.py $domain > ~/Desktop/alexatop100/$domain; done < ~/Desktop/alexatop100.txt

Running the above takes about 5 seconds and gives us a directory looking like this:

https://i.imgur.com/wLbeUqU.png

Host Lookup

Next thing on our plate is to do a host lookup on the resulting permutated domains. We want to end up with a text file containing the permutated domain, permutation type, and IP Address if valid and a string of “NXDOMAIN” if it's not a valid IP Address.

Here's the bash one-liner we used to look through all the permutations for each domain and run the host lookup on each, and then add it to a new file for future analysis.

for file in *; do python hostlookup.py $file; done

After letting the above command run for about 30 minutes or so, we're left with a directory that looks like this:

https://i.imgur.com/A5nLMyr.png

You'll notice the directory now has another 100 files appended with \_hostlookup. Inside each file, we see each permutated domain with the IP Address it resolved to.

https://i.imgur.com/dOoIYtA.png

Reverse DNS Lookup

Initially we were going to run a Reverse DNS lookup against the IP's to see what Pointer records they had, however, we thought doing a Whois lookup would be higher integrity. In any event, here's the steps we did to run a Reverse DNS lookup on all the domains and permutations.

Next we wrote a Python script to grab the pointer record that was returned from each permutation. It included the domain, IP, permutation type, the pointer record and a True or False statement depending on if the original hostname was seen in the hostname we grabbed.

Running the above script in another bash one-liner like this, for file in *; do python rdnslookup.py $file; done, we get another 100 files in our directory that are prepended with _rdns.

https://i.imgur.com/zGDQJMf.png

Inside each file we can see the resulting Pointer record and the True or False string.

https://i.imgur.com/hpTms00.png

Moving on to the Whois lookup...

Whois Lookup

The last part we need to code before we can analyze the data is the Whois lookup on all the IP Addresses we grabbed from the host lookup step.

With this part, we just want to grab the description field of the Whois info, which should tell us the company that owns that IP Address. Between the Whois and Reverse DNS lookups, we should be able to determine if the owner of the IP Address matches the permutated domain.

Now that we have our final data, which was two different sources (Whois and Reverse DNS), we can now run some statistics on this data to answer some of the questions we asked earlier in the post. First things first though, we'll need to get Splunk set up to ingest the data.

Splunk Setup

In order for Splunk to recognize the fields, we'll configure the props.conf file in /opt/splunk/etc/system/local/ with the following settings:

[phishing] REPORT-phishing = REPORT-phishing

[whois] REPORT-whois = REPORT-whois

Next, we edit the transforms.conf file in /opt/splunk/etc/system/local/ like so:

[REPORT-phishing] DELIMS = “ “ FIELDS = “domain”,“ip”,“permtype”,“hostname”,“ismatch”

[REPORT-whois] DELIMS = “ “ FIELDS = “domain”,“ip”,“permtype”,“owner”,“ismatch”

That's all that needs to be done in order to parse the events. Which will look something like this now:

https://i.imgur.com/dqKiCa4.png

Before We Begin

Before we get into the specific results between Whois and Reverse DNS, it'll help if we identify the different types of permutations, provided by Lenny Zeltser on his blog:

  • Bitsquatting, which anticipates a small portion of systems encountering hardware errors, resulting in the mutation of the resolved domain name by 1 bit. (e.g., xeltser.com).
  • Homoglyph, which replaces a letter in the domain name with letters that look similar (e.g., ze1tser.com).
  • Repetition, which repeats one of the letters in the domain name (e.g., zeltsser.com).
  • Transposition, which swaps two letters within the domain name (e.g., zelster.com).
  • Replacement, which replaces one of the letters in the domain name, perhaps with a letter in proximity of the original letter on the keyboard (e.g, zektser.com).
  • Ommission, which removes one of the letters from the domain name (e.g., zelser.com).
  • Insertion, which inserts a letter into the domain name (e.g., zerltser.com).

Whois Analysis

Ok, let's jump into the data! We'll start with the analysis of the whois data.

Below is a list of the top permutation types registered

sourcetype=whois | top perm_type

Permutation Type Count
Replacement 2002
Insertion 1849
Bitsquatting 1347
Omission 454
Repetition 400
Transposition 347
Homoglyph 335
Concourse 313
Subdomain 193
Hyphenation 146

Alright, out of all the registered domains, how many permutated domains are potentially registered by the original domain owner?

sourcetype=whois is_match=true | stats count

Out of all the domains registered, according to our unvetted data, there are only 460 domains registered by the original domain owner.

Now, let's see the permutation type protected against the most.

Permutation Type Count
Insertion 146
Replacement 130
Bitsquatting 53
Repetition 41
Omission 35
Transposition 23
Homoglyph 19
Hyphenation 11
Subdomain 2

Out of all the Insertion permutated domains, let's identify the domains that are protected the most:

sourcetype=whois is_match=true perm_type="Insertion" | rex field=source "\/tmp\/(?<original_domain>[^_]+)"| top original_domain

Domain Count
amazon.com 29
microsoft.com 28
booking.com 26
amazon.co.uk 25
yahoo.com 15
amazon.in 9
netflix.com 7
wikipedia.org 2
yandex.ru 1
msn.com 1

Now, let's just do the most protected domains regardless of permutation:

sourcetype=whois is_match=true | rex field=source "\/tmp\/(?<original_domain>[^_]+)"| top original_domain

Domain Count
amazon.com 82
amazon.co.uk 63
microsoft.com 62
booking.com 55
amazon.in 54
yahoo.com 44
netflix.com 29
bing.com 17
apple.com 10
wikipedia.org 6

Looks like Amazon is the most concerned about someone ripping off their domain :)

Reverse DNS Analysis

Alright, let's move onto Reverse DNS Analysis.

Let's get the most common permutation types registered.

sourcetype=phishing | top perm_type

Permutation Type Count
Replacement 1163
Insertion 1025
Bitsquatting 796
Omission 236
Repetition 227
Homoglyph 203
Transposition 194
Subdomain 98
Hyphenation 95

Alright, out of all the registered domains, how many permutated domains are registered by the original domain owner?

sourcetype=phishing is_match=true | stats count

Out of all the domains registered, according to our unvetted data, there are only 381 domains registered by the original domain owner.

Now, let's see the permutation type protected against the most.

Permutation Type Count
Insertion 114
Replacement 108
Bitsquatting 47
Repetition 32
Omission 27
Transposition 25
Homoglyph 17
Hyphenation 10
Subdomain 1

Out of all the Insertion permutated domains, let's identify the domains that are protected the most:

sourcetype=phishing is_match=true perm_type="Insertion" | rex field=source "\/tmp\/(?<original_domain>[^_]+)" | top original_domain

Domain Count
amazon.com 29
booking.com 26
amazon.co.uk 25
yahoo.com 17
amazon.in 9
netflix.com 4
yandex.ru 1
wikipedia.org 1
msn.com 1
blogspot.com 1

Now, let's just do the most protected domains regardless of permutation:

sourcetype=phishing is_match=true | rex field=source "\/tmp\/(?<original_domain>[^_]+)" | top original_domain

Domain Count
amazon.com 82
amazon.co.uk 63
yahoo.com 56
booking.com 55
amazon.in 54
netflix.com 19
bing.com 12
google.de 8
google.com 7
wikipedia.org 5

Just like the Whois data, Amazon takes the cake for most permutated domains registered. It makes sense though, since that would be a great site to phish for credentials.

DDoS Protection Sites

We knew that we would have some incorrect data, since not every domain registered would point exactly to the owner, i.e. wikipedia.com is owned by Wikimedia, thus we wouldn't count that as true.

One of the big things we noticed is the amount of domains resolving to prolexic.com, which is a DDoS protection site, which is what companies would use in order to prevent DDoS attempts on their sites (...obviously). We doubted that phishing domains and/or malicious actors would enlist the help of a DDoS protection service, since it probably costs a lot of money based on traffic. Based on this fact, we are going to count prolexic.com hits as true and see what kind of results we get then.

Let's rerun some of the original searches now...

First, how many permutated domains are protected?

sourcetype=phishing | eval ddos=if(searchmatch("hostname=*prolexic*"),"True","False") | search ddos="True" OR is_match="True" | stats count

We see that there are now 808 domains protected, instead of the original 381, big change!

Now we'll see what permutation types are protected against the most:

Permutation Type Count
Replacement 243
Insertion 211
Bitsquatting 139
Repetition 48
Transposition 45
Omission 44
Homoglyph 40
Hyphenation 19
Subdomain 19

Lastly, what domains are protected the most:

Domain Count
amazon.com 91
amazon.co.uk 66
booking.com 59
yahoo.com 58
amazon.in 56
pinterest.com 41
netflix.com 26
google.es 21
paypal.com 19
imdb.com 15

Final Thoughts

So, some interesting things came out of this research. First, the most common types of permutated domains that companies seem to register are replacement or insertion permutation techniques (netflox.com or netfliix.com). We also discovered that a majority of companies are using DDoS protection sites to register permutated domains (this isn't really a surprise, just interesting to note.). Lastly, we see that amazon.com, booking.com and yahoo.com are the most protected against potential phishing attempts.

The last note isn't that surprising again, but, it's interesting to see what companies took the time and steps to register these other domains. Amazon and Yahoo! are definitely sites I would expect to see, more so Amazon than Yahoo, however, since Yahoo has been around for a while, it makes sense.

If you're interested in checking out the data we used, you can find it here. If you have any questions or comments about this info, please feel free to reach out to us on Twitter, @brian_warehime or @egd_io.

#osint