Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Frances Coronel
November 2016
Objective
Create algorithms for generating honeywords - honeywords being fake passwords that look
real to an adversary.
Problem 1
T is empty set and algorithm uses no example passwords
First the algorithm validates the given password by replacing invalid ones with random strings.
However, if the password is valid, then the password is instead split into tokens.Tokenization
is the idea of using characters, numbers, and punctuation as substrings. If a single digit uppercase token is followed by a lowercase one, the algorithm can combine the two to form a
proper noun.
These token sets are used to create a cluster center by removing, adding, or re-ordering tokens. Variations of these tokens are created in order to be combined with the cluster centers
original tokens which in turn creates honeywords.
These honeywords are added on to a collection called sweetwords. A random word is chosen
from this collection and this itself is used to create a new cluster center.
The process to create a honeyword is repeated until the number of sweetwords wanted is
satised.
After theyre generated, we go through one nal validation and remove any duplicates.
Problem 2
T is set of 100 most common RockYou passwords
Since theres a database to work with, we can essentially use these top 100 passwords as an
origin that can be manipulated to create accurate honeywords.
The honeywords are essentially then derivatives of the passwords from the RockYou
database that are created through a few key tokenization algorithms.
shortenString
truncates the string since it is easier to delete characters than adding them on
addOnCharacters
for the most part, addition of characters is not done to prevent bad password creation
replaceCharacters
replaces existing digits, characters, and punctuation with a random digit, character,
or punctuation
tokenize
allows for above tokenizer methods to be applied randomly to any one given password in data set
Problem 3
T is full RockYou dataset of passwords
This algorithm uses frequency analysis to create centers around the world in order to change
or append digits to the end of potential honeywords (or sugarwords) to create branches.
The number of centers and branches are known as k.
The centers are generated on the input passwords:
1. Letters
a. a rst-order Markov chain model/matrix is used to predict the next letter of the
word based on the current letter
References