Tuesday, 5 June 2012

AddAGram: A Puzzle challenge from ITA Software

Last week I came across this page where ITA Software posted a few puzzle challenges it used in its hiring process. I found the one named AddAGram very interesting and then decided to solve it.
Here is the question:

An "add-a-gram" is a sequence of words formed by starting with a 3-letter word, adding a letter and rearranging to form a 4-letter word, and so on. For example, here are add-a-grams of the words "CREDENTIALS" and "ANACHRONISM":
ail + s =
sail + n =
nails + e =
aliens + t =
salient + r =
entrails + c =
clarinets + e =
interlaces + d =
CREDENTIALS (length 11)
mar + c =
cram + h =
march + s =
charms + o =
chromas + n =
monarchs + i =
harmonics + a =
maraschino + n =
ANACHRONISM (length 11)

Test your own credentials: given the dictionary found here, what is the longest add-a-gram?

 Initially I thought it would be solvable using a trie by inserting the words in dictionary in a trie and then simply searching the trie for the longest path down with each node after 3 node which is an end node ie. each node after third node marked as end of some word. But then I realized that it was wrong after half way coding it. Then I went back to square one and rethought the problem and then realized that it can be solvable only by searching. So then I came up with the following approach. I started with the longest word in the dictionary and then found the next word that is at a distance of 1 deletion from the previous word then pushed all those on a stack and for each word in the stack searched for next word at a distance of 1. So I basically created a graph from the words in dictionary t runtime and ran a DFS on it. The biggest hurdle or problem with this approach is searching through the whole dictionary for a word of that length and then backtracking when error occurs is very costly. So then I came up with a simple optimization technique to reduce the amount of words to be searched by storing the words in a array of  vectors where vector at index i stores words of length i so dividing the access in 2 dimensions gives access to the all words of length i directly by calling dict[i].get(). Then finally we can rebuild the original sequence in the visited list of nodes.
It is written in java and it takes ~20 secs to run wit the dictionary of words given here.

Here is only the main logic part of the code:

You can find the complete program here

public class AddAGram{
 public static Vector<String>dic[];
 public static Stack<String>s;
 public static HashSet<String>vis=new HashSet<String>();
 public static boolean check(char []l,char []s){
  int i,j,len=l.length;
    if(s[i]==l[j]&&l[j]!=0) {s[i]=0;l[j]=0;len--;}
  if(len==1) return true;
  return false;
 static void addAll(String str){
  int l=str.length()-1,tot=dic[l].size();
  String other="";
  for(int i=0;i<tot;i++){
 static char diff(String small,String large){
  StringBuffer sb=new StringBuffer(large);
  for(int i=0;i<small.length();i++)
  return sb.charAt(0);
 public static void generateOriginal(String start,String end){
  LinkedList<String>list=new LinkedList<String>();
  for(String a:vis) list.add(a);
  String next="",prev=start;
  int k,j=end.length();
  for(int i=4;i<=j;){
 public static void main(String args[]) throws Exception{
  dic=new Vector[30];
  String str;int i,j,k;
  for(i=0;i<30;i++) dic[i]=new Vector<String>();
  InputReader sc=new InputReader(new FileInputStream("WORD.LST"));
  long init=System.currentTimeMillis();
  int z=0;
  String word="",tmp="";Vector<String>v;
    s=new Stack<String>();
     if(tmp.length()==3) {
      System.out.println("Time taken : "+(System.currentTimeMillis()-init));