I had this problem too and used the same technique. After using bits in an integer (with binary operators), instead of a boolean array, my program finished in time (about 4.8 seconds).
But I'm interested how people managed to solve it in less then a second. I think they do something like pre-generating, but I'm not sure about that.