I usually just make the position controlled by two text variables for x and y and if the variable equals whatever it does whatever action
The reason I do it that way is it's alot easier to calculate the crap. You just have to move the guy to whatever position, see what the variable is, put that in, and when you're done move the text fields off screen.
I know that might not sound right, but I'm not very well right now and my thinkingisn't very coherant at the moment.