Couldn't something like this be implemented with regular hardware? Get 2 cheap £12 web cams and put them 30cm apart and that should be enough to get some spacial information. After that it would be mostly software to detect the motion from someone walking in front of the TV and doing all the "swipes" etc for motion.
Yes, I know. I make it sound easy and it probably isn't as easy as that, but still sounds like a fun project.