actually, if you follow the ffmpeg lists, the issue deals a lot with the fact that x264 isn't all that optimized at all. None of it is multithreaded. This becomes especially apparent if you use CoreAVC on windows (which in its standard form does not rely on any GPU accelleration etc) which can decode a 1080p H.264 stream on a 2Ghz Core 2 Duo comfortably.
There have been a lot of commits funneling into the ffmpeg stack on this issue starting in late 2006, and there still is a lot of work left to be done.
-Thom