Wil, just a couple of things to keep in mind. If you mount the Zoom on top of your camera you *may* pick up some noise from vibrations when the auto focus does its thing. That was notorious for the 5D with the built in and mics like the Rode and other shotgun mics have isolation mounts. You may want to take a look at something like this
J-Rod mount that says it incorporates a shock mount if you are going to camera mount the recorder. Anything to dampen as much vibration as you can.
As for matching up in post it is pretty simple really. If you have a slate or just clap your hands in front of the camera at each scene and match up the audio spike with the frame of when your hands touch. That's why something like this
slate works well. It's very easy to match the closing of the sticks with the sound spike. You just have to remember to clap or slate for each scene otherwise you will have to do a bit of searching in post for a good audio spike to match up with the video. Not hard to do, but may take some digging.
Almost any movie editing software should have the ability to adjust the audio time line, I see Movie Studio HD Platinum does. Even iMovie can split the native audio off the a/v track and allow you to overlay your better recorded audio. Just match the audio waveform spike with a known video frame and off you go! It sounds complicated but most video scenes end up being only a few seconds and then are all spliced together so it's not like if you are a touch off the error will have time to "grow".