I recently worked on an FPGA design for a video over USB application. The design used very high resolution video and was fairly complicated. Detailed simulation was definately the preferred method to implement and debug this design. However, simulation of a video design can be very time consuming in simulation time because a frame of video takes a long time (10's to 100's of ms of simulation time) to simulate. On this project, simulating a single video frame at full resolution took 45 minutes. With such long simulation times, it isn't easy to debug a design because of the long wait between trying something and the result. Sometimes problems don't show until several frames in. I have always advocated a heavy simulation based process for FPGA design with self checking automated test benches. The thinking is that the time spent in simulation within a controlled environment where it is easy to have full visibility into the design will save more time in debugging problems in the lab. Even though FPGAs are reprogrammable and there isn't a big penalty in NRE for mistakes like for ASICs, a similar design process still makes sense. Over the years I've seen a lot of FPGA designers use a process of writing code, and testing the design in hardware, fixing problems and iteratively retesting in hardware with a minimum use of simulation. A middle ground process that I used in developing the ECC design block for my masters degree capstone project was to simulate in detail at the block level (similar to unit testing in software), but then actually testing and debugging in hardware at the whole system level. Again in this case, it was fairly time consuming to get a full simulation up and running, including the FPGA vendor's serial transceiver models. Over the years FPGA vendors have added features such as embedded logic analyzers accessible through JTAG that gives designers visibility to internal signals inside the FPGA during debug.
There are pros and cons to both processes. In many cases less simulation and more in lab debugging might actually save time, with tools such as embedded logic analyzers available. On the other hand, actually running the FPGA synthesis and fitter tools can be almost as time consuming as running a simulation. In the video design that I worked on, it was critical to make the resolution programmable (at least with GENERICS for simulation if not fully programmable through registers in the hardware in real time) since lower resolutions could be simulated much quicker and allow reasonable debug times, especially for multiple frame issues. There are also tricks available to speed up FPGA synthesis and fitting such as incremental compilation. There is no perfect solution to this problem for complicated FPGA designs.
One thing that I have thought showed promise for speeding up simulations is a hardware based simualtion accellerator from GateRocket. The device is a plug in board to a high end simulation PC that contains an FPGA on it. Designs are actually compiled into the FPGA and run in real time with an interface to the simulator giving internal visibility to the signals while at the same time speeding up simulations. In concept it might really help, especially in video type designs where a lot of the simulation time is spent on the design and the test benches are fairly simple. One thought that this type of design triggers to me is the need for synthesizable test benches. If the whole system including the test benches was running in real time, a simulation that used to take hours would only take seconds, giving you all of the speed benefits of debugging in hardware, but still giving you the full visibility of all the internal signals without limitation. I think the ability to synthesize higher level test bench languages could be a key enabling technology to this process, or writing test benches that run in software on embedded processors.
[1] Wilson, Ron. February 19, 2009. "Verifying FPGA Designs: Simulate, Emulate, or Hope for the Best?" EDN Magazine.