Appendix A - Fix Your Timestep!
Introduction
In the previous article we discussed how to integrate the equations of motion using a numerical integrator. Integration sounds complicated, but it's just a way to advance the your physics simulation forward by some small amount of time called "delta time" (or dt for short).
在上一篇文章里,我们讨论了如何用数值积分器来对运动方程进行积分。积分听起来很复杂,但它本质上只是把你的物理模拟往前推进一小段时间,这段时间就叫做"delta time"(简写为 dt)。
But how to choose this delta time value? This may seem like a trivial subject but in fact there are many different ways to do it, each with their own strengths and weaknesses - so read on!
但问题是,这个 dt 值到底该怎么选?这看起来像是个小问题,但实际上有很多种不同的做法,每种都有各自的优缺点——接着往下看吧!
Fixed delta time
The simplest way to step forward is with fixed delta time, like 1/60th of a second:
最简单的做法就是用固定的 dt,比如 1/60 秒:
double t = 0.0;
double dt = 1.0 / 60.0;
while ( !quit )
{
integrate( state, t, dt );
render( state );
t += dt;
}In many ways this code is ideal. If you're lucky enough to have your delta time match the display refresh rate, and you can ensure that your update loop takes less than one frame worth of real time, then you already have the perfect solution for updating your physics simulation and you can stop reading this article.
从很多方面来说,这段代码堪称理想。如果你足够幸运,dt 恰好和显示器的刷新率匹配,而且你能保证每次更新循环所用的真实时间都不超过一帧的时长,那你就已经拥有了完美的物理模拟更新方案,可以不用再往下看了。
But in the real world you may not know the display refresh rate ahead of time. VSYNC could be turned off, or you could be running on a slow computer which cannot update and render your frame fast enough to present it at 60fps.
但在现实世界中,你可能事先不知道显示器的刷新率。垂直同步可能被关掉了,或者你可能在一台性能较差的电脑上运行,该电脑无法以足够快的速度完成更新和渲染来维持 60fps。
In these cases your simulation will run faster or slower than you intended.
在这些情况下,你的模拟就会比预期跑得更快或更慢。
Variable delta time
Fixing this seems simple. Just measure how long the previous frame takes, then feed that value back in as the delta time for the next frame. This makes sense because of course, because if the computer is too slow to update at 60HZ and has to drop down to 30fps, you'll automatically pass in 1/30 as delta time. Same thing for a display refresh rate of 75HZ instead of 60HZ or even the case where VSYNC is turned off on a fast computer:
修复这个问题看起来很简单。只要测量上一帧花了多长时间,然后把这个值作为下一帧的 dt 传进去就行了。这很合情合理——如果电脑太慢、没法以 60Hz 更新而只能降到 30fps,你就会自动把 1/30 当作 dt 传入。对于显示器刷新率是 75Hz 而不是 60Hz,或者 VSYNC 被关闭的快速电脑,也是同理:
double t = 0.0;
double currentTime = hires_time_in_seconds();
while ( !quit )
{
double newTime = hires_time_in_seconds();
double frameTime = newTime - currentTime;
currentTime = newTime;
integrate( state, t, frameTime );
t += frameTime;
render( state );
}But there is a huge problem with this approach which I will now explain. The problem is that the behavior of your physics simulation depends on the delta time you pass in. The effect could be subtle as your game having a slightly different "feel" depending on framerate or it could be as extreme as your spring simulation exploding to infinity, fast moving objects tunneling through walls and players falling through the floor!
但这种做法有一个巨大的问题,我现在来解释。问题在于,物理模拟的行为取决于你传入的 dt 值。影响可能很微妙,比如游戏在不同帧率下"手感"略有差异;也可能非常严重,比如弹簧模拟爆炸到无穷大、高速物体穿墙而过、玩家直接掉穿地板!
One thing is for certain though and that is that it's utterly unrealistic to expect your simulation to correctly handle any delta time passed into it. To understand why, consider what would happen if you passed in 1/10th of a second as delta time? How about one second? 10 seconds? 100? Eventually you'll find a breaking point.
但有一点是确定无疑的:指望你的模拟能正确处理任何传入的 dt 值,是完全不切实际的。要理解为什么,想想如果你传入 1/10 秒作为 dt 会怎样?1 秒呢?10 秒?100 秒?最终你总会找到一个崩溃的临界点。
Semi-fixed timestep
It's much more realistic to say that your simulation is well behaved only if delta time is less than or equal to some maximum value. This is usually significantly easier in practice than attempting to make your simulation bulletproof at a wide range of delta time values.
更务实的说法是:你的模拟只有在 dt 小于或等于某个最大值时才能表现良好。在实践中,这通常远比试图让你的模拟在各种大小的 dt 下都万无一失要简单得多。
With this knowledge at hand, here's a simple trick to ensure that you never pass in a delta time greater than the maximum value, while still running at the correct speed on different machines:
有了这个认知,下面是一个简单的技巧,既能确保传入的 dt 永远不会超过最大值,又能在不同的机器上以正确的速度运行:
double t = 0.0;
double dt = 1 / 60.0;
double currentTime = hires_time_in_seconds();
while ( !quit )
{
double newTime = hires_time_in_seconds();
double frameTime = newTime - currentTime;
currentTime = newTime;
while ( frameTime > 0.0 )
{
float deltaTime = min( frameTime, dt );
integrate( state, t, deltaTime );
frameTime -= deltaTime;
t += deltaTime;
}
render( state );
}The benefit of this approach is that we now have an upper bound on delta time. It's never larger than this value because if it is we subdivide the timestep. The disadvantage is that we're now taking multiple steps per-display update including one additional step to consume any the remainder of frame time not divisible by dt. This is no problem if you are render bound, but if your simulation is the most expensive part of your frame you could run into the so called "spiral of death".
这种做法的好处是,我们现在对 dt 有了一个上限。它永远不会大于这个值,因为超出时我们就会把时间步细分。缺点则是,每次屏幕刷新时我们可能要执行多次物理步进,其中还包括一次额外的步进来消耗掉无法被 dt 整除的那点余量。如果你的瓶颈在渲染上,这没什么问题;但如果物理模拟才是最耗时的部分,你就可能掉进所谓的 "死亡螺旋"。
What is the spiral of death? It's what happens when your physics simulation can't keep up with the steps it's asked to take. For example, if your simulation is told: "OK, please simulate X seconds worth of physics" and if it takes Y seconds of real time to do so where Y > X, then it doesn't take Einstein to realize that over time your simulation falls behind. It's called the spiral of death because being behind causes your update to simulate more steps to catch up, which causes you to fall further behind, which causes you to simulate more steps…
什么是死亡螺旋?就是当你的物理模拟跟不上它被要求执行的步数时发生的事情。比如说,模拟被告知:"好的,请模拟 X 秒的物理",而完成这些模拟实际花了 Y 秒,并且 Y > X——那不用爱因斯坦也能看出来,你的模拟会越来越滞后。之所以叫"死亡螺旋",是因为落后会迫使更新去模拟更多的步数以追赶进度,而这又会让你更加落后,进而需要模拟更多步数……
So how do we avoid this? In order to ensure a stable update I recommend leaving some headroom. You really need to ensure that it takes significantly less than X seconds of real time to update X seconds worth of physics simulation. If you can do this then your physics engine can "catch up" from any temporary spike by simulating more frames. Alternatively you can clamp at a maximum # of steps per-frame and the simulation will appear to slow down under heavy load. Arguably this is better than spiraling to death, especially if the heavy load is just a temporary spike.
那我们怎么避免这种情况?为了确保更新的稳定性,我推荐留一些余量。你真的需要确保模拟 X 秒的物理只花费远少于 X 秒的真实时间。如果能做到这一点,你的物理引擎就可以通过多模拟几帧来从临时的性能峰值中"追赶"回来。另一个办法是限制每帧最大步数,这样在高负载时模拟会显得"变慢",而不是崩溃。可以说,这比螺旋式地走向崩溃要好得多,尤其是在高负载只是暂时峰值的情况下。
Free the physics
Now let's take it one step further. What if you want exact reproducibility from one run to the next given the same inputs? This comes in handy when trying to network your physics simulation using deterministic lockstep, but it's also generally a nice thing to know that your simulation behaves exactly the same from one run to the next without any potential for different behavior depending on the render framerate.
现在让我们再进一步。如果你希望在相同输入下,每次运行都能得到完全一致的结果呢?这在使用确定性锁步(deterministic lockstep)做物理模拟的网络同步时非常有用。而且总的来说,知道你的模拟每次运行行为完全一致、不会因为渲染帧率不同而产生差异,这本身就是件好事。
But you ask why is it necessary to have fully fixed delta time to do this? Surely the semi-fixed delta time with the small remainder step is "good enough"? And yes, you are right. It is good enough in most cases but it is not exactly the same due to to the limited precision of floating point arithmetic.
但你可能会问,为什么非得用完全固定的 dt 才能做到这点?半固定步长里那个小小的余量步进难道不是"够好了"吗?是的,你说得对,大多数情况下它确实够用。但由于浮点运算精度有限,结果并不是完全一样的。
What we want then is the best of both worlds: a fixed delta time value for the simulation plus the ability to render at different framerates. These two things seem completely at odds, and they are - unless we can find a way to decouple the simulation and rendering framerates.
所以我们想要的是两全其美:模拟用固定的 dt,同时又能以不同的帧率进行渲染。这两件事看起来完全矛盾——事实也确实如此——除非我们能找到一种方法来把模拟和渲染的帧率解耦。
Here's how to do it. Advance the physics simulation ahead in fixed dt time steps while also making sure that it keeps up with the timer values coming from the renderer so that the simulation advances at the correct rate. For example, if the display framerate is 50fps and the simulation runs at 100fps then we need to take two physics steps every display update. Easy.
做法是这样的。以固定的 dt 步长推进物理模拟,同时确保它能跟上渲染器给出的计时器值,从而让模拟以正确的速率推进。例如,如果显示帧率是 50fps 而模拟以 100fps 运行,那我们每次屏幕刷新就需要执行两次物理步进。简单。
What if the display framerate is 200fps? Well in this case it we need to take half a physics step each display update, but we can't do that, we must advance with constant dt. So we take one physics step every two display updates.
如果显示帧率是 200fps 呢?在这种情况下我们需要每次屏幕刷新做半步物理,但我们做不到——必须以恒定的 dt 推进。所以我们改为每两次屏幕刷新做一步物理。
Even trickier, what if the display framerate is 60fps, but we want our simulation to run at 100fps? There is no easy multiple. What if VSYNC is disabled and the display frame rate fluctuates from frame to frame?
更棘手的情况:如果显示帧率是 60fps,但我们想让模拟跑在 100fps 呢?两者之间没有简单的倍数关系。如果 VSYNC 还被关掉了,显示帧率每帧都在波动呢?
If you head just exploded don't worry, all that is needed to solve this is to change your point of view. Instead of thinking that you have a certain amount of frame time you must simulate before rendering, flip your viewpoint upside down and think of it like this: the renderer produces time and the simulation consumes it in discrete dt sized steps.
如果你的脑子已经炸了,别担心——解决这个问题只需要换一个思考角度。不要再想着"我有这么多帧时间必须在渲染前模拟完",把视角翻转过来,这样想:渲染器生产时间,模拟器以离散的 dt 大小的步长来消费时间。
For example:
例如:
double t = 0.0;
const double dt = 0.01;
double currentTime = hires_time_in_seconds();
double accumulator = 0.0;
while ( !quit )
{
double newTime = hires_time_in_seconds();
double frameTime = newTime - currentTime;
currentTime = newTime;
accumulator += frameTime;
while ( accumulator >= dt )
{
integrate( state, t, dt );
accumulator -= dt;
t += dt;
}
render( state );
}Notice that unlike the semi-fixed timestep we only ever integrate with steps sized dt so it follows that in the common case we have some unsimulated time left over at the end of each frame. This left over time is passed on to the next frame via the accumulator variable and is not thrown away.
注意,不同于半固定步长,这里我们只用大小为 dt 的步长进行积分,所以通常每帧结束时都会剩下一些未被模拟的时间。这些剩余时间通过 accumulator 变量传递到下一帧,而不是被扔掉。
The final touch
But what do to with this remaining time? It seems incorrect doesn't it?
但这些剩余的时间该怎么办?看起来不太对劲,是吧?
To understand what is going on consider a situation where the display framerate is 60fps and the physics is running at 50fps. There is no nice multiple so the accumulator causes the simulation to alternate between mostly taking one and occasionally two physics steps per-frame when the remainders "accumulate" above dt.
要理解发生了什么,考虑这样一种情况:显示帧率是 60fps,物理模拟跑在 50fps。两者之间没有整数倍的关系,所以 accumulator 会让模拟在大部分时间走一步、偶尔走两步之间交替,当余量"累积"超过 dt 时就会多走一步。
Now consider that the majority of render frames will have some small remainder of frame time left in the accumulator that cannot be simulated because it is less than dt. This means we're displaying the state of the physics simulation at a time slightly different from the render time, causing a subtle but visually unpleasant stuttering of the physics simulation on the screen.
进一步想想,大多数渲染帧的 accumulator 里都会残留一小段不足 dt 的帧时间,这些时间没法被模拟。这意味着我们展示的物理状态,其对应的时间和实际渲染时间略有偏差,从而导致物理模拟在屏幕上产生微妙但视觉上令人不适的卡顿感。
One solution to this problem is to interpolate between the previous and current physics state based on how much time is left in the accumulator:
这个问题的一个解决方案是,根据 accumulator 中剩余的时间量,在前一个和当前物理状态之间进行插值:
double t = 0.0;
double dt = 0.01;
double currentTime = hires_time_in_seconds();
double accumulator = 0.0;
State previous;
State current;
while ( !quit )
{
double newTime = time();
double frameTime = newTime - currentTime;
if ( frameTime > 0.25 )
frameTime = 0.25;
currentTime = newTime;
accumulator += frameTime;
while ( accumulator >= dt )
{
previousState = currentState;
integrate( currentState, t, dt );
t += dt;
accumulator -= dt;
}
const double alpha = accumulator / dt;
State state = currentState * alpha +
previousState * ( 1.0 - alpha );
render( state );
}This looks complicated but here is a simple way to think about it. Any remainder in the accumulator is effectively a measure of just how much more time is required before another whole physics step can be taken. For example, a remainder of dt/2 means that we are currently halfway between the current physics step and the next. A remainder of dt*0.1 means that the update is 1/10th of the way between the current and the next state.
这看起来挺复杂,但有一个简单的理解方式。accumulator 中的余量本质上衡量的是:距离下一个完整物理步进还差多少时间。比如说,余量是 dt/2 意味着我们目前处于当前物理步进和下一步之间的正中间;余量是 dt*0.1 则意味着更新进度处于当前状态和下一个状态之间的 1/10 处。
We can use this remainder value to get a blending factor between the previous and current physics state simply by dividing by dt. This gives an alpha value in the range [0,1] which is used to perform a linear interpolation between the two physics states to get the current state to render. This interpolation is easy to do for single values and for vector state values. You can even use it with full 3D rigid body dynamics if you store your orientation as a quaternion and use a spherical linear interpolation (slerp) to blend between the previous and current orientations.
我们可以简单地把这个余量除以 dt 来得到一个混合因子。这会给出一个范围在 [0,1] 之间的 alpha 值,用来在前后两个物理状态之间进行线性插值,得到当前要渲染的状态。对于单个数值和向量状态值来说,这个插值都很容易做到。甚至在完整的 3D 刚体动力学中也能使用——只要你把方向存储为四元数,然后用球面线性插值(slerp)在前后两个方向之间进行混合就行了。