[Openswan dev] Problem and proposed bug fix to Openswan when the system's time jumps forward multiple decades

D. Hugh Redelmeier hugh at mimosa.com
Thu Aug 4 13:54:34 EDT 2011


| From: Antony Richards <arichards at cybertec.com.au>

| On 08/04/2011 04:37 PM, D. Hugh Redelmeier wrote:
| > | From: Antony Richards<arichards at cybertec.com.au>
| >
| > | I came across a bug in Openswan.  The box starts up.  Openswan connects.
| > | The
| > | date is set back to 1970, NTPD is started and sets the date to be 2011.
| >
| > That sounds like a bug in ntpd.  How does the date get set to 1970?
| >
| >    
| I explicitly turned off ntpd and set the date back to 1970.  The reason was
| that it quickly reproduced an issue I observed when an Linux embedded device
| I'm developing is first powered on with a pre-loaded configuration.

What were you observing in your embedded system?  Did pluto log the
message about time moving backwards?  Is your system behaving
reasonably?  Does it time-travel at great rates?

| That would solve the problem also, but using uptime gives a simpler solution.

Yes.

What I don't remember is whether we use results of now() in logging,
implicitly assuming that the values are meaningful to the reader (i.e.
now they are the best-guess at wall-clock time, but with another clock
that might not be the case).

If you switch to a monotonic clock, the monotonicity-creating feature
of now() ought to be replaced by a monotonicity-checking feature.

| What would the minimum jump in time be before adjusting delta?

If the time jumps past all pending event, logically it could be
clamped at the time of the last event, I would think.  I think that
there is almost always a pending event.  If there is not, then perhaps
the "delta" could be zeroed anyway: nothing is affected by delta in
that case.

| > It would be good for now() to check if n + delta results in an
| > overflow.  What to do then???  passert failure?
| >
| >    
| If all references in timer.c use changes in time then (and the changes are
| small, eg less than 1 year) then the overflow can be safely handled by doing
| unsigned differences.  Casting the result to a signed number will be correct
| for before, now or after.
| 
| eg
| unsigned int now;
| unsigned int scheduled;
| 
| if (((int)(scheduled - now)) < 0)
| {
|   act on the event.
| }
| 
| I believe this works for signed numbers also (I just did a paper exercise to
| prove it to myself).
| 
| So next_event() and handle_timer_event() need to handle time this way.  This
| should be done irrespective of a change to now().
| 
| I can create a new diff with this change in it - be about a day or so (I'll
| ensure it works with timer overflow).

I don't understand.

The "official" type is time_t.  I'd not want to monkey with that
without a good reason.  And int certainly isn't a good choice for a
replacement.  int might be only 16 bits.

If we were willing to play with types, using unsigned long instead of
time_t gives us more headroom (136 years from 1970 instead of 68 years
from 1970).  But that isn't the right fix.

| I'll send a new diff file soon covering these comments (ie changing the maths
| in timer.c and using sys_clock_gettime()).

I'm not yet 100% sold on this whole approach.


More information about the Dev mailing list