Bug #7124

More robust Acorn database rollback logic

Added by Tuukka Lehtonen 6 months ago. Updated 2 months ago.

Status:ClosedStart date:2017-04-03
Priority:4Due date:2017-04-09
Assignee:Tuukka Lehtonen% Done:

100%

Category:-Spent time:-
Target version:2017-18
Release notes:Major improvement to how Acorn DB handles rollback. Previously there was a chance that the implementation destroyed the entire database if user removed snapshot directories but forgot to remove the @main.state@ file. The information stored in main.state is now regarded as cached information only and if it seems invalid or cannot be read, the same normal rollback logic will be performed every time. Another enhancement is that rollback will now automatically store the revisions deleted by the rollback procedure in timestamped @<workspace>/db/recovery/yyyy-M-d_HH-mm-ss/@ folders for later inspection and debugging. Previously the code just deleted all the extra revisions. Manually removing the @db/recovery@-folder is always a safe operation to perform.
Tags: 1.29.0
Story points-
Velocity based estimate-
ReleaseSimantics 1.29.0Release relationshipAuto

Description

Currently, if the user desires to do manual rollback he can just destroy some directories but also the main.state -file needs to be deleted. Otherwise the whole database will be destroyed. This is not intended. The system should validate information in main.state against the file system and if something is wrong, the main.state should be discarded and the information built again from the file system. The main.state should only be regarded as cached information about the status of the file system and it should not be possible to accidentally destroy the database even if the user forgets to delete main.state when performing rollback.

Associated revisions

Revision 20dfd0ba
Added by Tuukka Lehtonen 5 months ago

Improved Acorn database rollback logic.

MainState will no longer destroy the entire database if the user removes
directories but forgets to remove the main.state file. The information
stored in main.state is now regarded as cached information only and if
it seems invalid or cannot be read, the same normal rollback logic will
be performed every time.

Another enhancement is that rollback will now automatically store the
revisions deleted by the rollback procedure in timestamped
<workspace>/db/recovery/yyyy-M-d_HH-mm-ss/ folders for later inspection
and debugging. Previously the code just deleted all the extra revisions.
Manually removing the recovery-folder is always a safe operation to
perform.

Also fixed a bug in databoard Files class readFile methods that take
a File as argument. Previously all the functions constructed a
BinaryFile using the default mode which is "rw". This unintentionally
made the readFile methods create an empty file if the file did not
exist. All such methods have been changed to use mode "r".

refs #7124

Change-Id: I3ac04d2e33151b33f4982cf7a2edce7ddb896e11

Revision d5ca4ed7
Added by Tuukka Lehtonen 5 months ago

Fixed bad logical bug from Acorn's MainState.load rollback

The major bug was the logical not in MainState.load rollback which
caused the database revisioning to be started from 0 when the database
was not empty. It should have been the other way around.

Also cleaned up the database head.state validation code by not using
exceptions for flow control in validating head.state files.

refs #7124

Change-Id: I7cd57fa73d39a637c71159df63566aed5063fc40

Revision ca260ab8
Added by Jussi Koskela 5 months ago

Try to acquire DB lock even if the lock file already exists

Lock file may already exist if the program crashed or was terminated
forcefully. It is ok to try acquiring the lock on existing lock file.

refs #7124

Change-Id: I1467dee3d889d18c68664f6df0b9fa9b13296351

Revision 25c9cc19
Added by Jussi Koskela 5 months ago

Improved logic in new head state creation.

Earlier any IOException during the reading of head state was interpreted
as empty DB. This might cause unwanted DB reset. It's better to identify
need for empty head state based on main state head directory.

Switched AcornDatabase.start logic back to using RandomAccessFile for
touching the db/lock file. Using RandomAccessFile instead of
FileSystemProvider.newFileChannel in Windows better prevents any other
process from removing the lock file. The newFileChannel version did not
prevent the user from initially running 'del lock' to remove the file -
although the file will be recreated quickly by the system.

Also AcornDatabase.start now re-throws ProCoreException if
opening/locking the lock-file fails with IOException to prevent the
system from attempting to start up without a proper database to work
with. Previously the system just logged the start-up problem and
continued.

refs #7124

Change-Id: I850b47d8f692e3d1b8ce177b9269540edc4dc272

Revision a6898527
Added by Tuukka Lehtonen 4 months ago

Improved Acorn database rollback logic.

MainState will no longer destroy the entire database if the user removes
directories but forgets to remove the main.state file. The information
stored in main.state is now regarded as cached information only and if
it seems invalid or cannot be read, the same normal rollback logic will
be performed every time.

Another enhancement is that rollback will now automatically store the
revisions deleted by the rollback procedure in timestamped
<workspace>/db/recovery/yyyy-M-d_HH-mm-ss/ folders for later inspection
and debugging. Previously the code just deleted all the extra revisions.
Manually removing the recovery-folder is always a safe operation to
perform.

Also fixed a bug in databoard Files class readFile methods that take
a File as argument. Previously all the functions constructed a
BinaryFile using the default mode which is "rw". This unintentionally
made the readFile methods create an empty file if the file did not
exist. All such methods have been changed to use mode "r".

refs #7124

Change-Id: I3ac04d2e33151b33f4982cf7a2edce7ddb896e11
(cherry picked from commit 20dfd0ba5e518a3706cd749c645a0a79480ea36f)

Revision 76f924bb
Added by Tuukka Lehtonen 4 months ago

Fixed bad logical bug from Acorn's MainState.load rollback

The major bug was the logical not in MainState.load rollback which
caused the database revisioning to be started from 0 when the database
was not empty. It should have been the other way around.

Also cleaned up the database head.state validation code by not using
exceptions for flow control in validating head.state files.

refs #7124

Change-Id: I7cd57fa73d39a637c71159df63566aed5063fc40
(cherry picked from commit d5ca4ed76bc83af27f2ade59ce49e35750aa4177)

Revision 4b277ed9
Added by Jussi Koskela 4 months ago

Try to acquire DB lock even if the lock file already exists

Lock file may already exist if the program crashed or was terminated
forcefully. It is ok to try acquiring the lock on existing lock file.

refs #7124

Change-Id: I1467dee3d889d18c68664f6df0b9fa9b13296351
(cherry picked from commit ca260ab8f66fabd96ec3af80a3143ae09907d3db)

Revision 7cc6ec28
Added by Jussi Koskela 4 months ago

Improved logic in new head state creation.

Earlier any IOException during the reading of head state was interpreted
as empty DB. This might cause unwanted DB reset. It's better to identify
need for empty head state based on main state head directory.

Switched AcornDatabase.start logic back to using RandomAccessFile for
touching the db/lock file. Using RandomAccessFile instead of
FileSystemProvider.newFileChannel in Windows better prevents any other
process from removing the lock file. The newFileChannel version did not
prevent the user from initially running 'del lock' to remove the file -
although the file will be recreated quickly by the system.

Also AcornDatabase.start now re-throws ProCoreException if
opening/locking the lock-file fails with IOException to prevent the
system from attempting to start up without a proper database to work
with. Previously the system just logged the start-up problem and
continued.

refs #7124

Change-Id: I850b47d8f692e3d1b8ce177b9269540edc4dc272
(cherry picked from commit 25c9cc192b2611646b0a476bf205484500e92997)

Revision b4e846e4
Added by Tuukka Lehtonen 4 months ago

Check head.state file existence before validating its integrity

This prevents unnecessary NoSuchFileExceptions from being logged at
startup.

refs #7124

Change-Id: I500c937ec8894f6c97dcfe73b01efc4adc9b59a6

Revision 9104cdc8
Added by Tuukka Lehtonen 4 months ago

Check head.state file existence before validating its integrity

This prevents unnecessary NoSuchFileExceptions from being logged at
startup.

refs #7124

Change-Id: I500c937ec8894f6c97dcfe73b01efc4adc9b59a6
(cherry picked from commit b4e846e4206e688050b659f2d66581d26e3dc1ce)

History

#1 Updated by Antti Villberg 6 months ago

  • Description updated (diff)

#2 Updated by Tuukka Lehtonen 5 months ago

  • Target version changed from 2017-14 to 2017-17

#3 Updated by Tuukka Lehtonen 5 months ago

  • Status changed from New to Feedback
  • % Done changed from 0 to 100

#5 Updated by Antti Villberg 5 months ago

  • Status changed from Feedback to Closed

#6 Updated by Tuukka Lehtonen 5 months ago

  • Tags changed from 1.28.0 to 1.29.0
  • Status changed from Closed to Feedback
  • Assignee changed from Antti Villberg to Tuukka Lehtonen
  • Target version changed from 2017-17 to 2017-18
  • Release changed from 53 to 54

Reopened because of severe bug in rollback.

#7 Updated by Tuukka Lehtonen 5 months ago

Teemu, your problems last friday were most likely caused by the bug fixed in https://www.simantics.org:8088/r/#/c/482/3.

#8 Updated by Antti Villberg 4 months ago

  • Status changed from Feedback to Closed

#9 Updated by Tuukka Lehtonen 2 months ago

  • Release notes set to Major improvement to how Acorn DB handles rollback. Previously there was a chance that the implementation destroyed the entire database if user removed snapshot directories but forgot to remove the @main.state@ file. The information stored in main.state is now regarded as cached information only and if it seems invalid or cannot be read, the same normal rollback logic will be performed every time. Another enhancement is that rollback will now automatically store the revisions deleted by the rollback procedure in timestamped @<workspace>/db/recovery/yyyy-M-d_HH-mm-ss/@ folders for later inspection and debugging. Previously the code just deleted all the extra revisions. Manually removing the @db/recovery@-folder is always a safe operation to perform.

Also available in: Atom PDF