You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a proposal for additional data consistency measure that will help to prevent wrong promote decisions.
The proposal is to create a permanent parameter that will store the highest timeline number that was ever reached in this database cluster. The parameter is saved in post-promote phase and consulted in pre-promote. It will ensure that failed master will never be promoted.
Details: post-promote: save the new timeline value to the crm_config database. Why crm and not private attr: crm parameter is permanent across reboots/crashes, it is node independent and is consistently reachable from any node within quorum partition. Format: crm_attribute --lifetime forever --type crm_config --name "$name" --update "$val" pre-promote: get the timeline value of the local database and compare it to the global highest timeline value. If the local timeline is lower than highest global, abort the promotion (set attr to abort).
Why it is needed:
it will ensure that the failed master (or greatly lagging slave) will never be promoted under any circumstances (even with fencing not configured)
it is just additional measure that can be helpful and it is not interfering with current voting mechanisms
it is pre-requisite to auto-rewind of failed masters that I'm considering to implement (I've already thinked it up and I'll open a separate issue to discuss it, but in short: if the local timeline of the DB is lower than global highest timeline during the start of a local resource, we have successfully identified a failed master (or greatly lagging slave) that needs a rewind or basebackup)
I'm in half-way of implementing the global timeline check and I've opened this issue to ask if this sounds desirable to you (my aim is to integrate as many changes as possible back into your project).
Hi,
I have a proposal for additional data consistency measure that will help to prevent wrong promote decisions.
The proposal is to create a permanent parameter that will store the highest timeline number that was ever reached in this database cluster. The parameter is saved in post-promote phase and consulted in pre-promote. It will ensure that failed master will never be promoted.
Details:
post-promote: save the new timeline value to the crm_config database. Why crm and not private attr: crm parameter is permanent across reboots/crashes, it is node independent and is consistently reachable from any node within quorum partition. Format:
crm_attribute --lifetime forever --type crm_config --name "$name" --update "$val"pre-promote: get the timeline value of the local database and compare it to the global highest timeline value. If the local timeline is lower than highest global, abort the promotion (set attr to abort).
Why it is needed:
I'm in half-way of implementing the global timeline check and I've opened this issue to ask if this sounds desirable to you (my aim is to integrate as many changes as possible back into your project).
Jan