Page MenuHomeClusterLabs Projects

Colocation score doubled for promotion score
Open, HighPublic

Assigned To
Authored By
kgaillot
Thu, Jun 6, 3:17 PM
Tags
  • Restricted Project
  • Restricted Project
  • Restricted Project
Referenced Files
Restricted File
Thu, Jun 6, 3:18 PM
Subscribers

Description

Reported by Andrew Beekhof

In the scheduler input F768756, rsc_SAPHana_I13_HDB00 has the below promotion scores on four nodes, plus a 99-score promoted colocation with vpcip.

  • ib1-az1-dbn1[1] (with vpcip): 100
  • ib1-az1-dbn2[2]: -12200
  • ib1-az2-dbn1[3]: 150
  • ib1-az2-dbn2[4]: -10000

The promotion priority on node 1 counts the colocation score twice (giving a total of 298) since 2984222d in 2.1.7. It appears this is a bug and not an improvement, but that needs to be verified.

Event Timeline

kgaillot triaged this task as High priority.
kgaillot created this task.
kgaillot created this object with edit policy "Restricted Project (Project)".
kgaillot updated the task description. (Show Details)

Yes we're adding it twice which does seem like a bug.

(pcmk__apply_coloc_to_priority)         trace: Applied master-prefers-vpcip to rsc_SAPHana_I13_HDB00:2 promotion priority (now 199 after adding 199)
(set_instance_priority)         trace: Assigning rsc_SAPHana_I13_HDB00:2 priority = 199
...
(sort_promotable_instances)     trace: Adding scores for rsc_SAPHana_I13_HDB00-clone: initial promotion priority for rsc_SAPHana_I13_HDB00:2 is 199
(add_promotion_priority_to_node_score)  trace: Added cumulative priority of rsc_SAPHana_I13_HDB00:2 (199) to score on ib1-az1-dbn1 (now 199)
(pcmk__add_this_with)   trace: Adding colocation master-prefers-vpcip (rsc_SAPHana_I13_HDB00-clone with vpcip using #uname @99) to 'this with' list for rsc_SAPHana_I13_HDB00-clone
(apply_coloc_to_dependent)      trace: Applying colocation master-prefers-vpcip (promoted rsc_SAPHana_I13_HDB00-clone with vpcip) @99
(add_node_scores_matching_attr)         trace: ib1-az1-dbn1: 199 + 0.000099 * 1000000 = 298

Neither pcmk__apply_coloc_to_priority(), add_promotion_priority_to_node_score(), nor apply_coloc_to_dependent() is redundant as of right now (I tested dropping each), so this is going to require some work. Hoping it's not too bad...


Initial results look okay with dropping apply_coloc_to_dependent() and adding a block to pcmk__apply_coloc_to_priority()... I need to mess with it some more. We also might prefer moving logic into apply_coloc_to_dependent() and trying to get rid of pcmk__apply_coloc_to_priority() instead, if possible.

Side note, the test input itself is intriguing. The summary wouldn't change even if this issue were fixed and the scores were applied correctly. So the problem shouldn't have been noticeable with this particular input.

It's noticeable solely in the scores

yeah I just meant that I usually wouldn't check scores unless they're affecting the behavior somehow, so I'm surprised this was found