diff --git a/doc/crm-flowchart.fig b/doc/crm-flowchart.fig deleted file mode 100644 index 6f778cb646..0000000000 --- a/doc/crm-flowchart.fig +++ /dev/null @@ -1,335 +0,0 @@ -#FIG 3.2 -Landscape -Center -Metric -A4 -59.40 -Single --2 -1200 2 -6 1620 1665 2970 2430 -2 4 0 2 0 7 50 0 -1 6.000 0 0 11 0 0 5 - 2925 2385 2925 1890 1845 1890 1845 2385 2925 2385 -2 4 1 2 0 7 50 0 -1 6.000 0 0 11 0 0 5 - 2835 2295 1755 2295 1755 1800 2835 1800 2835 2295 -2 4 1 2 0 7 50 0 -1 6.000 0 0 11 0 0 5 - 2745 2205 1665 2205 1665 1710 2745 1710 2745 2205 -4 1 0 50 0 14 14 0.0000 4 120 360 2340 2115 RAs\001 --6 -6 6255 2520 7785 3375 -6 6345 2610 7695 3285 -4 1 0 50 0 14 14 0.0000 4 135 840 7020 2745 Cluster\001 -4 1 0 50 0 14 14 0.0000 4 135 1320 7020 3000 Information\001 -4 1 0 50 0 14 14 0.0000 4 120 480 7020 3255 Base\001 --6 -6 6255 2520 7785 3375 -2 4 0 2 0 7 50 0 -1 0.000 0 0 11 0 0 5 - 7740 3330 7740 2565 6300 2565 6300 3330 7740 3330 --6 --6 -6 7875 2520 8820 3150 -6 7875 2520 8820 3150 -2 4 0 2 0 7 50 0 -1 0.000 0 0 12 0 0 5 - 8773 3102 8773 2568 7922 2568 7922 3102 8773 3102 --6 -4 1 0 50 0 14 14 0.0000 4 180 720 8348 2762 Policy\001 -4 1 0 50 0 14 14 0.0000 4 180 720 8348 3037 Engine\001 --6 -6 8910 2520 10665 2925 -2 4 0 2 0 7 50 0 -1 0.000 0 0 11 0 0 5 - 10620 2880 10620 2565 8955 2565 8955 2880 10620 2880 -4 1 0 50 0 14 14 0.0000 4 135 1440 9765 2790 Transitioner\001 --6 -6 6480 1620 10305 2025 -2 4 0 2 0 7 50 0 -1 0.000 0 0 11 0 0 5 - 10260 1980 10260 1665 6525 1665 6525 1980 10260 1980 -4 1 0 50 0 14 16 0.0000 4 195 3600 8415 1890 Cluster Resource Manager\001 --6 -6 7875 4725 9450 5130 -2 4 0 2 0 7 50 0 -1 6.000 0 0 11 0 0 5 - 9405 5085 9405 4770 7920 4770 7920 5085 9405 5085 -4 1 0 50 0 14 16 0.0000 4 150 1350 8685 4995 heartbeat\001 --6 -6 8730 4095 9990 4455 -2 4 0 2 0 7 50 0 -1 6.000 0 0 11 0 0 5 - 9945 4410 9945 4140 8775 4140 8775 4410 9945 4410 -4 1 0 50 0 14 14 0.0000 4 180 1080 9360 4320 Messaging\001 --6 -6 7200 3825 8640 4680 -2 4 0 2 0 7 50 0 -1 6.000 0 0 11 0 0 5 - 8595 4635 8595 3870 7245 3870 7245 4635 8595 4635 -4 1 0 50 0 14 14 0.0000 4 120 1080 7920 4050 Concensus\001 -4 1 0 50 0 14 14 0.0000 4 135 840 7920 4305 Cluster\001 -4 1 0 50 0 14 14 0.0000 4 180 1200 7920 4560 Membership\001 --6 -6 12465 1575 13815 2340 -2 4 0 2 0 7 50 0 -1 6.000 0 0 11 0 0 5 - 13770 2295 13770 1800 12690 1800 12690 2295 13770 2295 -2 4 1 2 0 7 50 0 -1 6.000 0 0 11 0 0 5 - 13680 2205 12600 2205 12600 1710 13680 1710 13680 2205 -2 4 1 2 0 7 50 0 -1 6.000 0 0 11 0 0 5 - 13590 2115 12510 2115 12510 1620 13590 1620 13590 2115 -4 1 0 50 0 14 14 0.0000 4 120 360 13185 2025 RAs\001 --6 -6 17325 1530 21150 1935 -2 4 0 2 0 7 50 0 -1 0.000 0 0 11 0 0 5 - 21105 1890 21105 1575 17370 1575 17370 1890 21105 1890 -4 1 0 50 0 14 16 0.0000 4 195 3600 19260 1800 Cluster Resource Manager\001 --6 -6 18720 4635 20295 5040 -2 4 0 2 0 7 50 0 -1 6.000 0 0 11 0 0 5 - 20250 4995 20250 4680 18765 4680 18765 4995 20250 4995 -4 1 0 50 0 14 16 0.0000 4 150 1350 19530 4905 heartbeat\001 --6 -6 19575 4005 20835 4365 -2 4 0 2 0 7 50 0 -1 6.000 0 0 11 0 0 5 - 20790 4320 20790 4050 19620 4050 19620 4320 20790 4320 -4 1 0 50 0 14 14 0.0000 4 180 1080 20205 4230 Messaging\001 --6 -6 18045 3735 19485 4590 -2 4 0 2 0 7 50 0 -1 6.000 0 0 11 0 0 5 - 19440 4545 19440 3780 18090 3780 18090 4545 19440 4545 -4 1 0 50 0 14 14 0.0000 4 120 1080 18765 3960 Concensus\001 -4 1 0 50 0 14 14 0.0000 4 135 840 18765 4215 Cluster\001 -4 1 0 50 0 14 14 0.0000 4 180 1200 18765 4470 Membership\001 --6 -6 18315 2115 19845 2970 -6 18405 2205 19755 2880 -4 1 0 50 0 14 14 0.0000 4 135 840 19080 2340 Cluster\001 -4 1 0 50 0 14 14 0.0000 4 135 1320 19080 2595 Information\001 -4 1 0 50 0 14 14 0.0000 4 120 480 19080 2850 Base\001 --6 -6 18315 2115 19845 2970 -2 4 0 2 0 7 50 0 -1 0.000 0 0 11 0 0 5 - 19800 2925 19800 2160 18360 2160 18360 2925 19800 2925 --6 --6 -6 6750 8370 8010 9090 -4 1 0 50 0 14 14 0.0000 4 120 1080 7380 8505 Concensus\001 -4 1 0 50 0 14 14 0.0000 4 135 840 7380 8760 Cluster\001 -4 1 0 50 0 14 14 0.0000 4 180 1200 7380 9015 Membership\001 --6 -6 6300 9945 7830 10800 -2 4 0 2 0 7 50 0 -1 0.000 0 0 11 0 0 5 - 6345 9990 6345 10755 7785 10755 7785 9990 6345 9990 --6 -6 6390 10035 7740 10710 -4 1 0 50 0 14 14 0.0000 4 135 840 7065 10170 Cluster\001 -4 1 0 50 0 14 14 0.0000 4 135 1320 7065 10425 Information\001 -4 1 0 50 0 14 14 0.0000 4 120 480 7065 10680 Base\001 --6 -6 3240 1755 4905 2250 -2 4 0 2 0 7 50 0 -1 6.000 0 0 15 0 0 5 - 4859 2222 4859 1783 3286 1783 3286 2222 4859 2222 -4 1 0 50 0 14 14 0.0000 4 135 1320 4095 1980 Executioner\001 -4 1 0 50 0 12 14 0.0000 4 165 1080 4095 2160 (STONITH)\001 --6 -6 14085 1710 15750 2205 -2 4 0 2 0 7 50 0 -1 6.000 0 0 15 0 0 5 - 15704 2177 15704 1738 14131 1738 14131 2177 15704 2177 -4 1 0 50 0 14 14 0.0000 4 135 1320 14940 1935 Executioner\001 -4 1 0 50 0 12 14 0.0000 4 165 1080 14940 2115 (STONITH)\001 --6 -6 10485 10710 12150 11205 -2 4 0 2 0 7 50 0 -1 6.000 0 0 15 0 0 5 - 12104 11177 12104 10738 10531 10738 10531 11177 12104 11177 -4 1 0 50 0 14 14 0.0000 4 135 1320 11340 10935 Executioner\001 -4 1 0 50 0 12 14 0.0000 4 165 1080 11340 11115 (STONITH)\001 --6 -6 15300 4320 17415 4995 -2 2 3 2 1 7 50 0 -1 6.000 0 0 -1 0 0 5 - 15345 4365 17370 4365 17370 4950 15345 4950 15345 4365 -4 1 1 50 0 14 16 0.0000 4 150 1950 16380 4590 Adminstrative\001 -4 1 1 50 0 14 16 0.0000 4 180 1050 16380 4875 request\001 --6 -2 1 0 4 0 7 50 0 -1 10.000 0 0 -1 0 0 2 - 1350 6300 21600 6300 -2 1 1 4 0 7 50 0 -1 10.000 0 0 -1 0 0 2 - 1845 6795 21780 6795 -2 1 1 4 0 7 50 0 -1 10.000 0 0 -1 0 0 2 - 8775 6795 8775 5085 -2 1 1 4 0 7 50 0 -1 10.000 0 0 -1 0 0 2 - 19755 4995 19755 6795 -2 1 0 4 0 7 50 0 -1 10.000 0 0 -1 0 0 2 - 19350 6255 19350 4995 -2 1 0 4 0 7 50 0 -1 10.000 0 0 -1 0 0 2 - 8415 6300 8415 5085 -2 1 1 4 0 7 50 0 -1 10.000 0 0 -1 0 0 2 - 6750 7920 6750 6795 -2 1 0 4 0 7 50 0 -1 10.000 0 0 -1 0 0 2 - 6390 7920 6390 6300 -2 4 0 2 0 7 50 0 -1 6.000 0 0 11 0 0 5 - 5040 3195 5040 2880 1575 2880 1575 3195 5040 3195 -2 1 1 2 0 7 50 0 -1 6.000 0 0 11 1 1 2 - 1 1 1.00 120.00 150.00 - 1 1 1.00 120.00 150.00 - 2430 2880 2430 2385 -2 4 2 3 0 7 50 0 -1 2.000 0 0 11 0 0 5 - 5130 1620 5130 3285 1485 3285 1485 1620 5130 1620 -2 4 2 3 0 7 50 0 -1 2.000 0 0 11 0 0 5 - 10710 3465 10710 1530 6165 1530 6165 3465 10710 3465 -2 4 2 3 0 7 50 0 -1 2.000 0 0 11 0 0 5 - 10035 5175 10035 3780 7155 3780 7155 5175 10035 5175 -2 4 0 2 0 7 50 0 -1 6.000 0 0 11 0 0 5 - 15885 3105 15885 2790 12420 2790 12420 3105 15885 3105 -2 1 1 2 0 7 50 0 -1 6.000 0 0 11 1 1 2 - 1 1 1.00 120.00 150.00 - 1 1 1.00 120.00 150.00 - 13275 2790 13275 2295 -2 1 1 2 0 7 50 0 -1 6.000 0 0 11 1 1 2 - 1 1 1.00 120.00 150.00 - 1 1 1.00 120.00 150.00 - 14850 2790 14850 2160 -2 4 2 3 0 7 50 0 -1 2.000 0 0 11 0 0 5 - 15975 1530 15975 3195 12330 3195 12330 1530 15975 1530 -2 4 2 3 0 7 50 0 -1 2.000 0 0 11 0 0 5 - 20880 5085 20880 3690 18000 3690 18000 5085 20880 5085 -2 4 2 3 0 7 50 0 -1 2.000 0 0 11 0 0 5 - 21195 3015 21195 1485 17280 1485 17280 3015 21195 3015 -2 4 2 3 0 7 50 0 -1 2.000 0 0 11 0 0 5 - 21375 5220 21375 900 12150 900 12150 5220 21375 5220 -2 4 0 2 0 7 50 0 -1 6.000 0 0 11 0 0 5 - 10260 9810 10260 10125 13725 10125 13725 9810 10260 9810 -2 1 1 2 0 7 50 0 -1 6.000 0 0 11 1 1 2 - 1 1 1.00 120.00 150.00 - 1 1 1.00 120.00 150.00 - 12870 10125 12870 10620 -2 4 2 3 0 7 50 0 -1 2.000 0 0 11 0 0 5 - 10170 11385 10170 9720 13815 9720 13815 11385 10170 11385 -2 4 2 3 0 7 50 0 -1 2.000 0 0 11 0 0 5 - 5265 7830 5265 9225 8145 9225 8145 7830 5265 7830 -2 4 0 2 0 7 50 0 -1 6.000 0 0 11 0 0 5 - 5355 8595 5355 8865 6525 8865 6525 8595 5355 8595 -2 4 0 2 0 7 50 0 -1 6.000 0 0 11 0 0 5 - 5895 7920 5895 8235 7380 8235 7380 7920 5895 7920 -2 4 0 2 0 7 50 0 -1 6.000 0 0 11 0 0 5 - 6705 8370 6705 9135 8055 9135 8055 8370 6705 8370 -2 4 2 3 0 7 50 0 -1 2.000 0 0 11 0 0 5 - 4725 7650 4725 11970 13950 11970 13950 7650 4725 7650 -2 4 0 2 0 7 50 0 -1 0.000 0 0 11 0 0 5 - 5040 11025 5040 11340 8775 11340 8775 11025 5040 11025 -2 4 2 3 0 7 50 0 -1 2.000 0 0 11 0 0 5 - 4950 11430 4950 9900 8865 9900 8865 11430 4950 11430 -2 1 1 2 0 7 50 0 -1 6.000 0 0 11 1 1 2 - 1 1 1.00 120.00 150.00 - 1 1 1.00 120.00 150.00 - 11295 10125 11295 10755 -2 4 0 2 0 7 50 0 -1 6.000 0 0 11 0 0 5 - 12375 10620 12375 11115 13455 11115 13455 10620 12375 10620 -2 4 1 2 0 7 50 0 -1 6.000 0 0 11 0 0 5 - 12465 10710 13545 10710 13545 11205 12465 11205 12465 10710 -2 4 1 2 0 7 50 0 -1 6.000 0 0 11 0 0 5 - 12555 10800 13635 10800 13635 11295 12555 11295 12555 10800 -2 1 1 2 0 7 50 0 -1 6.000 0 0 11 1 1 2 - 1 1 1.00 120.00 150.00 - 1 1 1.00 120.00 150.00 - 4095 2925 4095 2295 -2 2 3 2 1 7 50 0 -1 6.000 0 0 -1 0 0 5 - 12735 4185 14985 4185 14985 4815 12735 4815 12735 4185 -2 4 2 3 2 7 50 0 -1 2.000 0 0 11 0 0 5 - 10935 5445 1305 5445 1305 855 10935 855 10935 5445 -3 2 1 2 4 7 50 0 -1 6.000 0 1 1 3 - 1 1 1.00 120.00 150.00 - 1 1 1.00 120.00 150.00 - 7740 3150 14355 3780 18360 2790 - 0.000 -1.000 0.000 -3 2 1 2 4 7 50 0 -1 6.000 0 1 0 2 - 1 1 1.00 120.00 150.00 - 10620 2745 12420 2970 - 0.000 0.000 -3 2 1 2 4 7 50 0 -1 6.000 0 1 0 2 - 1 1 1.00 120.00 150.00 - 10125 2880 11250 9810 - 0.000 0.000 -3 2 1 2 0 7 50 0 -1 6.000 0 1 0 3 - 1 1 1.00 120.00 150.00 - 7245 4365 5535 3375 6525 1845 - 0.000 -1.000 0.000 -3 2 1 2 0 7 50 0 -1 6.000 0 1 1 2 - 1 1 1.00 120.00 150.00 - 1 1 1.00 120.00 150.00 - 5040 3060 6300 2925 - 0.000 0.000 -3 2 1 2 0 7 50 0 -1 6.000 0 1 0 3 - 1 1 1.00 120.00 150.00 - 7245 4275 6930 4005 6930 3330 - 0.000 -1.000 0.000 -3 2 1 2 0 7 50 0 -1 6.000 0 1 0 3 - 1 1 1.00 120.00 150.00 - 6975 2565 7740 2340 8325 2565 - 0.000 -1.000 0.000 -3 2 1 2 0 7 50 0 -1 6.000 0 1 0 3 - 1 1 1.00 120.00 150.00 - 8325 2565 9000 2340 9765 2565 - 0.000 -1.000 0.000 -3 2 1 2 4 7 50 0 -1 6.000 0 1 0 4 - 1 1 1.00 120.00 150.00 - 10035 2565 9450 2115 6480 2205 4905 2880 - 0.000 -1.000 -1.000 0.000 -3 2 1 2 0 7 50 0 -1 6.000 0 1 0 3 - 1 1 1.00 120.00 150.00 - 18090 4275 16380 3285 17370 1755 - 0.000 -1.000 0.000 -3 2 1 2 0 7 50 0 -1 6.000 0 1 1 2 - 1 1 1.00 120.00 150.00 - 1 1 1.00 120.00 150.00 - 15885 2970 18315 2565 - 0.000 0.000 -3 2 1 2 0 7 50 0 -1 6.000 0 1 0 3 - 1 1 1.00 120.00 150.00 - 18090 4185 17820 3645 18405 2925 - 0.000 -1.000 0.000 -3 2 1 2 0 7 50 0 -1 6.000 0 1 0 3 - 1 1 1.00 120.00 150.00 - 8055 8640 9765 9630 8775 11160 - 0.000 -1.000 0.000 -3 2 1 2 0 7 50 0 -1 6.000 0 1 1 2 - 1 1 1.00 120.00 150.00 - 1 1 1.00 120.00 150.00 - 10260 9945 7830 10350 - 0.000 0.000 -3 2 1 2 0 7 50 0 -1 6.000 0 1 0 3 - 1 1 1.00 120.00 150.00 - 8055 8730 8370 9000 7740 9990 - 0.000 -1.000 0.000 -3 2 1 2 4 7 50 0 -1 6.000 0 1 0 5 - 1 1 1.00 120.00 150.00 - 6345 10575 3330 9765 2610 6165 3780 4095 6300 3060 - 0.000 -1.000 -1.000 -1.000 0.000 -3 2 1 2 4 7 50 0 -1 6.000 0 0 1 5 - 1 1 1.00 120.00 150.00 - 6300 10305 4095 8460 3420 5850 4635 3825 6300 3195 - 0.000 -1.000 -1.000 -1.000 0.000 -3 2 1 2 0 7 50 0 -1 6.000 0 1 0 3 - 1 1 1.00 120.00 150.00 - 8055 2610 7785 2475 7380 2565 - 0.000 -1.000 0.000 -3 2 3 2 1 7 50 0 -1 6.000 0 1 0 4 - 1 1 1.00 120.00 150.00 - 16650 4365 17730 1755 13050 1350 10260 1800 - 0.000 -1.000 -1.000 0.000 -3 2 3 2 1 7 50 0 -1 6.000 0 1 1 3 - 1 1 1.00 120.00 150.00 - 1 1 1.00 120.00 150.00 - 14940 4185 17730 1800 18810 2160 - 0.000 -1.000 0.000 -4 1 0 50 0 14 16 0.0000 4 195 3300 3330 3105 Local Resource Manager\001 -4 1 0 50 0 14 14 0.0000 4 120 720 5625 4050 Events\001 -4 1 0 50 0 14 18 0.0000 4 240 4455 5850 1125 Designated Controller Node\001 -4 1 0 50 0 14 16 0.0000 4 195 3300 14175 3015 Local Resource Manager\001 -4 1 0 50 0 14 14 0.0000 4 120 720 16470 3960 Events\001 -4 1 0 50 0 14 18 0.0000 4 240 5280 16650 1215 Any client node in the partition\001 -4 1 0 50 0 14 14 0.0000 4 120 720 9675 8955 Events\001 -4 1 0 50 0 14 16 0.0000 4 150 1350 6660 8145 heartbeat\001 -4 1 0 50 0 14 14 0.0000 4 180 1080 5940 8775 Messaging\001 -4 1 0 50 0 14 16 0.0000 4 195 3600 6885 11250 Cluster Resource Manager\001 -4 1 0 50 0 14 16 0.0000 4 195 3300 12015 10035 Local Resource Manager\001 -4 1 0 50 0 14 14 0.0000 4 120 360 13050 11025 RAs\001 -4 1 0 50 0 14 18 0.0000 4 240 5280 9495 11700 Any client node in the partition\001 -4 2 4 50 0 12 14 0.0000 4 120 720 2970 4920 status\001 -4 2 4 50 0 12 14 0.0000 4 120 840 3105 4680 Gathers\001 -4 0 4 50 0 12 14 0.0000 4 135 3000 10665 5940 Instructs and coordinates\001 -4 0 4 50 0 12 14 0.0000 4 165 1200 3825 4905 Replicates\001 -4 0 4 50 0 12 14 0.0000 4 165 1560 3735 5130 configuration\001 -4 1 1 50 0 14 16 0.0000 4 150 1950 13860 4410 Adminstrative\001 -4 1 1 50 0 14 16 0.0000 4 195 2100 13860 4695 status inquiry\001 diff --git a/doc/executioner.txt b/doc/executioner.txt deleted file mode 100644 index 2ad41e1256..0000000000 --- a/doc/executioner.txt +++ /dev/null @@ -1,115 +0,0 @@ -DRAFT! DRAFT! DRAFT! DRAFT! DRAFT! DRAFT! DRAFT! DRAFT! DRAFT! DRAFT! DRAFT! - -NOTICE: Some ideas in this paper aren't yet well sorted. Some ideas aren't -complete. Some phrasings I'm myself not happy with yet. Some ideas need -further explanation. Most of the ideas presented are not final yet. It is -mostly a braindump. - -And did I say yet that this is still a DRAFT!!!!!! ? - - -Title: The Executioner -Author: Lars Marowsky-Brée -Acknowledgements: David Brower, Oracle - Alan Robertson, IBM - - -1. Summary - -Every node runs an instance of the fencing daemon ("executioner"). This daemon -knows which fencing devices are currently reachable and which nodes can be -fenced by them - ie, the current topology of the fencing mechanisms - and will -execute such requests on behalf of the CRM and report success or failure. - -A succesful fencing operation shall imply that the target node of the request -can no longer access any shared resources in the cluster, until it has -"properly rejoined the cluster". - - -2. Fencing topology information - -(Note: modelled after the STONITH model by alanr in heartbeat) - -2.1. Static configuration data - -The mechanisms available for fencing need to be configured on each node. - -Provision should be made that this file can be the same on all nodes (to ease -configuration deployment). It shall be made easy to configure a device for a -list of nodes or all nodes. - -TODO: Can this configuration also be stored in the CIB configuration part? -This would collide with the concept that every node is the authoritive source -of information about itself. - - -2.2. Runtime topology - -Every device needs to support a low-latency "ping" operation, which shall -verify whether it can currently be reached from the local node; this shall -preferrably not be affected by concurrent access. - -The devices can either autodiscover their targets (ie, like via STONITH -devices), or have to provide means of configuring this list; the list shall -only be queried from the device on explicit request by the CRM. - -The CRM shall be allowed to assume that the same device can fence the same set -of nodes for all clients. - - -3. Interaction with the CRM - -3.1. Policies - -The CRM will be responsible for retrying failed commands; the Executioner -shall only make exactly one attempt. It shall not retry the request on another -device in particular; it is permissible to retry the command on the same -device if it seems like an intermediate failure. - - -3.2. Queries/Commands issued - -3.2.1. Device reachable - -The Executioner shall verify whether the given device is still reachable by -the local node at this point in time. - -The verification shall be low-latency and low weight and allow for concurrent -access from multiple nodes (if appropriate, ie for network switches). - - -3.2.2. Targets fenceable via device Y - -The node shall contact the device and return the list of nodes which it can -fence. - -The CRM will ensure that no other node in the partition is accessing device Y -right now. - -Results: - 0 Success; list of targets included - 1 Failed; device not reached - 2 Failed; device failed to return list of targets - - -3.2.3. Fencing request to fence node X via device Y - -This is a blocking, synchronous call. The CRM will ensure that no other node -in the partition is accessing device Y right now. - -(This is an issue for certain network powerswitches) - -The result code need to distinguish between: - - 0 Fencing request succeeded - 1 Failed: device could not be reached, potential network issue - 2 Failed: device tried, but failed to acknowledge success - 3 Failed: interal device failure - -TODO: So much differentiation really necessary? Yes, it can help identify - quickly whether it is sensible to retry the fencing request via - another node to the same device, or whether the next fencing device - should be tried immediately; maybe that is overkill. - - - diff --git a/doc/msg-schema.txt b/doc/msg-schema.txt deleted file mode 100644 index f07c6aef9a..0000000000 --- a/doc/msg-schema.txt +++ /dev/null @@ -1,166 +0,0 @@ -Background -################## -First of all, go look at the diagram (comms.gif), read this, then -look at the diagram again. - -Next, some terminology... -Here I will use CRM to refer to the light blue section. That is, -the entire collection of processes/daemons/modules on a node that, -as a whole, manage resources in the cluster. CRMd refers to one -of the dark blue boxes. It is the "master subsystem" if you like. -Its role is to co-ordinate the actions of all the other pieces of -the puzzle, including those on other nodes. - -Key points from the diagram: -- All communications with the CRM are done with Heartbeat messages - routed through the CRMd. These messages contain a text - representation of an XML document, the schema of which is outlined - at the end of this document. -- All communications internal to the CRM (ie. between its subsystems) - is performed with IPC messages. Again all messages are routed via - the CRMd and contain the same XML documents as Heartbeat messages. -- All admin clients (eventually) end up sending Heartbeat messages - and are thus subject to existing HA client security is available. -- The RPC layer allows the cluster to be controled from non-member - hosts (subject to RPC security which is available for free). -- The option of synchronous or asynchronous RPC calls will be provided. - This will probably be in the form of a flag sent as part of the - function call. - -Advantages: -- The only source of "requests" is the CRMd which means it *never* has - to forward on "request" messages for any of it sub-systems. This is - useful for the security of the system (see security.txt). -- Potentially, most CRM<-->CRM communications can be replaced with RPC - calls. -- We are able to re-use existing security mechanisms (IPC, HA, RPC, - unix_auth via RPC) to protect the system. - -Message scenarios: -################## -There are really only 3 messaging scenarios in this system (exluding -broadcast vs. point-to-point). Again this is nice as it keeps down the -number of "special cases". - -1) Sub-system <--(IPC)--> CRM <--(IPC)--> Sub-system -2) Sub-system <--(IPC)--> Local CRM <--(Heartbeat)--> Remote CRM <--(IPC)--> Sub-system -3) Admin Client <--(Heartbeat Broadcast)--> Remote CRM <--(IPC)--> Sub-system - -Message examples: -################## -1.1) the DC telling the local LRM to start a resource -1.2) the LRM asking the CIB about a resource - -2.1) the DC telling a remote LRM to start a resource -2.2) the DC asking (all) the CIB(s) to provide their view of the world - -3.1) an admin request to add/remove/modify a resource -3.2) an admin request to force a failover of a resource or a -recomputation of the resource dependencies. - -Message Notes: -################## -Messages may be sent to the CRM from local sub-systems via IPC or from -other HA clients via Heartbeat. It is then the responsibility of the -CRM to unpack the message and pass it on to the correct sub-system. If -the destination sub-system is the DC and the DC is not running on the -current node, the message is discarded without error. - -Where the DC receives a message from another node, it will also keep -track of the sending host and the reference number so that it can direct -the replies appropriately. The exception to this is where the message -is from the DC. - -Messages to the DC are *always* sent as broadcast messages and the DC -*must always* acknowledge the message with either the results of the -message or a "thankyou" message. The reason for this is that the DC may -change or a DC election may be in progress. The implication of this is -that the sender should always set a timer and resend dc_messages if they -have not been acknowledged. The DC will be able to detect duplicates by -examining the destination sub-system and the reference number and we -will rely on HA to ensure the delivery of DC responses. - -All messages are full crm_messages. I toyed with only sending the *_request -or *_response piece of the message to and/or from the relevant sub-systems, -but it just got messy. This way, the routing role of the CRM is much easier. -And easier equals lower complexity, which means less bugs, which is good for -everyone. - -Schema Notes: -################### - -Key Attributes -=============== - -reference: provides the ability to track which request a responce - is in relation to and where the local CRM should send it. -*_filter: allow the operation to be limited to a particular type, - id and/or priority -timeout: allows the receiver to know how long the sender is - expecting the task to take so we can act and report back - accordingly. - - -Attribute values -================= -Where the list ends with |... , the complete list of possibilities will be -fleshed out at a later date. - -Message Schema: -################### - - - - - - - - filter_type? #CDATA - filter_id? #CDATA> - - - - - - - - - - - - - - - - - - - - - - -